arXiv 2025
Human-Like motion requires human-like perception. We create a human motion generation system, named CLOPS, purely driven by egocentric visual observations. CLOPS is able to realistically move in a scene and use egocentric vision in order to find a goal (red sphere). We achieve this by combining a data driven low level motion prior with a Q-Learning policy that effectively create a loop of visual perception and motion.
Visual overview of CLOPS approach showing motion generation through visual observation.
The following examples demonstrate CLOPS in action. The Q-Network recieves egocentric observations at 1Hz and predicts goal poses for the avatar’s head (visualised as coordinate frames). The motion generation network then generates natural motion in order to reach these head goals and the loop continues: