For a safe and efficient interactions between mobile robots and human, it is key for the robot to understand the human's behavior, ranging from the human's 3D body pose to its high-level actions. This research area includes questions about good learning representations for human modeling, robustifying model predictions despite the large variety of human shape and multi-modality of human actions, or leveraging contextual information for accurate predictions.