Semantic 3D perception and VLM-based planning for humanoid loco-manipulation Your browser does not support the video tag. Short demonstration (YouTube mirror: link) Previous Next