TextHOI-3D: Text-to-3D Hand-Object Interaction via Discrete Multi-View Generation and Joint Mesh Optimization
Researchers have developed TextHOI-3D, a novel framework for generating 3D hand-object interactions from text descriptions. This staged approach uses generated multi-view observations as an intermediate representation, bridging text-conditioned visual generation with geometry-aware recovery. The system significantly improves accuracy in object contact and reduces penetration volume compared to single-view methods, demonstrating the effectiveness of discrete multi-view tokens for this complex 3D generation task. AI
IMPACT Advances text-to-3D generation for complex interactions, potentially impacting virtual reality and content creation.