-
*: equivalent contribution, †: corresponding author
Deep Object-Centric 3D Perception |
|
Interactive Humanoid: Online Full-Body Motion Reaction Synthesis with Social Affordance Canonicalization and Forecasting |
|
ImOV3D: Learning Open Vocabulary Point Clouds 3D Object Detection from Only 2D Images
|
|
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction |
|
QuasiSim: Parameterized Quasi-Physical Simulators for Dexterous Manipulations Transfer |
|
FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models |
|
PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation
|
|
GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation |
|
TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding |
|
GenN2N: Generative NeRF2NeRF Translation |
|
Physics-aware Hand-object Interaction Denoising |
|
DreamLLM: Synergistic Multimodal Comprehension and Creation |
|
GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion |
|
Enhancing Generalizable 6D Pose Tracking of an In-Hand Object with Tactile Sensing |
|
CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding |
|
Semantic Complete Scene Forecasting from a 4D Dynamic Point Cloud Sequence |
|
Full-Body Motion Reconstruction with Sparse Sensing from Graph Perspective |
|
NSM4D: Neural Scene Model Based Online 4D Point Cloud Sequence Understanding |
|
TransTouch: Learning Transparent Objects Depth Sensing Through Sparse Touches |
|
LeaF: Learning Frames for 4D Point Cloud Sequence Understanding |
|
Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation |
|
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning |
|
3D Implicit Transporter for Temporally Consistent Keypoint Discovery |
|
ArrangementNet: Learning Scene Arrangements for Vectorized Indoor Scene Modeling |
|
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining |
|
Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning |
|
CAMS: CAnonicalized Manipulation Spaces for Category-Level Functional Hand-Object Manipulation Synthesis |
|
JacobiNeRF: NeRF Shaping with Mutual Information Gradients |
|
GAPartNet: Learning Generalizable and Actionable Parts for Cross-Category Object Perception and Manipulation |
|
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy |
|
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer |
|
Semi-Weakly Supervised Object Kinematic Motion Prediction |
|
Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance |
|
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning? |
|
Language-Assisted 3D Feature Learning for Semantic Scene Understanding |
|
Tracking and Reconstructing Hand Object Interactions from Point Cloud Sequences in the Wild |
|
MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds |
|
Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding |
|
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction |
|
Rotationally Equivariant 3D Object Detection |
|
AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation |
|
CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance |
|
Multi-Robot Active Mapping via Neural Bipartite Graph Matching |
|
APES: Articulated Part Extraction from Sprite Sheets |
|
Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction |
|
PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning |
|
Leveraging SE(3) Equivariance for Self-supervised Category-Level Object Pose Estimation from Point Clouds |
|
Contrastive Multimodal Fusion with TupleInfoNCE |
|
P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding |
|
Compositionally Generalizable 3D Structure Prediction |
|
Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments |
|
Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds |
|
Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks |
|
Curriculum DeepSDF |
|
SAPIEN: A SimulAted Part-based Interactive ENvironment |
|
Category-Level Articulated Object Pose Estimation |
|
StructEdit: Learning Structural Shape Variations |
|
AdaCoSeg: Adaptive Shape Co-Segmentation with Group Consistency Loss |
|
StructureNet: Hierarchical Graph Networks for 3D Shape Generation |
|
GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud |
|
TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes |
|
Supervised Fitting of Geometric Primitives to 3D Point Clouds |
|
PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding |
|
GeoNet: Deep Geodesic Networks for Point Cloud Analysis |
|
Deep Part Induction from Articulated Object Pairs |
|
Beyond Holistic Object Recognition: Enriching Image Understanding with Part States |
|
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space |
|
Learning Hierarchical Shape Segmentation and Labeling from Online Repositories |
|
SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation |
|
A Scalable Active Framework for Region Annotation in 3D Shape Collections |
|
ShapeNet: An Information-Rich 3D Model Repository |
|
3D-Assisted Image Feature Synthesis for Novel Views of an Object |
|
Image Super-Resolution Via Analysis Sparse Prior |