论文标题
gapartnet:跨类别域,可通过可概括的零件操纵的可替代对象感知和操纵
GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts
论文作者
论文摘要
多年来,研究人员一直致力于概括的对象感知和操纵,在这些物体的感知和操纵中,跨类别的普遍性是高度期望但尚未得到充实的。在这项工作中,我们建议通过可推广且可操作的零件(Gaparts)学习此类跨类别技能。通过在27个对象类别中识别和定义9个GAPART类(盖,手柄等),我们构建了一个以零件为中心的零件交互式数据集Gapartnet,我们在其中为1,166个对象提供了8,489个零件实例。基于Gapartnet,我们研究了三个跨类别任务:部分分割,零件姿势估计和基于零件的对象操作。鉴于可见对象类别和看不见的对象类别之间存在显着的域差距,我们通过整合对抗性学习技术的角度提出了一种健壮的3D分割方法。无论在看到或看不见的类别上,我们的方法都超过了所有现有方法。此外,通过部分细分和姿势估计结果,我们利用Gapart姿势定义来设计基于零件的操作启发式方法,这些启发式方法可以很好地推广到模拟器和现实世界中的对象类别。我们的数据集,代码和演示可在我们的项目页面上找到。
For years, researchers have been devoted to generalizable object perception and manipulation, where cross-category generalizability is highly desired yet underexplored. In this work, we propose to learn such cross-category skills via Generalizable and Actionable Parts (GAParts). By identifying and defining 9 GAPart classes (lids, handles, etc.) in 27 object categories, we construct a large-scale part-centric interactive dataset, GAPartNet, where we provide rich, part-level annotations (semantics, poses) for 8,489 part instances on 1,166 objects. Based on GAPartNet, we investigate three cross-category tasks: part segmentation, part pose estimation, and part-based object manipulation. Given the significant domain gaps between seen and unseen object categories, we propose a robust 3D segmentation method from the perspective of domain generalization by integrating adversarial learning techniques. Our method outperforms all existing methods by a large margin, no matter on seen or unseen categories. Furthermore, with part segmentation and pose estimation results, we leverage the GAPart pose definition to design part-based manipulation heuristics that can generalize well to unseen object categories in both the simulator and the real world. Our dataset, code, and demos are available on our project page.