Project: Visual-tactile geometric reasoning
Robotic grasp planning based on raw sensory data is difficult due to occlusion and incomplete scene geometry. Often one sensory modality does not provide enough context to enable reliable planning. A single depth sensor image cannot provide information about occluded regions of an object, and tactile information is incredibly sparse spatially. We are building a Deep Learning CNN that combines both 3D vision and tactile information to perform shape completion of an object seen from a single view only, and plan stable grasps on these completed models.
This multi-modal system will be able to utilize tactile and RGBD information to form a more complete object model for grasping. The project draws heavily on using simulated vision and tactile data to create rich training data sets. The DSI scholar will be involved in creating the simulated data sets, and also using ML algorithms and systems (e.g. Caffe, Theano) to develop the CNN architecture. The contributions of this work include: an open source dataset for training a shape completion system using both tactile and RGBD information; a framework for integrating multi-modal sensory data to reason about object geometry; and results comparing the completed object models using depth only and combined depth-tactile information.
Knowledge of robotics, computer vision, or machine learning is a plus
Lab: Columbia Robotics Lab
Direct Supervisor: Peter Allen
Position Dates: 6/1/2018 - 8/14/2018
Hours per Week: 40
Paid Position: Yes
Positions Available: 1
Eligibility: Freshman, Sophomore, Junior, Senior, Master's; SEAS only