Fantastic.github.io
Integrated System for Robot Food Handling at the Robosoft 2023 Manipulation Competition

This site presents a system for autonomous pick and place of food items, as specified by the IEEE Soft Robotics Conference 2023 (Robosoft) Manipulation Challenge, by Max Yang.



Table of Contents


Challenge Overview

Pick the food from the bins, place them onto a tray. The system involves object recognition, bin-picking, motion planning around the environment, and end effector design for food handling.
A robot must assemble as many complete food trays as possible in 30 minutes. The robot must use at least one "soft" component, such as but not limited to a gripper, and must be fully autonomous. The food tray shall contain seven types of food selected from the list below:
  • Frankfurter sausages (boiled)
  • Meatballs (cooked)
  • Broccoli (florets, raw)
  • Carrots (sliced, 2-5 mm in thickness, cooked)
  • Green beans (cooked)
  • Spaghetti Noodles (cooked)
  • Cookies from IKEA (KAFFEREP)
  • Fried eggs (sunny side up)
  • Surprise food item (It was turned out as iced gems.)
  • Orange juice

The surprise food item was randomly replaced with another food item on the list 30 minutes before the actual run. Therefore, each team got a different combination of food items in the competition. To our team, broccoli was replaced with iced gems. In no specific order, a robot must pick food items from the source table and place them into particular containers on a tray. The robot may assemble up to two complete trays with a predefined quantity of the items. On the source table, the bins are filled with more food than the required quantity to assemble and are firmly glued to a fixed rack and the source table. During the actual run, a team can restart the autonomous system, but the score is reset to zero. A detailed rule book can be found here.

Further detailed rules and scoring guidelines can be found here.

Hardware Integration


Due to the size of the workspace for the food handling challenge, we have decided to position the camera parallel to the robot arm and the gripper at a 45 degree angle to the camera. We found that this configuration enableed the UR10 robot arm to move freely around the shelves without collision or reach to all the key areas of the workspace without singularities.

System Architecture



Motion Planning and Collision Avoidance

One of the difficulties in this competition is to move the robot arm to a place without collision with the surrounding physical environment, such as bins and the rack. Our solution was building a digital twin in the Moveit! motion planning framework. Since this framework considers the motion planning of a robot arm by avoiding virtual objects, both motion planning and collision avoidance can be exploited as long as the virtual objects and the robot arms present the actual ones.

In Ubuntu 20.04, we installed the robot operating system (ROS) Noetic version and MoveIt! Noetic. The digital twin shown in MoveIt! consists of three parts: (1) connecting a virtual gripper to the virtual robot arm (UR5 or UR10), (2) loading the arm and the gripper in MoveIt!, (3) controlling the real robot from the virtual robot, and (4) building the surrounding environment for collision avoidance.
1. Connecting a virtual gripper to the virtual robot arm.
In our actual setup, a Robotiq 2F gripper has been connected to the UR10 robot. For the digital twin of it, we created a xacro file that combines two different URDF files representing the gripper and the robot, respectively. For the technical details, please refer to this link.

2. Loading the arm and the gripper in MoveIt!
Loading the gripper-attached robot arm in MoveIt! GUI can be very tricky because there are so many parameters to tune. One simple way to set it up is using MoveIt Setup Assistant. Although the tool does not solve all of your problems, it can be a good starting point. We summarized how to use it here.

3. Controlling the real robot from the virtual robot
We first installed the "Universal_Robots_ROS_Driver" package in Ubuntu 20.04. For communications between the Ubuntu PC and the robot, we also installed the "externalcontrol-x.x.x.urcap" program in the teaching pendant of our UR10-CB. Once installed, we were able to control the robot arm from MoveIt!. You can find basic technical details here.

4. Building the surrounding environment for collision avoidance
We designed all tables, bins, trays, and a rack in 3D CAD files, converted the files to dae files using MeshLab, and imported them into MoveIt! As shown in the figure below, each dae file can be located and rotated as you want at the "Scene Objects" tab.
Inserting dae files


Object and Grasp Detection

For vision capabilities, our robot system is equiped with a RealSense Depth Camera D435i to provide RBG-D images and is used for object and grasp detection. The purpose of the vision system is to localize grapsing candidates inside a food bin or the bottle sitting on the table. The grasping candidtaes are then transformed to robot coordinate frame to execute the grasp primitive. The computer vision system consists of three parts: (1) object detection by YOLO, (2) generation of grasp pose in bins by FGE Algorithm, and (3) object localization on the workspace.

1. Object detection by YOLO
We used the You Only Look Once (YOLO) algorithm to quickly detect and classify objects. The algorithm generates a box containing an object from an RGB image and attach a name on it through classification. In our food-picking system, the pre-trained YOLOv5x model was chosen to detect cookies and sunny-side ups. Open Images V6-Food and Vietnamese food pictures were used for object detection, while the MAFood121 dataset was used for classification training. In order to detect the bottle containing orange juice, we used a model trained with Object365 dataset

2. Generation of Grasp pose in bins by FGE Algorithm
Random grasping, which executes grasping in the center of the bin, works for some objects like ice jems and pasta, however it is not a good idea to pick up large objects for example sausages and broccoli. From a depth map, the Fast Graspability Evaluation algorithm generates multiple candidates for the optimal grasping position with less unexpected collision with to untargeted objects. The algorithm calculates graspability in order to determine the top five candidates by caliculating collision index from depth map and gripper geometory. We referred to Xinyi Zhang’s code for implementation and optimized it for our senacio.

3. Object localization on the workspace To achieve object localization, we first perform eye-in-hand calibration from which we can obtain the transformation from camera frame to robot frame \(^rT_c\). This allow us to transform detected points in the image frame to the robot frame. For object detection by YOLO, we use the center of the bouding box with a fixed gripper orientation to generate grasping poses. For FGE algrithm, we use the center of the grasping candidates as the grasping pose.



Effective end-effectors for ever-changing edibles

Pick and place of food items assumes a good enough grasp on the object. This may be a challenge as food can be... messy. Different shapes, textures, sizes, with all its organic variance. Compliant grippers are the key to handling such variation, and since the competition requires soft materials as part of robot functionality. An interesting soft gripper is needed.

The gripper developed for the competition is a variant of the popular Fin Ray effect adaptive fingers. The Fin Ray fingers were printed in a multi-material mix of vero and agilus material on a Connex Objet printer. They are curve-shaped to promote scooping behaviors and increase grasp volume.


Project Team

  • Saekwang Nam,
  • Loong Yi Lee
  • Max Yang
  • Naoki Shitanda
  • Lihaoya Tan
  • Jonathan Rossiter
  • Nathan F. Lepora



Acknowledgements

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.