Structure Overview
We introduce a reinforcement learning framework that effectively utilizes tactile priors, learns multi-modal features and rewards under the improved T-TD3 strategy, and conducts grasping subtasks of deformable objects in both simulation environment and real world.
We develop a structured simulation environment called CR5GraspStable-Env, using PyBullet for simulating deformable objects and decompose the grasping process of deformable objects into three consecutive subtasks: slip detection, stable grasping evaluation, and minimum grasping force tracking. We employ effective reward shaping for each of the three subtasks to better simulate the grasping challenges posed by deformable objects.
Here are the sixteen daily objects we used in simulation environment and real world:
Water Bottle
Apple
Tennis Ball
Insole
Potato Chip Bag
Cookies
Fabric
Pocky
Bagged Water
Plastic Jar
Plastic Cup
Orange
Bread
Banana
Egg
Cherry
We build CR5GraspStable-Env with three subtasks:
For the tactile modality obtained from the environment, we propose a unified tactile priors representation method to represent the tactile modality of the GelSight sensor using four feature maps: flow map, contact map, depth map, and force map.
To effectively utilize the tactile priors and proprioception acquired by the agent, we introduce T-TD3 based on TD3. This extension integrates tactile priors and proprioception through a Multi-Scale Fusion Network (MSF-Net) acting as the actor-critic network. To improve exploration efficiency, a dual actor structure is employed. We also implement a strategic noise attenuation method to stabilize the model's decision-making ability during training. Additionally, we enhance the model by integrating critic regularization terms into the Q value estimation process to enhance the model's fitting capability.