T-TD3

T-TD3: A Reinforcement Learning Framework for Stable Grasping of Deformable Objects Using Tactile Prior

1Department of Control Science & Engineering, College of Electronics & Information Engineering, Tongji University, 2National Key Laboratory of Autonomous Intelligent Unmanned Systems, 3Frontiers Science Center for Intelligent Autonomous Systems

Abstract

Human tactile perception enables rapid assessment of deformable objects and the application of appropriate force to prevent slip or excessive deformation. However, this task remains challenging for robots. To address this issue, we propose the T-TD3 algorithm, which utilizes a multi-scale fusion neural network (MSF-Net) for the fused perception of multi-scale features, including the tactile prior information obtained through preprocessing. Our approach decomposes the robot task of grasping deformable objects into three subtasks: slip detection, stable grasping evaluation, and minimum grasping force tracking. We develop a simulation environment called CR5GraspStable-Env using PyBullet and TACTO for the network training. Our work reports a success rate of 94.81% in the robot task of grasping deformable objects in real , demonstrating an excellent sim-to-real capability. Moreover, the proposed approach has the potential to be extended to other stable grasping tasks that utilize tactile perception.

Video

At present, videos are supported as non-public links on YouTube and can only be accessed through the following links.

Support Video Link, please click!

Motivation

Structure Overview

We introduce a reinforcement learning framework that effectively utilizes tactile priors, learns multi-modal features and rewards under the improved T-TD3 strategy, and conducts grasping subtasks of deformable objects in both simulation environment and real world.

We develop a structured simulation environment called CR5GraspStable-Env, using PyBullet for simulating deformable objects and decompose the grasping process of deformable objects into three consecutive subtasks: slip detection, stable grasping evaluation, and minimum grasping force tracking. We employ effective reward shaping for each of the three subtasks to better simulate the grasping challenges posed by deformable objects.

Here are the sixteen daily objects we used in simulation environment and real world:

Water Bottle

Apple

water bottle. white.
apple.

Tennis Ball

Insole

tennis ball. white.
insole.

Potato Chip Bag

Cookies

potato chip bag. white.
cookies.

Fabric

Pocky

fabric. white.
pocky.

Bagged Water

Plastic Jar

bagged water. white.
plastic jar.

Plastic Cup

Orange

plastic cup. white.
orange.

Bread

Banana

bread. white.
banana.

Egg

Cherry

egg. white.
cherry.

We build CR5GraspStable-Env with three subtasks:

For the tactile modality obtained from the environment, we propose a unified tactile priors representation method to represent the tactile modality of the GelSight sensor using four feature maps: flow map, contact map, depth map, and force map.

To effectively utilize the tactile priors and proprioception acquired by the agent, we introduce T-TD3 based on TD3. This extension integrates tactile priors and proprioception through a Multi-Scale Fusion Network (MSF-Net) acting as the actor-critic network. To improve exploration efficiency, a dual actor structure is employed. We also implement a strategic noise attenuation method to stabilize the model's decision-making ability during training. Additionally, we enhance the model by integrating critic regularization terms into the Q value estimation process to enhance the model's fitting capability.

Experiment Results

Here we show results generated with T-TD3. These videos show grasping results of sixteen everyday objects.


More Experiments

Different Light Conditions

We evaluate the feature extraction performance of the proposed tactile feature extraction method using GelSight under both natural light and strong light conditions.

Acknowledgements

We borrow github page from HyperNeRF. Special thanks to them!