๐ŸŽ‰ Our work received the Best Paper Award at IWSDS 2024

Mar 6, 2024ยท
Shang-Chi (Sean) Tsai
Shang-Chi (Sean) Tsai
ยท 1 min read
Table of Contents

Title

ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection

Main Contribution

In this paper, we introduce a novel pipeline designed to enhance the collection of robotic life-support scenario data, traditionally a time-consuming process. Our approach leverages a large language model to simulate dialogues between humans and robots, and a large diffusion model to create corresponding images of the environments. We design two distinct types for dialogue generation: place-based augmentation, which focuses on scenarios occurring in specific places, and action-based augmentation, which centers around specific actions the robot might perform. Both approaches have proven effective in generating realistic and relevant data, significantly aiding in the training of the LLaVA model. This model is fine-tuned to predict suitable actions based on ambiguous user requests and environmental imagery.

  • Source Code - Our code is available on the Github
  • Paper Link - Our paper is available on the Arxiv