Abstract
Creative tool use, a hallmark of intelligence characterized by utilizing tools beyond their intended functionality, presents significant challenges and opportunities in robotics. This study introduces RoboTool, a system developed to enable robots to employ creative tool use through the integration of large language models. RoboTool accepts natural language instructions and outputs executable code for controlling robots. By incorporating a modular design inspired by neurology, RoboTool operates in a manner akin to the specialized functions of the cerebral cortex. RoboTool incorporates four pivotal components: (i) an Analyzer that interprets natural language to discern key task-related concepts, (ii) a Planner that generates comprehensive strategies based on the language input and key concepts, (iii) a Calculator that computes parameters for each skill, and (iv) a Coder that translates these plans into executable Python code. We propose a novel benchmark to evaluate RoboTool across diverse robotic configurations, including a robotic arm and a quadrupedal robot. Our results show that RoboTool can not only comprehend implicit physical constraints and environmental factors but also demonstrate creative tool use. Unlike traditional Task and Motion Planning methods that rely on explicit optimization and are confined to formal logic, our LLM-based system offers a more flexible, efficient, and user-friendly solution for complex robotics tasks. Through extensive analysis, we validate that RoboTool is proficient in handling tasks that would otherwise be infeasible without the creative use of tools, thereby expanding the capabilities of robotic systems.