# dimos-unitree **Repository Path**: libaos/dimos-unitree ## Basic Information - **Project Name**: dimos-unitree - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-08-05 - **Last Updated**: 2025-08-05 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ![Screenshot 2025-02-18 at 16-31-22 DimOS Terminal](/assets/dimos_terminal.png)
dimOS interface

A simple two-shot PlanningAgent

3rd person POV

3rd person POV

# The Dimensional Framework *The universal framework for AI-native generalist robotics* ## What is Dimensional? Dimensional is an open-source framework for building agentive generalist robots. DimOS allows off-the-shelf Agents to call tools/functions and read sensor/state data directly from ROS. The framework enables neurosymbolic orchestration of Agents as generalized spatial reasoners/planners and Robot state/action primitives as functions. The result: cross-embodied *"Dimensional Applications"* exceptional at generalization and robust at symbolic action execution. ## DIMOS x Unitree Go2 We are shipping a first look at the DIMOS x Unitree Go2 integration, allowing for off-the-shelf Agents() to "call" Unitree ROS2 Nodes and WebRTC action primitives, including: - Navigation control primitives (move, reverse, spinLeft, spinRight, etc.) - WebRTC control primitives (FrontPounce, FrontFlip, FrontJump, etc.) - Camera feeds (image_raw, compressed_image, etc.) - IMU data - State information - Lidar / PointCloud primitives - Any other generic Unitree ROS2 topics ### Features - **DimOS Agents** - Agent() classes with planning, spatial reasoning, and Robot.Skill() function calling abilities. - Integrate with any off-the-shelf hosted or local model: OpenAIAgent, ClaudeAgent, GeminiAgent 🚧, DeepSeekAgent 🚧, HuggingFaceRemoteAgent, HuggingFaceLocalAgent, etc. - Modular agent architecture for easy extensibility and chaining of Agent output --> Subagents input. - Agent spatial / language memory for location grounded reasoning and recall. - **DimOS Infrastructure** - A reactive data streaming architecture using RxPY to manage real-time video (or other sensor input), outbound commands, and inbound robot state between the DimOS interface, Agents, and ROS2. - Robot Command Queue to handle complex multi-step actions to Robot. - Simulation bindings (Genesis, Isaacsim, etc.) to test your agentive application before deploying to a physical robot. - **DimOS Interface / Development Tools** - Local development interface to control your robot, orchestrate agents, visualize camera/lidar streams, and debug your dimensional agentive application. ## Docker Quick Start 🚀 > **⚠️ Recommended to start** ### Prerequisites - Docker and Docker Compose installed - A Unitree Go2 robot accessible on your network - The robot's IP address - OpenAI API Key ### Configuration: Configure your environment variables in `.env` ```bash OPENAI_API_KEY= ALIBABA_API_KEY= ANTHROPIC_API_KEY= ROBOT_IP= CONN_TYPE=webrtc WEBRTC_SERVER_HOST=0.0.0.0 WEBRTC_SERVER_PORT=9991 DISPLAY=:0 ``` ### Run docker compose ```bash xhost +local:root # If running locally and desire RVIZ GUI docker compose -f docker/unitree/agents_interface/docker-compose.yml up --build ``` **Interface will start at http://localhost:3000** ## Python Quick Start 🐍 ### Prerequisites - A Unitree Go2 robot accessible on your network - The robot's IP address - OpenAI/Claude/Alibaba API Key ### Python Installation (Ubuntu 22.04) ```bash sudo apt install python3-venv # Clone the repository git clone --recurse-submodules https://github.com/dimensionalOS/dimos-unitree.git cd dimos-unitree # Create and activate virtual environment python3 -m venv venv source venv/bin/activate sudo apt install portaudio19-dev python3-pyaudio # Install torch and torchvision if not already installed pip install -r base-requirements.txt # Install dependencies pip install -r requirements.txt # Copy and configure environment variables cp default.env .env ``` ### Agent API keys Full functionality will require API keys for the following: Requirements: - OpenAI API key (required for all LLMAgents due to OpenAIEmbeddings) - Claude API key (required for ClaudeAgent) - Alibaba API key (required for Navigation skills) These keys can be added to your .env file or exported as environment variables. ``` export OPENAI_API_KEY= export CLAUDE_API_KEY= export ALIBABA_API_KEY= ``` ### ROS2 Unitree Go2 SDK Installation #### System Requirements - Ubuntu 22.04 - ROS2 Distros: Iron, Humble, Rolling See [Unitree Go2 ROS2 SDK](https://github.com/dimensionalOS/go2_ros2_sdk) for additional installation instructions. ```bash mkdir -p ros2_ws cd ros2_ws git clone --recurse-submodules https://github.com/dimensionalOS/go2_ros2_sdk.git src sudo apt install ros-$ROS_DISTRO-image-tools sudo apt install ros-$ROS_DISTRO-vision-msgs sudo apt install python3-pip clang portaudio19-dev cd src pip install -r requirements.txt cd .. # Ensure clean python install before running source /opt/ros/$ROS_DISTRO/setup.bash rosdep install --from-paths src --ignore-src -r -y colcon build ``` ### Run the test application #### ROS2 Terminal: ```bash # Change path to your Go2 ROS2 SDK installation source /ros2_ws/install/setup.bash source /opt/ros/$ROS_DISTRO/setup.bash export ROBOT_IP="robot_ip" #for muliple robots, just split by , export CONN_TYPE="webrtc" ros2 launch go2_robot_sdk robot.launch.py ``` #### Python Terminal: ```bash # Change path to your Go2 ROS2 SDK installation source /ros2_ws/install/setup.bash python tests/run.py ``` #### DimOS Interface: ```bash cd dimos/web/dimos_interface yarn install yarn dev # you may need to run sudo if previously built via Docker ``` ### Project Structure ``` . ├── dimos/ │ ├── agents/ # Agent implementations │ │ └── memory/ # Memory systems for agents, including semantic memory │ ├── environment/ # Environment context and sensing │ ├── hardware/ # Hardware abstraction and interfaces │ ├── models/ # ML model definitions and implementations │ │ ├── Detic/ # Detic object detection model │ │ ├── depth/ # Depth estimation models │ │ ├── segmentation/ # Image segmentation models │ ├── perception/ # Computer vision and sensing │ │ ├── detection2d/ # 2D object detection │ │ └── segmentation/ # Image segmentation pipelines │ ├── robot/ # Robot control and hardware interface │ │ ├── global_planner/ # Path planning at global scale │ │ ├── local_planner/ # Local navigation planning │ │ └── unitree/ # Unitree Go2 specific implementations │ ├── simulation/ # Robot simulation environments │ │ ├── genesis/ # Genesis simulation integration │ │ └── isaac/ # NVIDIA Isaac Sim integration │ ├── skills/ # Task-specific robot capabilities │ │ └── rest/ # REST API based skills │ ├── stream/ # WebRTC and data streaming │ │ ├── audio/ # Audio streaming components │ │ └── video_providers/ # Video streaming components │ ├── types/ # Type definitions and interfaces │ ├── utils/ # Utility functions and helpers │ └── web/ # DimOS development interface and API │ ├── dimos_interface/ # DimOS web interface │ └── websocket_vis/ # Websocket visualizations ├── tests/ # Test files │ ├── genesissim/ # Genesis simulator tests │ └── isaacsim/ # Isaac Sim tests └── docker/ # Docker configuration files ├── agent/ # Agent service containers ├── interface/ # Interface containers ├── simulation/ # Simulation environment containers └── unitree/ # Unitree robot specific containers ``` ## Building ### Simple DimOS Application ```python from dimos.robot.unitree.unitree_go2 import UnitreeGo2 from dimos.robot.unitree.unitree_skills import MyUnitreeSkills from dimos.robot.unitree.unitree_ros_control import UnitreeROSControl from dimos.agents.agent import OpenAIAgent # Initialize robot robot = UnitreeGo2(ip=robot_ip, ros_control=UnitreeROSControl(), skills=MyUnitreeSkills()) # Initialize agent agent = OpenAIAgent( dev_name="UnitreeExecutionAgent", input_video_stream=robot.get_ros_video_stream(), skills=robot.get_skills(), system_query="Jump when you see a human! Front flip when you see a dog!", model_name="gpt-4o" ) while True: # keep process running time.sleep(1) ``` ### DimOS Application with Agent chaining Let's build a simple DimOS application with Agent chaining. We define a ```planner``` as a ```PlanningAgent``` that takes in user input to devise a complex multi-step plan. This plan is passed step-by-step to an ```executor``` agent that can queue ```AbstractRobotSkill``` commands to the ```ROSCommandQueue```. Our reactive Pub/Sub data streaming architecture allows for chaining of ```Agent_0 --> Agent_1 --> ... --> Agent_n``` via the ```input_query_stream``` parameter in each which takes an ```Observable``` input from the previous Agent in the chain. **Via this method you can chain together any number of Agents() to create complex dimensional applications.** ```python web_interface = RobotWebInterface(port=5555) robot = UnitreeGo2(ip=robot_ip, ros_control=UnitreeROSControl(), skills=MyUnitreeSkills()) # Initialize master planning agent planner = PlanningAgent( dev_name="UnitreePlanningAgent", input_query_stream=web_interface.query_stream, # Takes user input from dimOS interface skills=robot.get_skills(), model_name="gpt-4o", ) # Initialize execution agent executor = OpenAIAgent( dev_name="UnitreeExecutionAgent", input_query_stream=planner.get_response_observable(), # Takes planner output as input skills=robot.get_skills(), model_name="gpt-4o", system_query=""" You are a robot execution agent that can execute tasks on a virtual robot. ONLY OUTPUT THE SKILLS TO EXECUTE. """ ) while True: # keep process running time.sleep(1) ``` ### Calling Action Primitives Call action primitives directly from ```Robot()``` for prototyping and testing. ```python robot = UnitreeGo2(ip=robot_ip,) # Call a Unitree WebRTC action primitive robot.webrtc_req(api_id=1016) # "Hello" command # Call a ROS2 action primitive robot.move(distance=1.0, speed=0.5) ``` ### Creating Custom Skills (non-unitree specific) #### Create basic custom skills by inheriting from ```AbstractRobotSkill``` and implementing the ```__call__``` method. ```python class Move(AbstractRobotSkill): distance: float = Field(...,description="Distance to reverse in meters") def __init__(self, robot: Optional[Robot] = None, **data): super().__init__(robot=robot, **data) def __call__(self): super().__call__() return self._robot.move(distance=self.distance) ``` #### Chain together skills to create recursive skill trees ```python class JumpAndFlip(AbstractRobotSkill): def __init__(self, robot: Optional[Robot] = None, **data): super().__init__(robot=robot, **data) def __call__(self): super().__call__() jump = Jump(robot=self._robot) flip = Flip(robot=self._robot) return (jump() and flip()) ``` ### Integrating Skills with Agents: Single Skills and Skill Libraries DimOS agents, such as `OpenAIAgent`, can be endowed with capabilities through two primary mechanisms: by providing them with individual skill classes or with comprehensive `SkillLibrary` instances. This design offers flexibility in how robot functionalities are defined and managed within your agent-based applications. **Agent's `skills` Parameter** The `skills` parameter in an agent's constructor is key to this integration: 1. **A Single Skill Class**: This approach is suitable for skills that are relatively self-contained or have straightforward initialization requirements. * You pass the skill *class itself* (e.g., `GreeterSkill`) directly to the agent's `skills` parameter. * The agent then takes on the responsibility of instantiating this skill when it's invoked. This typically involves the agent providing necessary context to the skill's constructor (`__init__`), such as a `Robot` instance (or any other private instance variable) if the skill requires it. 2. **A `SkillLibrary` Instance**: This is the preferred method for managing a collection of skills, especially when skills have dependencies, require specific configurations, or need to share parameters. * You first define your custom skill library by inheriting from `SkillLibrary`. Then, you create and configure an *instance* of this library (e.g., `my_lib = EntertainmentSkills(robot=robot_instance)`). * This pre-configured `SkillLibrary` instance is then passed to the agent's `skills` parameter. The library itself manages the lifecycle and provision of its contained skills. **Examples:** #### 1. Using a Single Skill Class with an Agent First, define your skill. For instance, a `GreeterSkill` that can deliver a configurable greeting: ```python class GreeterSkill(AbstractSkill): """Greats the user with a friendly message.""" # Gives the agent better context for understanding (the more detailed the better). greeting: str = Field(..., description="The greating message to display.") # The field needed for the calling of the function. Your agent will also pull from the description here to gain better context. def __init__(self, greeting_message: Optional[str] = None, **data): super().__init__(**data) if greeting_message: self.greeting = greeting_message # Any additional skill-specific initialization can go here def __call__(self): super().__call__() # Call parent's method if it contains base logic # Implement the logic for the skill print(self.greeting) return f"Greeting delivered: '{self.greeting}'" ``` Next, register this skill *class* directly with your agent. The agent can then instantiate it, potentially with specific configurations if your agent or skill supports it (e.g., via default parameters or a more advanced setup). ```python agent = OpenAIAgent( dev_name="GreetingBot", system_query="You are a polite bot. If a user asks for a greeting, use your GreeterSkill.", skills=GreeterSkill, # Pass the GreeterSkill CLASS # The agent will instantiate GreeterSkill. # If the skill had required __init__ args not provided by the agent automatically, # this direct class passing might be insufficient without further agent logic # or by passing a pre-configured instance (see SkillLibrary example). # For simple skills like GreeterSkill with defaults or optional args, this works well. model_name="gpt-4o" ) ``` In this setup, when the `GreetingBot` agent decides to use the `GreeterSkill`, it will instantiate it. If the `GreeterSkill` were to be instantiated by the agent with a specific `greeting_message`, the agent's design would need to support passing such parameters during skill instantiation. #### 2. Using a `SkillLibrary` Instance with an Agent Define the SkillLibrary and any skills it will manage in its collection: ```python class MovementSkillsLibrary(SkillLibrary): """A specialized skill library containing movement and navigation related skills.""" def __init__(self, robot=None): super().__init__() self._robot = robot def initialize_skills(self, robot=None): """Initialize all movement skills with the robot instance.""" if robot: self._robot = robot if not self._robot: raise ValueError("Robot instance is required to initialize skills") # Initialize with all movement-related skills self.add(Navigate(robot=self._robot)) self.add(NavigateToGoal(robot=self._robot)) self.add(FollowHuman(robot=self._robot)) self.add(NavigateToObject(robot=self._robot)) self.add(GetPose(robot=self._robot)) # Position tracking skill ``` Note the addision of initialized skills added to this collection above. Proceed to use this skill library in an Agent: Finally, in your main application code: ```python # 1. Create an instance of your custom skill library, configured with the robot my_movement_skills = MovementSkillsLibrary(robot=robot_instance) # 2. Pass this library INSTANCE to the agent performing_agent = OpenAIAgent( dev_name="ShowBot", system_query="You are a show robot. Use your skills as directed.", skills=my_movement_skills, # Pass the configured SkillLibrary INSTANCE model_name="gpt-4o" ) ``` ### Unitree Test Files - **`tests/run_go2_ros.py`**: Tests `UnitreeROSControl(ROSControl)` initialization in `UnitreeGo2(Robot)` via direct function calls `robot.move()` and `robot.webrtc_req()` - **`tests/simple_agent_test.py`**: Tests a simple zero-shot class `OpenAIAgent` example - **`tests/unitree/test_webrtc_queue.py`**: Tests `ROSCommandQueue` via a 20 back-to-back WebRTC requests to the robot - **`tests/test_planning_agent_web_interface.py`**: Tests a simple two-stage `PlanningAgent` chained to an `ExecutionAgent` with backend FastAPI interface. - **`tests/test_unitree_agent_queries_fastapi.py`**: Tests a zero-shot `ExecutionAgent` with backend FastAPI interface. ## Documentation For detailed documentation, please visit our [documentation site](#) (Coming Soon). ## Contributing We welcome contributions! See our [Bounty List](https://docs.google.com/spreadsheets/d/1tzYTPvhO7Lou21cU6avSWTQOhACl5H8trSvhtYtsk8U/edit?usp=sharing) for open requests for contributions. If you would like to suggest a feature or sponsor a bounty, open an issue. ## License This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. ## Acknowledgments Huge thanks to! - The Roboverse Community and their unitree-specific help. Check out their [Discord](https://discord.gg/HEXNMCNhEh). - @abizovnuralem for his work on the [Unitree Go2 ROS2 SDK](https://github.com/abizovnuralem/go2_ros2_sdk) we integrate with for DimOS. - @legion1581 for his work on the [Unitree Go2 WebRTC Connect](https://github.com/legion1581/go2_webrtc_connect) from which we've pulled the ```Go2WebRTCConnection``` class and other types for seamless WebRTC-only integration with DimOS. - @tfoldi for the webrtc_req integration via Unitree Go2 ROS2 SDK, which allows for seamless usage of Unitree WebRTC control primitives with DimOS. ## Contact - GitHub Issues: For bug reports and feature requests - Email: [build@dimensionalOS.com](mailto:build@dimensionalOS.com) ## Known Issues - Agent() failure to execute Nav2 action primitives (move, reverse, spinLeft, spinRight) is almost always due to the internal ROS2 collision avoidance, which will sometimes incorrectly display obstacles or be overly sensitive. Look for ```[behavior_server]: Collision Ahead - Exiting DriveOnHeading``` in the ROS logs. Reccomend restarting ROS2 or moving robot from objects to resolve. - ```docker-compose up --build``` does not fully initialize the ROS2 environment due to ```std::bad_alloc``` errors. This will occur during continuous docker development if the ```docker-compose down``` is not run consistently before rebuilding and/or you are on a machine with less RAM, as ROS is very memory intensive. Reccomend running to clear your docker cache/images/containers with ```docker system prune``` and rebuild.