JUHE API Marketplace
DATASET
Open Source Community

PARTNR

PARTNR is a benchmark dataset created by FAIR Meta for studying planning and reasoning tasks in human‑robot collaboration. It contains 100,000 natural‑language tasks covering 60 houses and 5,819 unique objects, designed to simulate cooperative scenarios in everyday household activities. The data were generated through a semi‑automated pipeline that combines large language models (LLMs) with a simulated environment, emphasizing constraints on spatial, temporal, and heterogeneous agent capabilities. PARTNR is intended to advance robot‑human collaboration in complex tasks, addressing current model shortcomings in coordination, task tracking, and error recovery.

Updated 11/1/2024
arXiv

Description

PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi‑Agent Tasks

Overview

  • Dataset Name: PARTNR
  • Domain: Planning and reasoning in embodied multi‑agent tasks
  • Main Abstractions:
    • Agent: Represents a robot or a human capable of acting in the environment.
    • Planner: Represents a centralized or decentralized planner.
    • Tool: Abstracts a capability that allows an agent to sense or interact with the environment.
    • Skill: Low‑level abilities that agents can use to interact with the environment.
  • Dataset Generation: Produced using large language models (LLM).

Code Organization

  • habitat‑llm:
    • Agent: Represents a robot or a human.
    • Tools: Abstracts sensing or interaction capabilities.
    • Planner: Represents centralized and decentralized planners.
    • LLM: Contains abstractions for Llama and GPT APIs.
    • WorldGraph: Hierarchical world graph that represents rooms, furniture, and objects.
    • Perception: Simulated perception pipeline that sends local detections to the world model.
    • Examples: Demonstrations and evaluation scripts for showcasing or analyzing planner performance.
    • EvaluationRunner: Abstracts the execution of a planner.
    • Conf: Hydra configuration files for all classes.
    • Utils: Various utility functions required by the code base.
    • Tests: Unit tests.
  • scripts:
    • hitl_analysis: Scripts for analyzing and replaying human‑in‑the‑loop interaction trajectories.
    • prediviz: Visualization and annotation tools for PARTNR tasks.

Information Flow

  • EnvironmentInterface: Reads observations from each agent and forwards them to the perception module.
  • Perception Module: Processes observations and updates the world graph.
  • Planner: Uses the world graph and task description to select tools and interact with the environment.

Installation

Quick Start

  • Dataset Splits: train_2k, val, train, val_mini
  • Examples:
    • Decentralized Multi‑Agent React Summary:
      python -m habitat_llm.examples.planner_demo --config-name baselines/decentralized_zero_shot_react_summary.yaml \
          habitat.dataset.data_path="data/datasets/partnr_episodes/v0_0/val_mini.json.gz" \
          evaluation.agents.agent_0.planner.plan_config.llm.inference_mode=hf \
          evaluation.agents.agent_1.planner.plan_config.llm.inference_mode=hf \
          evaluation.agents.agent_0.planner.plan_config.llm.generation_params.engine=meta-llama/Meta-Llama-3-8B-Instruct \
          evaluation.agents.agent_1.planner.plan_config.llm.generation_params.engine=meta-llama/Meta-Llama-3-8B-Instruct
      
    • Centralized Multi‑Agent React Summary:
      python -m habitat_llm.examples.planner_demo --config-name baselines/centralized_zero_shot_react_summary.yaml \
          habitat.dataset.data_path="data/datasets/partnr_episodes/v0_0/val_mini.json.gz" \
          evaluation.planner.plan_config.llm.inference_mode=hf \
          evaluation.planner.plan_config.llm.generation_params.engine=meta-llama/Meta-Llama-3-8B-Instruct
      
    • Single‑Agent React Summary:
      python -m habitat_llm.examples.planner_demo --config-name baselines/single_agent_zero_shot_react_summary.yaml \
          habitat.dataset.data_path="data/datasets/partnr_episodes/v0_0/val_mini.json.gz" \
          evaluation.agents.agent_0.planner.plan_config.llm.inference_mode=hf \
          evaluation.agents.agent_0.planner.plan_config.llm.generation_params.engine=meta-llama/Meta-Llama-3-8B-Instruct
      
    • Heuristic Planner:
      python -m habitat_llm.examples.planner_demo --config-name baselines/heuristic_full_obs.yaml \
          habitat.dataset.data_path="data/datasets/partnr_episodes/v0_0/val_mini.json.gz"
      

Results

  • Use python scripts/read_results.py <output_dir>/<dataset_name> to inspect progress and results.

Test Set

  • Run the following command to test dataset viability and initial step success rate:
    HYDRA_FULL_ERROR=1 python -m habitat_llm.examples.verify_episodes \
        --config-name examples/planner_multi_agent_demo_config.yaml \
        hydra.run.dir="." \
        evaluation=centralized_evaluation_runner_multi_agent \
        habitat.dataset.data_path="data/datasets/partnr_episodes/v0_0/val_mini.json.gz" \
        mode=data \
        world_model.partial_obs=False \
        evaluation.type="centralized" \
        num_proc=5
    

License

  • License: MIT

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Human-Machine Collaboration
Planning and Reasoning

Source

Organization: arXiv

Created: 11/1/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.