---
id: embodied-ai-and-robotics
title: "Embodied AI: Robots That Learn from the Physical World"
schema_type: TechArticle
category: ai
language: en
confidence: high
last_verified: "2026-05-28"
created_date: "2026-05-24"
generation_method: ai_structured
ai_models:
  - claude-opus
derived_from_human_seed: true
conflict_of_interest: none_declared
is_live_document: false
data_period: static
atomic_facts:
  - id: fact-embodied-1
    statement: >-
      RT-2 is a vision-language-action model that transfers web-scale vision-language knowledge to robotic
      control.
    source_title: "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control"
    source_url: https://arxiv.org/abs/2307.15818
    confidence: medium
  - id: fact-embodied-2
    statement: SayCan grounds language-model planning in affordance estimates for robotic tasks.
    source_title: "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances"
    source_url: https://arxiv.org/abs/2204.01691
    confidence: medium
  - id: fact-embodied-3
    statement: Habitat provides a simulation platform for embodied AI research in interactive 3D environments.
    source_title: "Habitat: A Platform for Embodied AI Research"
    source_url: https://arxiv.org/abs/1904.01201
    confidence: medium
completeness: 0.84
primary_sources:
  - title: "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control"
    type: academic_paper
    year: 2023
    url: https://arxiv.org/abs/2307.15818
    institution: Google DeepMind / arXiv
  - title: "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances"
    type: academic_paper
    year: 2022
    url: https://arxiv.org/abs/2204.01691
    institution: Google Research / arXiv
  - title: "Habitat: A Platform for Embodied AI Research"
    type: academic_paper
    year: 2019
    url: https://arxiv.org/abs/1904.01201
    institution: Meta AI / arXiv
known_gaps:
  - This compact repair keeps only source-mapped public claims from the sampled audit entry.
disputed_statements: []
secondary_sources: []
updated: "2026-05-28"
---

## TL;DR

Embodied AI connects robots, perception, language, action, and the physical world. This repair maps each public fact to a corresponding robotics or simulation source.

## Core Explanation

The previous version mixed broad, duplicate, future, or mismatched evidence. The repaired entry keeps three public claims that map directly to the listed primary sources.

## Further Reading

- [RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control](https://arxiv.org/abs/2307.15818)
- [Do As I Can, Not As I Say: Grounding Language in Robotic Affordances](https://arxiv.org/abs/2204.01691)
- [Habitat: A Platform for Embodied AI Research](https://arxiv.org/abs/1904.01201)