Physical Intelligence, a startup founded by former Google and Meta researchers, is training robots to understand and execute complex tasks by leveraging large language models, the AI technology behind ChatGPT.
The company's approach combines computer vision with language model capabilities to enable robots to interpret human instructions and learn new behaviors without explicit programming for each specific task. Rather than coding individual responses for different scenarios, the robots develop generalizable understanding of physical environments and how to manipulate objects within them.
This represents a fundamental shift from traditional robotics, where engineers typically program narrow, task-specific behaviors. By borrowing from the success of language models that learn patterns across vast text datasets, Physical Intelligence applies similar learning principles to robotic control and perception.
The startup's research draws on insights from natural language processing, where models trained on diverse data develop flexible problem-solving abilities. The team is attempting to create robots that can adapt to novel situations and learn from fewer examples than conventional machine learning approaches require.
The challenge remains substantial. Robots must process real-world sensor data, handle unpredictable physical conditions, and execute movements that achieve intended outcomes. Unlike text prediction, robot actions have immediate physical consequences, and errors carry higher costs. The startup must overcome difficulties in data collection, simulation, and transferring learned behaviors from controlled lab environments to real homes and workplaces.
Physical Intelligence has attracted significant investment based on this vision. The team believes that training robots on diverse task demonstrations, combined with language model reasoning, will unlock more capable autonomous systems than existing approaches allow.
If successful, this research could accelerate the timeline for robots performing household and industrial work. The results would reshape manufacturing, care work, and domestic automation. However, the path from prototype demonstrations to reliable, cost-effective deployment remains unproven. Current robotics applications still require careful task engineering and environmental control, suggesting substantial work lies ahead before language-model-inspired robots become genuinely autonomous agents.
