On June 16, 2022, Meta announced new scientific research that will help AI navigate unfamiliar, indoor 3D spaces, learning to understand the physical world much more flexibly and efficiently. This body of work advances visual navigation in embodied AI, a field of study that focuses on training AI systems through interactions with 3D simulations rather than typical 2D datasets. Here are some details about Meta’s new research.
A point-goal navigation model: It can explore a completely new area without the use of a map or a GPS sensor. This was accomplished with Habitat 2.0, their cutting-edge embodied AI platform that runs simulations orders of magnitude quicker than in real-time.
GPS data can be noisy, unreliable, or unavailable in indoor spaces. So Meta created new approaches to boost AI's ability to track its location simply based on visual inputs, a process called visual odometry.
Habitat-Web: It is a training data collection of over 100K varied human demos for object-goal navigation algorithms, to better training without relying on maps. A paid Mechanical Turk user is given a task instruction (e.g., "find the chest of drawers") and teleoperates the virtual robot using a web browser interface on their computer for each human demonstration.
In another step toward more efficient, map-free learning approaches, Meta has demonstrated how to scale the paradigm of imitation learning from human demonstrations, which has previously been impossible due to a lack of data. They created Habitat-Web to close this gap.
"plug and play"modular approach: In a revolutionary zero-shot experience learning framework, they've created the first "plug and play" modular approach that allows robots to generalize to a varied set of semantic navigation tasks and goal modalities without retraining.
The model is trained once to capture the basic skills for semantic visual navigation and then applied to diverse target tasks without additional retraining in a 3D environment using a first-of-its-kind zero-shot experience learning (ZSEL) framework.
A novel formulation: In addition, they're pushing for efficiency with a novel formulation for object-goal navigation tasks that achieves state-of-the-art outcomes while requiring 1,600x less training time than previous methods.
This is accomplished using Potential Functions for ObjectGoal Navigation with Interaction-free Learning (PONI), a new paradigm for learning modular ObjectNav policies that separate the object search skill (i.e., where to look for an object) from the navigation skill (i.e., how to navigate to X, Y).
What’s next for embodied AI at Meta
Next, they'll be working on moving these gains from navigation to mobile manipulation in the near future to create agents that can perform specified tasks, such as "find my wallet and bring it back to me."
Need more details you can read: https://ai.facebook.com/blog/new-research-helps-ai-navigate-unfamiliar-indoor-3d-spaces/