ADVERTISEMENTs

Indian-origin researchers develop framework for context-aware robots

The framework uses large language models (LLMs) to generate situational queries, create underlying object-state information, and then evaluate for contextual accuracy.

Vishnu Dorbala and Dinesh Manocha / LinkedIn/ University of Maryland

A team of researchers at the University of Maryland developed a framework that allows robots to reason about everyday human environments using situational context.

The study led by Indian-origin doctoral student Vishnu Dorbala, introduced Situational Embodied Question Answering (S-EQA), a paradigm designed to help robots interpret contextual queries about human environments. 

Also Read: Smart IoT devices abused in financial cybercrimes: Study

Unlike traditional systems that identify objects in isolation, S-EQA enables robots to reason about situations—such as whether a home is secured for the night by checking the status of multiple objects and their interdependencies.

“Mobile robots are expected to make life at home easier. This includes answering questions about everyday situations like ‘Is the house ready for sleeptime?’” Dorbala said. “Doing so requires understanding the states of many things at once, like the doors being closed, the fireplace being off, and so on.

Our work provides a novel solution for this problem using Large Language Models, paving the way toward making household robots smarter and more useful.”

Along with Dorbala, Dinesh Manocha, the Paul Chrisman Iribe Chair and Distinguished University professor of computer science and electrical and computer engineering, and Reza Ghanadan, ISR professor and executive director of Innovations in AI, are also part of the team behind the work.

The project began during Dorbala’s internship with Amazon’s Artificial General Intelligence group in 2023 and was later extended under Manocha’s guidance at UMD. Over two years, their sustained collaboration advanced both algorithm design and dataset development for embodied AI.

To build the system, the researchers developed the Prompt-Generate-Evaluate (PGE) algorithm. This framework uses large language models (LLMs) to generate situational queries, create underlying object-state information, and then evaluate for contextual accuracy.

The team employed GPT-4 to generate situational datapoints, BERT embeddings to ensure novelty, and clustering methods to identify representative queries. The simulated household environment VirtualHome was used to produce a dataset of 2,000 situational queries.

To validate the dataset, they conducted a large-scale study on Amazon Mechanical Turk, which confirmed that 97.26 percent of the AI-generated queries were answerable by human consensus. 

However, when tested on answering these queries, language models aligned with human responses only 46 percent of the time, exposing a gap between generating realistic questions and executing accurate reasoning.

Manocha described the research as a major step toward practical household robotics and context-aware AI, emphasizing that situational awareness could allow systems to function more reliably in human-centered environments.

Their paper, Is the House Ready for Sleeptime? Generating and Evaluating Situational Queries for Embodied Question Answering, will be presented at the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) in Hangzhou, China.

Comments

Related