Article image
07-26-2024

Humans often misjudge and place too much trust in AI performance

Large language models (LLMs) have impressive, yet sometimes confusing capability. These tools have the power to help graduate students draft emails, or guide a clinician in diagnosing cancer.

Their applications are endless, but so are the questions on how best to evaluate these models.

The research is coming to us from the Massachusetts Institute of Technology (MIT), a group of researchers has taken a different approach to understand LLM’s better.

Let’s learn about their new perspective, which places human beliefs and expectations center stage in the evaluation process.

Machine learning and language models

The research team hailing from MIT involved a trio of specialists from diverse fields.

The study’s co-author Ashesh Rambachan, an assistant professor of economics, and a principal investigator in the Laboratory for Information and Decision Systems (LIDS), was a driving force behind this research.

He was joined by lead author Keyon Vafa, a postdoc at Harvard University, and Sendhil Mullainathan, an MIT professor.

This unique perspective of their research will be presented at the International Conference on Machine Learning.

Human belief and generalization

The research pivoted around the concept of human generalization – how we form and update our beliefs about an LLM’s capabilities.

You’ve probably done it yourself; you see someone ace grammar correction and generalize that they would also be great at sentence construction.

That’s precisely what we do with language models. However, Rambachan points out that these LLMs aren’t, in fact, human, which is why our generalizations can lead to unexpected model failures.

Gearing their research around this, the team introduced a human generalization function to evaluate the alignment between humans’ beliefs and LLM performance.

The survey says. . .

So, how did the team approach this exciting endeavor? The researchers designed a survey to measure how people generalize when they interact with LLMs.

Participants were shown questions that a person or LLM got right or wrong and were asked to predict the performance on a related question.

As a result, a dataset of nearly 19,000 examples was generated to provide an insight into how humans form beliefs about LLM performance across 79 diverse tasks.

Measure misalignment in language models

The survey results unfolded a fascinating pattern. Participants did pretty well at predicting a human’s performance but fumbled when it came to foreseeing the LLM’s performance.

“Human generalization gets applied to language models, but that breaks down because these language models don’t show patterns of expertise like people would,” Rambachan rightly pointed out.

Another interesting finding was that humans were more likely to update their beliefs about an LLM’s performance when it got a question wrong.

Simpler models seemed to outperform very large models, like GPT-4, in such scenarios. The researchers speculate that this could be because we are not as accustomed to interacting with LLMs as we are with humans.

Implications for model development

The findings from this research hold significant implications for the future development of language models.

As the researchers highlight, understanding how human beliefs shape our expectations of LLMs can inform model design and training methodologies.

By incorporating insights into human generalization patterns, developers can create models that are not only more robust but also align more closely with user expectations.

This could lead to enhanced transparency in model capabilities, facilitating better integration of LLMs into real-world applications.

Future research on language models

As this investigation underscores the complex relationship between human belief systems and LLM performance, a wealth of opportunities for future research emerges.

Questions surrounding how different demographics interact with LLMs, or how context affects human generalization, remain largely unexplored.

Additional studies could focus on refining the human generalization function to improve alignment measures or delving deeper into the psychological aspects of belief formation and adjustment in response to LLM capabilities.

Exploring these avenues could ultimately lead to more effective and user-friendly artificial intelligence technologies.

The road ahead

Armed with these insights, the researchers are excited to conduct further studies on how people’s beliefs about LLMs evolve over time as they interact with a model more frequently.

They also aim to understand how human generalization could be incorporated into the development of LLMs right from the start.

In the meantime, they hope that their dataset could be used as a benchmark to measure the LLM’s performance related to the human generalization function. This could indeed revolutionize the performance of models deployed in real-world situations.

“When we are training these algorithms in the first place, or trying to update them with human feedback, we need to account for the human generalization function in how we think about measuring performance,” the researchers remind us.

This research was funded, in part, by the Harvard Data Science Initiative and the Center for Applied AI at the University of Chicago Booth School of Business.

The study is published in the journal arXiv.

—–

Like what you read? Subscribe to our newsletter for engaging articles, exclusive content, and the latest updates. 

Check us out on EarthSnap, a free app brought to you by Eric Ralls and Earth.com.

—–

News coming your way
The biggest news about our planet delivered to you each day
Subscribe