Published on February 28, 2026

How Scientists Really Use AI Tools: An Analysis of 250,000 Real Queries

Researchers from Allen AI analyzed 250,000 queries to scientific AI tools to uncover how scientists genuinely interact with them in practice.

Research 5 – 7 minutes min read

Event Source: Ai2 5 – 7 minutes min read

When companies release AI tools for science, they typically focus on their capabilities: what the system can do, the problems it solves, and its accuracy. However, the question of how scientists actually use these tools in their daily work often remains unanswered. Specialists from Allen AI decided to investigate this and published an analysis of over 250,000 real queries submitted to their scientific AI tools.

Where the Data Comes From and Why It Matters

Allen AI is a non-profit research laboratory that develops AI tools specifically for the scientific community. Among their offerings are Semantic Scholar (a search engine for scientific papers) and several specialized services that assist researchers in working with literature.

The dataset, which they named ASTA (Academic Search and Task Analysis), comprises queries from real users – scientists, students, and analysts. Simply put, this isn't synthetic data or lab testing; it's verbatim what people typed when they needed help with scientific texts.

Why is this valuable? Because developers of AI tools often build systems based on assumptions about how they will be used. Reality, however, frequently proves otherwise. Analyzing real queries is a way to validate expectations against actual usage.

What People Are Actually Asking

The first striking observation is that most queries don't resemble simple search phrases like, «find an article about X».People formulate tasks – in detail, with context, sometimes almost like an email to a colleague.

The researchers identified several main types of interaction:

Literature Search – finding papers on a topic, often with specific conditions such as time period, methodology, or area of application.
Comprehension and Explanation – «explain what this term means», «what's the difference between these approaches?» or «briefly summarize the article»./li>
Comparison and Synthesis – «how do different researchers approach this problem?» or «what does the literature say about this issue overall?»
Writing Assistance – phrasing, structure, and finding suitable citations.

This is an important observation: people perceive scientific AI tools not as mere search engines, but rather as thinking assistants to whom they can explain a task and receive a meaningful answer. The distinction is fundamental, and it influences how such systems should be designed.

Queries Are More Complex Than They Seem

Another key finding is that a significant portion of queries are multifaceted. A user isn't just looking for an article – they want to find an article, understand its place within the context of the field, and get a brief summary. All in one go.

This presents a real challenge for AI systems. Processing a single, clear query is a straightforward task. But when a person formulates something like, «show me the latest papers on topic X, explain the main disagreements in this field, and help me understand if I should read paper Y», it becomes a complex set of subtasks requiring different capabilities.

According to the data, these complex queries constitute a significant portion of actual use. This implies that tools designed solely for simple searches do not meet the real needs of scientists.

Who Uses Them and What It Looks Like in Practice

The audience for these tools turned out to be broader than one might expect. Alongside experienced researchers, the system is actively used by students and individuals just entering a new field of knowledge. For them, an AI tool often becomes the first point of entry – a way to quickly orient themselves on an unfamiliar topic before diving deep into reading.

This changes the perspective on who these tools are actually made for. While one might have previously assumed they were primarily used by experts needing to quickly find a specific paper, the reality is more nuanced. A significant portion of users are in the process of learning, and they require more than just a list of relevant documents; they need assistance with understanding.

What This Means for AI Tool Developers

Allen AI is publishing this dataset with open access – and this is perhaps the main practical value of the publication. Any team developing tools for working with scientific texts can now rely on real usage patterns instead of building hypotheses.

The conclusions that suggest themselves here are quite specific:

Tools must be able to handle complex, multifaceted queries – not just simple search phrases.
Explanation and synthesis are not optional features but fundamental user needs.
A large part of the audience consists of people who are just getting to grips with a topic, rather than established experts. The interface and the logic of the responses must take this into account.

To put it simply: if you build a scientific AI assistant based on how scientists really work with it, you get one kind of product. If you base it on assumptions, you'll likely get another – and it's not guaranteed to be useful.

Open Questions

For all its value, this research has its natural limitations. The queries were collected on specific Allen AI platforms, which means the sample reflects this particular audience and these specific tools. The behavior of users on other systems may differ.

Furthermore, the analysis reveals what people ask but doesn't always explain why they ask it that way. Why do some prefer detailed queries while others use short ones? To what extent does the phrasing depend on habit, the interface, or previous experience with AI? These questions remain open.

But even with these caveats, having 250,000 real examples of how scientists interact with AI is significantly better than building systems in an informational vacuum. Such data gradually shifts development from intuition to an evidence-based approach – and this is, perhaps, exactly the direction we should be moving in.

#analysis #methodology #ai development #social impact of ai #data #human–machine interaction #thinking in the age of ai #scientific ai #ai assistants

Link to Original: https://allenai.org/blog/asta-interaction-dataset

Original Title: How do researchers actually use AI-powered science tools? Lessons from 250,000+ queries

Publication Date: Feb 27, 2026

Ai2 allenai.org A U.S.-based research institute developing language models and AI systems for science and education.

Previous Article Mercury 2: Diffusion Language Models Get a Major Upgrade Next Article Instant Neural Network Updates: How Doc-to-LoRA and Text-to-LoRA Are Changing the Game

How Scientists Really Use AI Tools: An Analysis of 250,000 Real Queries

Where the Data Comes From and Why It Matters

What People Are Actually Asking

Queries Are More Complex Than They Seem

Who Uses Them and What It Looks Like in Practice

What This Means for AI Tool Developers

Open Questions

Related Publications

AI2's AutoDiscovery: When AI Formulates Scientific Hypotheses Automatically

Oracle Adds Clinical Order Generation to Its Medical AI Assistant

Boring – It's Not Simple: Why a Predictable AI Result Is a True Achievement

From Source to Analysis

Neural Networks Involved in the Process

1. Analyzing the Original Publication and Writing the Text

2. step.translate-en.title

3. Text Review and Editing

4. Preparing the Illustration Description

5. Creating the Illustration