Typically, artificial intelligence works as an assistant: you ask a question, and it finds information or performs a task. But what if we turned this logic on its head? What if AI could come up with the questions itself, find the answers, and draw conclusions?
The team at AI2 (Allen Institute for AI) has introduced the AutoDiscovery tool as part of its AstaLabs platform. It's a system that aims to automate scientific discovery – from formulating a hypothesis to testing it and presenting the results.
How AutoDiscovery Works in Practice
How It Works in Practice
Simply put, AutoDiscovery takes over several stages of research that are typically performed by humans:
- Formulating scientific questions based on existing data;
- Planning how these questions can be tested;
- Conducting the analysis;
- Presenting the results as structured text.
The system doesn't operate in a vacuum – it relies on the data it's given and the analytical methods at its disposal. But the key difference is that it independently tries to determine which questions are even worth asking.
For example, if you give it a dataset with information on user behavior, it can spot non-obvious patterns on its own and propose a hypothesis for testing. Or, by working with scientific publications, it can identify gaps in research and formulate new lines of inquiry.
Why Automated Hypothesis Generation is Needed
Why Is This Necessary?
The idea isn't to replace scientists. Rather, it's about accelerating the most routine and labor-intensive part of the job – finding what is actually worth investigating.
In science, a huge amount of time is spent just figuring out what question to ask. The data is there, the methods are there, but it's unclear which direction to take. AutoDiscovery aims to automate this specific stage: it scans data, looks for non-obvious connections, and proposes options for further investigation.
This is especially useful in fields where data is plentiful, but time to make sense of it is scarce. For instance, in biomedicine, where thousands of papers are published daily, or in social sciences, where datasets can contain millions of records.
AutoDiscovery Technology Explained
What's Under the Hood?
AutoDiscovery is built into AstaLabs – AI2's platform for working with scientific data. This means the tool doesn't exist in isolation: it's connected to the platform's other features, including access to publications, analysis tools, and language models.
The system uses a combination of machine learning methods and logical analysis. It doesn't just generate random hypotheses – it tries to assess their relevance, test them against available data, and propose only those that make sense.
However, the final decision still rests with the human user. AutoDiscovery doesn't publish research on its own; it merely suggests options and shows what could be investigated.
Limitations and Open Questions for AutoDiscovery
Limitations and Questions
The first thing to understand is that the system's performance directly depends on the quality of the data. If the data is incomplete, biased, or contains errors, the hypotheses will be of corresponding quality.
Second is the question of interpretation. An AI can spot a correlation, but that doesn't mean it understands causation. A human still needs to evaluate whether the proposed hypothesis makes sense from a real-world perspective.
Third is the creative aspect of science. Many breakthroughs happen not because of systematic data analysis, but thanks to unexpected insights, metaphors, and interdisciplinary connections. It's not yet clear to what extent AutoDiscovery is capable of going beyond what is already inherent in the data.
Impact of AutoDiscovery on Scientific Research
What Does This Mean for Science?
If such tools become widespread, it could change the structure of research work. Some of the time currently spent formulating questions would be freed up. Researchers could test more hypotheses faster, find non-obvious connections, and focus on interpreting results rather than searching for them.
On the other hand, this raises questions about what authorship will look like in such a model. If the AI proposed the hypothesis, and a human tested and described it, who is the author of the study? How should the contribution of each party be evaluated?
For now, AutoDiscovery is more of an experiment than a finished product. But it shows the direction in which scientific tools might be heading: not just assisting with analysis, but participating in the very process of formulating knowledge.