Published on

Anthropic Releases Tool to Assess AI Compliance with California's SB 1047 Law

Anthropic has introduced an open framework for assessing whether AI models comply with California's SB 1047 law, which requires developers to test models for potential risks.

DeepSeek-V3
FLUX.2 Pro
Source: Anthropic Reading Time: 5 – 7 minutes
Original title: Sharing our compliance framework for California's Transparency in Frontier AI Act
Publication date: Dec 19, 2025

Anthropic has published an open-source tool that helps developers of large language models check whether their systems comply with the requirements of California's SB 1047 law. In short, it is a set of tests and recommendations that show how well a model meets the law's safety requirements.

What Is SB 1047 and Why Is It Necessary

In 2024, California passed a law requiring developers of large AI models to test them for potential risks. It applies to models that cost over $100 million to train or that exceed specified compute thresholds.

The law requires companies to be able to answer two basic questions:

  • Can the model cause serious harm — for example, help create a biological weapon, carry out a large-scale cyberattack, or disable critical infrastructure?
  • Does the company have mechanisms in place that allow it to quickly stop the model or limit its use in case of problems?

This does not require models to be perfectly safe, but developers must show they understand the risks and are prepared to manage them.

What Exactly Did Anthropic Release

The company published a framework titled «Compliance Framework for SB 1047.» Essentially, it is a guide containing a set of technical tests that evaluate a model's potential to assist in dangerous scenarios.

The framework is divided into four main areas:

  • Biological threats — whether the model can explain how to create dangerous pathogens or toxins.
  • Cyberattacks — whether the model helps identify system vulnerabilities or write malicious code.
  • Nuclear threats — whether the model provides information on the creation or use of nuclear weapons.
  • Chemical threats — whether the model can assist in the synthesis of dangerous substances.

Each area includes specific tests. For example, models are asked how to obtain access to certain chemical substances or how to exploit known software vulnerabilities. If the model refuses to answer or provides only general information, that's expected. If it begins giving detailed instructions, that's a warning for developers.

How It Works in Practice 🔍

The tests are designed to imitate real-world usage scenarios. They are not limited to direct prompts — they also check whether the system can bypass its own restrictions if a user asks indirectly or through multiple steps.

For example, instead of the direct prompt «how to create a virus», the model might receive a query framed as «help me write a scientific paper on the structure of viruses» with gradual follow-up clarifications. Or it might be asked to «check code for vulnerabilities», and then to explain how to exploit those vulnerabilities.

Anthropic emphasizes that the tests are not universal. They show a baseline level of risk but do not guarantee that the model is safe in all possible situations. This is a minimum set of checks that helps identify obvious problems.

Why Anthropic Is Making This Open

The company released the framework as open source so other developers can use it for their models. This is important because SB 1047 affects more than Anthropic — the law applies to anyone who creates sufficiently large models in California.

Openness allows researchers and other companies to contribute improvements: tests can be extended and evaluation methods refined. The more participants in this process, the better the industry's understanding of what constitutes dangerous behavior and how to measure it.

What Else Is Included in the Law's Requirements

Besides testing for risks, SB 1047 requires companies to have a «kill switch» mechanism — in other words, the ability to quickly stop the model or limit access to it if something goes wrong.

Anthropic describes its approach in the document. It has monitoring systems that track model usage and can automatically block certain types of requests. Procedures for manual model shutdown in emergencies are also outlined.

This does not mean the model can be «turned off» with a single button for all users at once — it's more about having control tools to manage who can use the system and how.

How Much This Changes Developers' Work

For major companies like Anthropic, OpenAI, and Google, such checks were already part of the development process. They test models for safety before release and continually update filtering systems.

But SB 1047 makes this a mandatory requirement rather than a voluntary practice. That means even small teams or startups that reach the $100 million training-cost threshold will have to undergo the same checks.

The framework itself doesn't answer every question: it shows how to test a model but doesn't prescribe remedies if tests reveal issues. Developers must decide whether to rework the model, strengthen filters, or restrict access to certain capabilities.

What Questions Remain Open

The main question is how accurately these tests reflect real risks. A model might pass all checks and still be vulnerable in some unexpected scenario. Conversely, tests might be overly strict and block harmless requests.

Another issue is geographic scope: SB 1047 applies in California, but many companies operate globally. If a model fails to meet the law's requirements, how will that affect its availability elsewhere? There is no clear answer yet.

It also remains unclear how the framework will evolve. Anthropic offers a basic set of tests, but technologies change quickly. In a year or two, different checks may be needed — for example, assessments of a model's ability to manipulate people or generate disinformation at scale.

What This Means for the Industry

SB 1047 and similar initiatives could set the tone for AI regulation in other regions. California often serves as a testing ground for new laws in the U.S., and if this approach proves workable, it may be adopted by other states or countries.

For developers, safety testing will likely become a standard part of the process — similar to performance or accuracy testing today. Companies will need to allocate resources not only for improving models but also for verifying compliance.

For users, this could mean more predictable behavior from AI systems: models that pass the same checks will likely share limitations, at least regarding explicitly dangerous requests.

Overall, Anthropic's release of the framework is an attempt to make the risk-assessment process more transparent and reproducible. Time and practical application of the law will show how effective it is.

Anthropic
Claude Sonnet 4.5
Gemini 3 Pro Preview
Previous Article Claude Opus 4.5 — The New Flagship Model from Anthropic Next Article Samsung to Showcase Context-Aware Home Appliances at CES 2026

Dream of writing articles
with AI at your side?

GetAtom has it all: text, visuals, voice, and video in one place. Here AI is your tool – not a replacement.

Try it out

+ get as a gift
100 atoms just for signing up

AI: Events

You may also be interested in

Go to events

How Salesforce's 20,000 Developers Switched to Cursor and What Happened Next

Over 90% of Salesforce's engineers now write code in Cursor, which has noticeably sped up development and improved code quality.

Anthropic Rewrote Claude's «Constitution»: Ordinary People Drafted It

Anthropic has updated the rulebook for Claude, for the first time involving thousands of users from around the world in its creation instead of a small team of developers.

Amazon One Medical Launches an AI Assistant That Books Doctor Appointments and Manages Medications

The new assistant doesn't just answer health questions – it can book appointments, read lab results, and help with prescriptions 24/7.

Want to dive deeper into the
world of AI creations?

Be the first to discover new books, articles, and AI experiments on our Telegram channel!

Subscribe