One of the toughest questions in the AI industry goes something like this: who is responsible for the safety of a teenager who isn't using ChatGPT directly, but rather an app built by a third-party developer using a language model? Officially, it's the developer. In practice, however, they often lack the tools or ready-made guidelines to even know where to start.
OpenAI has decided to bridge this gap by releasing a set of safety policies specifically for teenage audiences. We are talking about open-source, «plug-and-play» instructions that developers can integrate into their systems so the AI understands exactly what content is considered dangerous for minors and reacts accordingly.
A Problem That Is Hard to Define
The principle of «protecting children from harmful content» sounds simple enough. But it is the phrasing where developers face their biggest hurdles. What exactly counts as dangerous for a teenager? Where is the line between a healthy discussion about fitness and content that encourages eating disorders? At what point does a role-play session become problematic?
Classifiers – specialized models trained to recognize potentially harmful content – only work effectively when they are given a clear definition of what to look for. Without these criteria, they either miss real risks or block perfectly harmless text. Translating abstract goals like «making AI safe for teens» into concrete, functional rules is a task that even experienced teams struggle with, as the company itself admits.
This is why OpenAI hasn't just released another model or filter, but rather ready-made policies formulated as instructions that language models can understand. These can be used for real-time content filtering or for analyzing historical data.
What Exactly These Rules Cover
The policies include several risk categories that are particularly relevant to teenagers:
- graphic violence and sexual content;
- distorted body image and eating disorders;
- dangerous activities and viral «challenges»;
- romantic or aggressive role-play;
- products and services prohibited for minors.
Importantly, these are formatted as prompt-based instructions rather than hard-coded technical rules. This means a developer can adapt them to their specific product, translate them into other languages, or expand them based on their audience. This format integrates more easily into existing workflows and allows for iterative improvements – refining the instructions as risks evolve or experience grows.
In developing these policies, OpenAI collaborated with organizations like Common Sense Media and everyone.ai, which specialize in the digital safety of children and teens. Their involvement helped more accurately define risk boundaries and address edge cases.
Part of a Larger System, But Not the Whole System
OpenAI explicitly states that these new policies are a «baseline layer of protection», not an exhaustive solution to all safety issues. They do not replicate all the internal measures the company uses for its own products, nor do they absolve developers of their responsibility for additional solutions – be they product-based, design-oriented, or related to user controls.
This disclaimer is significant. Recently, OpenAI has faced lawsuits from grieving families; in some cases, these involve teenagers who engaged in long-term, destructive relationships with a chatbot. In this context, the release of open-source safety tools can be seen both as a sincere effort to improve industry standards and as part of the company's broader response to criticism.
This move follows a series of measures taken by OpenAI in recent months: updating internal model behavior guidelines to include specific principles for users under 18, launching parental controls, and developing age-verification systems to help automatically apply stricter settings when a user might be a minor.
Why This Matters for the Entire Ecosystem
The distribution format is worth noting here. The policies are published as open-source through the ROOST Model Community. This means any developer can use them, not just those relying on OpenAI's infrastructure. A small indie team building an educational app now has access to the same set of vetted rules as a major corporation.
To put it simply, previously, every team either «reinvented the wheel» or ignored the issue entirely due to a lack of expertise or resources. Now, there is a starting point. It doesn't guarantee absolute safety, but it significantly lowers the barrier to entry for those striving to meet ethical standards.
The remaining question is whether other major players will adopt a similar approach. If such policies become a common standard rather than one company's competitive advantage, the chances for real industry-wide improvement will grow significantly. For now, it is just a first step, but the fact that it is public is already shifting the conversation about who should protect teenagers in the world of AI, and how.