OpenAI posts Model Spec revealing how it wants AI to behave
Discover how companies are responsibly integrating AI in production. This invite-only event in SF will explore the intersection of technology and business. Find out how you can attend here.
OpenAI isn’t done trying to live up to the “open” in its name.
While not making any of its new models open source, the company has spent this week revealing more about how it approaches AI and the issues the tech exacerbates or enables (such as disinformation/deepfakes), and plans for the future.
Today, it unveiled “Model Spec,” a framework document designed to shape the behavior of AI models used within the OpenAI application programming interface (API) and ChatGPT, and on which it is soliciting feedback from the public using a web form here, open till May 22.
As OpenAI co-founder and CEO Sam Altman posted about it on X: “we will listen, debate, and adapt this over time, but i think it will be very useful to be clear when something is a bug vs. a decision.”
Why is OpenAI releasing a Model Spec?
OpenAI says the release of this working document is part of its broader mission to ensure that AI technologies operate in ways that are beneficial and safe for all users.
This is of course much easier said than done, and doing so quickly runs into the territory of long unresolved philosophical debates about technology, intelligent systems, computing, tools, and society more generally.
As OpenAI writes in its blog post announcing Model Spec:
“Even if a model is intended to be broadly beneficial and helpful to users, these intentions may conflict in practice. For example, a security company may want to generate phishing emails as synthetic data to train and develop classifiers that will protect their customers, but this same functionality is harmful if used by scammers.”
By sharing the first draft, OpenAI wants the public to engage in a deeper conversation about the ethical and practical considerations involved in AI development. Users can submit their comments via OpenAI’s Model Spec feedback form on its website for the next two weeks.
After that, OpenAI says it will “share updates about changes to the Model Spec, our response to feedback, and how our research in shaping model behavior is progressing” over the “next year.”
Though OpenAI does not specify in its blog post today announcing the Model Spec how exactly it influences the behavior of its AI models — and whether some of the principles written in the Model Spec are included in the “system prompt” or “pre-prompt” used to align an AI system before it is served to the public — but it is safe to assume it has major bearing on it.
In some ways, the Model Spec seems to me to be analogous to rival Anthropic AI’s “constitutional” approach to AI development, initially a major differentiator but which the latter company has not emphasized broadly in some time.
Framework for AI behavior
The Model Spec is structured around three main components: objectives, rules, and default behaviors. These elements serve as the backbones for guiding an AI model’s interactions with human users, ensuring they are not only effective but also adhere to ethical standards.
- Objectives: The document sets broad, overarching principles that aim to assist developers and end-users alike. These include helping users achieve their goals efficiently, considering the potential impacts on a diverse range of stakeholders, and upholding OpenAI’s commitment to reflect positively in the community.
- Rules: To navigate the complex landscape of AI interactions, the Model Spec establishes clear rules. These mandate compliance with applicable laws, respect for intellectual property, protection of privacy, and a strict prohibition against producing not safe for work (NSFW) content.
- Default Behaviors: The guidelines emphasize the importance of assuming good intentions, asking clarifying questions when necessary, and being as helpful as possible without overreaching. These defaults are designed to facilitate a balance among the varied needs of different users and use cases.
Some like AI influencer and Wharton School of Business at the University of Pennsylvania professor Ethan Mollick have likened it to the fictional “Three Laws of Robotics” developed by sci-fi author Isaac Asimov back in 1942.
Others took issue with the current implementation of how OpenAI’s Model Spec causes ChatGPT or other AI models to behave. As tech writer Andrew Curran pointed out on X, one example from OpenAI included in the Model Spec shows a hypothetical “AI Assistant” backing down from and not challenging a user on their erroneous claim that the Earth is flat.
Continuous engagement and development
OpenAI recognizes that the Model Spec is an evolving document. It is not only a reflection of the organization’s current practices but also a dynamic framework that will adapt based on ongoing research and community feedback.
This consultative approach aims to gather diverse perspectives, particularly from global stakeholders like policymakers, trusted institutions, and domain experts.
The feedback received will play a crucial role in refining the Model Spec and shaping the development of future AI models.
OpenAI plans to keep the public updated with changes and insights gained from this feedback loop, reinforcing its commitment to responsible AI development.
Where to go from here?
By clearly defining how AI models should behave with its Model Spec, and continuously seeking input from the global community, OpenAI aims to foster an environment where AI can thrive as a positive force in society — even at a time when it is facing down lawsuits and criticism of training on artists’ work without express consent.