AI Policy Weekly #26
RAND and Aschenbrenner’s security analyses, OpenAI’s data deals, and a proposed “right to warn about advanced AI”
Welcome to AI Policy Weekly, a newsletter from the Center for AI Policy. Each issue explores three important developments in AI, curated specifically for US AI policy professionals.
AI Companies Are Failing to Prioritize US National Security
In April, OpenAI fired safety employee Leopold Aschenbrenner for leaking information.
This week, Aschenbrenner offered his side of the story in an interview with Dwarkesh Patel. Aschenbrenner said that the leak referred to a short “brainstorming document” containing no confidential information. He had shared the document with three external researchers for feedback.
OpenAI also told Aschenbrenner that a major reason for his firing was that he shared a memo expressing concerns to the OpenAI board last year. The memo argued that OpenAI’s information security was “egregiously insufficient to protect against the theft of model weights or key algorithmic secrets from foreign actors.”
Aschenbrenner elaborated on these concerns in a 165-page treatise that he published this week. The document, titled “Situational Awareness,” puts forth a thorough argument that AI systems will grow dramatically more capable over the next few years, soon reaching the capability of artificial general intelligence (AGI) systems that can automate all cognitive labor and accelerate AI research even further. The document then describes major sources of risks, including lax information security at AI companies.
Aschenbrenner pulls no punches, predicting that “In the next 12-24 months, we will leak key AGI breakthroughs to the CCP. It will be the national security establishment’s single greatest regret before the decade is out.”
Besides concerns about leaking technical insights, Aschenbrenner also expresses concern about leaking model weights, the numerical parameters that encode everything an AI model knows. He says AI companies are “miles away” from sufficient security on model weights.
As evidence, he highlights Google’s Frontier Safety Framework, where the company admits that its AI model weights are vulnerable to hacks from sophisticated attackers. He also emphasizes that the Justice Department recently arrested a Chinese national for stealing AI trade secrets from Google using remarkably simple tactics, such as copying data into Apple Notes.
The concern with stealing AI model weights is that thieves can easily repurpose the model for malicious purposes. Note that it is remarkably cheap (e.g., $200) to remove any safety guardrails that a company instilled in an AI model before theft.
Indeed, OpenAI has publicly reported that actors in China, Russia, and Iran are already using its models to assist in influence operations and cyber attacks. OpenAI is able to thwart these attempts by revoking account access, but OpenAI would be totally unable to do that if the attackers had the company’s model weights.
Aschenbrenner’s concerns about inadequate security are seconded by a new RAND report on securing model weights. The report identifies 38 potential attacks for stealing weights, assesses the capacity of different attackers at executing these attacks, and outlines a framework for escalating company security to match these threats.
RAND recommends that leading (“frontier”) AI companies implement several critical security protections that “are not yet comprehensively implemented in frontier AI organizations,” such as insider threat programs. Most of these measures are feasible to implement within a year.
However, these protections are insufficient for defending against well-resourced attackers like nation states. Accomplishing that level of security will take years, and “it is unclear whether such actions will take place without proactive solicitation.”
The bottom line is that frontier AI companies are not on track to self-govern their way into adequate information security, especially if AI models soon become important tools for launching cyber attacks and building weapons of mass destruction. It’s high time for Congress to step in and ensure stronger security practices.
OpenAI Inks Twelfth Major Data Deal, Illustrating Steep Costs of AI Development
Last week, OpenAI announced content deals with The Atlantic and Vox Media (including Vox, The Verge, SB Nation, New York Magazine, and more).
These deals include permission for OpenAI to train AI models on articles from these outlets.
These add to at least ten existing deals OpenAI has made so far to acquire training data:
News Corp (including The Wall Street Journal, Barron’s, the New York Post, and more)
Axel Springer (including POLITICO and Business Insider)
Dotdash Meredith (including People, Better Homes & Gardens, Serious Eats, and more)
Prisa Media (including El País, Cinco Días, and El HuffPost)
The Financial Times
OpenAI’s financial commitments in these deals are often private, but some details have been leaked or announced.
For example, The Information reported that “OpenAI has offered some media firms as little as between $1 million and $5 million annually to license their news articles for use in training its large language models, according to two executives who have recently negotiated with the tech company.”
Additionally, the Axel Springer deal reportedly cost OpenAI “tens of millions of euros” over three years, and the News Corp deal reportedly cost $250 million over five years.
There is also evidence from other AI companies. Google recently signed a deal with Reddit that was “about $60 million per year.”
Add the deals up, and OpenAI could easily be paying over $100 million per year for training data.
This is setting aside the tens or hundreds of millions of dollars that leading companies must spend on cloud computing rental costs—or even more for permanently purchasing data centers with chips, cooling equipment, and electricity, although those resources can be used multiple times.
It is also setting aside the salaries for the employees involved in each mini-Apollo mission that goes into building a frontier AI model: Google's Gemini 1.5 paper lists over 700 contributors, and each of the company's engineers easily earns six—if not seven—figures per year.
Further, the cost of constructing frontier AI is elevating exponentially over time, so these financial requirements will rapidly rise from here. Development costs have more than doubled every year for the past eight years, suggesting that state-of-the-art AI models will cost over $1 billion apiece in the coming years.
For these reasons, cutting-edge AI development is swiftly becoming a project that only Big Tech can afford.
AI Employees Propose a “Right to Warn About Advanced Artificial Intelligence”
A group of current and former OpenAI employees, led by Daniel Kokotajlo, recently accused the company of having a “reckless” and unsafe culture in its rush to develop advanced AI systems.
They emphasize OpenAI’s historical use of restrictive non-disparagement agreements to prevent former employees from raising concerns about safety.
In this spirit, the group published an open letter calling for AI companies to adopt robust whistleblower protections, anonymous reporting procedures, and cultures that allow criticism.
Kokotajlo also alleged that OpenAI’s internal safety review board failed to prevent unapproved testing of GPT-4 by Microsoft in 2022. Microsoft initially denied this claim before backtracking and admitting to it.
An OpenAI spokesperson expressed agreement that “rigorous debate is crucial given the significance of this technology.”
The news adds to existing concerns about OpenAI’s ability to self-regulate its way through the creation of a transformative technology.
News at CAIP
We’re pleased to announce the newest full-time member of CAIP: Brian Waldrip is joining the team as Government Relations Director. Brian brings over two decades of Congressional affairs experience, including nine years as the Legislative Director for former Congressman John Mica of Florida.
We’re also happy to share that our 2024 summer interns have settled into the office and are getting started on their projects. Aileen Niu and Vedant Patel of Duke University are joining us through the DukeEngage summer program.
Save the date: we’re hosting a panel discussion on Wednesday, June 26th from 11am–12pm titled “Protecting Privacy in the AI Era: Data, Surveillance, and Accountability.”
ICYMI: transcripts for all seven episodes of the Center for AI Policy Podcast are now available on Substack.
Jason Green-Lowe wrote a new blog post: “Influential Safety Researcher Sounds Alarm on OpenAI's Failure to Take Security Seriously.”
We submitted a comment to the National Institute for Standards and Technology’s (NIST) Request for Comments on draft documents responding to NIST’s assignments under last fall’s AI Executive Order.
We’re hiring for an External Affairs Director.
Quote of the Week
Models can write code dozens of times faster than humans; and I’m assuming they’ll be as skilled as the best human hackers. If they put a subtle vulnerability into that code, it’d be difficult for humans to detect.
Cybersecurity against external attackers is hard enough. Against an adversary who’s adding tens of thousands of lines of code to your codebase every day, it’s far harder.
—OpenAI employee Richard Ngo outlining a concern that in the future, autonomous AI software agents could create serious cybersecurity vulnerabilities
This edition was authored by Jakub Kraus.
If you have feedback to share, a story to suggest, or wish to share music recommendations, please drop me a note at jakub@aipolicy.us.
—Jakub