AI Policy Weekly #72

o3, Education EO, ASI Security

and

Apr 25, 2025

Welcome to AI Policy Weekly, a newsletter from the Center for AI Policy (CAIP). Each issue explores three important developments in AI, curated specifically for U.S. AI policy professionals.

To help shape the future of this newsletter and make it more valuable for you, please consider taking a minute to share feedback here.

OpenAI Launches o3 and o4-mini Models with Enhanced Reasoning and Tool Integration

OpenAI recently released o3 and o4-mini (and o4-mini-high, which thinks more before responding). Paid ChatGPT subscribers can use all these models, while free users are limited to o4-mini.

o3’s benchmark scores are impressive, but they are lower than the early prototype that OpenAI flaunted in December. Granted, o3-pro is coming “in a few weeks,” and it might match or exceed the December scores by using more computation to solve problems.

o3 is arguably the best reasoning model in the world currently, though Google’s Gemini 2.5 Pro is a close competitor.

The model’s competence may be the result of heavy computational consumption, with one OpenAI employee stating “we put in more than 10 times the training compute of o1 to produce o3.”

That compute is yielding impressive results. The third-party evaluator METR tested o3 and believes it surpasses previous projections for AI’s ability to complete lengthier and lengthier software-related tasks entirely autonomously (albeit with 50% reliability).

Importantly, o3 can fluently wield tools, especially web search and image processing, as it reasons. Though many chatbots have reasoning capabilities and search access and image recognition, few of them integrate these all together so smoothly.

Not all the news is rosy. Many users are reporting cases of o3 lying to them and fabricating information with confidence. Indeed, its hallucination rate is higher than o1 on the PersonQA benchmark.

Additionally, biological risks are visible on the horizon. OpenAI writes that “several of our biology evaluations indicate our models are on the cusp of being able to meaningfully help novices create known biological threats, which would cross our high risk threshold. We expect current trends of rapidly increasing capability to continue, and for models to cross this threshold in the near future.”

In line with that assessment, a new paper from SecureBio and the Center for AI Safety finds that language models score well on the Virology Capabilities Test (VCT), a benchmark with “322 search-proof, relevant, and multimodal questions on practical troubleshooting in virology, including coverage of many dual-use topics. The questions in VCT involve rare knowledge that trained virologists themselves consider hard-to-find or even tacit.”

Emerging AI-driven biorisks make it even more pressing for Congress to pass bills like the Strategy for Public Health Preparedness and Response to Artificial Intelligence Threats Act and the MedShield Act. The Center for AI Policy supports both these bills.

Trump Signs Executive Order on Advancing K–12 AI Education

On April 23rd, President Donald Trump signed an executive order titled “Advancing Artificial Intelligence Education for American Youth.”

The order sets the policy of the United States to “promote AI literacy and proficiency among Americans by promoting the appropriate integration of AI into education, providing comprehensive AI training for educators, and fostering early exposure to AI concepts and technology.”

A new White House Task Force on AI Education, led by the Office of Science and Technology Policy Director, will coordinate implementation efforts across multiple federal agencies, including the departments of Education, Labor, Agriculture, and Energy, alongside the National Science Foundation.

Within 90 days, the Task Force will plan a Presidential AI Challenge to implement over the ensuing 12 months. The Challenge will “encourage and highlight student and educator achievements in AI, promote wide geographic adoption of technological advancement, and foster collaboration [...] to address national challenges with AI solutions.”

The order also mandates public-private partnerships to develop K–12 AI literacy resources, with initial partnerships announced on a rolling basis and resources mobilized within 180 days of the first announcement.

Separately, the Department of Education will issue guidance within 90 days on using existing grant funds for “AI-based high-quality instructional resources; high-impact tutoring; and college and career pathway exploration, advising, and navigation.”

Through teacher training grants, the Department of Education will also support AI-related projects that reduce administrative work, improve teacher evaluation, provide professional development for teachers, and help educators “integrate the fundamentals of AI into all subject areas.”

The National Science Foundation will prioritize research on the use of AI in education, while the Agriculture Department will support AI education through 4-H programs and the Cooperative Extension System.

The Labor Department will expand AI-related Registered Apprenticeships, direct funding toward AI skills development, promote AI education certifications, and support high school AI courses through grants.

Overall, this executive order represents a significant federal push towards AI literacy across America’s K–12 educational landscape.

From left to right: Commerce Secretary Howard Lutnick, Labor Secretary Lori Chavez-DeRemer, President Donald Trump, and Education Secretary Linda McMahon. (source)

New Report Warns of Major Security Vulnerabilities in U.S. AI Projects

A newly released report from Gladstone AI identifies severe vulnerabilities in America’s frontier AI development and warns that espionage could give China access to U.S. AI breakthroughs before they benefit American national security.

According to authors Jeremie and Edouard Harris, Trump White House officials viewed the document, titled “America’s Superintelligence Project.”

While some sections are redacted in the public version, the report presents findings from a 12-month investigation involving over 100 specialists from intelligence, military, and AI research communities.

The authors identify several critical AI security and governance challenges:

Data center vulnerabilities: One experienced special forces operator assessed a $2 billion data center and identified a $30,000 attack that would disable it for six months or more.
Supply chain dependencies: Many AI infrastructure components come from China, making them vulnerable to compromise. For example, Taiwan-based ASPEED manufactures 70% of the world’s Baseboard Management Controllers (BMCs).
AI lab insecurity: According to one former OpenAI researcher, OpenAI had serious security vulnerabilities that “would have allowed any employee to exfiltrate model weights from the lab’s servers undetected.”
AI control challenges: The authors write that “highly capable and context-aware AI systems can invent dangerously creative strategies to achieve their internal goals that their developers never anticipated or intended them to pursue.”
Power concentration: The report warns that without robust checks and balances, “a small handful of people will end up in control of the most powerful technology ever created.”

To help address these issues, the authors recommend building highly secure data centers in remote locations, creating U.S.-based supply chains, developing robust AI control techniques, implementing oversight mechanisms similar to nuclear command protocols, and more.

CAIP News

CAIP supported a letter to the attorneys general of Delaware and California, urging them to preserve nonprofit control of OpenAI.
- CNBC’s Hayden Field quoted Jason Green-Lowe in an article about the letter.
Politico’s Mohar Chatterjee quoted Jason Green-Lowe in a story covering concerns about superintelligent AI.
ICYMI: Mark Reddish wrote a blog post in support of the National AI Research Resource (NAIRR).

From the archives… Just under a year ago, Vox’s Kelsey Piper joined the CAIP Podcast to discuss OpenAI’s past use of nondisclosure and nondisparagement agreements in its exit documents. Tune in or read a transcript here.

Quote of the Week

We’re not just designing interactions for humans. The cows are our users, too.

—Jan Jacobs, the human-robot interaction design lead at Lely, a dairy farm robotics company

This edition was authored by Jakub Kraus.

If you have feedback to share, a story to suggest, or wish to share music recommendations, please drop me a note at jakub@aipolicy.us.

—Jakub

A guest post by

Jakub Kraus

Tarbell Fellow writing about AI and policy

AI Policy Weekly