Welcome to AI Policy Weekly, a newsletter from the Center for AI Policy. Each issue explores three important developments in AI, curated specifically for U.S. AI policy professionals.
DeepSeek Goes Viral
DeepSeek, a Chinese AI company, has dominated headlines since releasing its reasoning-focused model, R1. The New York Times alone has published over ten different articles about DeepSeek this week.
What’s going on?
First, DeepSeek’s chatbot app surged in popularity on the iPhone App Store, ranking #1 among free apps. (App Store rankings depend on a variety of factors such as download velocity and positive reviews.)
Additionally, DeepSeek buzz appears to have triggered a 17% drop in NVIDIA stock on Monday, erasing nearly $600 billion from the company’s market capitalization. According to Forbes Senior Reporter Derek Saul, this was “by far the single greatest one-day value wipeout of any company in history.”
In benchmark testing, DeepSeek’s R1 model achieved notable scores: 49.2% on real-world software engineering tasks (SWE-bench Verified), 79.8% on competition mathematics problems (AIME 2024), and 71.5% on PhD-level science questions (GPQA Diamond).
These scores match leading U.S. AI models like OpenAI’s o1 and Google’s Gemini-2.0-Flash-Thinking, though they fall well short of OpenAI’s forthcoming o3.
What’s arguably most impressive about R1 is the computational efficiency with which it was trained.
To be specific, R1 is a specialized version of V3, a base model that DeepSeek built with its own computing cluster of 2,048 NVIDIA H800 GPUs. According to analysis from Epoch AI, V3’s training involved just under 5 * 10^24 computing operations. That’s almost 10 times less compute than OpenAI used to build GPT-4o.
The total computing time DeepSeek used for V3 would’ve cost approximately $5.6 million to rent from a cloud provider. For comparison, Anthropic CEO Dario Amodei says his company’s cutting-edge Claude 3.5 Sonnet model “cost a few $10M’s to train” and “remains notably ahead [of V3] in many internal and external evals.”
Despite DeepSeek’s accomplishments, U.S. AI companies maintain a crucial edge through their access to advanced AI chips—access that Chinese companies lack due to U.S. export restrictions.
“Money has never been the problem for us,” said DeepSeek CEO Liang Wenfeng in an interview last year. “Bans on shipments of advanced chips are the problem.”

OpenAI Launches Its First AI Agent
OpenAI recently released Operator, an AI agent that performs web-based tasks by processing screen pixels and controlling a virtual mouse and keyboard. The system marks OpenAI’s first deployment of an autonomous AI that can take real-world actions on behalf of users.
In live demonstrations, Operator successfully booked restaurant reservations through OpenTable, ordered groceries on Instacart, searched for tennis court availability, arranged house cleaning services, and attempted to purchase sports tickets on StubHub—even running multiple tasks simultaneously. Users can take control of the remote browser at any time or let Operator work independently.
Operator is currently available to $200/month ChatGPT Pro subscribers in the United States.
In benchmark testing, the system achieved a 38.1% success rate on OSWorld, a test of AI models’ ability to control a computer. This is over sixteen percentage points higher than the previous record of 22.0% set by Claude 3.5 Sonnet in late 2024.
OpenAI has implemented several safety measures. The most notable guardrail is that Operator frequently checks in with the user for approval before taking consequential actions.
Early reviews indicate Operator is slow and prone to mistakes, sometimes taking longer than doing the task yourself. But AI agents will only improve from here, and improvements tend to happen quickly in AI.
Operator represents the dawn of agents, foreshadowing a future where AI systems can operate without us.

Trump Signs AI Executive Order, Calls For Action Plan
President Trump signed an AI executive order last Thursday, establishing it as “the policy of the United States to sustain and enhance America’s global AI dominance in order to promote human flourishing, economic competitiveness, and national security.”
By July 22nd, a range of senior White House policy officials will develop and submit a comprehensive action plan to achieve this policy.
Meanwhile, federal agencies must immediately review and “suspend, revise, or rescind” any actions taken under Biden’s flagship AI executive order that conflict with the policy.
Similarly, the Office of Management and Budget will revise its AI governance memoranda M-24-10 and M-24-18, which currently regulate federal AI usage and procurement.
This is the first AI executive order of Trump’s second term. It is unlikely to be the last.

News at CAIP
Jason Green-Lowe published a memo with five AI questions for Commerce secretary nominee Howard Lutnick.
Jakub Kraus wrote a blog post on the flurry of AI policy actions from the U.S. executive branch during Biden’s final week and Trump’s first week in office.
Jason Green-Lowe wrote a blog post on the $500 billion OpenAI infrastructure plan: “What's at the Other End of Stargate?”
POLITICO quoted Jason Green-Lowe in the Morning Tech newsletter’s discussion of DeepSeek.
ICYMI: CAIP’s 2025 AI Action Plan emphasizes whistleblower protections, safety planning, and cybersecurity.
Quote of the Week
I’ve just come to realize AI is smarter than I am. Has better ideas, has more efficient ways to execute them. This is an existential moment, akin to what Kasparov felt in 1997 when he realized Deep Blue was going to beat him at chess.
—Paul Schrader, the American filmmaker whose psychological character studies, from Taxi Driver to First Reformed, have shaped modern cinema
This edition was authored by Jakub Kraus.
If you have feedback to share, a story to suggest, or wish to share music recommendations, please drop me a note at jakub@aipolicy.us.
—Jakub