Welcome to AI Policy Weekly, a newsletter from the Center for AI Policy. Each issue explores three important developments in AI, curated specifically for US AI policy professionals.
UK AI Safety Institute Leads the Way Towards Safer AI
In its first progress report in September 2023, the three-month-old UK AI Safety Institute (AISI) described its expert advisory board, successful hiring efforts, partnerships with AI safety organizations, and ambitious plans for the future. “We have an abundance of will to transform state capacity at the frontier of AI Safety,” boasted the growing government team.
That ambition has borne fruit. Six months after hosting the world’s first AI Safety Summit in November, the UK AISI made a slew of announcements before and during this week’s follow-up AI Seoul Summit.
First, the UK AISI will invest up to £8.5 million ($10.8 million) to support research on systemic AI safety, which “aims to understand how to mitigate the impacts of AI at a societal level and study how our institutions, systems and infrastructure can adapt to the transformations this technology has brought about.”
The UK AISI also intends to collaborate with AI safety institutes in Canada and the United States on this research, according to a separate announcement on a UK-Canada science of AI safety partnership.
Beyond making grants to other organizations, the UK AISI plans to grow its own in-house research capacity, which already features over 30 technical experts. To more effectively accomplish this, AISI announced plans to establish an international office in Silicon Valley.
Meanwhile, the existing UK AISI technical team released research detailing results from risk evaluations of five leading AI systems from major AI companies. AISI found that several models “demonstrated expert-level knowledge of chemistry and biology” and that two models could autonomously complete “simple software engineering problems.” Additionally, the team open sourced its software framework for conducting AI evaluations.
Beyond spearheading on its own projects, the UK also helped lead the AI Seoul Summit, where public and private stakeholders made critical collective strides towards AI safety:
All the major AI companies signed the Frontier AI Safety Commitments. These companies promise they will identify specific thresholds beyond which an AI model would pose intolerable risks, and formulate explicit company processes to handle such scenarios.
Over 25 countries signed the Seoul Ministerial Statement outlining shared views and intentions for AI’s future. For instance, the countries acknowledge potential dangers of AI systems acting autonomously to evade human oversight.
A smaller group of US allies signed the Seoul Declaration and accompanying Statement of Intent Toward International Cooperation on AI Safety, which included support for the growing global network of AI safety institutes.
The event’s activities were informed by thorough research from a major international collaboration between leading AI experts around the world. Their “International Scientific Report on the Safety of Advanced AI: Interim Report” followed through on goals established at the November AI Safety Summit.
In summary, the UK AISI has made commendable progress. Many of these critical outcomes simply would not have happened without AISI’s support. This demonstrates the exciting potential of an efficient team acting as “a startup inside the government.”
For its part, the US AI Safety Institute announced its vision, mission, and strategic goals ahead of the Seoul Summit.
But so far, the US AISI has received less than 8% of the funding of its UK counterpart. If the US wants to lead the world in AI safety, then the US AI Safety Institute will need more funding.
So Much for Safety: OpenAI’s Irresponsible Week
Less than a year ago, OpenAI formed its Superalignment team, which aimed to make “scientific and technical breakthroughs to steer and control AI systems much smarter than us.” The team had ambitious plans to “solve this problem within four years.”
OpenAI committed to backing the effort with 20% of its existing computing resources, which may have been comparable in power to the entire National AI Research Resource in the US.
Now, less than a year later, OpenAI has reportedly disbanded the entire team. Further, the company failed to follow through on its computing resource commitments.
In fact, the problems were deeper than that. Jan Leike, the leader of the Superalignment team, departed OpenAI last Wednesday and initially posted a cryptic message on X that simply said “I resigned.” This exit, along with the departure of OpenAI co-founder and Chief Scientist Ilya Sutskever, added to the growing list of AI safety researchers leaving OpenAI.
Two days later, Leike elaborated on his reasons for leaving. His safety team was “sailing against the wind” and “struggling for compute” over the past few months.
Leike expressed concern that OpenAI is “not on a trajectory” to adequately prepare for safety of upcoming AI models. He explained that “over the past years, safety culture and processes have taken a backseat to shiny products.”
Leike was one of the few former employees to openly criticize the company. It turned out this was not a coincidence. For years, OpenAI forced departing employees to sign unexpectedly strict nondisclosure and non-disparagement clauses. Employees who did not sign could lose all their vested equity in the company, which comprised the majority of their earnings.
Another departing OpenAI employee, Daniel Kokotajlo, stated publicly that he gave up 85% of his family’s net worth after refusing to sign the clauses.
OpenAI quickly backtracked from these policies after receiving backlash, although it remains unclear if former employees will be fully able to criticize the company.
CEO Sam Altman claimed he did not know about the practice. But he also signed several documents that make it hard to see how he could have been out of the loop, adding fuel to the mistrust that some people felt towards the CEO after OpenAI’s November board drama.
If these events weren’t enough, OpenAI separately drew criticism from Scarlet Johanssen, who said the company unsuccessfully approached her twice to use her voice in their AI chatbots, and then used a suspiciously similar voice without her permission.
Incidents like this demonstrate that OpenAI employees need to be able to disclose irresponsible practices without risking the loss of millions of dollars. That’s why Congress should protect whistleblowers with policies like the Center for AI Policy’s model legislation.
EU AI Act Passed (For Real This Time)
Across the Atlantic, the Council of the European Union approved the AI Act.
The Act has now completed the full legislative process, aside from a few signatures. It will appear in the EU’s Official Journal in the coming days, and it will enter into force twenty days after that. However, many of the provisions will not begin applying for another two years.
The Council’s approval was unsurprising. The Act had already cleared its most challenging milestone in December, when the Council reached a provisional agreement with the European Parliament.
In the following months, the Council’s Committee of Permanent Representatives (COREPER) confirmed the Act, followed by confirmation from the Parliament’s Internal Market and Civil Liberties Committees. Penultimately, the full Parliament adopted the Act in March, setting the stage for the final approval from the Council.
Importantly, the EU AI Act includes regulations that target advanced general-purpose AI models trained on over 10^25 operations.
The providers of such models must experimentally evaluate their systems to identify and mitigate systemic risks. They also need to ensure cybersecurity protections and report serious incidents. The Center for AI Policy is pleased to see these requirements outlined in Article 55.
News at CAIP
Save the date: we’re hosting an AI policy happy hour next Thursday, May 30th, from 5:30–7:30pm at Sonoma Restaurant & Wine Bar in DC.
We issued a press release responding to the Bipartisan Senate AI Working Group’s policy report: “The Senate's AI Roadmap to Nowhere.”
CAIP Executive Director Jason Green-Lowe wrote an opinion piece on our blog: “OpenAI Safety Team's Departure is a Fire Alarm.”
We issued a press statement on the European Council's approval of the EU AI Act.
We’re hiring for two different roles: An External Affairs Director and a Government Relations Director.
Quote of the Week
One thing is clear: our generation’s collective safety now hangs in the balance. But no one actor can prepare or protect us. AI development and its risks are increasingly borderless, and so too is the responsibility of AI oversight.
—the AI 2030 platform from Encode Justice, the world’s largest youth movement for safe and equitable AI
This edition was authored by Jakub Kraus.
If you have feedback to share, a story to suggest, or wish to share music recommendations, please drop me a note at jakub@aipolicy.us.
—Jakub