3-Shot Learning #2

NDAA passage, DOE office announcement, and OpenAI risk preparation

and

Dec 21, 2023

Welcome to 3-Shot Learning, a weekly newsletter from the Center for AI Policy. Each issue explores three important developments in AI, curated specifically for AI policy professionals.

Congress Passes AI Policies Within the FY2024 NDAA

Last Thursday, Congress passed the National Defense Authorization Act (NDAA) for FY2024. Among the bill’s AI-related provisions, several stand out for advancing AI safety. For instance, Section 1542 tasks the DOD with developing a “bug bounty program” for “foundational AI models,” which are defined similarly to the famous Stanford report on “foundation models”: in short, general-purpose AI models that learn from a wide variety of data, such as ChatGPT. Although the bill omits an explicit definition of “bug bounty program,” the term refers to an initiative incentivizing the detection and reporting of vulnerabilities in AI systems.

This bug bounty program may sound familiar, since Majority Leader Schumer’s bipartisan AI quartet introduced the idea in a bill earlier this year. Notably, that bill’s “vulnerability analysis study” has also made it into this year’s NDAA, as Section 1545. Specifically, the DOD will have one year to conduct a study covering topics that are common in AI safety. For example, the study will assess the R&D efforts needed to support interpretability and explainability of AI systems in the military, and identify any research gaps that the DOD could fund. Additionally, the study will examine how “increased agency” in AI systems, as well as interactions between multiple AI systems, may affect the ability to assess “new, complex, and emergent behavior.”

Another safety-relevant segment addresses “ethical and responsible AI.” In particular, the DOD will develop a test, evaluation, verification, and validation (TEVV) framework that “operationalizes responsible AI principles.” Furthermore, the DOD will implement a process for assessing compliance with the TEVV framework, including protocol for discontinuing the use of AI systems that repeatedly fail the compliance assessment.

New DOE Office and AI Capability Evaluations

The DOE recently announced an Office of Critical and Emerging Technology, to “accelerate progress” in critical technology areas like AI, and to work towards solving “critical science, energy, and security challenges.” The leader of the new office is Helena Fu, who has also been named the DOE's Chief AI Officer. The White House AI Executive Order tasked DOE with establishing this office, as well as testing AI capabilities posing “nuclear, nonproliferation, biological, chemical, critical infrastructure, and energy-security threats.” Indeed, in a House E&C Committee hearing on Wednesday, Fu described how the DOE is assessing AI capabilities related to “some of the more existential catastrophic risks.”

OpenAI Prepares for Catastrophic AI Risks

On Monday, OpenAI released an early draft of its Preparedness Framework, which describes the organization’s plan to guard against AI threats that “could result in hundreds of billions of dollars in economic damage or lead to the severe harm or death of many individuals.” The framework involves regular testing for high-risk capabilities during the development of an advanced AI system. If safety mitigations are insufficient to control critical risks such as cyberattacks, bioweapon creation, and autonomous self-improvement, then OpenAI will pause development and “focus [their] efforts as a company towards solving these safety challenges.”

News at CAIP

We released a statement on the FY2024 NDAA.
Thomas Larsen discussed technical AI safety in an interview with The AGI Show.

Quote of the Week

I urge the global community of nations to work together in order to adopt a binding international treaty that regulates the development and use of artificial intelligence in its many forms.

—Pope Francis in his annual message for the World Day of Peace

A guest post by

Jakub Kraus

Tarbell Fellow writing about AI and policy

AI Policy Weekly

Discussion about this post