Ground zero for AI safety

BY BILLY PERRIGO

IN MAY 2023, THREE OF THE MOST IMPORTANT CEOS IN artificial intelligence walked through the iconic black front door of No. 10 Downing Street, the official residence of the U.K. Prime Minister, in London. Sam Altman of OpenAI, Demis Hassabis of Google DeepMind, and Dario Amodei of Anthropic were there to discuss AI, following the blockbuster release of ChatGPT six months earlier.
After posing for a photo opportunity with then Prime Minister Rishi Sunak in his private office, the men filed through into the cabinet room next door and took seats at its long, rectangular table. Sunak and U.K. government officials lined up on one side; the three CEOs and some of their advisers sat facing them. After a polite discussion about how AI could bring opportunities for the U.K. economy, Sunak surprised the visitors by saying he wanted to talk about the risks. The Prime Minister wanted to know more about why the CEOs had signed what he saw as a worrying declaration arguing that AI was as risky as pandemics or nuclear war, according to two people with knowledge of the meeting. He invited them to attend the world's first AI Safety Summit, which the U.K. was planning to host that November. And he managed to get each to agree to grant his government prerelease access to their companies' latest AI models, so that a task force of British officials, established a month earlier and modeled on the country's COVID-19 vaccine unit, could test them for dangers.
The U.K. was the first country in the world to reach this kind of agreement with the so-called frontier AI labs—the few groups responsible for the world's most capable models. Six months later, Sunak formalized his task force as an official body called the AI Safety Institute (AISI), which in the year since has become the most advanced program inside any government for evaluating the risks of AI. With £100 million ($127 million) in public funding, the body has around 10 times the budget of the U.S. government's own AI Safety Institute, which was established at the same time.
Inside the new U.K. AISI, teams of AI researchers and national-security officials began conducting tests to check whether new AIs were capable of facilitating biological, chemical, or cyber-attacks, or escaping the control of their creators. Until then, such safety testing had been possible only inside the very AI companies that also had a market incentive to forge ahead regardless of what the tests found. In setting up the institute, government insiders argued that it was crucial for democratic nations to have the technical capabilities to audit and understand cutting-edge AI systems, if they wanted to have any hope of influencing pivotal decisions about the technology in the future. “You really want a public-interest body that is genuinely representing people to be making those decisions,” says Jade Leung, the AISI’s chief technology officer. “There aren’t really legitimate sources of those [decisions], aside from governments.”
In a remarkably short time, the AISI has won the respect of the AI industry by managing to carry out world-class AI safety testing within a government. It has poached big-name researchers from OpenAI and Google DeepMind. So far, they and their colleagues have tested 16 models, including at least three frontier models ahead of their public launches. One of them, which has not previously been reported, was Google’s Gemini Ultra model, according to three people with knowledge of the matter. This prerelease test found no significant previously unknown risks, two of those people said. The institute also tested OpenAI’s 01 model and Anthropic’s Claude 3.5 Sonnet model ahead of their releases, both companies said in documentation accompanying each launch.
In May, the AISI launched an open-source tool for testing the capabilities of AI systems, which has become popular among businesses and other governments attempting to assess AI risks.
But despite these accolades, the AISI has not yet proved whether it can leverage its testing to actually make AI systems safer. It often does not publicly disclose the results of its evaluations, nor information about whether AI companies have acted upon what it has found, for what it says are security and intellectual-property reasons. The U.K., where it is housed, has an AI economy that was worth £5.8 billion ($7.3 billion) in 2023, but the government has minimal jurisdiction over the world’s most powerful AI companies. (While Google DeepMind is headquartered in London, it remains a part of the U.S.-based tech giant.) The British government, now controlled by Keir Starmer’s Labour Party, is incentivized not to antagonize the heads of these companies too much, because they have the power to grow or withdraw a local industry that leaders hope will become an even bigger contributor to the U.K.’s struggling economy. So a key question remains: Can the fledgling AI Safety Institute really hold billion-dollar tech giants accountable?
In the U.S., the extraordinary wealth and power of tech has deflected meaningful regulation. The U.K. AISI’s lesser-funded U.S. counterpart, housed in moldy offices in Maryland and Colorado, does not size up to be an exception. But that might soon change. In August, the U.S. AISI signed agreements to gain predeployment access to AI models from OpenAI and Anthropic. And in October, the Biden Administration released a sweeping national-security memorandum tasking the U.S. AISI with safety-testing new frontier models and collaborating with the NSA on classified evaluations.
While the U.K. and U.S. AISIs are currently partners, and have already carried out joint evaluations of AI models, the U.S. institute may be better positioned to take the lead by securing unilateral access to the world's most powerful AI models should it come to that. But Donald Trump's electoral victory has made the future of the U.S. AISI uncertain. Many Republicans are hostile to government regulation—and especially to bodies like the federally funded U.S. AISI that may be seen as placing obstacles in front of economic growth. Billionaire Elon Musk, who helped bankroll Trump's re-election, and who has his own AI company called xAI, is set to co-lead a body tasked with slashing federal spending. Yet Musk himself has long expressed concern about the risks from advanced AI, and many rank-and-file Republicans are supportive of more national-security-focused AI regulations. Amid this uncertainty, the unique selling point of the U.K. AISI might simply be its stability—a place where researchers can make progress on AI safety away from the conflicts of interest they'd face in industry, and away from the political uncertainty of a Trumpian Washington.
ON A WARM JUNE MORNING about three weeks after the big meeting at 10 Downing Street, Prime Minister Sunak stepped up to a lectern at a tech conference in London to give a keynote address. “The very pioneers of AI are warning us about the ways these technologies could undermine our values and freedoms, through to the most extreme risks of all,” he told the crowd. “And that’s why leading on AI also means leading on AI safety.” Explaining to the gathered tech industry that his was a government that “gets it,” he announced the deal that he had struck weeks earlier with the CEOs of the leading labs. “I’m pleased to announce they’ve committed to give early or priority access to models for research and safety purposes,” he said.
Behind the scenes, a small team inside Downing Street was still trying to work out exactly what that agreement meant. The wording itself had been negotiated with the labs, but the technical details had not, and “early or priority access” was a vague commitment. Would the U.K. be able to obtain the so-called weights—essentially the underlying neural network—of these cutting-edge AI models, which would allow a deeper form of interrogation than simply chatting with the model via text? Would the models be transferred to government hardware that was secure enough to test for their knowledge of classified information, like nuclear secrets or details of dangerous bioweapons? Or would this “access” simply be a link to a model hosted on private computers, thus allowing the maker of the model to snoop on the government’s evaluations? Nobody yet knew the answers to these questions.
In the weeks after the announcement, the relationship between the U.K. and the AI labs grew strained. In negotiations, the government had asked for full-blown access to model weights—a total handover of their most valuable intellectual property that the labs saw as a complete nonstarter. Giving one government access to model weights would open the door to doing the same for many others—democratic or not. For companies that had spent millions of dollars on hardening their own cybersecurity to prevent their models’ being exfiltrated by hostile actors, it was a hard sell. It quickly became clear that the type of testing the U.K. government wanted to do would be possibl...

You're reading a preview of

Time (Digital) - 1 Issue, January 27, 2025

DiscountMags is a licensed distributor (not a publisher) of the above content and Publication through Magzter Inc. Accordingly, we have no editorial control over the Publications. Any opinions, advice, statements, services, offers or other information or content expressed or made available by third parties, including those made in Publications offered on our website, are those of the respective author(s) or publisher(s) and not of DiscountMags. DiscountMags does not guarantee the accuracy, completeness, truthfulness, or usefulness of all or any portion of any publication or any services or offers made by third parties, nor will we be liable for any loss or damage caused by your reliance on information contained in any Publication, or your use of services offered, or your acceptance of any offers made through the Service or the Publications. For content removal requests, please contact Magzter.

Time (Digital)

Rev Lebaredian

Q & A - Rene Haas

Ground zero for AI safety