AI Safety Conundrum: Reason Behind General Departure from Safety Measures
In the rapidly evolving world of Artificial Intelligence (AI), a game of defection is underway, with each player vying for market domination. Even if all companies cooperated, academia, open source communities, nation-states, and individuals would continue research and development, ensuring that someone always defects.
The stakes are high, as the final defection could lead to an existential event. This underscores the urgency to restructure the game before it's too late. However, the path to safety is fraught with challenges.
Countries find themselves in a regulatory arbitrage dilemma. Strict regulation can lead to economic disadvantage, while loose regulation can result in safety risks, creating a race to the bottom on safety standards. Regulation, unfortunately, is too slow and weak to keep pace with AI development, making it ineffective in ensuring safety.
Technical solutions, such as alignment research, interpretability, capability control, compute governance, and transparency, while important, are insufficient in solving game theory problems. The iterative game problem in AI development makes cooperation difficult due to the winner-take-all nature of the industry.
OpenAI, initially a non-profit focused on safe AI, transitioned into a for-profit subsidiary. Similarly, OpenAI and DeepMind, which originally focused on AI safety, have shifted towards more business-oriented objectives in recent years due to competitive pressures. Every AI company that started with safety-first principles has ultimately succumbed to these pressures.
The payoff matrix for AI companies shows that defection (speed-first strategy) can lead to market domination, massive valuations, and regulatory capture, while cooperation (safety-first strategy) results in slower progress, higher development costs, regulatory compliance, limited market share, and long-term survival.
Open source is used as a nuclear option, eliminating competitive advantages and making no safety controls possible. Meta, for instance, is open-sourcing everything to destabilize competitors and maintain its platform power. Even safety researchers defect and join speed-focused companies, contributing to the arms race in AI development.
The US-China AI dilemma involves national security implications, economic dominance, and military applications, making it necessary for both countries to defect for national survival. International cooperation is also difficult due to the lack of trust between countries, particularly between the US and China.
The unilateral disarmament problem in AI development means that if one company prioritizes safety, competitors gain an insurmountable lead, and the safety-focused company becomes irrelevant, leading to investment withdrawal and eventual death. Self-regulation is also ineffective due to the lack of enforcement mechanisms, verification, and aligned incentives.
The stock price of companies punishes caution, with activists demanding acceleration and CEOs being replaced if they resist. Top AI researchers are offered $5-10M packages to join speed-focused companies, leading to a brain drain from safety-focused companies. Google, previously cautious and research-focused, has accelerated its AI releases after ChatGPT.
Safety advocates face an impossible position, as they are either ignored, discredited, or dismissed as alarmist, regardless of the accuracy of their warnings. Venture capitalists (VCs) also face a prisoner's dilemma, where funding safety can lead to lower returns and LPs withdrawing, while funding speed can lead to higher returns but existential risk. This individual rationality creates collective irrationality in the AI industry.
In the AI version of the prisoner's dilemma, the Nash Equilibrium is always defection, leading to faster but potentially dangerous progress. With multiple players in AI development, coordination becomes impossible, one defector can break cooperation, and there are no enforcement mechanisms.
The question isn't why everyone defects, but whether we can restructure the game before the final defection makes the question moot. The race to AI dominance is on, but the consequences could be dire unless we can find a way to ensure safety while still driving progress.