Ideas Mined from Trustworthy Tech Dialogues
As Artificial Intelligence (AI) and other emerging technologies become increasingly integrated into our shared reality, they have the dual potential to either empower individuals at
Centre for Trustworthy Technology
Artificial Intelligence (AI) is shaping the future of industries, economies, and societies at an unprecedented pace. Its transformative power offers boundless opportunities for innovation and growth. Yet, as this frontier technology continues to evolve, so does the need for a balanced approach—one that harnesses its potential while addressing the risks inherent to its deployment. Building trust is not merely a goal but a prerequisite for realizing AI’s promise responsibly.
In 2024, over half of Fortune 500 companies cited AI as a ‘risk factor’ in their most recent annual reports – a marked increase of more than 470% in just two years. The widespread growth and adoption of AI have brought the risk of misuse to the board level across industries. Trust in technology development and deployment is sustained through managing the balance of mitigating risk while harnessing potential benefits. This balancing act is especially salient for frontier technologies like AI —which present complex challenges and tradeoffs in weighing both present and speculative risks against benefits.
Bridging the Gap: Toward a Unified Understanding of AI Risk
Risk mitigation is universally recognized as a top priority for stakeholders across industries. Mitigating risk means first and foremost, measuring risk. The fast-moving landscape of AI has sparked rapid responses from industry, civil society, and governments alike to address and mitigate its potential risks. Risk definitions and tolerances are evolving, often in response to incidents or harm. As a result, industries and stakeholders have developed various frameworks, taxonomies, and implementation guidelines to navigate this complex landscape. These efforts indicate not only the demand for risk classification but also the challenge of creating a collective understanding of such a high-impact and wide-reaching issue.
The lack of collective understanding regarding AI risk is a foundational challenge to the trustworthy development and deployment of the technology. This fragmentation can result in misguided implementation, confused policy discussions, and gaps that expose communities to potential harm. MIT Future Tech recently released a Risk Repository, which seeks to address this issue of fragmentation by systematically reviewing 700+ cited risks across 43 existing frameworks. The repository classifies risks based on how, when, and why an AI risk occurs, and extracts the categories and subcategories of risks from the included papers and reports into a living database. These defined categories offer insight into key concerns and allow for further classification and analysis across topics.
Figure 1: Seven defined categories of risk
Source: MIT Future Tech, Comprehensive AI Risk Repository
This repository establishes a common frame of reference by aggregating the varied discourse of policymakers, technical experts, business leaders, and others. Key insights reveal significant variations in the risks covered across different frameworks, with certain domains receiving more attention than others in popular discourse, as shown in Figure 2. By consolidating risk assessments, this repository highlights the importance of co-constructing a comprehensive understanding of AI risks through the participation of a wide range of stakeholders.
Figure 2: Insights Table from Domain Taxonomy of AI Risks
Source: MIT Future Tech, Comprehensive AI Risk Repository
Defining Risk Thresholds
Risk thresholds are a fundamental component of measuring and regulating risk in frontier technologies, specifically AI. Technically, risk thresholds are often based on the capabilities of certain models or systems, or often the level of computational power (‘compute’) needed to train a model. Establishing these thresholds is a foundational step to developing effective mitigation strategies and governance frameworks.
Risk thresholds for AI are technical and social, requiring vulnerability assessments of affected populations. This process needs a blend of perspectives and expertise – from technical to social – to ensure that systems, businesses, and people are protected. The concept of setting risk thresholds has become a key area of discussion in policy and technical communities alike with a range of approaches emerging. A 2024 paper from The Centre for AI Governance highlights this key topic in AI discourse and offers how risk thresholds can inform, rather than determine decisions in AI design and development.
Trustworthy Decision-Making
Researchers from the Centre for AI Governance outline how a multifaceted approach to risk evaluations can strengthen a more trustworthy model of high-stakes AI development and deployment decision-making. Currently, risk thresholds for frontier AI can feed directly or indirectly into critical decision-making. Direct use of risk thresholds in decision-making is often employed on a case-by-case evaluation binary, i.e. whether to stop or go ahead. Indirect involvement may look like risk thresholds informing relevant safety measures for specific circumstances.
Understanding the relationship between risk thresholds, capacity thresholds, and compute thresholds is essential to creating a robust risk management framework. These thresholds – all of which are currently independently defined by different frontier AI company policies[1] and various federal regulations – can be interrelated in ways that help to weigh potential hazards. For example, the paper outlines how risk models are one way that risk thresholds can identify model capabilities that may cause large-scale harm and safety measures. In this example, risk thresholds inform the scope of a risk model which defines a capacity threshold for the greater system.
Looking Forward: A Multi-Stakeholder Approach to Risk
Given the complexity of defining risk, the existing disparate practices, and the high-stakes potential of AI decisions, it is an urgent priority for all stakeholders to solve for the best way forward. Managing risk currently looks different for different stakeholders: AI companies are focused on improving risk management practices while regulators conduct comparative studies and cost-benefit analyses to shape effective policy. Yet, all require the mutual advancement of risk estimation methodology to carry out their respective functions.
This fall, the OECD held a public consultation to establish risk thresholds for frontier AI, signaling a crucial step toward global alignment. There is potential to lower the barrier to entry to defining risk thresholds and foster multi-stakeholder collaboration. This requires establishing not only a common lexicon but also a common threshold of what level of risk is appropriate for our society.
Risk thresholds are more than technical measures, they are also societal measures that reveal both our collective values and the barriers to trust in technology. Establishing risk thresholds collectively is essential to ensure advancements in frontier technologies are guided by a universal commitment to safety and accountability. Multi-stakeholder participation is critical to prioritizing trust in this process, establishing a foundation where innovation serves humanity without compromising shared principles.
As the world navigates the transformative potential of AI alongside its inherent complexities, the responsibility to foster innovation guided by trust, accountability, and a commitment to the common good becomes paramount. A multi-stakeholder approach, underpinned by alignment on common frameworks, offers a pathway to ensuring AI serves as a tool for progress—one that respects societal values, safeguards communities, and delivers equitable benefits. The decisions made today will not only shape the trajectory of this powerful technology but also the legacy of an era defined by the pursuit of a trustworthy, human-centered future.
[1] Anthropic’s Responsible Scaling Policy, OpenAI’s Preparedness Framework, and Google DeepMind’s Frontier Safety Framework
Artificial Intelligence (AI) is shaping the future of industries, economies, and societies at an unprecedented pace. Its transformative power offers boundless opportunities for innovation and growth. Yet, as this frontier technology continues to evolve, so does the need for a balanced approach—one that harnesses its potential while addressing the risks inherent to its deployment. Building trust is not merely a goal but a prerequisite for realizing AI’s promise responsibly.
In 2024, over half of Fortune 500 companies cited AI as a ‘risk factor’ in their most recent annual reports – a marked increase of more than 470% in just two years. The widespread growth and adoption of AI have brought the risk of misuse to the board level across industries. Trust in technology development and deployment is sustained through managing the balance of mitigating risk while harnessing potential benefits. This balancing act is especially salient for frontier technologies like AI —which present complex challenges and tradeoffs in weighing both present and speculative risks against benefits.
Bridging the Gap: Toward a Unified Understanding of AI Risk
Risk mitigation is universally recognized as a top priority for stakeholders across industries. Mitigating risk means first and foremost, measuring risk. The fast-moving landscape of AI has sparked rapid responses from industry, civil society, and governments alike to address and mitigate its potential risks. Risk definitions and tolerances are evolving, often in response to incidents or harm. As a result, industries and stakeholders have developed various frameworks, taxonomies, and implementation guidelines to navigate this complex landscape. These efforts indicate not only the demand for risk classification but also the challenge of creating a collective understanding of such a high-impact and wide-reaching issue.
The lack of collective understanding regarding AI risk is a foundational challenge to the trustworthy development and deployment of the technology. This fragmentation can result in misguided implementation, confused policy discussions, and gaps that expose communities to potential harm. MIT Future Tech recently released a Risk Repository, which seeks to address this issue of fragmentation by systematically reviewing 700+ cited risks across 43 existing frameworks. The repository classifies risks based on how, when, and why an AI risk occurs, and extracts the categories and subcategories of risks from the included papers and reports into a living database. These defined categories offer insight into key concerns and allow for further classification and analysis across topics.
Figure 1: Seven defined categories of risk
Source: MIT Future Tech, Comprehensive AI Risk Repository
This repository establishes a common frame of reference by aggregating the varied discourse of policymakers, technical experts, business leaders, and others. Key insights reveal significant variations in the risks covered across different frameworks, with certain domains receiving more attention than others in popular discourse, as shown in Figure 2. By consolidating risk assessments, this repository highlights the importance of co-constructing a comprehensive understanding of AI risks through the participation of a wide range of stakeholders.
Figure 2: Insights Table from Domain Taxonomy of AI Risks
Source: MIT Future Tech, Comprehensive AI Risk Repository
Defining Risk Thresholds
Risk thresholds are a fundamental component of measuring and regulating risk in frontier technologies, specifically AI. Technically, risk thresholds are often based on the capabilities of certain models or systems, or often the level of computational power (‘compute’) needed to train a model. Establishing these thresholds is a foundational step to developing effective mitigation strategies and governance frameworks.
Risk thresholds for AI are technical and social, requiring vulnerability assessments of affected populations. This process needs a blend of perspectives and expertise – from technical to social – to ensure that systems, businesses, and people are protected. The concept of setting risk thresholds has become a key area of discussion in policy and technical communities alike with a range of approaches emerging. A 2024 paper from The Centre for AI Governance highlights this key topic in AI discourse and offers how risk thresholds can inform, rather than determine decisions in AI design and development.
Trustworthy Decision-Making
Researchers from the Centre for AI Governance outline how a multifaceted approach to risk evaluations can strengthen a more trustworthy model of high-stakes AI development and deployment decision-making. Currently, risk thresholds for frontier AI can feed directly or indirectly into critical decision-making. Direct use of risk thresholds in decision-making is often employed on a case-by-case evaluation binary, i.e. whether to stop or go ahead. Indirect involvement may look like risk thresholds informing relevant safety measures for specific circumstances.
Understanding the relationship between risk thresholds, capacity thresholds, and compute thresholds is essential to creating a robust risk management framework. These thresholds – all of which are currently independently defined by different frontier AI company policies[1] and various federal regulations – can be interrelated in ways that help to weigh potential hazards. For example, the paper outlines how risk models are one way that risk thresholds can identify model capabilities that may cause large-scale harm and safety measures. In this example, risk thresholds inform the scope of a risk model which defines a capacity threshold for the greater system.
Looking Forward: A Multi-Stakeholder Approach to Risk
Given the complexity of defining risk, the existing disparate practices, and the high-stakes potential of AI decisions, it is an urgent priority for all stakeholders to solve for the best way forward. Managing risk currently looks different for different stakeholders: AI companies are focused on improving risk management practices while regulators conduct comparative studies and cost-benefit analyses to shape effective policy. Yet, all require the mutual advancement of risk estimation methodology to carry out their respective functions.
This fall, the OECD held a public consultation to establish risk thresholds for frontier AI, signaling a crucial step toward global alignment. There is potential to lower the barrier to entry to defining risk thresholds and foster multi-stakeholder collaboration. This requires establishing not only a common lexicon but also a common threshold of what level of risk is appropriate for our society.
Risk thresholds are more than technical measures, they are also societal measures that reveal both our collective values and the barriers to trust in technology. Establishing risk thresholds collectively is essential to ensure advancements in frontier technologies are guided by a universal commitment to safety and accountability. Multi-stakeholder participation is critical to prioritizing trust in this process, establishing a foundation where innovation serves humanity without compromising shared principles.
As the world navigates the transformative potential of AI alongside its inherent complexities, the responsibility to foster innovation guided by trust, accountability, and a commitment to the common good becomes paramount. A multi-stakeholder approach, underpinned by alignment on common frameworks, offers a pathway to ensuring AI serves as a tool for progress—one that respects societal values, safeguards communities, and delivers equitable benefits. The decisions made today will not only shape the trajectory of this powerful technology but also the legacy of an era defined by the pursuit of a trustworthy, human-centered future.
[1] Anthropic’s Responsible Scaling Policy, OpenAI’s Preparedness Framework, and Google DeepMind’s Frontier Safety Framework
As Artificial Intelligence (AI) and other emerging technologies become increasingly integrated into our shared reality, they have the dual potential to either empower individuals at
The launch of the International Network of AI Safety Institutes (AISIs) during the Seoul Summit marked a pivotal milestone in the global effort to advance
A nod to the landmark breakthroughs of today’s gene editing techniques, gene editing-enabled transplants have been named
For decades, popular culture has envisioned Artificial Intelligence (AI) as autonomous machines capable of reasoning and even feeling like humans. However, today’s AI reality remains
This episode of Trustworthy Tech Dialogues tackles the urgent questions surrounding public and private investments in innovation and the transformative role of Artificial Intelligence
As technology advances at an unprecedented pace, society is grappling with its profound implications.