Centre for Trustworthy Technology

Videos

Episode Notes

Ideas Mined from Trustworthy Tech Dialogues: The Industry of Ideas with Dr. Julia Lane

Episode Notes

business-woman-robot-working-office-desk

Ideas Mined from Trustworthy Tech Dialogues: Artificial Intelligence in HR: Promise, Pitfalls, and Protecting Civil Rights

Episode Notes

Ideas Mined from Trustworthy Tech Dialogues Privacy in the Age of AI

Episode Notes

IdeasMined Image

Ideas Mined from Trustworthy Tech Dialogues Data Trusts and Large Language Models

Episode Notes

OTN Cover Image

Augmenting the Global Digital Economy through Open Transaction Networks

OTN image 2

Open Transaction Network: Could this shift in technology transform the global economy?

Open Transaction Network: What is it and what does it mean for the incoming era?

Scroll to Top

Chapter Notes

The Industry of Ideas with Dr. Julia Lane.

From Economics and Statistics to Public Service (4:55)

Vision for Democratizing Data (6:58)

Data for Better Public Policy Decisions (10:07)

Introducing the Industry of Ideas (16:22)

Measuring the Impact of AI on our workforce (22:45)

A Proposal for The Centre for Data and Evidence (27:02)

Three Challenges of Data Reliance (30:16)

Looking Forward (39:08)

Chapters of OTN video

The Industry of Ideas with Dr. Julia Lane.

Trustworthy Tech Dialogues Transcript

Satwik Mishra

Welcome everyone to another exciting edition of Trustworthy Tech Dialogues. I’m Satwik Mishra, Executive Director for Centre for Trustworthy Technology. In this episode, we will dive into the power of designing data architectures that have the potential to shape a future that is both trustworthy and innovative. Today, industry has emphatically embraced the transformative power of data to drive emerging technologies and unlock their full potential.

But is the public sector keeping pace? Governments are pouring resources into AI and other emerging technologies, aiming to transform the economies. However, there exists a challenge in effectively monitoring and measuring the spending. To truly understand its impact, a paper released by IMF this month brings out a stark analysis. In the 1980s, the total U.S. R&D investment was 2.2% of GDP.

Today, it stands at 3.4% of GDP. Conventional economic models suggest that this rise in R&D spending could have spurred faster economic growth. But contrary to expectations, the economy has experienced a slowdown instead of an anticipated growth. Between 1960 and 1985, the productivity growth averaged 1.3% annually. In the following 35 years. Productivity growth constantly fell below this historical average.

That brings forth the question how can we ensure that these investments are smart, strategic and backed by data and evidence to truly make a difference? Where is the public sector excelling? Where does it need to improve, and what changes are necessary to stay ahead? Today, we dive into the transformative potential of building data architectures which can ignite innovation, inspire fresh ideas and power economic growth.

We’ll also analyze the impact of artificial intelligence on the economy and the future of work. This remains a hotly debated issue, with predictions often pulling in opposite directions. Some argue that a productivity boost is on the anvil. Others warn of a possible hype. Some say we are headed for a jobless future, while others argue that AI will surely elevate the quality of work.

Increasingly, it seems that when it comes to AI and future of work, the only thing we can truly count upon is uncertainty.

That brings about another question, are we capturing the right data to truly understand what’s happening in our economy? If yes, what does this data actually indicate? And if not, what do we need to do to get the answers that matter?

To discuss all of this and more? I’m delighted to be joined by Dr. Julia Lane. She is a renowned economist and statistician who has brought her quantitative skills to public service. Dr Julia is a tenured professor at New York University’s Wagner Graduate School of Public Service. She recently served on the Advisory Committee on Data for Evidence Building and the National AI Research Resource Task Force.

She currently serves as the Secretary of Labor’s Workforce Innovation Advisory Committee and the National Science Foundation’s Advisory Committee on Cyberinfrastructure. She’s also the author of a fascinating book, “Democratizing Our Data: A Manifesto”, which I highly recommend to everyone. In her extensive public sector experience, she has developed innovative and practical approaches to design in varied data infrastructures. Let’s dive into the potential of building optimal data architectures. To help us navigate today’s uncertainties, enable smarter decision making and as importantly, advance societal trust.

Dr Julia. Welcome. 

Dr Julia Lane

Thank you so much. I’m thrilled to be here.

Satwik Mishra

Walk us through your journey from the quantitative background in economics and statistics to the world of public service. How did you come here?

Dr Julia Lane

I was really interested in answering the actually partly the same questions that we still have, which is what is the impact on investment on the job training on the economic productivity, the earnings of workers and the survival and growth of firms. So, this is 30 years ago. You can’t answer those questions with survey data because most workers work for big firms. But most firms are small. And you can’t get information about workers over time and firms over time. So, the only way to do it was to develop new sources of data.

So, I thought, okay, let’s go ahead and try to do that. That turned into a long journey. I created the LEHD program, Longitudinal Employer-Household Dynamics Program at the Census Bureau and that integrated data infrastructure in New Zealand, as you can tell from my accent, I’m a New Zealander.

And what I found was this foundational investment in data is way more valuable than writing a paper that maybe half a dozen people read, or even writing a book that thousands of people read. But that was really what hooked me into investing in data. That massive gap for firms, for businesses, for government, it’s a public good, and you can make a huge difference by investing in it. But that’s how I got into it, just by chance.

Satwik Mishra

I do want to dive into all the current affairs that are happening, and I would love to know your views on all of them, but I’ve spent the last two weeks reading your fascinating book, and I don’t want to, get into current affairs before getting you to speak about the book and its key ideas.

Give us a snapshot of, and I wish we could do an entire podcast on that book itself, but, give us a snapshot of that book’s recommendations and what is that vision for democratizing data?

Dr Julia Lane

The book really came out of a sense that despite the enormous potential, when I first started the LEHD program at Census Bureau, which was in the mid-1990s, you could see the world changing potential from data, from moving from surveys to new types of data.

I went from census to the National Science Foundation; I could see what the computer scientists were doing in security and access and converting texts was in 2004. So, I thought, oh my God, we could just securely make data available, we could democratize data. There’s all this potential that would galvanize the statistical agencies.

After 20 years of pushing and saying and opening the doors that the federal government couldn’t move and didn’t. It was still the same old stuff and I was brought to the White House to work on the federal data strategy.

And I was having a series of conversations with people, and they were saying, it’s never going to change. There are all these wonderful people. There’s a lot of money getting spent, but nothing is ever going to change. So, what the book is saying – “What are the consequences of not changing?” What does that mean for the people of the country when they don’t have the money to make decisions?

Not the big companies, they always kind of have the money, but economic intelligence, the small businesses, the students who are making career decisions, the policymakers who are trying to make informed decisions, one what are the consequences and how do we get out of it? How do we reimagine public data? So, that’s the notion of the book.

And the recommendation is very much the hap build within the government. You need an independent, trusted, data driven resource that is complemented by high quality research but driven by local needs. So, you get finely actionable information that can be used proactively, not just reactively. So that’s the book in a nutshell. 

Satwik Mishra

There are so may practical recommendations in that book. But if you are to look at today’s environment, give us one example around how informed, granular or dynamic real time data would have helped make better public policy decisions. What is that one aspect in the economy, you feel like if he had better data right now. I know there would be so many, but we would make better public policy decisions.

Dr Julia Lane

Well, I’m not going to take one. I’m going to take two. So, one is COVID, and I’m going to give you a concrete example in just a minute. But then the second is the point that you opened up with which how do we react to this and better manage the investments in R&D, which we hope are going to stimulate the economy. So, can I do the quick thing on Covid first and then the R&D?

Satwik Mishra

Absolutely.

Dr Julia Lane

The COVID example is a classic one. COVID hit and, we had this massive impact on the workforce. People went home. They were getting laid off.

How you went from .5% of all people, or 2% of people going and asking for unemployment insurance, to 25%. The workers who were administrating unemployment insurance were sent home. They didn’t have access to the data. A lot of them had the data on their physical machines. They didn’t have laptops, they didn’t have access.

They didn’t know how to produce information to the governor’s offices and the workforce boards at a local, granular, timely level so that you knew who was getting hit, how to get help to people and infrastructure laid bare, the inadequacies of a system that was designed a 100 years ago. Now, I had set up the Coleridge Initiative so, the states that had put their data into a secure environment, informed by my NSF experience on cyber security and so on, and they were able to get access to the data, produce dashboards that could be provided to the workforce boards and the governor’s offices. Illinois was the leader in this. They had they got the data about the layoffs and the unemployment insurance claims on a Monday.

The data were in the summaries, the tables, the dashboards were in the governor’s office by Thursday. That is transformational to answer your question, timely, local, actionable. So, take the same basic idea we had. I was on the National AI Task Force, asked to figure out what we need to spend money in order to do better industrial policy, essentially, what are we going to spend it on?

And the task force was asked to say, what are we going to spend? And of course, people were saying compute, data training and so on. And I said, how are you going to evaluate that?

Exactly the question you opened up with. We don’t have an infrastructure to answer those questions, proactively not reactively. You actually said that 20 years later, like with NAFTA, the North American Free Trade Agreement, 20 years later, we didn’t have enough data on there were these deaths of despair from the middle parts of the country that lost their jobs.

How do we avoid that?. So, as part of that national AI task force, I said, what data do we currently have? The data doesn’t make sense. The measures of AI that were put out, for example, by the Stanford AI ethics, which is what the committee relied on. They’re changing their ways, is based on Bibliometrics, which doesn’t tell you anything about the workforce.

In the MIT press piece that I circulated, and you’ll probably provide a link to it. The Census Bureau provides an estimate of 5%. What are the consequence of that 5% of firms who are using AI?

So, what are the consequences of, of not knowing the answers? Well, let me get very concrete. Since you opened up on productivity and the differences at MIT’s Daron Acemoglu will think that AI will increase by GDP by 0.93% over the next ten years.

Economists at Goldman Sachs say AI will increase GDP by 6%. Our task force had to figure out which one was right and where does the money go. In that order of magnitude between those two estimates is 1.6 times larger than the world’s third largest economy, Germany, over the decade. So, these are not small amounts. These are really, really big amounts.

And it means part of the issue with this is one of the questions is that is knowable. The data could be created and figured out, what proportion of firms are using AI, what proportion of jobs are going to be replaced, what types of new jobs are gonna be created and finally, how skilled is workforce and how can it be retrained?

All that’s knowable. So, building an infrastructure that can answer the question of what is the impact of R&D investment on productivity and the workforce not currently known. Big differences. But knowable and in relatively short time period.

Satwik Mishra

On this. This is a perfect segue way to, speak to yours industry of ideas, which is such a fascinating idea again and I’ll link again that paper in our show notes. Speak to us about the industry of ideas, not only its potential, but also where do you see it making a difference, and what do you think are its pitfalls or challenges which may come about if we are to go down that path?

Dr Julia Lane

So, here I am on the NAIRR task force. We’re trying to figure out where should, we were charged by the President, Congress writing a report on where to invest. Many economists say “you need data to decide”.

You don’t just make big numbers up. So think about it. AI quantum computing or not, all of these new areas of investment and industrial policy, none of them has an industry code. None of them is a scientific field. So, if you can’t even measure what you’re investing in, when will you get it? So what you measure is a management saying what you measure is what you get.

So again, I’m an economist. You think, okay, how do you think about what these terms mean? And really it is a new form of economic organization. So how are we going to describe that. And I’m going to argue industry of ideas. And let me tell you how I got to that piece. A 100 years ago, which by the way, as I said before.

Satwik Mishra

Can I just intervene just one. One sort of quote from your nature article, which I found fascinating. And then we can absolutely get into this idea, which will set it up for the audience. So, in the nature article, you wrote, which we’ll link in the show notes, the North American Industry classification system was modified in 2022 to add a single category for AI Activities, AI Research and development laboratories. In February, Adam Leonard, the Chief Analytics Officer of The Texas Workforce Commission in Austin, applied this new NAICS classification to Texas data and found a mere 298 AI research and development firms employing just 1021 workers in total, real workforce,

Dr Julia Lane

Entire state in the entire state of Texas

Satwik Mishra

Exactly. The real workforce involved in AI related activities, meanwhile, is likely to be much larger and spread across multiple industry sectors, ranging from hospitality and healthcare to oil exploration and more. So yes, what is the industry? Let’s segway into the industry of ideas.

Dr Julia Lane

And so, thank you for jumping in right there to make a very concrete. So, 100 years ago, we classified firms into what they produced, and that was mainly manufacturing and agriculture. So, we had thousands of manufacturing, agriculture industries. Then 50 years ago, it was, we’re not just producing physical goods anymore.

So, we need to change from SIC codes to what you referred to as NAICS codes, NAICS codes that are based on services. So, how things are being produced. That was 50 years ago, and it took ten years to get those categories. Now, as I’m looking at it, those new industries – AI, quantum computing, these are industries of ideas. The basic idea, the innovation is ideas, new ways of putting things together.

As Heidi Williams has pointed that innovation is what breaks the shackles of scarcity because you no longer have to just invest in capital, labor, energy, material services which we have. But if I have a better idea, you can put the same inputs together in a different way, like Steve Jobs did with the iPhone. That was nothing new. It was just pure innovation.

There are so many things, that’s what AI is. So how do we track those ideas? the reason that the industry of ideas is such an important one is ideas are embodied in people. It’s not things, it’s not services. Its ideas embodied in people. So, the whole idea between investing in R&D is investing in great ideas in great people.

So, and not just to give you credit, Paul Romer got the Nobel Prize for this Endogenous Technological Change back in 2018. He called it the “Economics of Ideas” that you think about that is, ideas don’t get consumed like capital. Right. Physical capital that one of the key ideas that comes out of that is, first of all, that creating a market with incentives for people to share ideas is massive because you have a great idea.

The more you share it, the more impact it has. Unlike steel when I consume it, it’s gone. The second thing is, going back to your opening statement, which is then you need to figure out where to invest in people. That has demographic consequences, geographic consequences, workforce consequences. And then it tells you how to build the data infrastructure that has been being built over the past ten years, about which we wrote the Nature piece.

So that’s the key point here. It’s a new economy. Firms are organized around ideas much more than services and industry. And so, the data infrastructure needs to reflect the economic organization.

Satwik Mishra

So, from the industry of ideas, give us an analysis of the issue, which everyone’s debating right now. What would the future of work look like? How would you think about the measuring and monitoring of how jobs are changing?

What needs to change in our measurement of jobs so that we’re able to analyze artificial intelligence and impact on our workforce, our industry of ideas translate into that measurement?

Dr Julia Lane

My first answer is going to be you don’t send out a bloody survey that no one’s going to answer. But, you think strategically about how are you going to trace that through digital fingerprints? Right. So, what has been being done on AI is the first thing is the shock to the economy is coming, as you just pointed out from R&D spending.

That’s the first thing that they’re going to need to figure out. How do we track that? Well, that part is straightforward, because when a government or a grant, a contract is issued, there’s a code that’s generated “Abcdefg”. When that contract hits a firm or a university, I’m going to focus it on universities for right now, a code gets created within the university or within the firm.

Everyone who charges time to that project is captured, and so are the vendors who are paid. The people who are producing new types of computers, the GPUs, the Nvidia’s. All of that is tracked in the HR and finance system. So, immediately you have a data infrastructure that describes the scientific production function and who is working on these new ideas.

And this is what we talk about. Now, how did those ideas that are in the university get transmitted to the private sector? Well, it’s not just the principal investigators.

Our research shows using these data, which have been built up at the University of Michigan, the Institute for Research on Innovation and Science over the past 15 years. So, this again, and it goes back further because universities were able to pull it out.

So, it goes back 20 years. We show that there are 22 graduate students, postdocs, undergraduates, clinical researchers, the every principal investigator. You can see those people. Those are the sparklers like Larry Page and Sergey Brin. Those are the sparklers that go out and create new fire into the economy. So, what those people were taught because they described it in their proposal.

The funding agencies know what’s going on. You can then track that. You could see the firms that they landed. Many of them get hired in at high wages because they’re expected to create that spark and then you can see what happens to the jobs in those firms and the skill contours of the workers in those firms, because that skill information is available from additional sources, both the links and to the educational information from the public schools, as well as what’s hiring information with the job vacancy data.

That immediately tells you, its local, it’s timely, it’s actionable, and it enables you to figure out proactively what’s happening to the workforce and the hiring patterns of firms that are affected.

Satwik Mishra

Absolutely. So, this answers the question around why the industry of ideas. Another fascinating piece of intervention that you make in your writing is, answering the “How”. And you speak about your proposal for a Centre for data and evidence.

The institutional intervention, which is, if I may say, which is required in our data ecosystem. Walk us through, what would what is that proposal for Centre for Data and Evidence? Why is it important and what do you think should be the structural arrangements around it?

Dr Julia Lane

So, I think what’s important is getting new eyes in to solve these problems. There are so many smart people out there who are already cited MIT people, but also people in other institutions throughout the country. We shouldn’t just focus on the top institutions that need to be many voices. So, the basic idea is to get new ideas in, there can’t be the Federal Statistical System. Just to be clear that survey methodologies and statisticians are not the ones with the ear to the ground.

It’s the researchers, the local researchers, the people who are on the ground trying to provide services, community colleges and so on. So, the questions need to be driven from outside. That centre should be independent, nonpartisan, trusted think tank, but a different think tank from the ones that we historically see it needs to be, grounded in the idea of bringing together data from multiple sources to make sense of it, just like IRIS has done, for example, at the University of Michigan.

But many other things like the multi-state data collaboratives, that have been set up by the NASWA, the National Association of State Workforce Agencies. So, people who know how to analyze data and people who want to understand why it needs to be built on top of an institutional infrastructure. So, you’re not building castles on sand. You’re building a firm foundation.

What does that look like? In what I’ve argued is philanthropic foundations are going to be central to getting that set up. They’ve been very successful in setting up institutional infrastructures in the past, but they have to be pure of heart.

You could also have funding lines that go to the states through federal agencies that also want the answers to the questions. Department of Labor, Department of Education, Department of Commerce, Department of Defense, Department of Energy. All of those could provide funding to this independent centre, but the centre would be demand driven, agile, and had the right incentive structure.

Satwik Mishra

We answered the why, we answered the how. All from your writing. But there’s an old piece of yours which I went through last week. It’s a piece you had written in science and technology, issues in science and technology where someone who advocates, as you do most importantly for building a robust data ecosystem, you also identify three challenges of relying a little too much on data and being wary of that.

And I’ll give you the headers, and I would love for you to expand on them. The first one is what you call “Campbell’s Law”, which I didn’t know about but is very interesting to hear. The second one you speak about the lag in investments in things like science & technology and education. That is something that I’d love you to explain the readers.

Why that lag happens in science and technology and education? And thirdly, obviously on privacy and confidentiality. So, expand on a little bit on the challenges on all three of these challenges and what what do you see? How do you analyze these three, particular challenges?

Dr Julia Lane

Why do you start with number one? Why don’t you spell out your take on that law that you’d never heard of?

Satwik Mishra

So as much as, for a novice in stake, I understood Campbell’s law, as the more we rely on a quantitative analysis or to make decisions the more they will be proclivity from society to possibly corrupt that ecosystem because of decisions are made from it. There will be a tendency or a proclivity to corrupt that ecosystem.

And I found that like, it was almost like a light bulb moment when I read about Campbell Law. But this is as I understood it. But I would love for you to expand a little bit more on it. How do you think about it? How do you envisage getting around something like that?

Dr Julia Lane

So, and I wanted you to say it so that I wasn’t I, I feel like if I talk all the time, I’m talking at people. So it’s helpful. So, I’m going to ask you to also summarize the other two. But once I go through that, so, again what you measure is what you get. So, the challenge is coming up with a measure that can’t be gamed.

And people are really smart. So, anytime you set up a measure, you’re going to get gamed in some way. The advantage of investing in ideas and people is harder to game that, right. You immediately switch the attention away from mindlessly producing bibliometrics and papers and patents. You’re swamped and the stuff to really investing in your students and your trainees and pushing that out.

But that can still be gamed. So you just have to pay attention, basically, and change, if you’re getting the wrong results for not the wrong results. But if you feel like there’s bad actors in there, just be aware and change, because people are always going to figure out how to mess things up.

Satwik Mishra

So yeah. And the second one, which is investments in science and technology and education, take time from the time of investment to its impact in society. And it’s, we need to make space for assessing that complexity and almost acknowledging that complexity of those investments.

So, we need to make space in a data ecosystem, walk us through that risk as well.

Dr Julia Lane

So, it comes back to a very basic concept, which is what is your theory of change and what kind of lags do you think?. You’ve got these inputs which are the spend a new immediately in the introduction you said, well it was 2.4.

Now it’s 3.2. So, everyone focuses in on the inputs and the Lisbon protocol said every European country should spend 3% and you’re like, well, why 3%? Why not 2.91 or 3.1? What’s the magic number? So, the theory changeṣ. Okay, we’re going to spend money. Also, the charge to the National AI task force that I was on.

What activities are going to be generated? What are the initial outputs, what are the outcomes and what are the goals? And then how are you going to measure at each step of the way and then, what are the compounding factors? Now for a lot of things, people are complex, organizations are complex. There are a lot of compounding factors.

And so of course, what scientist tend to say when they’re going to be evaluate it but it sounds so complex, you can’t possibly measure how wonderful we are because there are so many things that can happen. And there are two answers to that response. One is, this is why they call you welfare queens in white lab coats on the hill.

The second point is, if you think that ideas are random and serendipitous, you get that argument a lot, you don’t need to send money. If, however, you think, put it simply, why which is an idea is equal to X, which is money beta plus epsilon. Epsilon is your noise, The serendipitous part.

You’re going to say money makes a difference. Then you have to give me some idea of what beta is and the process whereby that beta and that X generate innovation. That’s the theory of change. And as a scientist, you should bloody well be able to write that down. That’s the core idea here. And yes, science is complicated.

So is investment in early childhood, so is investment in health, so is investment in many areas. But you have some idea. And as you go through the process you write, you learn, and you have a better understanding of how that money should be structured to be most helpful.

Satwik Mishra

And the third one the final one which is a little more straightforward. But leads me to my final question. The third one, around is the insufficient protection of privacy and confidentiality, including disclosure of intellectual property and national security risk could lead to a loss of trust and, more importantly a consequent disincentive to participate in sharing knowledge, which is so important.

Dr Julia Lane

This is what I’ve spent my life worrying about these things that privacy and confidentiality issues are massive. That’s partly why setting up secure environments are so important. What I learned from NSA about security is you can have multiple tiers of access. So, as long as your biggest risk of security is people sharing information.

So, figuring out how to indemnify individual behavior, which is what they do in the Defense Department, for example, is going to be critical. And that’s by the way, going back to the Illinois example, there was specialized customized tabulations that were sent to the governor’s office and the workforce board because they were trusted partners. So, you have levels of trust that are institutionalized.

There can be standards that can be set up around those that get that provide that trust. FedRAMP is an example that is being put in place so.

Satwik Mishra

FedRAMP is a fascinating example and how it functions. Right. Yeah. My final question is always looking ahead. As we said in 2024, with all of these debates happening around us as to what’s happening in the economy. What is the best case outcome, as you see going forward and what are the pitfalls that you worry about that we may need to work more on otherwise we are going down that path?

Dr Julia Lane

I’d like to see the centre set up immediately. It’s been 30 years, there’s lots of reasons kicking cans down the road. A philanthropic foundation or a set philanthropic foundation dedicated to the public good should set this up and get it done. All the pieces are there, join together, make massive opportunity that I don’t think we can miss.

So that’s the opportunity that I think can be capitalized on. What are the challenges? I think the incentive structure needs to be rationalized, it’s not publications and patents that rich publishers, it should be investing in people. Immigration is going to be a big issue because it is just a few ideas or you need as many of those bright ideas opened up as possible.

So, that’s going to be a major challenge.

In terms of the privacy and confidentiality, there’s been a big focus on things like differential privacy and synthetic data. The secure environment having multiple eyes on the data is what’s going to generate trust, not black box fake data. So. And all of those things can get addressed in this new set of the data and evidence.

Satwik Mishra

Dr Julia, from the book to the Centre for Data and Evidence to the Industry of ideas, all of these would be individual podcasts. And I could just keep asking you questions because it’s such a fascinating body of work. Thank you so much for being here. 

Dr Julia Lane

You’re a joy to work with. Thank you so very much.

Open Transaction Networks

Trustworthy Tech Dialogues,
Transcript

Satwik Mishra:
How would you define digital infrastructure?
Dr. Pramod Varma:
industrial era is over.
Dr. Pramod Varma:
We are very excited about that portion where AI is supercharging OTN.
Satwik Mishra:
what amazes me is the private sector industry getting unleashed by a public layer.
Satwik Mishra:
When AI and DPI comes together, in the case of OTN
Dr. Pramod Varma:
when OTN and AI comes together, it is not 1 + 1,
Dr. Pramod Varma:
it’s not additive effect anymore, it’s exponential effect.
Dr. Pramod Varma:
Interoperability was always there, even in the pre digital world.

Satwik Mishra:
Hello, everyone. I’m Satwik Mishra, the Executive Director for Centre for Trustworthy Technology.

It’s a World Economic Forum Centre for 4th Industrial Revolution. And today I have with me Doctor Pramod Varma.

Now every time I have to introduce Doctor Pramod Varma, I find myself wrestling with brevity. His influence, fans, countless technological ecosystems, each marked by its own compelling narrative of innovation and a deep-seated commitment to building trust in technology. His work isn’t just about technological advancement, it’s also about shaping a future where technology amplifies society trust.

Doctor Pramod Varma is the chief architect of Aadhaar, India’s digital identity program that has successfully covered more than a billion people. He’s also the chief architect for various citizen stack layers in India such as e sign, Digital Locker and Unified Payment Interface. He’s currently the CTO of Ek Step Foundation, a non-profit creating learner centric digital public goods and the Co-Chair for Center for Digital Public Infrastructure, a global digital public infrastructure advisory. He is the genesis author of the open source Beckn networks such as Open Network for Digital Commerce and Unified Health Interface. Finally, he’s an old friend, mentor, and to my mind, the quintessential public interest technologists in the world today. I’m eagerly looking forward to this conversation. Doctor Pramod, thank you so much for being here.

Dr. Pramod Varma:
Lovely to be here, and it’s a fantastic topic that you have selected. I look forward to having the conversation.
Satwik Mishra:
Okay, so let’s, dive in. Firstly, to kick things off, to your mind, what is digital infrastructure in the world today? In this age of relentless technological advancement, what are its technical and structural tenets? How would you define digital infrastructure?

Dr. Pramod Varma:
I think the vocabulary called digital public infrastructure was sort of came about during the G20 discussions, as you know. The industrial era is over. We thick and fast went through information era and continued to go through information era post the public Internet. Dramatic change in the way digital technologies have influenced humanity. Internet, GPS, cloud computing, smartphone, you know everything that came about and even faster than that we are walking straight into the intelligence era.

The good advantage where there we just got enough time to think through from an academic perspective, think through labor laws, how, you know labor, productivity, economy, And with now AI coming into the intelligence era is going to be a lot of questions asked.

Now for us as we navigated India’s digital infrastructure journey. Public infrastructure as public good in one sense, right journey. Our questions were twofold. One, we continue to see, a massive division in the Society of people who have access to access or get outcomes.

You know, a mere access. Access to financial products, access to banking, access to cards, access to lending, access to saving opportunities or similar to access to healthcare primary care, primary care, access to education, better education if you’re in agriculture if you’re a farm or access to knowledge, you know agriculture knowledge or agriculture market access. Much of this is driven by having access.

First cut is to access. Access will drive knowledge and agency and then it’ll drive outcomes. So, we can’t go straight to outcome without unblocking access and unblocking agency. Then comes outcomes, right eventually, is that a good outcome?

And when we were asking the question the why India specifically and it’s good. I’ll give you some statistics.

In 2009, India was most poorly one of the most poorly banked nations on earthless than 20 percentage of the people had bank account access. Nobody had a portable identity. But by then, post 1992, India had, you know, changed our economic policies in 1992. Resurrected ourselves, but the good thing is that allowed people to explore newer opportunities and that meant people had to move from their villages and in their hometown. And India, as you know, there’s no unifying language, there’s no unifying culture, there’s no unifying weather, there’s no unifying food.

India is just like a continent. So that meant lack of identity, lack of bank account, lack of mobile connections. All this meant huge impediments for a large section of the society. 1.3 billion then, now currently it’s expected to be 1.4 billion people. Out of that, 50 million people have access to it or 75 million people have access to everything. Everybody else lives in the completely what’s called informal economy. That means they somehow survive. They have roadside money lenders.

There are some informal saving schemes they cook up among themselves. It’s not well regulated. There’s no consumer protection and people lose money, get cheated.

But really the question was how we bridge this gap or how do we formalize their economy. It turned out that much of that access was simply because of cost structures in the system. And post 9/11 as you know you know much of the global norms’ financial money laundering, terrorism funding all those norms got tightened. The tighter it got the less people got in you know into the system because they would ask hundred more papers, you know they go you know to just prove you’re ok. So we were asking the question then what does that cost amount to?

So it amounted to a few things, very few basic things, which is cost of who you are, identity verification. That means proving your credentials, proving oh, I am 10th, you know, the high school pass, or I’m you know, graduate or I work in this company. Proof of work, proof of skill, proof of earning and proof of revenue in a small business. If you’re a small business, right?

Proof of existence, you know there’s identity and trade licenses and all. All, every one of them adds cost in the system. And when the cost is high, that means your revenue should be higher than the cost to be able to make it wire. But these are people who can put $5 in a bank account. You know, $2 are savings, $5, are banking cost of dealing with customer acquisition, customer transaction, customer engagement, cost of compliance and all the paperwork and cost of overall trust.

This cost added up so much that most systems would not go after the people beyond the top 50, 75 million people. So that every company and every bank and every lender and every system was targeting the same cohort of the 50, 75 million because the rest was almost simply unviable. So that started our journey of asking that question.

How do you do use digital to collapse, dramatically, collapse the cost as a shared infrastructure. If the shared infrastructure is built in such a way that is that is universal, inclusive and low cost and high volume, remember that’s very important for us. Universal, inclusive, high volume, low cost.

If you can build those infrastructure as public infrastructure, we strongly believe we will close the access gap and that precisely what happened in India through the e-digital public infrastructure.

If you don’t reduce the cost of KYC and cost of customer acquisition and cost of paperwork, you can’t open a bank account, you can’t lend. It’s obvious, right?

You know, today 1.4 billion people have digital identity in the country. They can digitally authenticate themselves anywhere, it’s a public good. It’s like GPS. GPS tells you where you are, identity tells who you are to the system, not in the philosophical sense, but who you are in the system. And then payment UPI completely collapsed the cost of moving money. It’s one 700th of a dollar moving money. It’s really cheap.

And Digi Locker collapsed the cost of digital credentialing, verifiable credentialing and so on. So, it’s very interesting that when we built this, we were able to do high volume, low-cost public infrastructure.

Now, today from 50 million people in 2016, we only had 50 million people doing

digital payments around that. Today 500 million people digital payments

in span of 6-7 years, we went up 10X and identity from nobody to 1.4 billion people, having nobody having bank account is 17 percentage of bank account penetration or 18 percentage in 2009 to near universal bank coverage. So, this nonlinear inclusion and formalisation also was very key. The exact outcome that we were after that access vector was opened up by collapsing the cost of underlying common things like payment or identity, data sharing. So that’s a story of DPI.

Satwik Mishra:
That’s fascinating. And to be, I always think of the DPI story and what amazes me is the private sector industry getting unleashed by a public layer. And that is fascinating. What happened over the last 10-15 years over the private sector just booming in India due to that public infrastructure layer that came out in the DPI story.

Let’s pivot to our main topic today.

So, as you know that we did some research on open transaction networks, which came out a couple of months back. We link it to the show notes as we release this video.

What are open transaction networks and why is it important in the technological world that we occupy today?

Dr. Pramod Varma:
I think this is a natural evolution from a platform centric economy to a network centric economy. Doesn’t mean platforms will go away.

The platforms will exist, but many platforms would join. When the platforms join together it forms a network. But when the platforms join together, the network needs a set of underlying trust layers, protocols, which is nothing but the language of how these platforms talk to each other. So, if we were able to design that we thought network would create a much higher universal architecture then purely one concentrated platform story.

And why were we inspired? We were inspired by Internet. Internet is not a platform. The internet is a network. Our telephone network is a network, it’s not a platform. Our e-mail networks are networks, not a platform.

So, anything that we have seen at humanity scale in one sense has been done mostly because or including SWIFT payments or payments networks around the world, right banks we can transfer banks. Inefficient in some sense in the new world we should find a more efficient ways to do it.

But nevertheless, there were a universal payment infrastructure. So, anything that you look at universal, multi country, universal infrastructure, you will see a network playing out there. Otherwise, you will have one company or one platform from some part of the world connecting the entire humanity. And problem with that is that either it becomes extremely monopolistic in nature, or it’ll be very hard to do.

Either way, we must think hard. So, because we were building public infrastructure or digital infrastructure as public goods. We were thinking, what if we start extendingthe idea of internet. Internet was held together by a set of protocols and standards, right? That’s what we did.

When you type HTTP on the browser, that really means Hypertext Transfer Protocol. That’s exactly what it stands for.

The fact that you can send e-mail from Yahoo mail to my Gmail, to someone else. Proton Mail in you know maybe in Europe somebody wants to use Proton Mail or privacy centric guys want to use it. Sure, we need a choice. Choices are good, but people, competition is good, choice is good.

But if you to create interoperability between this platform, you need an interconnecting language between this platform which is called a protocol and then a set of standards like HTML and all the standards came about.

But for some reason, post internet and GPS, our standards bodies did not extend or they were they were somewhat weakened in one sense, and no different from traditional institutions. Sometimes they there’s a rise of this institution at the appropriate time and after sometimes they don’t know why they existed. The value this institution brings reduces.

This happened for text standards bodies as well, and the speed at which companies were driving this, you know, close platform economy. We are so fast that the neither the academicians nor the standards folks could catch up. And take a step back and say, you know, let’s think about how did internet became pervasive and universal.

It’s because of the underlying protocols and standards. How did telephony network become pervasive and universal?

That’s because underlying protocols and standards. Shouldn’t we be doing this for next generation payments? Shouldn’t we be doing this for next generation data portability. Next generation, you know, commerce, trade.

Because individuals and companies exchange assets and economic assets and economic resources. There must be a means to exchange this. Money is their economic resource. Data is an economic resource, but you can also do other resources that other, any value, any products. If that ought to happen, we must start aggressively.

Think about the underlying connection, exchange protocols, exchange protocol or transfer product in like hypertext transfer protocol, SDP. We need a commerce transfer protocol or money transfer protocol and so on.

When somebody needs to define this and by designing such protocols and standards ought to be open. Protocol by themselves cannot be locked in and proprietary, but platforms can be private.

So, you can have AT&T or a Verizon or Airtel in India. Sure, they’re all private companies, nothing wrong in it. And you know you can use private innovation on the handheld or phones and handhelds and so on devices. But the protocol underline, if you want into voice interoperability, if you want content interoperability, if you want money interoperability and commerce interoperability, data interoperability. We must think about protocols.

And so, we were navigating from first principles, we were thinking from first principles. And we were motivated and inspired by what happened in the Internet and GSM and earlier payment networks and so on, and saying we must continue to extend this protocol story, especially now.

Why is it important now is because finally every human is walking around with the computer in their hand with connected computer. Today our smartphones are you know faster and more powerful than a supercomputer in the 90s.

Really, powerful computer we are walking around with, with extremely powerful capabilities such as voice and pictures and camera.

Everything is being commoditized, you know, then if both two parties have these devices, connected devices, computers in their hand, shouldn’t they be able to exchange money? Shouldn’t they be able to exchange trade and commerce? And if I’m a doctor and if I’m a patient, shouldn’t they be able to connect and do a Tele medicine, call Tele, you know, things like that.

Why do they all need to be in one mega platform in the world?

You know that that didn’t gel correctly for us. So, we wanted universality. You want a choice and fair competition. So, we went one level below the platform and said let’s attack the protocol layer

and let’s build the protocol and when that happens, we will have if we thought platform economy was fast, network economy is 100 X fast because every node that adds into a network creates combinatorial value to the network.

So, I think that that’s what internet was all about, right? Internet is exploded because of this. It also creates a new set of innovation possibility for private entrepreneurs. Because when you create a new set of protocols like Internet beyond an Internet which we saw with UPI in India. UPI was a unified payment interface, was a protocol story where a Google Pay and a WhatsApp and a Walmart on phone pay and all could exchange money with private banks and public banks all in real time.

350 banks, hundreds of apps and workflows all integrated through common protocol exchanging money instantly right in a guaranteed trusted manner, consumer protected manner. And we will talk about trusts separately.

But these protocols with trust embedded networks can create a hundred X more growth and new innovation possibilities and then a single platform story and that takes us to a universal infrastructure approach rather than a platform centric, more siloed approach.

So, it was very first principal sort of computer scientists like thinking, and people thought we are a little bit wacko, maybe we were wacko, we were thinking crazy, but you know what we were doing anyway, open-source stuff.

you know, we thought, what do you lose? What is the worst case that way?

One year later, we would have thought, hey, you know, this damn thing, we were smoking some stuff and it didn’t work out, you know, but we wouldn’t have lost anything.

But what we saw was hundreds of open-source folks joining. Now we have thousands of them around the world joining the movement to help the protocol in the open source, and then entrepreneurs joining and say, I can use this protocol to do what I thought I’d I know I can’t do otherwise, faster, cheaper. And they’re all coming together.

So, we said a maybe we unleashed a next set of values into the ecosystem, private innovation ecosystem. And if the ecosystem likes it, they will run with the ball as long as we keep the protocol open and open source. And that’s what we’ve been doing lastly 4 years and you brilliantly covered in that in the paper you produced. If people are listening to it, you must go back and read that paper.

Satwik Mishra:
So, on open transaction networks, one of the core tenets is interoperability. And we see over the last couple of years this word thrown about a lot. Now it’s suddenly become way more prominent than what it was then when we were working on it five years back, six years back.

Europe has its own interoperability layer coming about with DMA. We’re pushing it from, we’re pushing it from the open transaction networks and from protocols.

Give us a sense of the philosophical sense of interoperability. Why was it considered important when ARPANET and World Wide Web came about all the Tim Berners Lee and Vint Cerf and why did they think that was so important and why is it important today? What is it for what does it mean for the world economy today?

Dr. Pramod Varma:
Funny thing, interoperability was always there, even pre digital world. The fact that random companies hired that idea to a specification can fit into a car manufactured by completely another company without them talking to each other at all tells you there’s interoperability in the physical world.

Your house is constructed with a set of interoperable components. If everybody had to sit together and build a custom doorknobs and hinges and you know all that damn thing, you would have never built a house or every house would have been so costly to build it made no sense to the specification. To unbundle and dynamically re-bundle later was a norm we have always seen in the history of humanity.

There’s no other way we would have survived without Interoperability. is not a new word, it’s not anything cool and new, it existed.

Anything that works at human scale has interoperability built in. No other way, it would have survived. It nicely fits in. Few things kind of come together.

When Digital World came, we extended the idea of, you know, interoperability into the digital world. Computer protocols, TCP, IP and all the protocols that we talked about are all interoperability.

How do two machines talk? If two machines are built by two different companies, how do they talk? Some standards, some specifications by which can talk.

How do independent modems? In the early days of internet talked, everything was about specification. If you look at a Windows laptop, Windows operating system, Microsoft doesn’t build a mouse. Logitech builds a mouse, some other company builds a camera.

I use some headphone earphones right now, which is built by some cheap stuff in India. I think you know, some not even expensive one. They don’t, It’s not that they all have to sit in a room, and they agree that how will my pin connect to you, right. Standards like USB pins, all those were a device driver standards. All the things were built as interoperability.

What do you what happens when interoperability is built. A system can be now unbundled and independently developed by a new ecosystem in the car. Car is an assembly. The car companies primarily looking at the main engine and chassis, nobody builds the dashboard or steering wheel, they are all built by different people, right?

So, when you unbundle it, each component, each component in the unbundle can have its own ecosystem, bunch of people making mouse, bunch of people making printers, but they all come together because specifications necessitate that they come together. Its not new, always been there.

That’s the only way we’ll survive as humanity, Anything at human scale and sustainable. You need interoperability as a fundamental element. When that comes, why is it important now in the digital world?

Why is it important? What sadly happened this post Internet. Much of the advancements cloud computing or all that thing was private goods. And when they create private goods, they obviously want to create a mode. You know, if I were running a private company, I would want to create a mode. Nothing wrong about it. That’s the right thing to do.

In fact, if I were a private company, I will dig the pole. I’ll build bigger mode, why not? And but when you create the mode.

But what happens is these modes become silo, they become sort of the walled gardens, you know, it disconnected walled gardens that’s happening. And that works at some scale beyond which it stops working and beyond works replicates of that.If it becomes really large like in the US you know drivers would start complaining or in the Europe against a large you know ride hailing company or against a large e-commerce company they would say hey anti-monopoly you know you guys are screwing small companies and my products are being downgraded in the marketplace unless I pay up my premium, all the drama starts happening right.

So, what happens is eventually if the platform wall gardens or closed loop platforms either they remain small, not enough value it, it seems big because valuation is big, PC driven valuation is big, but the volumes are really not that big, that’s what happens. Or if it becomes big then all the anti-competitive discussions are coming up, right.

So, we believe interoperability as a means to create fair market competition and new innovation playground and thus choice and options for the people is a very interesting possibility. Even beyond that, I feel if you want anything at 8 billion people scale, you very well deal with interoperability and you can’t have one platform story, you have to have multiplatform story. But if you have multiplatform siloed story, then individuals and Smas get screwed in the deal because they are not portable and interoperable.

And so, you know if you want anything at internet, or telephony scale across the world, you know anybody can exchange voice, anybody can send money. Now with appropriate rules, no big deal. You can layer regulations and rules on top of that.

But the infrastructure itself should be inter universal and interoperable across the world. I think it’s somewhat a no brainer, but it may be, it may be.

If you are a private company wanting to build a full platform mode, it might look like the protocol story or interoperability stories against you.

Because why should I interoperate? You know, I’ll build my own platform and I’ll get it. But if you want the universality, you want many, many such platforms, but they should be able to interchange, you know, data, money, trade, commerce, you know, interoperate between them, right.

When you look at electric charging stations in most countries, none of them are interoperable. It’s like saying it’s your car but to fill the gas you’re to only go to car companies gas station.

We when we look back, it would have never worked across at the human scale, it would have never worked.

So obviously we had unbundled and created interoperability between traditional cars and gas filling stations. But somehow with electric cars we have a proprietary charging thing going on.But I suppose that is the first phase.

I think the second phase would be always a network phase. First phase will be a platform phase. Second phase will be a network phase because other than there’s no other way to scale to humanity scale.

Satwik Mishra:
So, it’s not as much as an anti-platform play as a new way of platforms to speak, It’s getting platforms to speak to each other and create more scale.

Dr. Pramod Varma:
Not only platforms to talk to each other, it creates independent, innovative ecosystem. If you create a standard for chargers, for electricity, cars for electric bikes and cars, a new slew of innovators will come about to create all kinds of ATM like charge machines, small charge machines in your apartment, big charging in all the in the highway they a new set of innovation will kick in.

But if it is a full vertical closed loop play, then the car company has to innovate everything all the way down. And every car company saying my charger, my pin, my things, you know, it made no sense by the way. It is like saying your mouse and your laptop all had custom pins. You know, type C, USB or you know, you would have gone bonkers.

You would say what the hell is this, but we seem to accept it when the innovation is new as a phase one strategy, but it’s never a scale strategy.

Satwik Mishra:
What would be the phase two strategy for traditional industry like you’re speaking about new entrepreneurship coming up with open transaction networks. There’ll be more opportunities as we unbundle, and we’ll create more spaces for innovation.

What do you emphasize is the role of industry to operate in open transaction networks?

Dr. Pramod Varma:
In fact, open transaction network doesn’t mean free innovation, open source, everything, nothing. It simply means think HTTP, think GSM, think SMTP, think payment protocols but entire all the Nodes on the network are all private, you know could be completely private innovation.

Now there might be some open source like Firefox versus GROM or something you know. There might be some open-source Nodes also because some societies might need some free government given because extreme poverty or you know somebody else will support them and all that thing.

But majority of the nodes on a network open transaction network is the network is open, but the nodes are all private.

OK, remember, so in this without industry, there is no OTN.

No, without private industry and private innovation, don’t even bother attempting creating protocols and OTNS.

Satwik Mishra:
That’s true. And this is what I’ve been arguing most of the times that we need the actual win of open transaction networks and open-source protocols would be when industry joins in and they see merit in it and that’s the only way it can scale going ahead.

Dr. Pramod Varma:
We have seen that in the automobile industry, we have seen that in the healthcare industry. It’s not that hospital person is building the device to everything.

I mean all these have been unbundled through independent specifications. Eventually industry sets up and say you know enough of this custom game, man, let’s, you know, let’s do, let’s come together to do it something, you know, so that universality happened. You know, that’s, you know, it’s like it has to happen.

Satwik Mishra:
Now I know that there are like so many pilots ongoing across the world. In our paper, we’ve got pilots. We’ve exhibited pilots from Brazil, Gambia and India. But you know that it’s happening in the developed world also in the developing world, in low-income countries as well in Amsterdam as well.

Paint us a picture on what you envisage open transaction networks to do in different economies, in a developed economy, in a middle-income economy, low-income economies.

Would it play out differently? Would it? Would it serve the same purpose? How would it help community and industry in different economies?

Dr. Pramod Varma:
In fact, the idea of open transaction network, it has nothing to do with global power or developing nations or developed nations. Nothing. It has relevance to every digital economy.

Now which industries most, which use case and which industry is the most attractive depends on the context of that region or the country or whatever, right?

Like if you’re dealing with energy transition, new problems, Europe is going to energy transition. They will say, OK, everybody’s going electric because it’s pretty Europe. because you don’t, you know smaller countries more compact and enhance electric cars can make it much more meaningful with the long highways and everything.

US might be still a little bit hard to not have enough charging stations and long distance maybe still have you know you might still have anxiety, you know, So, because of that the European nations

for example, we see lot more conversations related to interoperability in charging infrastructure, interoperability in new age energy trade for example, energy distribute, energy production is going decentralized as you know behind the grid, solar, all kinds of and then you are having energy storage also decentralized because batteries and battery banks and all that thing.

So, no more are the days where the energy is being centrally produced using one typically one dominant source, then distributed as a one-way distribution to the homes.

Today, it’s being produced, stored, exchanged in a two-way situation. When such things happen, you need lot more programmability and interoperability among battery systems, solar systems, smart grid systems. You need to be able to because you don’t want to pump all the energy into a grid and grid might collapse, grid might not take it.

So, you need to readjust the load reprogram, pause, you know bring back load only when the peak time maybe this pricing different differentiation you have a price leverage you can play with, you can sell at peak time versus you know nighttime and all that thing, right.

All these are actually coming through right now. If you look at IEA, IEAS, energy papers, brilliant set of papers they produce, it’s very clear, composability, programmability, dynamism of the whole energy world is changing.No more are the linear one-sided energy distribution problem.

Now people are asking the same question, what is the interoperability standard for smart games? Smart meters or smart grids or programmable exchange and trades and so on. Somebody has to sit together and figure out these things.

Otherwise, it will create a bunch of silos, which is not so bad as the early part of the innovation to prove the innovation out. But it will never scale the innovation further, right?

So, we are seeing developing nations focusing on newer problems such as energy and climate resilience and sustainability, recycle economy, sustainable agriculture. Even in the US we are having this discussion with Stanford and Berkeley and on energy and sustainability and all in the Portland we have a pilot that’s supposed to kick in on climate resilient Township.

How do you bring, how do you reduce coordination cost at the time of a disaster? Because at the time of disaster, you know everything is on random WhatsApp and you don’t know what to do. And so how do you remove, create a coordination fabric and reduce the cost of coordination when a disaster happens, right. And that also is a protocol thinking I can think about because it’s a network, there’s, you know, everything is a network of providers and consumers.

So, you have to connect them. But in the US, all the problems like commerce, ecommerce, food delivery, ride hailing is not that compelling. Although drivers might be complaining and restaurants may be complaining, it’s not yet reached a situation where people are saying we need an alternate model. So, but energy and new problems US is very keenly thinking about AI. They keep thinking about is there an interoperability story on AI and all that.

In Europe, you see new problems including old problems because much of Europe has digital markets act or other interoperability push. And sometimes the labour laws are pushing away from these platforms out, like Germany and Copenhagen and all those kinds of Uber and other companies have been, you know sort of had tough time surviving because of the regulations and other issues in Europe, labour laws and other issues in Europe.

So, then they are also looking at many alternate Amsterdam project was an open mobility and open commerce infrastructure for example. It was not an energy climate problem. It was an open mobility, ride hailing in Amsterdam.

They wanted a more network approach so that multiple ride hailing platforms, including public transport platforms can join in. So that consumer can get choose between public transport, private transport, last mile and all the good stuff, electric bikes. Because there are a lot of that going on in Europe, right?

Global South, it’s completely fragmented because global South is not yet digitized.Even you know ride hailing largest ride hailing platform in India does across India daily. They do less transaction than one cities metro in Delhi, I mean for example or Bangalore.

So, because you know our India is so widely large and diverse that these platforms have been only in in single digit. All these platforms have only reached single digit penetration including largest e-commerce platform has only reached single digit penetration in D to C commerce.

So, India is not a monopoly problem, it’s a fragmentation problem we are seeing that means much of the economy is high information asymmetry, high friction, high cost and low performing equilibrium.

So, we are looking at a network approach to bring them all together onto a unified and more formalized economy so that everybody gets their advantage, so that every platform wins in this game. No platform will be a loser in India when or in Brazil when we do the network because it’s an obvious large country problem, you know unpenetrated, you know not at all penetrated problem, but global north we see different set of conversations.

So, networks are here to stay open standards and specifications and protocols that interconnect part many systems is a norm. We always had for hundreds of years. We are going to continue that thing. The use cases, what are the real compelling, compelling use cases in different countries differ.

Satwik Mishra:
But as these use cases and pilots go forward, another aspect that we should discuss, and I’d love to know your views on it is, what do you foresee the risks coming about in these ecosystems as they’re being piloted in different industries and also, different geographies with different economic conditions and incentives? How do we think about risks in this space and what do you envisage?

What should we be mindful about as we go forward?

Dr. Pramod Varma:
Yeah, most, by the way, much of the just for the audience, much of the protocol story by design, are decentralized. There’s no centralization, data centralization, or a flow centralization. So no, individual institution is in the middle. Controlling this protocol allow you and I to exchange money, that’s about it right.

Now, there might be a bank into which my money is stored. That’s OK that other than that really, it’s more of a decentralized and when it comes to commerce ride ailing for India for example,

it’s between the driver and the you know passenger, so it’s no big deal. But there are two big risks that we always should look at it one, a business viability risk. That means some use cases. You have to really analyze what is that society?

What is the industry in that society in Brazil, India or Gambia or whichever country or Amsterdam or US. In that environment, do we have enough private entrepreneurs who are ready to play the game without PRI?

As we said earlier, without the industry, without the private industry, there is no network play. Network is only meaningful when there’s a private entrepreneurial play going on.

OK, remember, it’s nothing to do with government. So, we have to analyze + 1 use case. What is a use case? What is most attractive?

That means what is the most value unlocking use case in that society with least friction? Are their entrepreneurs ready to play the game? Because it is like Internet, if there’s a new playground, you need new players and new players need new infused VC funding. All that thing has to happen.

If everybody is going after them building another LLM now and there’s not enough money, then maybe there’s a business risk here, right?

There are not enough entrepreneurs to bootstrap. Bootstrap networks need bootstrapping. Remember, network is about two sides. You know, it’s a chicken and egg problem. Without supply, there’s no demand. Without demand there’s no supply. So, we have to bootstrap these networks through use cases that are most value unlocking and least friction to the society.

We need entrepreneurs there. And if you don’t get that correct context or environment, very high risk of a failed network because you went after a problem statement for which the society, neither the society nor the entrepreneurs were ready. We have to get both society and entrepreneurs, and funders, all ready for it.

So that is very important. And 2nd risk is contract and consumer protection risks. Because when you unbundle and through a network and then rebundle dynamically through a programmable rebundling layer, you might be money sending buying something on app one from another platform in a platform two. And then platform two is interacting through product called the platform three to transport that product and everything nicely come.

What if it doesn’t come now? How does consumer protection work? How does contract adherence work? If everything ends up in court, then we have a mess in hand, right?

So, we have to think through a contract trust layer. Basically, trust the design of the network. Trust design includes participation agreements, SLA, compliance contract adherence, consumer protection grievance, handling tracking grievances to make sure grievances are not going through the roof because then you know, network is going to collapse.

So, there is there exist a need for a neutral, neutral facilitator. Neutral facilitator doesn’t have to be government or it should not be government according to me. It should be a market alliance, like GSMA type of market alliance, OK, some sort of market alliance rising up and saying we like the idea of network.

We are enthused about new entrepreneurs are enthused about doing it because we can do it at scale now money movement across the world and all that thing. But for us to all talk together like GSM guys did, we need a GSMA so that we can all talk about each other and say guys tech has to be trustworthy. Compliance & complaints has to be handled.

Money contracts, when I call from my Airtel phone here to an AT&T or a Verizon phone, there is money being also moved right between companies.

What if you stop paying? Who takes the risk the network will start collapsing if people start cheating each other?

So, you need all this. You need a neutral party to make sure, there is a healthy, trustworthy network. So, two parts economic and business viability, trust and consumer protection and you know, trust of the whole network. These are the biggest risks of any network. And if you don’t get it right you will realize you will think, no, the idea of network work, no idea of network always works.

The use case may not work in the correct if you don’t get the context right or the way you went about. You know, in terms of governance, in terms of grievance handling, consumer protection you didn’t think through this well.

And hence a massive amount of consumer complaints are going on. And when the complaints rise up, network automatically loses trust and it will collapse itself.

Satwik Mishra:
This sets you up for the like a question that I’ve been meaning to ask you for about a month and half after the paper released. And we’ve just not had the time to speak in the paper. Like you were very kind enough to review the paper and give your thoughts. You have this port where you talk about as we are explaining the technical aspects of it to our leaders and trying to explain this new way of thinking.

And you say that’s the whole way of thinking. You have this quote, which I’m paraphrasing, which says something to the effect of it’s not the technical interoperability, which is a challenge, it’s the structural trust creating trust in this network, which is a challenge now as you’ve spoken about earlier.

So, a network play will require multiple parties to come together for every unique transaction. With these pilots coming about across the world, what are the some of the lessons in building structural trust in open transaction networks and how important has given your experience with other ecosystems that we’ve developed over the last 10-15 years, how important is trust in technology in open transaction networks as we go forward?

Dr. Pramod Varma:
Yeah, so it’s very clear that anyone wanting to set up an open network for commerce, mobility, energy, circular economy networks, education networks, killing job networks, whatever networks you want to create and we have covered some of those examples in your paper. I think these networks need, as I said earlier, this network needs two parts.

One very systematic analysis of the use case readiness to understand is it really society really ready? Are entrepreneurs really ready to play this game?

Otherwise take a different use case because not all use cases may be ready for it because of many other reasons. Nothing to do with protocol or technology or anything, right?

It’s just maybe it’s just not mature enough. And maybe society, like with healthcare, we see some of the habits in the society are so strong that they are not ready to, you know, migrate to a digital health doctor. OK, I would rather go to a physical doctor near me, right.

Things like that. So, it’s because it’s just a cultural habit and so on. So, you have to whoever is attempting the network must have a systematic way to answer, objective way to answer the use case, readiness, consumer readiness, entrepreneurial readiness. OK, that point number one.

And if you don’t do that, we’ll walk into a we’ll say it me, it makes intuitively make sense to me. I want to do the same thing in Gambia. But Gambia may not be Amsterdam. You have a difference there. Different culture, different habits, different entrepreneurial energy, different funding structures, different people.

So, you have to analyze, don’t copy paste. Basically, you have to do an analysis of that in that context. That’s point number one.

Point number two, is that you must, that’s our learning.

You must incubate some sort of a by the entrepreneurs for the entrepreneur’s coalition or cooperative, digital, cooperative, digital collective, coalition, whatever you want to call it. Alliance. You must create this and ideally keep it. That is not-for-profit neutral because the playground maker can’t be also a player, OK, because that becomes very messy.

So, the create a playground maker who is a with the membership of all these entrepreneurs being member of that collective with you know, ideally kept as not-for-profit. So that you have a round table where you can actually discuss about trust contract issues, pricing issues. You know when you exchange there might be Commission structures,

Commission money that everybody might have agreed to some of those terms, those conversation had to happen in a neutral facilitating fashion, not a controlling fashion.

It can’t be a platform with a for profit entity saying you shall do it because then the network falls apart, it’s a then it’s a platform play. Then you should play the platform game, close loop platform claim, not an open network game.

So, our learning is that you need to incubate very early such facilitating entity and create the environment for conversation and co-creation of new governance rules. Remember no governance rules can be created upfront.

Rules have to evolve as the networks evolve. So, you need a mechanism, you need a facilitating environment that allows such conversation to happen and allow such rulemaking to happen in a participatory way so that everybody of the network, all the nodes of the network as the voice there can participate.

Create an open and transparent mechanism for such conversation to happen so that trust is established and so on.

At the protocol level, technology level, lot of these are taken care. Digital signature, cryptographic contract, cryptographic agreements, all the things taken care that is not to worry.

But technology alone won’t solve it. Technology plus a you know conversation, governance, conversation will make it a much more trustworthy network.

So, when you instantiate a network. You need that facilitating organization also to be instantiated. That creates an environment for co-creation and government open governance to happen. This is a learning, there’s no other way to skip it.It’s very hard to skip it by the industry. For the industry or facilitating organization ought to happen and not-for-profit. Ideally A neutral player, ideally without it. It’s a bit messy.

Satwik Mishra:
As we go towards the end of our paper, we end on sustenance, and we talk about how do we create sustainability in these open transaction networks as they’re going forward because they’re just starting off.

Do you think that these will be lessons which will come in as the time goes forward and there’ll be a mixed set of lessons from different geographies, so, there will be more of a plug and play basis that or will we be able to in this diverse sort of pilots be able to create some common standards and common schemes for creating trust and sustainability of transaction networks?

Dr. Pramod Varma:
Excellent question.

There will be common patterns, but they will also be contextualization of these patterns that is necessary. So common patterns will emerge. So, it’ll be wonderful if multiple networks can sort of come together, publish their own learning, publish their governance structures.

Anyway, for their own transparency, they should publish to exchange knowledge. But that should not be a controlling or regulating cyber model.

It should be purely; I don’t need your blessing. You don’t need my blessing. But I will share my learning. You share my learning. I am almost with my you know I’m old enough to see that.

No way, we are not going to see a pattern. We are going to see patterns emerge. But contextualization must happen because different countries, for example, Singapore you know, contract adherence is nobody even doubts that if you sign a contract with me that you will not.

But in India or developing nation signing a contract is assigning a contract. OK after that there are all kinds of, you know, drama happens, right.

So, we have now we have to put additional layer of governance so that it doesn’t end up in court because court is also clogged in India. So, we are so different contexts that have different ways to govern. So let them be, let them contextualize it.

But common patterns will emerge and there must be shared. Ideally there must be shared. You know that’s the only way we are going to learn from each other.

Satwik Mishra:
We have managed to do something which rarely happens. We’ve managed to speak for about 50 minutes without bringing up AI.

But I can’t let you go. Before we answer this question, it is fascinating what’s happening and as somebody who’s been following and I’m sure you’d find it, you’ll have even more of a context to it in the last 10-12 years.

We are truly in a unique moment right now with the development of AI and what it’s offering. What do you see its role in open transaction networks?

What do you see is the productivity frontier that AI could bring about in this new play that we’re talking about.

Dr Pramod Varma:
Excellent question.

We have been talking about the idea of AI and whether public infrastructure or network as OTN as a public infrastructure open in our transaction networks.

As a public infrastructure. I feel a lot of us feel AI is a massive catalyst to boost many things that the network is trying to do. So, we, you know and then keep saying it’s DPI to the power of AI. Because when AI and DPI come together, in the case of OTN, when OTN and AI comes together, it is not 1+1, it’s not additive effect anymore, it’s exponential effect.

You know, suddenly that’s because consumer interface, you know on these networks now can supercharge through AI. I can have a, you know, natural language bot that is ordering. I can have broken language Hindi languages or Swahili or language people saying yeah, I want to order this, I’m not sure.

Can you first search for my price? What is it? Just I’m interested in buying this in broken, grammatically incorrect language and suddenly AI is able to dissect some of these and provide a new way of interacting with computers beyond keyboard and touch. Keyboard itself was completely geeky and nobody could use it really frankly.

But that’s at least allowed lot more humans to use it, including my mother and all that thing. But we still have in India. We would still see transactions.

When it comes to transactions, only about 100 million people actually do self-service transaction and at best maybe 200 million people. We have 1.4 billion people. Think about it. It’s still a 20 percentage, right?

What happened to 80% of the people? How do they fill a form? This form filling, although it is digital, it is very archived.

By the way, some text box, some you know, list box, select your country, radio buttons, fill up a you know area for you and I.So naturally we’ll just fill it up because in English I sort of know it.

Most people can’t fill this stuff, so they’ll always need an assistance. Please help me. Please help me to fill a bank form or a loan phone.

It’s not easy to you know, it’s not at all easy. It’s so unintuitive. It’s like very silly that we create all these forms. AI can boost. AI can dramatically create voice, space, next generation human computer, human interactions right at the same time on the provider side, look at SMEs, small medium businesses.

Why do we struggle with small medium businesses and nano entrepreneurs in India to bring become digital. Large companies have cataloging people.

They can digitize their catalogue, They can digitize like inventory. They can digitize their you know, they can IT outsource, create an API endpoint. All the staff casually say they should do.

What happens to all that stuff is one man shop. We have millions of one man shop, literally husband wife doing shops or whatever right or father son doing or daughter mother doing.

They have no time to catalog it properly. They don’t even have the language to describe a product in a catalog. We think literally using you know whether the open a Open AI or any Lama three  or with these multi model LVMS, LLMS multi model.

We know whether it’s voice, what you call visual or whether it is language-based model, visual models.

Now multi model, you know LLMS or not LLMS now they’re called whatever LXMS you know models that are multi model in nature can literally shoot my inventory, shoot my inventory and actually auto catalog.

So, I can create productivity tools, advanced tool sets and supercharge peoples and companies, small medium business, ability to join the economy, their ability to join the digital economy and transact in an open transaction network.We need to transact. To transact we need to make transaction.

Coming on to the grid easy and transacting easy.I should be able to.

Shopkeeper should be able to say, oh, I received a new order through the network. I should say, yeah, yeah, I accept it and you know, then they’ll say but when can you ship?

Imagine computer asking me, I can say commit day after tomorrow. I can ship day after tomorrow, not a problem. No need to go click some mouse, computer click some form, accept order, nothing.

I can actually talk to the computer and accept an order as a shopkeeper while I’m still serving.Maybe my customer, right, it’s amazing stuff you can do with you know LLM.

So you or AI can dramatically reduce the access and ease convenience, productivity and cost structures in the transaction networks.

So, we are very excited about that portion where AI supercharging, OTNs. We also feel on the other hand, if you want a dream where the dream is AI agents interacting with agents, that’s like one day my agent will schedule with you know some a doctor, book vacation. But to book a vacation I had to search, I had to book a cab in that new city.

I had to book hotels, I had to book flights. Imagine agents are all doing this. How do agents interoperate?

How do agents contract? How do agents create micro contracts that are legally valid contracts so that what you commit, you committed a booking room, I booked a room, you have to commit to that right?

So, you need cryptographic techniques and you need interoperability guidelines and all that thing to ensure agent to agent trust, agent to agents contracting, agent to agent discovery and transactability between agent to agents taking care.

So, we are also think OTNs could be a way we could actually supercharge a human scale agent network. This is the Part two of the story.

OK, so part one is simply OTNs getting supercharged through AI techniques like you know language and voice and everything and cataloging and all the good stuff.

Second thing is OTNs enabling the dream of agent driven networks to come about.

Satwik Mishra:
Doctor Varma, thank you so much as always. There are far too many topics and very little time to discuss with you. But hopefully we will continue this conversation with our work on open transaction networks.

It was an absolute pleasure to have you out here and also learn from your insights.

Dr. Pramod Varma:
Thank you for having me and very much look forward to your support and continue with other newer publications on the OTN.

The idea of networks is here to stay, different context, different use cases, but networks are here to stay. Interoperability always existed here to stay. So, you know it’s good times ahead. I think if you put our heads together,

Satwik Mishra:
Good times ahead. Thank you.

Chapter Notes

Artificial Intelligence in HR: Promises, Pitfalls, and Protecting Civil Rights with Keith Sonderling

The Unique Role of the EEOC (05:38 )

Balancing Benefits and Harms on AI in the Workplace (13:41)

Multi-Stakeholder Approach at the EEOC (24:10)

The Narrative around AI in Policymaking (33:18)

Federal Government’s Role in Technology Guidance (35:35)

Looking Forward to Emerging Technology and Regulation (43:22)

Trust in AI in the Workplace (48:50)

 

Artificial Intelligence in HR: Promise, Pitfalls, and Protecting Civil Rights With Keith Sonderling

Podcast Dialogues Transcript

Satwik Mishra

Welcome everyone to another episode of Trustworthy Tech Dialogues. I’m Satwik Mishra, Executive Director, Center for Trustworthy Technology. Today, we are exploring a topic that affects each of our professional lives, ‘The interface of artificial intelligence with human resources.’

In a world where AI increasingly interfaces with workplace operations, particularly in HR, the question arises how do we ensure that this technology is used ethically, responsibly, and with a clear commitment to civil rights to advance society trust?

Now, on the surface, AI offers tremendous benefits. It can streamline recruitment by efficiently analyzing resumes. It can remove unconscious bias in hiring decisions, and it can create more tailored, and data driven approaches to employ management. But with these capabilities come significant challenges and considerations. How do we develop AI systems that are human centered? How do we ensure that AI in HR (Human Resources) respect civil rights, promotes equity, and avoids perpetuating discrimination?

Trustworthy AI adoption isn’t just a technical challenge, it’s a deeply ethical one. It requires the collaboration of technologists, policymakers, business leaders and civil societies to ensure transparency in design, development and deployment. It demands a vigilant focus on safeguarding the dignity, integrity, security and privacy of data that AI systems are built upon. Because at the heart of every HR decision is a person, and that person’s rights and autonomy must be respected.

Today, we will explore all of this and more. We’ll talk about the importance of building AI systems that not only work for businesses, but also enhance fairness and protect civil rights within the workplace. And of course, we’ll dive into practical aspects like – What does AI mean for the future of HR? How can we harness this power responsibly? And what are the steps that we need to take to ensure that AI is used to empower, rather than undermine, the human experience at work?

To discuss these pertinent and timely issues, it is my absolute privilege to welcome the Commissioner of the U.S. Equal Employment Opportunity Commission (EEOC), Keith Sonderling, in today’s podcast. Commissioner Keith was confirmed by the US Senate with a bipartisan vote, to be a Commissioner on the US EEOC in 2020. Prior to his confirmation, he served as the Acting and Deputy Administrator of the Wage and Hour Division at the U.S. Department of Labor.

Before joining the Department of Labor, he practiced Labor and Employment law in Florida. He was a partner at one of Florida’s oldest and largest law firms, Gunster. At Gunster, he counseled employers and litigated labor and employment disputes. Commissioner Keith also serves as a Professional Lecturer in the Law at The George Washington University Law School, teaching employment discrimination.

Since joining the EEOC, one of Commissioner Keith’s highest priorities has been ensuring that artificial intelligence and workplace technologies are designed, developed and deployed consistent with longstanding civil rights norms.

Commissioner Keith, thank you so much for being here.

Keith Sonderling

Thank you for having me.

Satwik Mishra

All right. So, let’s dive in. I want to start off with a little bit about your journey at EEOC.

You’re currently wrapping up almost four years as the commissioner. Walk us through your path to this role. How were you introduced to the intersection of technology and politics?

Keith Sonderling

Well, it’s a great question. And to get there, you have to take a step back and learn a little bit more about my agency here.

So, the US EEOC, for all intents and purposes, we are the regulator of human resources. So, our mission is to prevent and remedy employment discrimination and advance equal opportunity for all in the workplace. So, when you think about some of the larger issues the workforce faces with discrimination, preventing discrimination, all the issues related to the #MeToo movement, diversity, equity, inclusion, social issues in the workplace, pay — That’s my agency.

So, you may ask, what is so significant about this agency and why we’ve been able to take such a global role in ensuring workers have protections related to not being discriminated against in the workplace, which are civil rights protections in the United States? That is because of our founding.

And this agency was founded out of the civil rights movement in the 1960s. So, Martin Luther King Jr. marching in Washington DC, led to the Civil Rights Act of 1964 that created Title VII of the Civil Rights Act, which gives employees, applicants and even former employees, civil rights protections in the workplace to not be discriminated against. Such as characteristics, their sex, national origin, color, race, religion, age, disability, pregnancy — all these very big-ticket items.

So, the United States was one of the first to create these civil rights protections in the workplace. Because of this very storied history, a lot of governments around the world have modeled their own employment laws and civil rights protections for workers in the workplace. So as the global leader, when it comes to these important issues, we really can set the pace for not only the compliance side, but also the enforcement side, because at our core, this is a civil law enforcement agency.

So, that’s a little bit about our mission and how we’re here. And I would like to call that the day job because, what’s also unique to US based employment law is that for most claims, you cannot just go sue your employer in court. You have to come to the EEOC first. Whether you work for a private company, state or local government, or even the federal government.

We see every case of employment discrimination for the most part in the United States. So, it really puts us in a very good position to see what those trends are and to see what the hot issues are. So now, to answer your original question, how did I become interested in the intersection of technology in workplace laws and workplace rights?

The answer is – when I got here, I really wanted to focus on an issue that was going to be proactive, focus on an issue that would help not only workers, but employers, be able to comply with these very complicated federal laws. And if you look at just at the recent history, the last 10-15 years of this agency, very much like HR departments, very much like corporate governance departments, we’ve had to shift what our priorities are, based upon what’s in the news.

So, let’s go back to the recession post 2008, there were a lot of claims of employment discrimination related to age discrimination. Because a lot of older workers were disproportionately laid off after the recession because they were making more. So, we started focusing on age discrimination. And then happened, the global #MeToo movement. And we had to stop everything and then focus on preventing sexual harassment, which of course has been illegal since the 1960s.

But that raised awareness. Then the U.S. women’s soccer team with pay equity became national news. Then, COVID happened. And everything related to vaccines and working from home and those accommodations were right in the center of our agency. Then George Floyd and the social movements within, then racial injustice within the workplace and a lot of those movements. So, we had a focus there.

So, I say this is that there’s always going to be something, for us, for companies on the compliance side, that is going to distract based upon what’s happening in the news. So, when I got here in 2020, I said, how do we get ahead of the next #MeToo movement? How do we start thinking about, proactively, what is the biggest issue for the workforce going to be? And how do we tackle that?

So, I started talking to corporate leaders and a lot of different stakeholders, both on the employee and employer side. And they told me that artificial intelligence in the workplace is going to be the biggest issue moving forward. And like many, especially in the government, I didn’t really understand what that meant. For me, it kind of took me to this dystopian vision of these robot armies replacing human workers.

And I thought, maybe that’s happening. You see that in manufacturing and logistics, some retail fast food where actual robots are doing the job. Well, here at the EEOC, we regulate every single industry across the board. So, I needed something big enough to tackle, and not just a couple industry specific things. But then when I started learning about it, I said, wow, it’s a lot more than just replacing workers.

Which is funny because four years later, after generative AI, now we are actually talking about replacing a different kind of worker with our knowledge workers, which we can talk about. Then I realized how artificial intelligence, machine learning, natural language processing, all these buzzwords I had to learn, were already being vastly used within the HR function, not to displace the workers, but to actually perform HR functions that HR managers have been doing in their entire career.

And to do that was really having software designed, developed and deployed to make decisions better than humans making them. Not just on being efficient, not just being more economical, but dealing with the biggest problem in human capital management, in human resources. And that’s the H (human), because a lot of it was based upon and still is. If we can remove the H from these decisions, if we can remove the humans from these decisions, then we will be without bias.

Then, humans can’t inject their bias into employment decisions, which we know have plagued employment decisions before, during and now where even though these laws are here and taking a step back again. In the last two years, this agency has collected $1.2 billion from employers for violating these laws. So, as entrepreneurs and technologists look at stats like that and they say, well, how can we have artificial intelligence, computers, do that better? And as I started to explore it, I said, yeah, if properly designed and carefully used, AI can potentially make decisions without bias. But then I also realized at the same time, if the AI is not properly designed, or if it’s not carefully used, it can scale discrimination to the likes we’ve never seen before.

So, it really took me to understand not just the AI industry, not just technology, but how impactful it has been, it will be and it is to our space. So that’s why we decided, and I decided to make this a priority for the agency. And a lot has happened since then.

Satwik Mishra

The EEOC absolutely has a fascinating history.

And like you said, with your term, I feel like building that proactive lens of getting ahead of the curve, you really embodied that. And it’s fascinating. And might I say you are being too humble when you say that you didn’t know enough, because I’ve read everything that you have published on the topic, and we’ve all learned so much from your work.

So, it’s absolutely fascinating. Let’s talk a little bit about your work at the EEOC now. As you say that over the last three-four years, the focus exclusively has been on artificial intelligence and the changes which have come about. And now you’ve focused extensively on the balance between the benefits and the harms of using this, at the workplace.

Walk us through the factors which go into weighing this balance. And secondly, it would be really interesting to know from your perspective what potential benefit of harm do you think is most misunderstood or mis-valued at the workplace right now?

Keith Sonderling

Not being a technologist myself and being in the executive branch and not in an agency, like Congress, that can create new laws, I’ve always understood the restrictions and the confines that we’ve had, and that’s how I knew we had to operate to be impactful. So, just taking it a step back. At the end of the day, there’s only a finite amount of employment decisions – Hiring, firing, wages, training, benefits, promotions, etc. And that’s what we regulate.

And that’s what we’ve regulated since the 1960s. And from our perspective to look at how to analyze these AI tools, don’t forget companies are making those decisions with or without technology. And they have been making those decisions, since companies have been in existence. So, in a sense, I try to simplify it and say like, we regulate the outcome of the employment decisions, whether you’re using AI to completely make that employment decision, whether you’re using AI to supplement or augment that employment decision.

There’s still going to be an employment decision. And that’s what we know best. And just that the employment decision is based upon the candidate or the employee’s ability to perform the job, based upon their skills and their merit? Or is it based upon bias, which is unlawful so that just, to simplify things, I always went back to the basics there.

And, when you talk about artificial intelligence making all those decisions in the HR space just so everyone understands, there’s AI out there related to every type of employment decision an employer is making. There’s software out there that will completely draft the job description, a job advertisement. There’s software out there that will do the advertisement or software out there that will collect the resumes, that will review the resumes, that will rank the resumes, that will determine which resumes to interview.

And then, of course, there’s software out there that will completely conduct the interview from scheduling to substantively asking the questions to grading the interview, and then software out there to determine whether or not to make an offer, how much to pay the person. And then once they accept the offer, there’s AI software out there that will look at your social profile and tell you where you should physically sit within the office, what colleagues you should interact with based on AI.

And then when you start working, there is software out there that will be your boss, that will tell you what your assignments are for each day. They will grade you. It will determine what your performance reviews are. They’ll determine if you get promoted, or demoted. And then there are AI algorithms out there that will tell you – you’re fired, if you’re not meeting expectations. So, this isn’t hypothetical. This is out there right now related to every type of employment decision that employers are making, and they’re making it with or without technology. So, when it comes to using technology for these decisions, we’ve historically relied on humans.

I’ve taken the approach that – Can AI absolutely help you make better, more transparent, more fair employment decisions based upon the ability to do the job based upon neutral metrics, based upon industry standards without the human bias?

Absolutely. Is it the best thing in the world? Is it going to completely revolutionize everything? Maybe! Is it the worst thing in the world? Is it going to discriminate against everybody? Maybe! At the risk of sounding like a lawyer, which I am. It all depends. And that’s the trickiest question. And I always felt here at the EEOC, we shouldn’t be in the position to tell employers not to use certain software, because if we are, then they’re going to rely on some of the issues that we’ve been dealing with – Humans – which has led to the creation of this agency.

The decision you have to make is what software are you going to use? What purpose are you going to use it for? What are your existing policies and practices for that? And when you’re using these tools, how are you going to comply with the longstanding civil rights laws?

And that is so complicated because every use of AI technology has a different analysis. So, if you want to use facial recognition in a zoom interview to determine a candidate’s score, whether how much they’re looking at the camera, how much they’re smiling. Grading that on a score, you are allowed to do that except for, in certain states, of course.

But is that going to give you the best candidate or will it potentially lead to disability discrimination because the person can’t smile, or national origin discrimination because it’s not culturally appropriate to smile in that situation, even though that would be the best employee. That is something you need to account for and assess.

And even with some of these programs that look for the best candidates, are they going to be sophisticated enough to deal with some of the very classic discrimination examples we’ve seen in the AI space and specifically employment saying – we can’t take into factor whether you’re a male or female and, you’ve seen cases where the vast majority of applicants are men.

And the AI thinks that must be the number one qualification for that job because there are so many men applying. That must mean something more than skill, more than the ability to do the job, and that will then play an unlawful factor in making a determination if the AI isn’t sophisticated enough to exclude those characteristics, that you’re not allowed to make an employment decision on and actually look at merit.

So much of it is what I said earlier. It’s the design side which the vendors need to deal with, and then it’s then the use side, which the companies can use, and have to deal with. Because in both of these examples, whether it’s the data, SAT discrimination or in the use of not being able to account for disability or national origin or color, if the vendor can then design the software to say, well, I’m not going to be able to have the artificial intelligence look at any of those protected characteristics, such as male or female. And even though 90 out of 100 resumes are male, the AI is not going to look at that. It’s going to exclude all that. It is going to mask all that, and it’s going to look what are the underlying actual skills that these potential men, who were the more qualified applicants in the first round, had? And then where do the females rank within that?

Where does everyone rank neutrally based upon that skill, which is how hiring works. Or in the other example, how are these tools, especially when you start dealing with natural language processing going to account for foreign accent? How are they going to account for disability to allow those individuals who are in that class to be treated equally, which is what the US law requires, ‘Equal’. It’s in our name. And that is so much on the design of the program that the vendors can do. And then on the flip side, you can have the most perfectly designed algorithm, you can have the most perfectly designed tool. And if you don’t have those governance constraints within your organization, if you don’t have that training, the vendor can do everything they’re saying.

And then somebody within your organization can use these AI tools now, that is faster and more efficient than ever before to make an unlawful hiring decision whether that’s including certain people based upon characteristics you’re not allowed to or excluding them. I say that in the aggregate, it’s difficult to just say one broad brush about if AI is going to revolutionize every side, everything in HR space, or if it’s going to cause significant discrimination, it could do both.

But that is on this community, whether it’s the vendors, the users, the employers, developers, all these different stakeholders, we have to come together to deal with all these factors because unlike using AI in other areas, reviewing documents, making shipping routes faster in employment, you’re dealing with what you said in the intro “Fundamental Civil Rights”.

So, if the AI makes a mistake here, you’re excluding somebody from the workforce based upon their religion, based upon their age, based upon their national origin, based upon a disability, etc., and not allowing them to provide for their family because of that at scale, I think that’s the difference here. And why leading with the workforce, leading with the HR from our perspective here at the EEOC, has brought us that attention because everybody gets it.

Everybody has entered the workforce, everyone’s had a resume, everyone’s had a boss and the issues that we’re dealing with related to this AI software is being used equally, whether it’s in the United States, whether it’s in the EU, whether it’s in Asia, it’s the same core issues.

Satwik Mishra

You’ve said this in several other interviews and in your papers, but one of the things is to look behind the wheel.

And it’s not just technology, it’s affecting people, it’s affecting different communities. And it could elevate them, or it couldn’t. Basis on what you said, it depends on how we design it and how conscious we are in our decisions, why we are deploying these technologies. But in your role, what you’ve done is also engaged, like you said in the beginning, engage with a lot of corporate leaders, engage in society, engage with community to understand what is happening, walk us through that approach.

Like how does the EEOC approach and engage in multi-stakeholder dialogue to work these critical issues, and how important has that been in the framing of what you have thought about the current situation?

Keith Sonderling

I think it’s probably the most important thing that I was able to do here related to the AI governance world, related to regulations, related to the use.

And let me let me just explain why. Because, again, you have to take a step back and look at our history. Since this agency has been around, we’ve had a very small world. I mean, a very big world. Every employer, basically every employee, I don’t want to minimize that. But our stakeholders were limited.

They were employee groups, employer groups, unions and staffing agencies that was everyone in this world and we never had technology that nurses, making or assisting with employment decisions. So, we’ve operated in this world of people who understand the issues here. But now you have that technology coming into play and essentially making those decisions about being a big part of decisions.

And then I learned that, taking a step back, who is this new world when it relates to tech and AI, especially in the workplace, it’s such an expansion of those stakeholders. So where does it start? From the very beginning. The VC money that’s going into this that is used to going into tech, I realized that they need guidance and tools of what technology to invest in because they’re not steeped in federal employment law, and they do not want to invest in a product that’s going to violate civil rights.

Then you have entrepreneurs who are extremely talented in actually making the technology, whether it’s from actually being able to code and make the AI or the business ideas to be able to revolutionize some of these hiring processes. And most of them, the common theme is that they are entrepreneurs. They’re tech savvy. They’re not HR professionals.

They speak a different language as well and they don’t want to build with the money they just received a product that’s going to violate civil rights laws. Then you have what I like to say, the group that is in the toughest position, is in the middle, and that’s the companies who are needed to make the decision of what products to buy, what products to deploy. on the front end, dealing with the vendors.

And now they need to speak the language and understand the language the vendors are speaking, and they’re used to buying off the rack SaaS software, which doesn’t work in our space because, the rights and requirements for each individual employee are different. And then, after making the determination of what products to use, they have to then determine how are we going to use this on our applicant pool? How are we going to use this in our workforce?

Ultimately in our space, which we’ll talk about, the company using the product is going to be the one that’s 100% liable for those decisions, the final stakeholder. Arguably the most important are our consumers. And who are the consumers in our space that’s different than other spaces? The consumers are the applicants. The consumers are your current workforce.

They’re you’re going to speak a different language and there’s a different level of trust required for them when being subject to these tools. When I realized that’s our world and everyone is speaking different languages, this is going to be a lot of work getting everyone on the same page, although everyone is on the same page about not violating the law,

You’d hope. We’re not violating fundamental civil rights. But in all sincerity, everyone was looking at it from different directions. For VC – how do we invest in the right products and how do we return our money? For the entrepreneurs – how do we make these companies very large, the companies buying it, how do we use these properly and have more efficient hiring decisions with less bias?

So, it’s taken a lot of effort to bring everyone together to have those discussions to see what all the different, unique concerns are. But more importantly, explaining how these tools are, how we’re going to look at these tools. And in federal law enforcement investigation, which of course is the fear of using all these products. Here is what we’re going to look at.

So many people wanted to say, this is so novel. We don’t understand how an investigation is going to work. And that’s where I’ve been trying to change the narrative and saying, look, right now our investigators are trained to look at employment bias. And how do we do that? By interviewing people by, unfortunately, subpoenas and depositions and litigations.

But if this AI chain can be done correctly through transparency, through accountability, robustness, all these words that I know you’re very familiar with, here’s what they mean in the employment space. What is your data set? for most people in this world, they don’t know what a data set is. But to simplify that saying, your data set, let’s say, it’s your current workforce.

If you’re going to be using AI, what are the dynamics of that? Is it imbalance there, what going to lead to an imbalance, result potentially on one side and how do you potentially justify that. If not? We’re just talking about the scalability here when you’re using AI and then if you’re ever challenged on an employment decision, instead of, having to retrace that now you may have a contemporaneous record that’s defendable exactly what you asked the algorithm to do.

Let’s just say, in a hiring case where you have AI help you with the job description, and you have AI do those things based upon industry standards, based upon your own standards. And you can then show that the AI was only looking at these qualifications that are necessary to do the job, and not the other factor.

That’s how we selected these applicants, we didn’t know what religion they were. We didn’t know what their sexual orientation was or pick any of the factors. Then you potentially have an undeniable record of a proper employment decision in real time. And that’s taking a step back to explain how then you can use that to disprove discrimination.

I think it really helps everyone understand saying – now I understand why these need to be properly designed and carefully used, now I understand why these factors need to be accounted for. Because if starting with the end of a class action lawsuit or any EEOC investigation, which everyone fears, if this is what they’re going to look at, well, I can show that and I can build an algorithm around that.

That’s going to account for that. And I think that was just a different way of thinking. I think most people have never experienced a regulator trying to be proactive, because it goes to my theme of why I got involved in this. It’s because not only do I want this to work, but because it can help eliminate bias if it doesn’t work.

The scale of these lawsuits, the scale of potential victims of discrimination through AI, is just far greater than anything this agency or courts had ever seen before, and that requires learning different languages and dealing with different stakeholders. Many, especially when you start looking at how the global community, had never heard of the EEOC, yet they’re creating products directly regulated by us.

Everyone has heard of the FTC on the advertising side, but that’s just the first layer. When you actually get to the core, when you get to the bias side, which is one of the hottest topics in AI. We’re the core regulators. So, it really took almost injecting myself in that conversation and then have a stake in it.

And once we did, everyone was like, no matter where you are in the world we want to get your side right as well.

Satwik Mishra

So, one of my former professors is Defense Secretary Ash Carter at the Kennedy School, used to always, tell us, just look under the hood. Look under the wheel.

Keith Sonderling

It’s not that complicated. If you’re able to build out the pipeline of decisions which are being made by technology, and if you can create out those structures, you’ll be able to navigate through all of this very well. But you managed to do was absolutely fascinating in the sense of and also you mentioned this changing the narratives, changing the narrative when you came in and took over and went ahead of the curve, tried to go ahead of the curve about what the current marketplace challenges might be.

I wanted to talk one more thing about changing the narrative, because, it gets into the Washington D.C. part of this where it’s just easy for people in D.C. to say, this is bad. And before generative AI, people started citing the benefits. If you look at some of the oversight towards this agency, sort of the original interest from Congress and civil rights groups about AI was all bad.

It’s going to discriminate, It’s going to cause issues for older workers, for disabled workers. It’s going to discriminate based upon race, based upon national origin. And you had heard some of those original horror stories, with the discrimination against females, just the very basic data set discriminations, which are used largely in the AI space, whether it’s from the employment, for the housing and the finance credit. The original kind of thought leadership on how I can discriminate was all bad.

Because, when I realized early on, it’s no longer a question, especially for large organizations are going to use AI in HR that’s when we had to flip the narrative to, like I said earlier, how are you going to use it? What purpose are you going to use it for? Don’t use it because it’s going to be really bad and discriminatory.

So, I think that is also something really important where a lot of governments inherently, because we have a very important role to play to prevent fraud or prevent discrimination, to prevent violations of the law that you really need to take that compliance side of it as well. And saying, if this is where the markets are going, here’s what we’re going to require opposed to scaring you into not using it.

And that was a very, very different approach, that I tried to lead.

Satwik Mishra

This is absolutely wonderful. But on exactly this aspect, I’m going to ask a little bit, to step back and maybe put on a lawyer hat. Tell us, you’re the interface of artificial intelligence at the workplace and EEOC’s work, which has been absolutely fascinating.

But tell us, how does this role shape your perspective on the role of federal government in general, emerging technology guidance. There are a lot of issues that we’re dealing with at the marketplace. And like you said, there are several agencies looking at it. From your experience, what do you think about the federal government’s role in technology guidance?

Keith Sonderling

Well, this is a very complicated question. Again, I’m in the executive branch. So, I take the very narrow approach, we cannot make new laws. We have to enforce the laws that are in the books and the laws that are in the books of this agency are from the 1960s although they may be old, they’re not outdated.

They’re going to apply to all the decisions. And it doesn’t matter to us whether it’s an AI making decision or whether it is a human making the decision, because the employer is the only one that can make an employment decision under our laws. It’s easier for me to say this with the authority and power we have here, this is how the law works.

This is what Congress, all of us doing, and all we’re going to do now, I can’t ignore the complexities of the changing dynamics in how governments want to regulate AI and how they want to regulate AI in the HR space. So, taking out that side of the executive branch limitations. You’re right. I’ve learned a lot and I’ve seen a lot.

And I think the trickiest thing is understanding that without additional resources, without additional help, we are not technologists here. And if Congress decides to give us a lot more money to hire technologists to actually dive into the algorithms, dive into the development and the deployment from a technical side. Yeah, that will significantly change things.

And then it would be a different conversation than just the lawyers dealing with the results and dealing with the technologists on the design, which we just don’t have the ability, we don’t have the funds. Really most agencies aren’t able to do this. I think what you’re seeing now, in the meantime, is states and foreign governments trying to get involved, trying to fill some of those gaps that we can’t do without congressional authority.

And if you look at the HR space in particular, it’s one of the more active ones when it comes to AI regulation. There is just a quick brief history is that the first laws related to AI in the workplace were out of Illinois and then Maryland, essentially banning facial recognition in interviews. Then obviously New York City local law 144 – was the first comprehensive AI HR law anywhere in the world related to artificial intelligence being used to make actual employment decisions and hiring, promotions. Then obviously the EU AI act diving into some of the cases for employment being in the higher risk. And same with Colorado and a lot of proposals and California and New York state.

Recently, just last week in Illinois, they passed another law related to some of the disclosures, and then not being able to use zip codes as proxies for artificial intelligence because of the longstanding discriminations there. So, you’re seeing now everyone wanting to get involved in that space. And my personal thought of this, having that national perspective is that, I commend them for trying to dive into this very complicated issue, but without a federal standard, without these requirements coming from us, especially for national or global employers, it’s going to cause potentially more confusion than compliance.

And I would like to use the example of New York. So, New York’s local law 144 requires that if it’s their own definition of what an AEDT (Automated Employment Decision Tool) means, well, for them, it has to essentially, for the most part, make the employment decision, right. The employment decision versus, being able to say, we have some humans in the loop.

It’s not actually one of many factors. Then it doesn’t qualify as AI even though it’s clearly AI. And if it does qualify as AI need to do a pre-deployment test and yearly bias, not even post those results. But if you look under the hood, what does that bias audit entail on the front end? It allows you to use other employers or synthetic data, which the EEOC would never allow.

And then it only requires you to do the audits on race, sex and ethnicity where we have many more protected characteristics. So, I just don’t want people to lull into thinking that they’re fully compliant if they do these little one offs in certain areas where states care about whether it’s video, whether it’s certain disclosure. And then, I think if you’re completely compliant and be surprised if the EEOC comes out and says, well, you’re using it for terminations and show us that.

And, we didn’t think that was regulated. So, you see kind of the issues there. And then, you’re starting to see some common threads now across the board of whether it’s overseas or here in the US in the HR space of saying, well, what are the actual themes we’re starting to see? And what’s emerging is consent, disclosure of the vendors, of how it’s actually being used.

And also, that bias auditing requirement. And even in the EU, with the high-risk systems requiring those audits then that takes the step back of saying, okay, well, now we have to do audits if we’re using AI in HR, whether we’re in Colorado, California, New York City or in the E.U., how do we do that?

And then if these states and foreign governments are going to start diving that well, what’s the new standard for auditing bias in HR? what everyone said across the world will look to the EEOC. And if you look at EEOC, it is based upon our uniform guidelines and employment selection procedures from 1978.

All right. So, it all comes back to that as progressive as some of these new laws are, the basic standards of how you actually get there, are based on us and are not going to change. And I’ve argued too, look whether or not you’re subject to the EU AI act, whether or not you’re subject to New York local law 144. What are the potential benefits there that you can see and use as yourself as a risk mitigation tool such as the bias-lines? And I’ve encouraged self-audits, since I started diving into this. Because if you’re using the EEOC standards, which of course, all these places require you to do, you have the ability now to test these tools before ever making a decision on someone’s life.

Well, prior to this. When you do an employment test where you have some kind of assessment, you hope it doesn’t discriminate. You do everything in advance to mitigate it. And then if there’s discrimination, you’re liable for that now with these AI tools, you can test in real time to see if the skills and everything you’re asking it to do related to your business is actually finding you the best candidates, or if it’s disproportionately excluding certain groups, based upon something that you could potentially change that is not necessarily for the job which we see all the time. So, even though that’s not required under federal law, even though you may have to do part of it for some places, if you’re doing this in advance voluntarily, you get to root out employment discrimination, not have that risk that it’s going to do wrong and feel much more confident than you can with your regular employment tools.

So that’s a long-winded way of saying, if these state, local, and foreign laws are pushing people to do things that you could be doing voluntarily any way to prevent discrimination, I think that’s a really good thing.

Satwik Mishra

So, one aspect of this is definitely we are starting to see the emerging trends of regulation, what it will look like.

And you’ve done, this was great answer to understanding what those emerging trends would look like, what’s missing and what has been done well. Let’s talk about the two questions that I definitely want to ask you is one of the issues that all of us deal with is what would the future of work look like?

What do you think from your perspective going ahead? Let’s take a foresight. I would never ask ten years ahead because no one would know. But let’s say 2 to 5 years ahead, what would be not the regulation side, but the future of work side. So, what do you think should and will change? And more importantly, as you said, as we’ve been talking about the constants.

So, civil rights movement, the civil rights laws, the 1970s interventions which have stayed constant. What do you think when saying ‘consistent’? And what do you think would change in the next 2 to 5 years and then going.

Keith Sonderling

Well, I think this conversation has completely changed since generative AI. And everything we’ve been talking about so far is using machine learning to make human employment decisions.

The decisions that humans were making, whether or not to hire somebody, whether or not to pay somebody, how to pay somebody. But now we’re seeing generative AI come into place and the displacement of jobs is no longer related to factory workers where you actually have a physical robot. And it’s coming for everyone for the knowledge workers across the board.

So, I think that immediate is the most number one immediate impact in the workforce. And from my perspective on that, I have significant issues like I do with any kind of AI in the workplace. How is that going to be implemented? And I think now with the rush to implement some of these generative AI tools, because every CEO, every board of directors is reading those same studies, we read about – How it’s going to displace 300 million jobs?

Is it going to make everyone 80% more effective in work? You don’t need to hire for these jobs anymore. People don’t have to do what they don’t want to do. Work and they’ll be happier working. You’re seeing a rush to say, well, how do we get that with an organization? And, if that is not built in with not only these ethical and legal safeguards, but more importantly trust, it is not going to go anywhere.

It’s going to lead to more claims of discrimination, because workers are going to feel that they’re just really being pushed out and that they’re actually not the employers not using these tools in good faith. They’re actually using these tools to displace them, and they’re having their employees train their robot replacement. And that really goes to the messaging, the trustworthiness of the name of your organization, Centre for Trustworthy Technology.

And it really resonates in this context, because if you’re not going to account for how are disabled workers going to be able to use generative AI, whether they need more time, where they need more adaptive devices, how are older workers, going to be able to use these tools who didn’t grow up on their phones like the Gen Z workers?

And are they going to really believe that the company is trying to implement this to help them, or are they trying to push them out? And we’ve seen a lot of workers who’ve been doing their job great for 30 years, the highest performance review of their job. And now suddenly, if they don’t become a prompt engineer 30 years into their career, they’re going to be forced out.

So, I think a lot of that on the front end is just how we message this, ensuring that workers with disabilities, workers who are older, who are more vulnerable to this, these issues or job displacement are going to have the tools we need as well, or it’s not going to work.

So, before we can even get there, from our perspective, we don’t want to see this led to more discrimination. We want to have all those safeguards in place to ensure that everyone can use these programs properly and that there’s no fear and anxiety. There’s been some studies out there that most workers, a lot of them, are very anxious to these tools.

And that is leading to mental health issues in the workplace. So, you can see from my perspective, from the laws we enforce, my head immediately goes there and says, how are we going to do this right? And what are those safeguards going to be in place to make sure that’s happening? And like everything else switching gears here, when this inevitably comes, it makes people much more efficient, they don’t have to do the job they don’t want to do, etc., all those studies.

And then we start getting into, well, who are going to get those employment opportunities to reskill, to retrain, to start getting the newer jobs that are going to be created out of this. And look, historically, we see some of these job displacements impact certain groups over the others. And there are some studies just on the quadrants of workers who are going to be displaced immediately, are lower wage jobs, and it’s going to impact certain groups more than others.

So, I think these are just really important dynamics that have to be in play. Because the worst thing is for a company to spend millions of dollars on a generative AI tool that they think is going to help their workers make them more efficient and it leads to mass claims of discrimination because it disproportionately eliminates certain jobs that you can correlate to, national origin, race, etc., which we see happening.

So that’s where I think before we can get into this whole future of work conversation, I have immediate questions and a raising immediate awareness on the implementation of this.

Satwik Mishra

Here, a lot is at stake. And no one does. Like no one brings light to it better than you in this space when we are in HR space. My final question before I let you go.

And we’re coming on time. But this is one of my favorite questions I would love to ask different leaders in their own fields. How do you think about trust? What is your perspective on trust? Certainly, within this domain on AI in the workplace?

Keith Sonderling

Well for me it’s a little different because I’m a lawyer, I’m a regulator.

On the compliance side, there’s going to be no trust if you’re violating the law. There’s going to be no trust. Like I just said, if you’re disproportionately impacting certain groups, whether females over male, right. And like some of these classic examples of who is being discriminated by AI, not just in the employment space, but the housing space, the credit space buying in, you see a lot of these studies of some of the same characteristics of those who have been historically marginalized and discriminated against groups is just being replicated and amplified when it comes to, technology. So, I define trust more uniquely. I’m saying that, well, you can’t even get to the trust conversation, which is incredibly important. In some cases, more important than some of the issues we were discussing.

Because you can legally have everything solid and not have trust and you’re going to potentially lead to those issues. Well, how do you get the underlining right on this? And for me, that’s the civil rights protection. And before you get to define what trust is, show me how you’re going to comply and ensure that these are very important protected characteristics in the US and around the world are being protected before you even tell me the benefits of the technology. So, I think that’s how we get there in the HR space, showing that, like, look, most people don’t trust employment decision making made by humans. And there’s a really interesting study that I encourage everyone to read by the Pew Foundation about this issue in HR. And we live in a bubble.

The vast majority of Americans don’t know this technology exists in the workforce, right. So, at first glance, most people don’t even know the topic. But when you dive into it, you ask people, would you rather be rated by the human or machine? And a lot of people said human first because that distrust of the technology, that distrust of big tech, the distrust of algorithms.

So, when they flip the question, they said, does bias exist? Yes. Bias exists in employment and hiring and promotions. Most people said yes. Well, understanding that exists. Flipping the question now, would you rather trust that same hiring manager or same boss who just had a bias or, a potentially neutral algorithm, and then people said they’d rather trust the algorithm.

So, I think that’s really a good example of trust, because if you could build that trust, then people are going to potentially want to be subject to this technology more than humans. So, a lot of that is from the good work you’re doing in other organizations of trying to lead with trust, lead with ethics, lead with legal compliance.

And this is not a criticism of the tech industry. I just think it’s new because a lot of the products that have been developed previously, whether it’s making data transfers faster, cloud tech, all these things, you weren’t dealing with human, you weren’t dealing with civil rights. You were dealing with business issues of reviewing documents story that all these things and now you’re using this for good, in civil rights areas.

It’s just a completely different mindset of what that trust is.

Satwik Mishra

Sure, Keith. Thank you so much for being here. It’s always such a privilege to talk to you and I always end up learning so much.

Keith Sonderling

Thanks for having me.

Tapan Singhel Podcast Chapter Notes

Journey to the Insurance Industry (06:03)

The Key to Innovation (13:18)

The Importance of Humble Curiosity (15:50)

Staying Ahead of the Curve (18:14)

Lessons from Crisis Management (24:10)

Driving Through the Uncertainties Amidst Emerging Technologies (28:59)

How to Scale a Standout Organizations (36:05)

The Importance of Trust Today (39:44)

Advice to Young Leaders (43:03)

Tapan Singhel

Podcast Dialogues Transcript

Satwik Mishra

Welcome everyone to another exciting edition of Trustworthy Tech Dialogues. I am Satwik Mishra, Executive Director of the Centre for Trustworthy Technology. Today, we are diving into a topic of great significance, ‘Leadership in the era of emerging technology.’ As we navigate these uncertain times, exercising leadership has never been more critical. In today’s world, innovation, creativity, and curiosity are not only buzzwords, but the foundational pillars of a successful corporate strategy.

Leaders are increasingly recognizing that fostering these qualities within their organization is essential for staying competitive, driving growth, and navigating the complexities of a rapidly changing global landscape. Cultivating a culture that sparks new ideas and encourages out-of-the-box thinking has become the key to developing products and services that are not just relevant today, but future-ready. For this culture, curiosity is the engine that drives continuous learning and adaptability.

It is the key to staying connected with the ever-changing ground realities around us and staying ahead in today’s fast-paced, competitive world. Leaders who nurture curiosity within their teams create a culture of exploration and learning, leading to higher engagement and ultimately more satisfied customers. This commitment is particularly vital in industries where stability and trust are paramount, such as the insurance industry.

We find ourselves in an area characterized by profound economic uncertainty, unprecedented technological advancements, escalating climate change, and the imminent reality of looming global health challenges. In this context, the insurance industry is increasingly seen as vital in delivering the stability we so urgently need. It is an industry which uniquely stands at the crossroads of the challenges and the opportunities presented by rapid change.

At the heart of this intersection lies trust, which I believe is fundamental to the industry’s mission. To explore these themes and more, today, it is my absolute privilege to be joined by a global luminary of the insurance industry, Mr. Tapan Singhel. With over 30 years of rich experience in the insurance sector, he has been a key figure at Bajaj Allianz for more than 22 years.

He has served as the company’s MD and CEO for over 12 years. He is a member of the Insurance Advisory Committee of IRDAI and the Pension Advisory Committee of PFRDA. He serves on the board of the Institute for Insurance and Risk Management. He has been the president of the Indo-German Chamber of Commerce and continues to serve as a board member.

He is also a member of the Governor’s Council of the World Economic Forum. Under his visionary leadership, Bajaj Allianz has emerged as one of the largest private general insurers known for its growth, profitability, and customer centricity. He has been awarded innumerable awards and recognitions over his illustrious career, including most recently the CEO of the year at the India Insurance Summit 2024.

Beyond these astounding accomplishments and an enviable body of work, I have had the distinct privilege of knowing him for over a decade. Throughout this time, I have witnessed firsthand, leadership embodied by him, which is not only strategic and steadfast, but also profoundly humane. It is an absolute pleasure to welcome you to the Trustworthy Tech Dialogues. Thank you so much for being here.

Tapan Singhel

My pleasure, Satwik. It is amazing to be with you on this forum.

Satwik Mishra

Let us start off at the beginning. Walk us through your journey at Bajaj Allianz. What initially led you to the insurance industry and how has the industry evolved over the last few decades?

Tapan Singhel

If you go back to the era when I joined insurance, at that time, hardly anybody wanted to join insurance.

It was like somebody who did not do good in life would join insurance. That was the image. And the funny part was that when we had this JB (Joint Business) with Allianz and speaking to some of my Allianz friends, they said, ‘No.’ In Germany, it is a bit different. When you fail in life, you become a bartender, and when you fail as a bartender, you join the insurance industry.

Imagine! That is the image that the industry had in that era. I am a scientist by education, and I wanted to get the Nobel Prize for my country. That was the burning desire as a child growing up. But because of some challenges, my friends said – This is interesting, take this government examination. I joined, public sector, which in India, you know, is run by the government insurance company.

Very early on, I paid some claims, and, when I saw the difference it made to the lives of people, I realized it’s a much bigger thing than what we understand. You contribute to society, to people at a very, very different level. And you can make a huge difference to your own country also by being part of this industry.

And that’s why I stayed on in the insurance industry for over three decades. But if I look at the last decade, not only for insurance but across the globe. Earlier on, when we used to do an innovation or do something new, we would reap the benefits for a decade or two, and you feel like you have done something really amazing.

Today, you do something which you feel is amazing, in just about 20 days, 25 days, everybody else has done it just like that. The speed at which things are happening now is so phenomenally fast. It is exciting. So, if you are not used to this excitement, it can become overwhelming. But if I use excitement and constantly keep on thinking about what next, then you really enjoy this game.

The industry has gone through massive changes in the past 10 years. And we will talk about it as we move further. But I will leave you with the thought process that if you meet somebody on the street, they will have distrust in insurance.

And they would say, these guys do not pay claims. And this is happening globally. You ask anybody, any part of the country or world. You are in the U.S. right now, go and ask somebody what do you think of the insurance industry?

But look at the insurance industry, they pay millions of claims, and their profit margins are actually much smaller than any other industry which is known.

So, the thought is, why should this be the notion when they are paying claims so well? Why should there be distrust in the industry? I will leave you with this question to answer that later on, as we progress in the conversation.

Satwik Mishra

So, a little bit on the distrust. And you have seen the shifts in the economy and society over the last two, three decades in this industry. What I wanted to specially do and learn for myself, and the audience is your distinct leadership style over the last three decades, how has it shifted?

And you speak specifically about the last decade and the changes which have come about. How has the changing environment, the economy, society, changed your leadership style at the company?

Tapan Singhel

Well, is it the economy or my age or the years of being there? Satwik, I can’t define what would be the reason for the changes. But let me tell you how I saw myself some years back, how do I see myself now and how do I see myself going forward?

So, when I was younger, aggressively building the company, I would like to be, very, very strong like an alpha male, as you define it. That is how it would be. And I would feel that ‘this is it.’ Nothing can go wrong. And this is how the world should be, and this is how people should think, and this is how it is.

And way back I remember, there was a professor from Harvard, so he was doing a course about two decades back, and he was doing some profiling. And he told me that, you are like an eagle but the worry is that you will scare the doves off. People would not speak when you are there.

That hit me really hard. I was thinking, why should it be, why should people not speak up? They said, no. You are too aggressive, and you do not listen. But my style was doing very good. I was very successful in what I was doing.

But that conversation really made me think. And I started making efforts to see that I listened to everybody. And then, when I would be on the ground, looking at, people, their difficulties connecting with them, it brought a lot of different perspectives, to leadership, to business, to the way I see people, to what business should do for society, how organizations and corporates can make a huge difference to the country, to society, to individuals.

Those thoughts came to my mind, and I saw a shift in the way I was behaving, with people, with customers, with stakeholders. The level of empathy moved up significantly and much higher. And this also gets very good results. It was not that my early alpha male perspective was the only one giving results. When I moved forward, I realized that it is not only about the materialistic way when we look at things.

How do you connect to the higher plane? How do you see a much bigger purpose in building organization and company and team, which actually is sustainable and last for hundreds of years? So over time, this has been a huge shift in the way I have looked at things.

But if you ask me, the current phase, I think this is better than the phase that I was in the past years.

Satwik Mishra

Eagles, doves, engagement and empathy. That is fascinating. But one of the things which always stands out with your leadership style is your continuous emphasis on engagement. We have seen you engage with probably your employees and more importantly, even customers.

I have seen you travel all across the country and even abroad, making sure that you meet people outside and someone at your position meeting people regularly, what is the value of that engagement with people, customers and your regular employees? What drives you to do that and what is the benefit of something like that in today’s age?

Tapan Singhel

So, Satwik, somebody asked me a question, how do you innovate so much?

So, as a company, we have done a lot of stuff which was first in the world for the industry. Not only first in the country. And I was thinking, how does it happen? And the answer which I gave was – To be able to innovate you should have, two things I feel.

One is you should be eternally curious. Eternally curious means we should be like a sponge, pick up every bit of information and knowledge and be eternally curious. Ask people, be with people. Figure things out. Because your mind is a data lake. The more you fill it up with knowledge, it may be random knowledge, but it is perfectly good.

And second, you should have a high-level of empathy to be able to feel the pain and the concern of an individual, of the society, of the country, of the world at a very high level. When you combine these two together, your mind will figure out the solution and that is what we call innovation. And then you should obviously have the understanding to implement it.

See the transformation and push it through to the last mile to be able to get to where you want. Now to get the understanding, be it business, be it what is happening or what do you feel is going to happen in the immediate near future.

You don’t connect to people at an amazing level in which you are just being how should I say, at your humblest form, where you are able to listen and understand every sentence, every word the person is saying. Be it your customers, your stakeholders, your own team members, be it a person on the street, be it a person in Parliament, or anywhere.

You will not be able to take your organization to the next level. So, meeting people is very essential. Essential to understand business, to understand what’s the requirement, to understand what you do. Also understand from people’s perspective, what they are missing and what you as a leader should be doing.

Because leadership is not the way you think, because it is not about, leading or something like a very powerful marshal on the top. It is the lowest rung in the hierarchy. Where you actually have to be at every place to figure out how you can ensure that people are more comfortable and happier from the time they met you, to the time they meet you next.

Satwik Mishra

It is not just your engagement, if I may say so. Like, not only are you meeting people, but also, you are meeting people with a lens of curiosity.

And I say this because I have been following your work for over a decade, like I said. And I spoke about curiosity upfront because I cover your blog, I regularly read it, and your blog speaks to not only corporate strategy, but also philosophy or personal development, things which are very wide ranging and help individuals at different stages of their lives. How does your curiosity drive this engagement and what, how has that changed over the years?

Tapan Singhel

So, when, if I go back again when I was younger, I would feel that I am God’s gift to mankind. Whatever I say and do is the right way in the world and nothing else. Today, I feel that it’s unbelievable.

Every individual that I meet, everybody that I see, even if I look at a tree. They have so much to give, and I have so little time to understand and absorb. It is a huge shift. And that is why I said is it age? Is it understanding? Is it life? Is it the economic condition?

But this transformation I have seen. And I am very happy with this equation of mine in which I am eternally excited to meet people and to know and be awed by their experiences, the knowledge, and try and figure out, wow, this is something which is so amazing. I think it just hits me. Some people say it happens when you let go of your ego or you become more humble.

I don’t think I really worked on any of this. But maybe the equation that you have when you think that you are God’s gift to mankind, to a place where you feel that every individual is so amazing, I think that shift in your thinking actually opens you up much more to a bigger inner experience and better understanding of life and things in a much better perspective.

So, this has been the shift, Satwik.

Satwik Mishra

It’s absolutely fascinating. And let me go back to another thing that you mentioned earlier about short cycles of product development and innovation, where you need to be at the forefront every 20 days, every quarter, if I may say so, going ahead. And one of the things that Bajaj Allianz is especially known for is innovation and staying ahead of the curve as a leader.

How did you imbibe this within the work force, and how does that drive the company going forward? And what value would you give to this for leaders today?

Tapan Singhel

So, let us understand this. If you are too ahead of the curve, then it doesn’t really benefit your business a lot.

So how do you do that and how does it happen? I will give you two examples and then I will answer my first question I left you in the beginning saying I will answer later. One example is as a company, we are doing very well. You know, with the mid phase.

But the challenge was that we had the old government companies which have been there for close to 100 years. They had bigger sizes in terms of volumes and number of customers. And then I was thinking the difference is that they have more offices in a huge country like India compared to what I have. So, if I had to be ahead of them, I must have more offices.

That was first. Now to open more offices means I have to invest a lot of money in the infrastructure, which would be a huge capital requirement, which means I have to go to the shareholders and my return on the ROI would be much less. That was one thought which was there. The next thought was that I was at that time the chairman of the Indo German Chambers.

And then when I go to Europe and people would talk of demographic dividend. And that India has demographic dividend, and you would not have it. So that was the first time in early 2000, I was thinking, what are the speed brakes? I know about the dividend. But if we are able to provide employment to the youth here, it could actually become negative.

And then, you normally would say the government should do that, but as a professional, how do I contribute to that is the next question.

So, can I generate employment for a million youth? That was the second burning question. Now, if I am able to open offices across the country and I don’t have to invest in capital in terms of the infrastructure, then I can have a model.

Now you have two statements. I can provide employment as a professional to youngsters in my country. I can generate more business and take on competition. Now to do that, it looks very easy nowadays as you have, internet, good cameras and mobile, and tablets. But in those times, the internet was not so good.

But then you have the prime minister speaking about getting internet access to the villages. And maybe NITI Aayog you would also be not working on those stuff in that era. And that conversation is going on. So, when I started seeing the conversations happening, I realized that this is going to happen well and the speed at which technology is progressing, the cameras and the internet speed, and the tablets are going to get better.

So, I told my team to start working on virtual offices. Where you do not have physical presence, everything can be on the tablet. Today, it looks like everybody is doing it. But that time, people told me it was a foolish idea. My board members told me it is a foolish idea. It does not work. That there is no connection, nothing.

My bet was, by the time we develop this, the connections and other things will happen. And that is what happened Satwik. I think by the time we developed the technology, the infrastructure, did recruitment, you had internet in all places, the camera got better. So, we were just ahead of the curve.

We were the first to do that, and we launched one thousand locations, provided employment, generated business. And now we are ahead of most of the government companies in the Indian market. So, this is an example that you start preparing before the technology has arrived at the stage where it is. Because it takes time to prepare.

After it arrives you may be at the curve or just a bit behind it. But after everyone else has done it, then you are too far behind. But the one who just goes ahead of the curve, and you know you have to prepare in a way that by the time the technology hits, you already are ready to deploy it. That is where the money and the reach and the confidence come.

The second example I want to give you is as I told you is the distrust on the insurance industry, even though they are paying so many claims. And they are really paying through the nose a really high amounts of claims.

It is because the process of paying claims is so cumbersome. You lodge a claim, and you say they ask so many questions. They ask so many questions simply because they have public money and they really cannot be, giving it away to fraudsters and all. And that friction has to go. So, you have to make claims, friction free.

And that is when it happened, with machine learning. Now you have advanced AI, generative AI, and blockchain. We used it initially to ensure that as a customer, when you click pictures, upload and I’m talking about like 15 years back and we transfer money to your account. We tried to make it as friction free as possible.

So, when you put all these thoughts together, you realize that, for innovation, you have to put thinking which would take into account the developments happening. And you are actually at the time it happens.

Satwik Mishra

It’s almost about developing the foresight to see the curve not being far too ahead, but definitely not being behind.

It is absolutely fascinating. But also, this is about developing foresight and in the insurance sector you would also navigate you would have navigated through various crises which have happened in the economy. As a leader, is there a crisis that stands out which brought about some lessons for you?

And how would you suggest is the best framework to think about a crisis as a leader today?

Tapan Singhel

So, in India right now, there is a price war which has become very intense. So, I had to address a team some days back and I could sense that they were saying that the time is bad for business.

Then I got on stage. The first question I asked them was that was let’s say in the 23 years since we have built this company, which year that any one of you came and told me that this is an amazing time, whatever target you give, we’ll give you 500 times of that. That means, every year, there has been some crisis or other.

Every year is a difficult year. That is how business will be. You cannot have everything falling into place the way you want it. And you know, then you sit at home, and business starts growing. It doesn’t happen. Which means that crisis is not something to be afraid of. It is something to be loved. And the worse it is, the better it is for you.

Because in a crisis, the good companies become stronger, and better. And the weak and shallow companies, disappear.

Now, are you a good company? Do you have that kind of conviction and standard? This is what will determine how you move forward.

Okay, let’s say these many years I have been, when was something where I felt everything was perfectly fine and I am just happy? It does not happen. There would be some pressure. It’s how do you handle that? How do you see it and what is the point at which you make a company stronger? What are your competencies? In your own life, how do you handle that and how do you make it stronger than what it was, takes you to the next level.

Crises are good. Bad times are amazing. You should be excited about it. The worse it is, the more excited you should be because that gives an opportunity for you to really know, transform and push things. That is the first thought process that you should have.

The second is that you start looking at it. Be realistic about where you stand and how things are.  Have a dispassionate look at it and start figuring out from the end consumer perspective. End consumer does not mean only the client. It’s also your own employees. I gave you an example of Covid also, and how we handled that. And what exactly is hitting them hard?

And how, with what constraints you have, are you going to stand by them and make a huge difference to them? And how do you transform yourself to deliver that? That is what should be the obsession for the entire unit or organization, that you will lead. And that is when you put these two together, you start enjoying it. And then as time progresses, you start emerging stronger.

So, in our case, we are in 2007, completely free pricing. Or when we started coming in 2001, we only had government companies. So why would somebody trust a private company at that time. People would say that Tapan we know you, but how can we trust you? Government agencies not accepting our quotes, to the price going to 90% fall, to an environment not for the profits of the market going to disappear. In 23 years of building this organization, I have seen at least ten huge crises.

But fast-forward today, if I look at the past ten years, 70% of the industry’s profit was made by our company alone. We have 150 million customers organically grown brick by brick. We have the largest distribution of bank insurance in the world. We have 240 banks, and a hundred thousand agents. If we look at it, it is all built organic. Every crisis made us stronger than what we were before that. It is just the thought process that we just have to do.

So, in Covid, I remember when hospitals didn’t have beds, we actually set out 18 ambulances in different cities with six hours of oxygen competence. So, if something goes wrong with the customer or employees, we can actually keep them alive on oxygen and take them to a different city where the beds are available.

So, in every type of crisis, you keep thinking on your feet and see what you can do to be able to handle that. Not just sitting back. I think that is what keeps the excitement alive.

Satwik Mishra

So, this is one aspect of not becoming overwhelmed by the crisis at hand, the crisis will always be there. The second aspect that you spoke about was a little bit of building resilience within your company and within your organization such that you are ready for any crisis which comes ahead.

But there’s also the reality today that there are too many lenses of uncertainty around us. And you spoke about 150 million customers. That’s 150 million different uncertainties, that you have to drive through and navigate through. How do leaders think about uncertainty in today’s time? Like what do you see? And specifically with the lens of emerging technology.

I don’t think emerging technology has ever been as pervasive as it is right now in society. How do leaders think about driving through the uncertainties which have come about from these emerging technologies?

Tapan Singhel

So, somebody asked me this question, I think that is early on when generative AI, when it was just starting. And they said that, how do you look at technology?

I said, to me technology is a tool. I solve problems. And that problem, whatever technology has to be used, I use it to solve the problem. I think that is how we start. It is not that technology becomes all for what I am doing. Now, anything which is new, we are curious and excited about it. And keep on thinking how and what problem that you have where this can become a huge fit.

And how do you do that? It will have risks. There will be disruptions. But the biggest risk is if you are not able to use it well for solving issues, problems or customer experiences. If you are afraid of it and you want to be away from it, that is a bigger risk.

Risks evolve over time. I think when I started working, you would get, salary in terms of a cheque which we had to go collect in cash from the bank teller and carry cash home. And the biggest risk was pickpocketing then, and you try to see how to handle that. Today, the biggest risk is phishing and people taking money away from your account using mobile hacking or whatever it is.

Risks have changed. But obviously convenience has also moved in. In my experience as a customer, it has moved in. Things have become much better. So never be afraid of new technology. It will have risks.

The second thing is how do you build a mechanism where risk correction keeps on happening? Let us look at generative AI. Now there are a lot of hallucinations which would happen. It can go to unethical methodology. So, people are thinking of guard rails. But there should be continuous feedback because there are more good people in this world compared to not so good people. Let us put it like that.

The majority or overwhelming majority are of good people. That is why the world still exists. If the majority was not of good people, the world would not have existed. So, if you have a feedback mechanism of self-correction, where people start contributing and more and more people contribute in the right way, the Generative AI actually gets more powerful in the right ethical manner compared to what it is.

Now if you try to remove all that mechanism of feedback and you are so scared and you put so many guardrails, you don’t allow development to happen. You can’t stop it. Somewhere else. Somewhere, far off, maybe a rogue unit, organization may actually start building, using more feedback from negative. And make it more negatively powerful than possibly it can be.

So, my view on technology and the way I look at risk is, first and foremost, be super excited about all the development. Second, what problem does it solve? Use it well. Third, just keep it very open and allow society and the world to keep on adding there. Don’t make it very restricted because there are more good people than not so good people. Remember this.

That is how it works. And that would actually ensure that it actually gets better and goes in the right way. So that is the way I look at it, Satwik.

Satwik Mishra

So the way I’m thinking, one of it is, from your answer, it is like a philosophical lens in terms of not being awed by the uncertainty brought about by the technology, but flipping the paradigm and thinking about how the technology can be used to address the uncertainty and strategically, developing a framework such that you have continuous feedback channels, which is again, extremely interesting to think about while you are thinking about emerging technology.

But you also, like in your answer, spoke about phishing and the challenges and risks that stem from tech, like in the world today. And I remember Bajaj Allianz was probably one of the first firms to come up with cyber insurance. Now, walk us through that journey of coming out with cyber insurance. And what have been the lessons since then?

Tapan Singhel

Yeah, Satwik. It is again, a story. So, in one of my interactions, one elderly person lost some money because of phishing, and I met him, and he was so sad. And I was talking to him and thinking, as an insurer we are missing out on solving the worry of this person and it is going to happen again.

This was much earlier. Today, every second person is getting this phishing message and people have become smarter on both sides. But this was the early era. And then we said, we have to solve it. So, we came up with a retail cyber cover, the industrial cyber cover was there. But now any individual can buy that and be protected against that. And then we thought, there is also stalking happening on the cyber.

So, people normally do not know how to react with somebody stalking them. How can an insurance company provide assistance in taking care of the legal fees? How do you stop that? What are the issues that are happening in cyber? How are the insurance companies going to protect that? And we came out with the first retail cyber cover in India.

My daughter’s buying it. She was one of the first customers to do so, and we’re super excited. But those were early times. At that time the cyber risk was not so perceived as it is today. So, the policy did not take off as well or as much as I thought it should be. Because normally I am more excited than the time is.

But it started picking up, as the risks are moving in. But again, it came back from same thought process, the risks have changed. You know, nobody’s pocket is getting picked now, we don’t have pickpockets anymore. So, the insurance companies still providing cover against pickpocketing has no relevance, it has shifted.

So, how do you keep seeing the shift happening, the risk and how do you keep moving towards it? And how do you solve issues? And what all risks can you think of even putting out, so the product is also involved, coverage is also involved.

Also, a lot of companies also came forward and the risk also has moved up a lot now. But obviously we have the pleasure of the first to be launching it. And it started from that conversation with the elderly person on them losing some money to phishing. And that again, comes back to the question. You asked me, why do I keep on connecting with people? Because that’s when you realize what is the mess that you’re doing and how do you make yourself better.

Satwik Mishra

Wonderful. So, one particular question that I’ve always wanted to know, and I’m sure that a lot of young leaders today would be very interested in understanding is, you had this distinct pleasure of seeing your firm grow over the last three decades.

And scale to where it is right now. How do leaders think about their business models as their businesses scale, as the industries scale, as they get more consumers, as they probably navigate through the need of a larger workforce in times like this, as companies, think about scaling as we move from startups to probably a resilient company becoming a more standout organization in the world.

What are some of the lessons that you thought have worked? Why Bajaj Allianz scaled over the last three decades?

Tapan Singhel

So, when you are a startup, every decision moves very fast. When you become a large organization, the biggest worry I have is, it becomes bureaucratic, it becomes silos. You have, as I call it in my word, you have emperors with their own kingdoms.

And they do not want to know anybody else’s move. And you have a lot of arrogance which comes in. And that is what destroys organization. Organizations and individuals get destroyed when you reach this place. So, you have to be on a continuous, one, startup pace. Second, always be excited that you have to do much more.

And there’s so much to be done. And know what can you do. Be aware of the use of capital and cash. Be aware that it has to be leading to the benefit of society, the country and the world. And what you do, it has to have an impact on that. Be aware that organizations exist for society. And for that, they have to run it with an ethical, good governance perspective.

And be aware that the individuals in your organizations define the organization, not you. The culture that you have, the people that you have. Be very sensitive towards that. You can’t have a very toxic culture inside the organization and expect them to be taking good care of us when that happens. Be very aware of that.

Be obsessed with the thought process that every day that you get up, what difference do you make. So, when you keep this on and you don’t allow arrogance, the bureaucracy, the layering to come in and politics to come in, you have to keep on nipping at the bud.

Continuously. Be it in communication. It is not that one culture, you change, and it happens. It is the continuous work that you have to keep on working on and build that sensitivity towards that. Then you can have organizations which outlast the average age of organizations which normally is 20-25 years. Because look at the stock market and pick up companies at the top, 20 years back.

Most of them are not there today. So, to outlive that you have to be very, very sensitive on this part. And this is fun and challenging also because the natural tendency of large organizations is to become bureaucratic and to create silos, and to become arrogant.

Satwik Mishra

Tipping the arrogance that may develop as you scale, being very careful about not ensuring that bureaucracy comes in the way of you scaling and obviously being very mindful about the culture within the organization and so that it isn’t leading to any toxic work environments that you are able to grow sustainably.

One of my favorite questions to ask leaders today is – you spoke about the distrust which exists possibly amongst people, with the changing times with so much uncertainty, how do you personally think about trust, both within the firm while you are dealing with your employees and also, more importantly, while dealing with external stakeholders with partnerships, which Bajaj Allianz essentially is a partnership.

What is your take on the importance of trust today?

Tapan Singhel

Trust is everything, Satwik. You know. And trust does not mean that you are judging anybody, trust means that you accept everybody as how they are and who they are, and you believe them. Start from the point that you trust everybody. Never start from the point that you will trust people after they have proven themselves to be trustworthy.

I think that is the first point.

Let’s say if I meet even a stranger, I will start with complete trust and faith in the person and awe of what the person has done there. So, the way you have to look at trust is when you give a person that comfort, that trust, that feeling, they also trust you back. Human beings actually mimic. It’s a very interesting thing.

And, to understand this better, try and do this. So, next time you meet somebody just look a bit cranky. And observe, after some time if you watch them, they will also become stiff. You have done nothing; you have not spoken a single word. And next time you meet somebody, just keep on smiling for no reason, and you will see the results.

So human beings mimic. If you want trust. You decide to trust. It is not a game that somebody has to prove before you trust them. I think that is the first fundamental. You start with trust. What will happen? 100, you will trust. One will betray. Fair enough.

99 did not. So, you’re still winning more than losing. As I said, the world is full of amazing people.

There are a few who are not so amazing. It’s fine. Majority are. So why would you start with the negative?

So, in my view, trust doesn’t come from the expectation of somebody else showing it. It comes from us and demonstrates that we really trust everybody that we meet and talk to and go all out in all that we can do. Even if somebody is not kind to you, you be kind, and you watch the change happen.

I’ve never seen the change not happen. You be more kind. Somebody doesn’t trust you; you trust that person more, and watch the change. So, I think the basic rule is you start strong as we have to trust more and blindly. Even the cost of getting betrayed a couple of times, is perfectly fine.

Satwik Mishra

And, so it’s almost reciprocity, like building a reciprocity from first, from your lens, and then you expect. Once you start building, engaging people with trust, you will get trust back.

Tapan Singhel

Don’t expect. It will happen. You expect, you will have disappointment, Satwik. So, don’t expect.

Satwik Mishra

And my final question is – What advice would you give to young leaders today who are trying to navigate this intersection of technology and starting off in leadership positions, maybe in the government, maybe in the corporate world, maybe in terms of society, what are some essential steps that you think that young leaders should keep in mind in today’s times?

Tapan Singhel

The first thing is – ‘It is not about you.’ Always keep that in mind. It is about everybody around you, first.

Second, what do you feel for the most? Figure that out because that is what you have to get into to try and help make it better. Third, the more you try to make the world better, the happier and more prosperous you will be.

Now remember that. So, if you look at any business, becoming a unicorn or has billions of dollars, at the root of it, they made the world better, than what it was. So as a young leader, try to think how you would make a difference to the world? Then think how can you make it better? Or, forget the world, if not the world, the country, if not the country, the society, not the society, the family. And if nobody else, you. What difference can you make?

I think that is the first thing I always think about is to have a high level of empathy. And it is not about you.

The rest will follow. It is something that if you are so obsessed about position, if you are so obsessed about money and you chase that, then that may not come so soon. If you are obsessed with trying to solve things, to make a difference in society, these will follow very easily.

It is a very funny equation, like the one I told you about how people mimic others. So, do not be obsessed. Designations, power, money, fame will come to you if you do not obsess about it. If you obsess about it, it will take a lot of time. Be obsessed about what difference do you make.

‘It’s not about you.’ Just remember that. And in the end, never forget, be happy, be joyful. Keep smiling. Come what may, no crises will last forever. Even when it comes, smile it through. This too will pass.

Satwik Mishra

Those are wonderful words to end this podcast. I wanted to understand leadership in today’s times and there was nobody better that I could have discussed it with.

Thank you so much for being here.

Tapan Singhel

My honor, Satwik. Always great talking to you.

Chapters of Patricia Thaine Podcast

Patricias Journey to GenAI (04:10)

Understanding & Minimizing Risk Within Data (05:58)

Introduction to Data Minimization (07:03)

What are Synthetic PIIs (09:58)

How Synthetic PIIs Safeguard Sentiment Analysis (11:26)

Assessing Risk for Re-Identification from Data Sets Within Organizations (12:28)

Privacy Preserving Solutions in High Concern Industries (15:41)

Federated Learning & Differential Privacy (19:04)

Adopting Privacy Preserving Solutions (21:43)

Clarity Within Data Regulations (24:58)

Trust in Private AI (26:57)

Looking to the Future (30:17)

Patricia Thaine

Podcast Dialogues Transcript

Satwik Mishra:
Hello everyone and welcome to Trustworthy Tech Dialogues. I’m Satwik Mishra, Executive Director for Centre for Trustworthy Technology, a World Economic Forum Centre for Fourth Industrial Revolution. Today, we will dive into the fascinating world of generative AI and explore the critical need to safeguard user privacy in this domain. This convergence is not just a technical issue, it’s a pivotal conversation about the future of technology and trust.

Large language models often capture and at times reproduce sensitive information from their training data, leading to several privacy concerns. Challenges such as unintentional data memorization, data leakage, and the risk of disclosing confidential information have remained significant issues even as the wave of generative AI keeps rising. These concerns highlight the delicate balance we must achieve between leveraging the utility of these powerful models and protecting user privacy. To discuss these pertinent issues, I’m delighted to be joined by Patricia Thaine.

Patricia is a computer science PhD candidate at the University of Toronto. Her research focuses on privacy preserving natural language processing, machine learning and applied cryptography. She also conducts research and computational methods for lost languages decipherment. Patricia is the co-founder and CEO of Private AI, founded in 2019 by privacy and machine learning experts from the University of Toronto. Amongst many other accolades, Private AI is one of World Economic Forum’s technology pioneers this year.

Welcome, Patricia.

Patricia Thaine:
Thank you so much. It is such a pleasure to talk to you about this.
Satwik Mishra:
Let me begin by asking you about your journey so far. How did your research as a computer scientist lead you to the intersection of privacy and AI? And what led to the subsequent creation of Private AI?

Patricia Thaine:
Yes. So, I was doing work on acoustic forensics, who is speaking and recording, what kind of educational background they have, etc. That is the kind of information that you could gather to improve automatic speech recognition systems as one application.

But it was obvious throughout this work that there is a lack of data and there is a lack of data because of privacy issues. So, privacy, was, of course, a huge concern if you gathered that data. But privacy was also a blocker for innovation in that sense.

And it did not make sense to not be able to innovate because of privacy concerns. And of course, we don’t want to ignore those privacy concerns are super important. So then, we started looking at privacy enhancing technologies, doing work with homomorphic encryption. And throughout this work, we noticed that even the very basics of privacy and of data protection regulation requirements were not being done well for 80 to 90% of the data out there, which is unstructured data, text, audio, images, documents.

And those basics start with what kind of information do you have and how can you minimize the amount of personal information you are collecting. And the cool part is that in this unstructured data, is a lot of gold around the personal information. You often do not need it to do things like understand conversation flows, like topic modelling, summarization, etc.

So, it has been quite a fun journey to help other people innovate with their data.

Satwik Mishra:
If there was one challenge you could speak to that Private AI is trying to address, and break it down for us, as to why in this era of generative AI, is it important to focus on this challenge? What would that be?
Patricia Thaine:

So, one of the main challenges that we are trying to address is helping people understand the risks within their data and minimizing that risk. And the reason that it is important generative AI is because you can you don’t or currently at an influx where we have no idea, what kind of risks we are taking on with this innovative technology.

So, it’s very similar to when we were doing, started doing cloud computing, and there was a huge learning curve. now this huge learning curve includes models that can go into the hands of unpredictable, and unpredictable environments, with maybe customers using it, maybe, internal parties within an organization using it, without, normal access controls.

And these models can memorize information and spew them out in production. And that is one of the reasons why it’s an even bigger risk than when training machine learning models normally with regards to personal information.

Satwik Mishra:
Could we speak to just going back to basics for everyone? Just a breakdown of what is data minimization and why specially in the generative AI context, has it been a challenge right now?
Patricia Thaine:
Okay, absolutely. So, data minimization is, you use the data that you absolutely need. That’s the personal information you absolutely need. And now also confidential company information in a sense as well, they absolutely need for a particular task. When it comes to generative AI, because those models are not only memorizing the information, but then can spew them out in production.

That is one of the main concerns. With classifiers, you’re getting classifications of terms. So, it’s a little bit harder to extract the information from the model. You still can, but it is more difficult. And you might need access to the model itself. So, when it comes to generative AI, there are many ways that there could be privacy concerns.

One of them is during fine tuning or treating. One of them, which, often get overlooked, is the fact that word embeddings, or embeddings for RAG, can still contain information that you can extract. And 92% of the information, even within dense text embeddings can be extracted. That means personal information like names also can be extracted also addresses things like that.

So, one challenge stems from not necessarily understanding where the risks come from. Another one is not understanding this early, what personal information actually is. A lot of the times, for example, you might see, anonymization doesn’t work, because such and such company tried it and the individuals we identified. If you dig into those, usually you don’t have, actually anonymized data.

You’ll have, for example, postal codes or zip codes in there and other very identifiable information. But there’s another step to it.

Even if you’re not anonymizing the data, understanding what is personally identifiable, and what is risky data, can be challenging. And then once you identify what it is that you want to count as risky data, actually doing so at scale when it comes to these really complex data sets is very difficult. Because you’re dealing with, optical character recognition errors, for example, when you’re dealing with documents, you’re dealing with a variety of different format types.

You might be dealing with audio, you might be dealing with automatic speech recognition errors, you might be dealing with, just normal disfluencies. And you’re dealing with multilingual, multinational data. And you have to do this all with massive amounts of data. So that that is why this is a very tricky problem.

Satwik Mishra:
Along with data minimization, I think one of the most fascinating things, as we did our research on Private AI, we came across is one of the key offerings of Private AI is the ability to generate synthetic PII.

Could you speak to what is synthetic PII? Why is it important? And do you see that being a specific leverage while going forward as generative AI scales along while it can protect user privacy?

Patricia Thaine:
Yeah. So, synthetic PII is the replacement of personal information with synthetic personal information. So, names with fake names, locations with fake locations. And, it adds a more natural edge to the data. And as opposed to fully synthetic data, you get to keep as much of the context as possible, which can be very important for things like conversation flow, for understanding the sentiment across the conversation, for particular, interactions, where its fully synthetic data is really useful when you don’t have enough data in the first place.

And when it comes to synthetic PII, there’s an additional benefit, with regards to, the privacy protection. So, when you have, the names, name- location one, the actual labels of the personal information that are redacted, you can still fine tune the models with that. You can still get good results with them when it comes to having a good conversation flow.

However, as you can see from, HIPAA Expert Determination requirements, if you have synthetic PII replacing the personal information you’ve got, it’s called hidden in plain sight. So, if anything is missed, it’s very difficult to tell what the original data was from the fake data. And so, it makes it much more difficult to re-identify an individual. So, for many use cases, I would say that this is a very good tool to use for privacy for generative AI.

Satwik Mishra:
You spoke about sentiment analysis. Can you speak about how that happens and how can synthetic PII help in safeguarding it?
Patricia Thaine:
Yeah. So, for sentiment analysis we actually look at the language around the PII. So, what are the choice of words, if there’s audio, you could also detect the tone of voice. But normally it’s very much, which words are being chosen and how do they match up with the expected sentiment associated with previous conversations like this. And you don’t need the PII in order to get that.

You also don’t need it the PII in order to extract topics. You don’t need it to summarize information a lot of the times. And so that’s, that’s one of the many tasks that you could do, in a privacy preserving way very effectively.

Satwik Mishra:
Interesting. So, let’s get to a little bit of the playbook that Private AI would offer forward in terms of synthetic PII or data minimization. How can organizations assess whether their operations are at risk for re-identification from the data sets, and how do they interface with privacy enhancing technologies to go forward?
Patricia Thaine:
Great questions. Okay. First of all, this might be a very long answer. I’ll try to keep it concise. First of all, you have to be aware of what kind of information you have in your data set. So do a privacy impact assessment that says, in these data sets, these are the kinds of information I found.

And because we’re talking about organizations that have confidential information, they’ll also want to have confidential impact assessment about what kind of confidential information they have in there. And these can include things like trends. These can include financial amounts. it could be, for example, logos of organizations that they’re working with. and all of that needs to be taken into account.

Then it’s, determining which tasks, they actually need this data for. Who needs access, to the responses of the generative AI model in these cases. And then, determining what the minimum amount of information is that they can use of these, this private data or the sensitive data.

And well, determining that minimum amount of information, the goal might not necessarily be, non-identifiability. If you want the data sets to fall outside of the GDPR, for example, you do have to anonymize the data. However, that’s not always the case for organizations. You might, for example, want to be Payment Card Industry Data Security Standard (PCIDSS) compliant, in which case what you’re looking for are, account numbers and names and removing those and still getting the most out of your data. And that can be that’s very common in calling contact centers, for example.

And so, once you’ve determined which legislations you want to be compliant with, what kind of data you actually need, then you need to think about, four different pieces of the large language model set up.

Fine tuning and training. what kind of data is being memorized there? The RAG embeddings, if you’re using that for grounding. and what kind of information is being used to create those embeddings?

And then the prompts themselves. So, what kind of information you’re sending to either a third party LLM or what you’re collecting internally as a result of these prompts. And you can strip out the personal information there and then reintegrate it in the response for a lot of different tasks.

And you can set that up from a team by team, perspective as well. So, different teams can send different types of information. And then finally, as a measure of last resort, you can also block out personal information in the output of the model. So, if by accident you’re sending through a credit card number or, a Social Security number or, an address that you didn’t intend in that conversation flow, walk it out before the person receives it.

Satwik Mishra:
Now, when speaking about synthetic PII, the first thing that came to my mind when I had come across it was the impact or the potential it could have in certain key industries or sectors, for instance in banking or healthcare, very specifically, there are concerns about privacy. So, can you provide an example of how privacy preserving solutions can be integrated into these high impact and high concern industries?
Patricia Thaine:
Absolutely. So, one example is, work we have been doing with Providence Health even before the last language model boom, where they’ve been using training data to create chat bots for their users and their patients, and they do so in a HIPAA compliant way. So, for HIPAA, you can either follow HIPAA safe harbor, where you’re removing 18 different entity types that are listed or you could do the Expert Determination.

And one thing I want to point out is how nice it is to have a legislation that’s very clear around,  what it takes to be to have acceptable data sets. And while HIPAA safe harbor is by no means perfect by HIPAA expert determination, as people normally, more commonly choose to do for their health care data sets for an extra level of safety. That clarity actually leads to much higher ability to and comfort to innovate with the data. So, there’s a much higher level of confidence from the organizations.

And that’s something that we really need to see, from other legislations. With GDPR, it’s basically you do what the best that you possibly can and you show to regulators, these are the steps that we took, please don’t charge us too much money if there’s a data breach. So that’s one aspect of it.

Another aspect is that, for privacy enhancing technologies in general, is the ability to share, and do cross-border data transfer. And for generative AI models, that’s a little bit trickier because normally you might be able to do that with phone numbers, for example, a cross-Atlantic setting when you’re trying to determine, is this phone number, spam number or not?

And you have a database to match that against. So, you could do an encrypted search, to just check that and not have any personal information that crossed the Atlantic. When it comes to Generative AI, I’d say the most viable technologies are, federated learning where you are sending the model to the source of the data or of the interaction.

And then you’re learning in that space, and then you’re sending those learnings to a supermodel, that, if you will, that will update everybody else. However, that federated learning does have to be combined with that ability to not memorize the personal information. Because even if you’re learning on the edge, you’re still learning it, and you can still spew it out in other people’s environments if you’re sending that to the mother model.

There’s also differential privacy that you can combine with this, this learning on the edge, for federated learning. However, it’s very tricky to calculate the right, the right kind of privacy that you want to include in that, the epsilon value, if you will. There aren’t enough experts in the world for, all of the tasks that there are to do there.

And additionally, the folks who have been able to do this, like, Google and Apple, sometimes have somewhat flexible definitions of their epsilon value. So, the actual privacy is a little bit unclear there. So, I’d recommend having the ability to really pinpoint what data it is that you are training on or not.

Satwik Mishra:
I’m just going to, try and learn a little bit in in this podcast and for our audience. Can you  tell us about the basics of federated learning and differential privacy. What is it and how is it different from what had been the status quo earlier.
Patricia Thaine:

So, federated learning is the ability is basically the concept of you’re learning in a federated way. You’re sending models to a lot of different environments. And in all these different environments, you’re learning and then you’re taking that information and you’re sending that sending that for, you’re combining it together, and then creating an, updating the massive model at the top and sending it over to the edge devices again.

And the reason that that is a privacy preserving step is that you’re not sending the data to a third party. You’re training directly in your environment, and that training directly in your environment is very useful because you can, for example, if you’re a hospital that can’t share information with other folks, you can still provide this data in this form to an organization that’s training a model.

However, there’s still that concern of, is memorization taking place, and that’s why it has to be federated learning combined with either differential privacy or the deidentification of the data. And differential privacy adds noise to the training process. So, the goal is to, not have, one particular input affect the training too much.

So, you’ll add noise to the gradient. I’ll, stop there, but basically, they’re these arrows that point in certain directions and you add noise to the direction, and you shorten the arrow. And that affects how much, a particular input will, make the training different. And one aspect of that is also that rare data.

I mean, that means rare data doesn’t affect the training much, which can be good in some cases for generalization. But not necessarily good if you have, underrepresented populations, for example, that you need to train more heavily on.

Satwik Mishra:
Thank you for that 101.

You did mention HIPAA and GDPR and there are several other legislations, around the world which are being ideated upon right now. And I want to get to that. But before I ask you around the particular legislations.

As you would have seen that there are several technology governance frameworks, which are increasingly including privacy recommendations, but not specifying or prescribing particular practices or techniques.

Now, how can organizations who wish to comply with these responsible and trustworthy tech frameworks, evaluate and adopt the right privacy preserving solution for their businesses?

Patricia Thaine:
So that is very tricky to answer, because there are so few experts in the area and the guidelines tend to be fairly vague. I will say that the Cloud Security Alliance, they have reports that are much more specific around what technologies are available and how to use them.

Ultimately, I think there are a few principles that, that you can, you can stick to that are pretty common across the, across any tasks that you need to do, which are, doing privacy impact assessments. Having an accurate assessment of the kind of data that you have and minimizing the information and then what privacy enhancing technologies you use beyond that.

That has to be a conversation on a case by case basis. We do have a privacy enhancing technologies decision tree on our website. Mostly for technologies that, that we’ll go through, if you have, certain criteria around your task, which privacy enhancing technologies to use.

However, it is very much a, a conversation also about, maturity of the different technologies and how much risk you’re willing to take with regards to, implementing something that’s novel within your environment. And doing this plus the innovation with, generative AI that folks are undertaking with understanding ROI, can be a big ask.

So ultimately, I’d say if you have questions about which privacy enhancing technologies to use, happy to talk to you about it. But look at our decision tree or look at, the Cloud Security Alliance reports.

Satwik Mishra:
We’re going to link all of those in our show notes.
Patricia Thaine:
And also, the World Economic Forum is coming up with some really interesting reports as well. Keep an eye out for those.
Satwik Mishra:
All right. And now, getting to the privacy legislations.

Now there are several already in force, some various, ones in the US at the state level and even at the federal level, being debated upon ideated upon right now. Just taking a broad overview, is there one of these legislations or governance recommendations that got it right?

And in terms of safeguarding privacy and promoting innovation, What is the optimal pathway here for legislation, for regulation while dealing with generative AI and user privacy?

Patricia Thaine:
With regards to enabling innovation, I’d say HIPAA somewhat got it right. Because of their clarity, because of their specificity as I was mentioning before.

Without that specificity and without enough certifying bodies to give confidence to organizations about this, you’re doing the right thing and don’t you don’t have to worry about it. It’s currently pandemonium.

Folks have no idea if they’re doing the right thing. There’s such a patchwork. Not just, state level in the US, for example, but worldwide of various data protection regulations.

And many of them are quite unclear with regards to how much data minimization is enough. Many of them are, basically do best practices without pointing to what the best practices, really are, at that point in time. And that is why we need those certifying bodies, to, to come in and make sure that those best practices are happening.

What I do like, that’s coming out, more and more are, the ability to have some sort of warranty, for example, around the model that you’re building. For example, Mila AI is one company that’s working on this. and they’re checking models for proper governance. They’re checking models for that whether they’re as accurate as they say they are with a particular warranty around that base amount of accuracy.

And they’re checking for bias. And this is a really cool way of approaching it. But ultimately, that is, one really useful tool in our toolkit. But we need regulations to be able to say – this is the correct way of going about it, or these are the bodies that will tell you, what is the correct way of going about it is as technologies develop, because maybe the law doesn’t, we can’t keep track of everything that’s going on at that point.

One thing I will say is law is very aspirational. The EU AI act is also very aspirational because we don’t have the technologies available yet for actually full compliance with these laws. And so that is why there’s a need for, something that’s much more rapidly, keeping track of what’s what best practices are.

Satwik Mishra:
So, I want to ask one of my favourite questions in this, in this podcast series, how do you think about trust when you are, offering these services and with Private AI’s work? And it’s so important in terms of  the USP and the ethos of Private AI around protecting user privacy.

What is your thinking around trust and emerging technologies right now?

Patricia Thaine:
So, if you look at what trustworthy AI counts as, you’ve got, of course, privacy is a pillar. You’ve got transparency of the information that’s being output. And the transparency can mean many things, but it can include, for example, the source of the data.

And there’s also, of course, non-discrimination, that is something that people look at as part of these trustworthy pillars and the safety, the security of the model as well. There’re other things you could add to this framework, but these are some of the main ones. And the way that we’ve seen organizations look at it is first and foremost, the first two things that come to mind are, how can we make sure we’re not saying crazy things to our users?

And how can we make sure that we can track back the source of the information and how can we make sure that we’re also using their data properly? And this is, a board level topic. When it comes to, how we look at trust, there’s, and how we integrate trust in a technological way.

And there’s, for example, NVIDIA Guardrails, NeMo Guardrails,  is on the path of trying to implement let people implement trust easily, through a dialogue system, which is really cool because you don’t necessarily need to have the as many technical chops, to implement each of these, one of these pieces. When it comes to, privacy, what we’re seeing now is actually much more collaboration between the teams that are using these technologies with their privacy teams.

And the privacy teams are being asked quite a bit, of expertise on the technology front. Which is great to see those teams stepping up and being consulted throughout the process in a privacy by design way. And that privacy by design way of thinking is, one of the benefits, I’d say that generative AI has brought because, for previous projects that they may not have been brought in in such an early stage, but because it’s a board level topic, this is this is the time they’re talking about it.

so not only I’d say privacy by design, but also trust by design is what’s being implemented within organizations at this moment. And the organizations, I think, that are figuring out how to which tools to use and how to do this properly, are the ones that are going to be able to scale their deployments much more easily than the ones that do this as an afterthought.

Satwik Mishra:
It’s wonderful. Another thing that now I want to ask you a little bit, looking ahead.

You just spoke about, about how privacy enhancing technologies still need to advance to keep up with legislations. Assuming an optimal pathway ahead where we are able to advance on these technologies, and we reach those optimal points where privacy can be safeguarded.

What potential do you envision for the private sector and the public sector unlocking at the marketplace? Will we have more trust in these emerging technology solutions? Will the risks and incidents that we are always so worried about and that keep coming up, reduce? Are there other optimal pathways that you envision going forward with advances in these technologies?

Patricia Thaine:
Yes, in particular with advances of understanding the data itself that is particularly interesting because that’s a core requirement for being able to implement any data protection regulation when it comes to these unstructured data sets.

When it comes to trust, I think that yes, the technology part is one piece of it.

But as I was mentioning, it’s also that ability to know that you’re doing the right thing. And just because the technology is there doesn’t mean that you know that it’s the right technology to use. And so, we can implement as much, we can create as much beautiful technology as we possibly can. But without that that certification, without that ability to know, it’s still going to be a bit of a pandemonium.

There will be best practices, that, folks might know from a might learn from a professional circle, format, the way that we know, for example, for cloud security, there is best practices like keep up with beyond, for example, SOC 2. But having that SOC 2 certification, for example, really helps people trust that the vendors that they’re using and that the practices within their own organizations are following those best practices.

In addition to that, the privacy enhancing technologies that are being built, they can be there’s often a perception that privacy has to be perfect, and that’s never going to be the case.

There’s already an acceptance in the cybersecurity community, that cybersecurity is never going to be perfect. And what we’re doing is constantly working more and more towards, building barriers against the bad guys.

It’s very similar to privacy. However, we do need to change that frame of reference, whether you’re using differential privacy, homomorphic encryption, federated learning, de-identification none of these are going to be perfect. But they have to be taken in context with the cybersecurity frameworks that are in place as well. And with the risk assessments about who has access to the data.

Whether the proper access controls are put in place not only within the organization, but also for third parties? And whether there’s high risk of the data actually being accessed, in terms of, interest of the data, for example, a data set of the credit card numbers of various individuals might be much higher value than a data set of their conversations about vacuum cleaners.

Satwik Mishra:
Right.

And finally, I would love to know right now, in your current analysis of not only within the Western Hemisphere, but generally globally, what is it that, firstly that you think we need to pay more attention to, in this domain that policymakers or the industries, civil societies, not paying enough attention to, but is really important in this domain of protecting user privacy while generative AI scales?

And what is it that you are most excited about that you see the most potential in these privacy enhancing technologies going forward?

Patricia Thaine:
So, what you should pay more attention to. There needs to be more research around what data minimization means for generative AI. And the reason that, there needs to be more attention to that is there’s data minimization for data sets that you’re actually seeing.

But what does it mean when you have these data sets for training or fine tuning where noise is already being added essentially through that that training process. So, the output itself, might not be directly, for example, Satwik lives on 349 Park Place. It might not come out directly like that.

It might come out Satwik lives at another address. So, understanding what kind of noise is added through that process, may really makes, I think, a difference with regards to how much, how much we have to modify the training data itself. And part of the data minimization process is really understanding how can you keep as much of the information as you possibly can while minimizing that risk.

And then the second question you asked was?

Satwik Mishra:
What are you most excited about?

Within data minimization, we need to do more work going forward.

But what do you see as the most important or exciting frontier in privacy enhancing technologies right now for the ecosystem going ahead? I never ask more than two years, but like two – four-year time horizon going ahead.

What do you think is, what excites you the most in this domain?

Patricia Thaine:
Well, I’d say what we’re working on. What we’re working on at Private AI is really making those modular components that lead to that comprehension of the data itself. And that is key for proper access to information, requests, requests to be forgotten, which are incredibly difficult right now.

Anybody could tell you that. But being able to integrate that directly from a product by product lens, minimizing the data from a product by product lens, or from a software pipeline, by software pipeline prevents the huge mess of data leaks that we have with just information with people have no having no idea what’s actually in there.

And being able to do that in an extremely accurate way, will make all the difference with regards to understanding what the risk is in our data. And that is one of the very first steps that we need in order to understand where the cybersecurity, infrastructure should be heaviest. So that is what I’m most excited about, actually being able to get people to comply with information.

Satwik Mishra:
Thank you so much. This was an absolutely fascinating conversation.
Patricia Thaine:
Thank you so much, Satwik. It was really fun.

Chapters of Sylvie Delacroix Podcast

Journey from Moral Philosophy to Data (03:03)

Introduction to Data Trusts (05:03)

Diverse Approaches to Data Trusts (10:32)

Private vs Public Sector Data Trusts (14:01)

The need for A Multi-Layered Approach to Data Governance (19:01)

Large Language Models and Ensemble Contestability (21:54)

Implementation Challenges and Data-Sharing Arrangements (29:04)

LLMs in Domains of Healthcare, Justice and Education (33:01)

Trustworthy human large language model collaboration in healthcare (38:51)

Collective modalities for feedback in LLM design (43:39)

Looking to the future (46:25)

Prof. Sylvie Delacroix

Podcast Dialogues Transcript

Satwik Mishra:
Hello, everyone. I’m Satwik Mishra, Executive Director for Centre for Trustworthy Technology of World Economic Forum, Centre for Fourth Industrial Revolution. In this edition of Trustworthy Tech Dialogues, we are delighted to have with us Professor Sylvie Delacroix. Sylvie is the inaugural Jeff Price Chair in Digital Law and the Director of Centre for Data Futures at King’s College London.

Her research spans philosophy, law, ethics and regulation to address human agency amidst data intensive technologies. She has acted as an expert for public bodies such as the Department for Digital, Culture, Media, and Sport of the United Kingdom, and served on the Public Policy Commission on the use of algorithms in the justice system. She focuses on designing participatory infrastructure over the life of data-reliant tools deployed in varied contexts.

This spans right from data generation all the way to the design of human computer interfaces. Sylvie is also the author of the book called “Habitual Ethics”, published by Bloomsbury in 2022. The book offers insights into the operation of habits in our lives, its contributions to modern deliberation and judgment, and its potential for distraction and distortion. Welcome, Sylvie. Thank you so much for being here.
Sylvie Delacroix:
Thank you, Satwik. I look forward to our discussion.
Satwik Mishra:
Likewise. So, let’s begin by understanding your journey and how did you come to be interested in data.
Sylvie Delacroix:
Well, I actually come at data from, probably fairly unusual, starting point in that I’ve always been fascinated, and worried to some extent about agency. And what do I mean by agency?
I mean, a capacity to keep asking questions, to keep looking at the world and thinking, oh, maybe it can be made better. Maybe it can be different. So, this capacity we all have, a priority, much of my work has been focused on, on saying, , we can’t take it for granted. This is actually a capability that can be compromised by the way we organize our institutions.

And one thing that really struck me was that if you look at data governance regime today, it’s structured in a way that really incentivizes, what is best described as a habit of passivity. So, I’m sure you’ve experienced these pop-up windows that pop up on your screen asking you – Do you consent to this data being collected or that data? and most of us try to click these windows away as quickly as we can. And I think, I mean, I really, I am concerned about the extent to which this encourages a deep rooted, attitude or habit of passivity that I think having, widespread effects when it comes to the kind of civic participation we need to encourage if our democracies are to stand a chance.

So, I come at this from a more philosophical angle, probably. But it hasn’t stopped me from looking at fairly practical examples of how we can change things. So, I’m really passionate about bringing philosophical discussions to bear on very concrete design interventions.
Satwik Mishra:
Speaking of taking philosophical, thoughts into practical implementations, that’s like a perfect segue way into understanding your work on data trusts. So, what are data trusts and why is it important in today’s age?
Sylvie Delacroix:
So, data trusts, I mean, they were born from, random coffee conversations that I happen to have with a colleague, Neil Lawrence. Maybe six years ago, seven years I lose track now. So, this was born really, in an unexpected way. We started chatting and we chatted more, and at the end of the day, we wrote this paper called “Bottom-Up Data Trust”, which was meant to highlight the extent to which the top-down regulation efforts that have taken place to give us data rights. Yeah. For this, in Europe, we have the GDPR, which gives us personal data rights.

Now that is a super important tool. Nobody’s going to deny that. But it is quite striking that on its own, this top-down regulation governance regime is not sufficient to basically reversing the imbalance of power that has taken such a strong hold on our society today that its overly optimistic, I think, to think that we can move away from a situation that is, well, that has very detrimental effect at several levels, and we document those, we can’t do that just by sitting back and hoping that, somehow data rights will solve the problem.

So, they’re very important tools, but they’re not enough on their own. And so, what we’ve tried to argue is that we need bottom-up data empowerment institutions and data trusts are one kind of bottom-up data empowerment institution and what’s key about data trust is two things. One thing which is coming to many data institutions is they’re there to basically say, look, we need to pay attention to groups, not just individuals.

So, this is a real problem that our legal structures today are very much geared towards individuals. We have individual data rights, we have individual actions, etc. And yet when it comes to data, data is relational. Yeah, my data is also my neighbor’s data and my family’s data. So, it doesn’t make sense to think only of individuals.

One thing that we try to address is the fact that we need structures that allow people to come together, and in this case, pool together the rights they have over the data. So, in Europe, that would be GDPR rights and entrust those rights to an intermediary. In this case, if it’s a data trust, it will be a data trustee who is a professional who has fiduciary responsibilities to act with undivided loyalty towards the people who’ve joined the data trust, the beneficiaries in this case, but they’re also the citizens. I’m not going to go into too much detail.

But what’s key about data trust is two things. So, this emphasis on groups that’s common to other data empowerment institutions. But the key thing is that data trusts come with built-in institutional safeguards that are stronger than any other institution based on contract or corporate structures.

There’s nothing wrong per se with contractual structures or corporate structures. The only problem that one has to be vigilant about is that the burden of proof if not reversed. So, when you have a trust, it is for the trustee to demonstrate that they have acted with undivided loyalty. And that’s super important if you don’t want to continue a situation that is not really empowering people.

So, the imbalance of power cannot really be addressed if you can’t, time-poor, often, resource- poor individuals to launch legal actions to try and get the rights recognized. So, this is one very important aspect that trust brings that the contractual structure cannot bring. I think I’ll stop here because I could speak for a very long time about data trust.

But the really two important things are enabling groups to pool together resources, in this case data rights and trust or task an intermediary who is a professional that’s the other really important thing, is that there has been talk of data intermediaries and they easily confused with corporations. I am not talking, when I say data intermediaries, I’m not talking of corporations. I’m talking of a much needed 21st century profession that is slow in developing, but I think is more needed than ever today. Yeah. So just like at the end of the 19th century, we had the medical profession, profession that was born out of basically progress in the medical sciences today, I would argue that progress in data sciences around the birth of the new profession, and in this case, let’s call them Data Stewards to be agnostic as to the legal form of the of the data institution.
Satwik Mishra:
It’s absolutely a fascinating paper, and we’ll link it to our show notes for everyone else to also go through it. But let’s dig a little bit deeper into the idea of data trust. One of the key strengths of it is having a range of options with different terms to support varied approaches to data governance. So, what are the approaches that you think or you’re most excited about, and what approaches do you see being very effective in the digital economy that we occupy today?
Sylvie Delacroix:
That’s a great question. And I think it’s very important to be concrete actually and one example that is close to my heart at the moment is, in fact, it’s not happened yet, but I was contacted at some point by, a group of schools who are interested in creating a data trust because they were interested in delivering more personalized education to their pupils.

And the idea was to say, well, we have different regulatory regimes across the world. We operate in different countries. Rather than just trying to meet the minimum regulatory requirements in each country, why don’t we empower both the children and the parents to choose a data trust that basically meets their aspirations and their attitude to risk. And accordingly, we as the school, we would be talking to the data trustee representing each group of parents and children. And we would, be able to dynamically negotiate which data we have access to, in light of the terms of each trust.

So, for instance, to give you a concrete example, you could have at one end of the spectrum a group of parents and children. And I’ll come back to that because there’s a difference sometimes. Who are very happy for a lot of data to be collected by the school and the in-kind benefits that they expect are basically better personalized education.

But they don’t only expect those in-kind benefits for themselves, they also want to join in that, a kind of charitable aim. So, let’s say they want to improve. We move to remote education for kids who don’t have access to in-person education worldwide. And so, they, task the intermediary, in this case the data trustee, with negotiating data sharing agreements with education providers around the world for whom access to this rich data or granular data, could improve the delivery of education for kids other than themselves.

And I think that’s a great example of the fact that these data trusts need not bottom out into basically profit maximization. Some people may want to prioritize financial returns and that is a possibility, and there may have to be regulation to basically put a stop in some cases to what could be seen as abusive monetization of certain kinds of data.

But what I find very important is to highlight, something that too many people have forgotten is that sharing your data and, in this case, tasking intermediary with the stewarding of this data can unlock really important in-kind benefits that otherwise are impossible to achieve.
Satwik Mishra:
And so just one follow up on that. It’s also about thinking about, like you mentioned, corporations. Let’s talk about the operation models, now data trusts in general, its diverse approach can be funded either through the private sector or the public sector as well. So how do you envision the public sector coming in and implementing the concept of data trust for the private sector, offering it as a unique, initiative in its own operating model? What do you think about the varied challenges that if the public sector comes out with it, or the private sector comes out with supporting or operationalizing data trust.
Sylvie Delacroix:
So, one thing we emphasize in this paper is that it’s high time we move away from the one size fits all approach to data governance? So, we need to be able to give a choice to people, so that they may choose to join Data Trust at time ‘T’ like today, according to the fact that they think it matches their aspirations and attitude to risk. And maybe tomorrow they may change their mind and join another one. Now this is one thing that’s important to flag in terms of facilitating a very much a debate that is lacking today. Because there’s no choice, no genuine choice. There’s also no debate. It’s not very surprising, choice is important for more than debate. But it’s crucial.

And the other thing then in terms of private public distinction is I think we have to acknowledge the fact that there are a bit different needs that are best addressed by different kinds of structures. So, for instance, in the UK, a lot of concern and a lot of attention is being paid today about the fact that we really need to do better when it comes to sharing medical research, medical data, education data and social care data.

There are too many cases that fall through the gaps because of lack of data sharing. And in this case, there’s quite a compelling argument to say maybe this would be something best funded by the public sector, possibly on a regional basis, not a national basis. And this could be something that could be, and here I’m always very careful about the words I choose, but it could be a default structure which in the absence of choice, people are assigned to. Acknowledging the fact that today, unfortunately, many people don’t actually have much awareness of the fact that data is being collected every day, whether we like it or not, when we go shopping, when we go online, etc.

And so, the lack of data awareness today means that realistically, there is a risk that even if we were to develop a wide ecosystem of bottom up data empowerment structures, we don’t want to end up in a situation where we only empower the least vulnerable part of the population, i.e. the people who are most aware of the risks they take when they share data. That is a problem that we can’t fix with a magic wand.

I am always concerned about this because advocating a default data trust has its risks and dangers. Obviously then it would have to be a very strong oversight mechanism to avoid abuses, etc. But in answer to your question about private- public, I do think it’s very important that in some cases we do have publicly financed, bottom-up empowerment institutions, which are complemented by private structures that can offer a choice and alternative.

Now, that is one example. There are other examples where you could consider, for instance , a data trust that’s created purely to offer a service. Effectively answering for all those people who feel that they don’t have time to or the intellectual resources to make informed choices about which data they share on which occasion, for what purpose, etc.

They’re like, ‘we need we need someone to whom we can trust’, a professional, who can basically take on the brief of managing these data sharing decisions, monitoring them, etc. and so there I would happily pay a fee for service, in which case it’s basically the service of data stewardship on a personal level or maybe a family level.

So that is another model, is the fee for service model. There are others. but that’s probably a model that effectively best, instantiated through private initiatives rather than public ones.
Satwik Mishra:
Let’s consider three sort of levels of this: one being the users that you’re talking about coming together to enlist the data in these data trust.

Then you talk about a new layer of intermediaries, which should come up in the 21st century, which is very much required. But this ecosystem also exists in top-down regulation and legislation. GDPR is a great example. There is a privacy law being debated in the US right now. How do you envisage those laws at the federal level or at the state level, supporting initiatives like bottom-up data empowerment through data trust?
Sylvie Delacroix:
The first essential level is to give rights to people. In the US, for the moment, that’s still lacking. Yeah. In California you do have personal data rights. But other than that that is a major drawback, I would argue. And so the first step, the essential step is to grant personal data rights.

That’s not enough, of course, because also you have to be prepared to implement those data rights. And that is still lacking today. I have to say there is an increasingly, a very high level of frustration, among those who are trying to see to what extent people who file a data portability request or data access request, these are not rights that are easy to exercise.

Often, they give very poor answers. And so we are still we still have a long way to go in terms of implementing those data rights in a robust way that basically empowers people. So that has to be highlighted. But beyond that, I do think we will also need a multi-layered approach to data governance. So just like say, take medical governance.

We don’t just have top-down national legislation when it comes to medical practice. We also have a layer of professional regulation. We often have regional levels of regulation or hospital-based levels of regulation. There are multiple layers and I think this is very important also when it comes to data, is to allow for complementary levels that respect the fact that sometimes decisions need to be made at a very local level and are best or overseen by a very local oversight committee.

And at the same time, you do need, of course, national legislation. So that is something that’s still missing today. We don’t have the kind of multi-layered approach to governance that we need.
Satwik Mishra:
Another, very important work that you put forth and is especially important in the current climate is around large language models.

And a feature of your work is something you call Ensemble Contestability. So, tell us a little bit about what it is and how can it foster a trustworthy relationship between users and data driven systems?
Sylvie Delacroix:
I use this term because I was keen to engage with the existing computer science literature and look at what are the tools that are currently being used by computer scientists. That could be, repurposed if you want, for the sake of increasing and incentivizing critical engagement on the part of users of augmentation tools. And by augmentation tools, it could be recommender systems, it could be, a system that’s there to optimize the delivery of homework for children who follow remote education, etc. There are many, many examples. The key thing here is to say there has been much talk about transparency, about the need to provide explanations to users of those automated systems and per se.

There’s nothing wrong with that. But I do think this is based on problematic… Almost, it’s a fiction, in the sense that it’s based on the idea that what matters is to protect our, individual deliberative selves. So, when I make a decision, it’s very important that I make it, in a way that could be seen as autonomous.

So, I’m not being misled into believing x, y, or z. , now that’s all laudable, but that’s not enough. If we want these tools to be the trustworthy partners, we want them to be in, especially in morally- loaded contexts like education, justice, or healthcare. So, what do we need there?

When you think of how we interact with doctors or teachers, etc. What do we do?

We don’t just say, I want an explanation. And then we suppose the conversation stops there. We tend to ask questions of each other. And it’s absolutely crucial that these conversations take place where there is an open-ended character to that conversation. And I am trying basically what they’ve tried to do is to say look at the way in which some computer scientists use so-called ensemble techniques to avoid the risk of basically, base learners being overfitted to a certain training data sample.

So, for that, what they try to do is to say let’s identify subsets of training data, which they are choosing according to such rules. And then they train sub learners on each of these sub datasets. So, let’s take the idea that there’s five different sub datasets. And you have based on the A, B,C,D,E. And so they’re each trained on slightly different bits of training data.

And so of course they’re going to result in differences, at the end of the learning process. And so, what they do typically is they try to then average the differences between those base learners, or they have various methods to harmonize the results. And what I propose to do is to say, look, forget about this last step. We don’t actually want to harmonize the results.

What we want to do is to give a very concrete chance. Let’s imagine, I am receiving remote homework from an education provider, and I complained that I always get easier physics homework than my brother. So, I ask, why am I getting easier physics homework.

Now, according to the dominant model, I could be given a counterfactual explanation that says, well, Sylvie, if you hadn’t scored so highly on anxiety kind of tests, and if you hadn’t scored quite poorly in the last test in physics, you would have been given the same homework as your brother. Now, does that empower me? No. There’s nothing much I can do except think, oh, I should be less anxious and maybe score highly, more highly to my last test. And so instead, what I propose to do is to say, look, what if I received instead an example of the kind of homework I would have received had the homework provider used a slightly different sub-learner.

So, a system that’s been trained, let’s say, on data from girls only schools. Yeah. Oh, and then I see also a slightly different homework base that’s been trained on data from boys only schools. And so, then I could be saying to my homework provider, well why are you using this particular model that’s been used, that’s been trained on mix schools data I think, say the homework, that I would have had had you chosen this sub training dataset is much better.

And so, what’s interesting here is that I can give feedback. I can justify why I prefer output from a slightly different model. And this could be conducive to a wider conversation where my parents can pitch in, or the teachers can pitch in. And this is key in the sense of saying, well, what is valuable about education is the fact that it’s a never-ending process of questions and answers.

We can’t know ever for sure what constitutes good education. So, we have to keep asking each other questions. And so that practice of asking each other questions is at risk of being compromised, of being dulled down if you want by or increasingly using automated tools that discourage this kind of critical engagement, that just give me a counterfactual, explanation saying, oh, well, Sylvie, you’re just a bit too anxious.

Surely, we can do better and so, this is my attempt to say, let’s not just be philosophers who wave their hands in the air and say we should do better and have some kind of complicated, philosophical conception of why it should be different. Let’s look in practice as to how we can change things.

So, this is a hands on attempt if you want to say, look, this is something we can do actually. There are complications, of course, and maybe I’ll come to that, but this is something that we can implement tomorrow if there is the willingness to do it.
Satwik Mishra:
So, this brings me to a couple of follow ups. One of it is a little bit practical, implementation and the other one is more philosophical about why you chose the fields that you spoke and that’s a little bit from your work. But a fundamental challenge to providing users with multiple outputs and choices is having enough valid data to train these algorithms. So how can we address this problem of data for base learners while also encouraging a multiplicity of uses from these algorithms?
Sylvie Delacroix:
So, basically there is a problem. Nobody’s going to deny that at the moment. Actually, the biggest obstacle to all having these, helpful tools that can improve education is not fancy algorithm like we have the fancy algorithm, its data.The biggest obstacle to my proposal is absolutely the fact that at the moment we struggle to put together a good enough quality dataset to code, let alone creating five sub datasets.

One from girls on these schools, one from boys on these school. That’s going to be tricky.

It’s going to be tricky because we don’t have a data sharing culture. And why don’t we have a data sharing culture? Because we’ve been assuming that people are happy to sign blank checks. That’s what happens today, when I’m asked if I’m happy to share my data for X and Y study, well, I sign there and that’s it.

Do I have any means of monitoring the data sharing agreement? Do I have any way of thinking “Yes, someone is looking out for me. Someone is checking that the terms that govern this data sharing agreement are respected.” Well, I could hope so. I could put my faith in the very overworked institutions that are supposed to be cracking down on abuses of data sharing agreements, but in practice, that’s not enough.

I think that’s very close to a blank check, and so I think we can do better than that. We can do better than make believe consent. But for that we need bottom-up empowerment structures urgently. And that’s what we also need to keep an open mind. I’m not arguing that every data empowerment structure has to be a data trust. Far from it.

What I’m arguing is we can’t afford to wait any longer when it comes to changing the way in which our data governance structure is organized. And so, what my best hope is, is to precisely join the dots between huge opportunities we have today when it comes to, say, large language models. I’m sure we are going to talk about this and data empowerment structures.

These are actually connected, and education is one of the best examples of that. Think of the example I gave you of those schools who wanted to create data trust to empower children and educate them about the choices pertaining to data. Well, this is a great example where you could say, well, we not only just going to create data trust to empower children and parents to make choices. We can also, through those structures, educate children to say, actually, you can have a say over what the tools look like, why you might prefer a tool that is slightly different.

And you can leave feedback and this feedback is your data. And you can have kind of agency over this data. Now that’s a different world that we don’t have yet today. And that’s missing. So, I do think it’s really important to join those so far separate conversations about, on one hand, data empowerment and on the other hand, how can we create basically participatory interfaces when it comes to AI.
Satwik Mishra:
My second follow up to that is more philosophical. And you mentioned this earlier as well. You mentioned three fields and it’s mentioned in your work around taking large language models to the domains of healthcare, the justice system or education. You say something to the effect of “domains which hinge upon the perpetual reinterrogation of the foundational values.”

What is the important of taking large language models to these fields? And why is the idea of contestability so important to these particular domains?
Sylvie Delacroix:
Thank you. Now, I have to give you, almost an apology, but the paper I put online is a paper designed for Philosophy Journal and so the perpetual reinterrogation of foundational values.

But basically, what does this mean? It means that today, if you look at the, like 95% of the research that’s currently taking place, and this is a massively big field of research when it comes to improving the way in which large language models communicate uncertainty. Why is it a massively big field of research?

Because everybody knows that these tools are not going to be fulfilling their promises in fields like healthcare, justice or education until they’re better at communicating uncertainty. It’s a major imperative in order for those tools to be to be living up to the to their potential. Now, if you look at this research, what’s fascinating is that almost all of it is focused on one objective, ‘to simplify’.

And that objective is let’s avoid unwarranted epistemic confidence on the part of the user, not now to translate in normal language as basically saying, let’s make sure that you don’t just take the output of the large language model at face value. So sometimes you ought to do further fact- finding. Sometimes you ought to go and check what the large language model is saying.

So how do we convey to the user of a large language model the need to sometimes, question the output and go and check further. Well, that is moving fast right now. There are many techniques that are being developed to basically change the incentives so that at the moment, large language models are mostly incentivized to provide an output that’s most liked by the user.

Now the question is, can we slightly change the incentive structure so that they also have incentives to communicate uncertainty and here the extra complication is that there are many different kinds of uncertainty of course… I’m not going to go into further details, but this is fascinating research. For me, this is an elephant in the room .

And that’s what I’ve tried to flag in this paper. It’s to say when we communicate uncertainty, yes. Sometimes it’s to invite my interlocutor to go and do some for the fact finding, but sometimes it’s a completely different objective. Sometimes, let’s say we are talking about, the merits of gender equality. Now, I might express uncertainty in the way I speak not to invite you to further fact finding, but to mark the fact that I’m committed to a certain type of conversation.

And that’s a conversation that is open to a variety of reasonable views. So, it’s basically saying, I am expressing uncertainty because I want us, me and my interlocutors to commit to a conversation that is sufficiently open and so that can be inclusive of a variety of world views, basically. Now that’s the objective. So, communicating uncertainty as a kind of humility marker that unlocks certain types of conversations, that is very different from the objective of inviting you to further fact finding.

And it happens to be super important in fields like justice, healthcare, and education. Why? Because these fields are morally loaded. They are always work in progress. We will never, I hope, we will never stop asking ourselves, what does it mean to say something is unjust? What does it mean to say that someone is not healthy? What does health mean?

These are actually not neutral words. And so, what I’m trying to argue here is that large language models, I see them as a fantastically exciting opportunity in these fields. If and this is an important “if” we pay sufficient attention to the subtle way in which the expression of uncertainty can change the qualitative nature of future conversations.

And if you think about the scale at which these large language models are going to be deployed, this is going to have a massive impact. Because imagine every judge in the country relying on a certain type of large language model just for advice, ,- for summarizing previous cases, etc., etc. Well, the way we do that large language model expresses itself will gradually shape the type of discourse, the type of conversation that can be dominant in a certain field.

I think I find it bewildering, actually, that there hasn’t been more talk about this yet. And I also see this as very exciting because there’s something we can do about this.
Satwik Mishra:
In your research, you also speak about the healthcare case study in this domain. So, walk us through that. How can general practitioners participate in this trustworthy human large language model collaboration via this interface that you’re talking about?
Sylvie Delacroix:
So, I provided an example actually, based on conversations I had with a colleague who’s a retired general practitioner in the UK. And the idea was to say, look, let’s imagine we have a general practitioner in the UK is basically a primary care doctor.

Typically, these doctors have ten minutes to have a conversation with a patient and figure out what needs to be done, if anything. So, it’s a very intense, time limited conversation. Now, we can imagine a situation where these doctors, who are typically time starved, they may be unsure. They may rely on the large language model to advise them on the kind of test they should be considering in light of the complaints or in light of the concerns expressed by the patient. Now, let’s say I’m a general practitioner and I see the list, and I’m very concerned that the list doesn’t include a particular test that I think is particularly important.

But it’s not been listed by the large language model. And I’m basically saying, look, surely, , expressing uncertainty in terms of a color coded. So, I could have the most obvious test in red and then tests that are less certain to be salient in gradually, I don’t know, blue shades or whatever.

So, that’s an interesting attempt to communicate uncertainty to me. But what you haven’t communicated is incompleteness. So, you haven’t told me that you, the large language model, are not in front of the patient and that there are things you may not be aware of because you have incomplete information.

And so how do you how does a large language model communicate in completeness, like, telling a primary doctor, these are the tests that seem salient based on what you’ve told me. But remember, I’m not in front of the patient. Please use your eyes and your ears to complete the list. Counting. Now, I let’s say I’m a very, diligent primary doctor, and I give feedback.

Let’s say there’s a discussion board somewhere, and I can say I was disappointed by this output because it forgot ‘x’. Now, wouldn’t it be better, given the fact that as a primary doctor, I work as part of a community of other doctors? Wouldn’t it be better if, instead of just my feedback, influencing the system on an individual basis? My feedback was recorded and discussed then in an in-person kind of group conversation that takes place, let’s say, every two weeks or something, where doctors can then discuss the experience in terms of how the large language model has helped them, how it has failed to communicate uncertainty or not, etc.

And then you could imagine a situation where you curate the discussion and you even validate, then the feedback that’s produced on the basis of that conversation. So, for instance, in the UK we have the National Health Service. It’s very credible to imagine that if there were to be for any reason. But if they were to be an official large language model used by primary doctors, any feedback fed back to the system would have to be validated on a national basis.

Otherwise, you end up with a system that could go in bizarre ways without being very predictable. And so, that is a very interesting case where there is a strong argument for modalities for collective feedback. And these cases tend to be precisely in these morally loaded fields like healthcare and education. You could imagine the same with teachers, etc.

And so, this is, again, a field, an opportunity that’s not really been explored to its full potential yet. I presented it at a conference with computer scientists a few months ago. They didn’t laugh at my face. They seem to think it would not be impossible. So, I am optimistic. but again, it’s high risk. This is not something that’s been done before.

And I really hope I’m talking to wise people right now to convince them to kind of implement this.
Satwik Mishra:
I hope the wise people take up your suggestion. So let me summarize the idea of contestability. It’s a little bit of expressing uncertainty, talking about incompleteness, and having a little bit of a participatory interface with which you can engage a large language model.

Now, let’s talk about the UI of the existing large language models in the marketplace today. Do you think we’ll need to devise new methods for this collaboration, or will you integrate it in the existing large language model UI set up which is there. How do you envisage this being integrated into large language models.
Sylvie Delacroix:
You mean the collective modalities for feedback?
Satwik Mishra:
Yeah.
Sylvie Delacroix:
At the moment these large language models are mostly being developed by corporations. The scenario where the National Health Service in the UK were to design its own large language model, but it would be a highly desirable one, if you ask me. They have an amazing dataset.

Will it happen? I haven’t met a single person who’s optimistic enough to think that this will happen, and it’s a great change, by the way. But so we have a problem that is that we don’t at the moment have serious activities like publicly funded builders of large language models to be used for the public.

And so, then comes the questions what are the buttons you push to get, let’s say the builders of, commercial builders of large language models to take on board concerns of, let’s say, a community of primary care doctors. And I think this is not impossible. Or you could imagine, like, let’s say that you have the in the UK, you have the Royal College of GP’s.

It’s a professional body that represents all these primary care doctors. You could say, well, the only way we can adopt this model and the only way it could be approved for use is if you put in place these collective modalities for feedback. So, you will need the kind of collective bargaining effectively, to put pressure on design choices that go beyond just the most liked output.

Yeah. Which is going to be the model that dominates. So, yeah. It’s a great question and, I very much hope that there will be enough examples of this in the near future.
Satwik Mishra:
We are going to have a data driven future, there is no doubt about it. What are you most optimistic about in this future? And what do you think we should be wary of?
Sylvie Delacroix:
Part of my answer is kind of already aired and one thing I’m very concerned about is in the optimism about building data, empowerment, infrastructure, etc. at the moment we still are at risk of developing solutions that only help the least vulnerable part of the population.

So how do you bridge the gap? How do you not only acknowledge, but take on board the fact that a large proportion of the population hasn’t even thought about data, is not concerned about data governance? How do you go from there to a system that gives them choice, encourages debate, etc., etc.? It’s tricky. And this is a crucial moment.

I feel that so much is at stake right now in the choices that are about to be made. And that’s why I do think one crucial thing is that we experiment with enough variety of structures that we can learn from failures as well. Not every structure is going to succeed and by structure, I mean bottom-up empowerment structures.

But we need a variety of them, we need enough of them to see what works, what doesn’t work, and how, we take on board what I would call the data awareness gap. This is one concern I have. I have many others, because also, of course, there’s a risk of being too European centered or US centered.

This system at the moment presupposes data rights that don’t exist in many, many countries. So, what are the alternatives? The alternatives to create structures that collect the data itself and use the data as leverage, to bargain better working conditions. That’s one example that’s being used in the country in for Uber drivers, etc. That is one model.

But again, I am mindful of, of those risks. It’s a one is to import too much of a European centric also understanding of what constitutes vulnerability. Well what Europeans think of us as a vulnerability. People in China may think of us completely different thing. and so how do we learn from each other as well in terms of, we are going to come up with different solutions.

Given the climate today, it requires quite a large dose of optimism to think we can, share and learn enough from each other on that front. Because data is not national, like data doesn’t have borders. So, this is one of the things that we have not really come up with a solution, we’ve not really addressed properly yet.
Satwik Mishra:
And optimism?
Sylvie Delacroix:
My optimism is very much tied to the fact that today we have an opportunity to join the dots between that data empowerment initiatives, which are starting to gain momentum. As a social movement, the Data Empowerment Fund had over 900 applications.

Now, that was bewildering. Applications from all over the world. High quality applications. To me, that’s fantastic. It means that a movement that started five, six years ago as a small academic thing, becoming a real worldwide movement. And I do think that the best hope I have is that if we connect these data empowerment structures, particularly in the context of education with participatory design for interfaces of values, AI tools we can then produce, the next generations will be much better equipped than us in some ways to move things along in ways that empower us rather than the opposite.
Satwik Mishra:
Professor Sylvie, thank you so much for being here, your research is fascinating and I’m sure its practical implementations will always land up creating a more trustworthy technology landscape. Thank you so much for being here.
Sylvie Delacroix:
Thank you, Satwik.