Overcoming AI Hallucinations: Truist’s Chandra Kapireddy

On today’s episode, Chandra Kapireddy, head of generative AI, machine learning, and analytics at Truist, delves into the evolving landscape of AI with a particular focus on how GenAI tools reshape the way Truist and similar organizations must navigate model risk management and regulations. GenAI is more versatile than traditional AI, he notes, yet its […]

Apr 15, 2025 - 12:02
 0
Overcoming AI Hallucinations: Truist’s Chandra Kapireddy

On today’s episode, Chandra Kapireddy, head of generative AI, machine learning, and analytics at Truist, delves into the evolving landscape of AI with a particular focus on how GenAI tools reshape the way Truist and similar organizations must navigate model risk management and regulations. GenAI is more versatile than traditional AI, he notes, yet its flexibility introduces new challenges around ensuring model reliability, validating outputs, and making sure that AI-driven decisions don’t lead to unfair or opaque outcomes.

Chandra’s responsible AI approach at Truist is focused on risk mitigation while emphasizing the importance of human oversight in high-stakes decision-making. He points out that while GenAI can vastly improve productivity by handling repetitive or analysis-heavy tasks, it’s essential to properly train employees in order to use the tools effectively and not over-rely on their outputs, especially given their tendency to hallucinate or produce inaccurate results.

Subscribe to Me, Myself, and AI on Apple Podcasts or Spotify.

Transcript

Sam Ransbotham: Despite the tendency to hallucinate, generative AI [GenAI] solutions have promise. Hear from one banking executive about how his company uses both traditional and generative AI to enhance operations — and engage customers and employees.

Chandra Kapireddy: I’m Chandra Kapireddy from Truist, and you’re listening to Me, Myself, and AI.

Sam Ransbotham: Welcome to Me, Myself, and AI, a podcast on artificial intelligence in business. Each episode, we introduce you to someone innovating with AI. I’m Sam Ransbotham, professor of analytics at Boston College. I’m also the AI and business strategy guest editor at MIT Sloan Management Review.

Shervin Khodabandeh: And I’m Shervin Khodabandeh, senior partner with BCG and one of the leaders of our AI business. Together, MIT SMR and BCG have been researching and publishing on AI since 2017, interviewing hundreds of practitioners and surveying thousands of companies on what it takes to build and to deploy and scale AI capabilities, and really transform the way organizations operate.

Sam Ransbotham: Hi, everyone. We’re back for another episode. Thanks for joining us. Today, Shervin and I are talking with Chandra Kapireddy. He’s the head of generative AI, machine learning, and analytics at Truist Bank. Chandra, thanks for taking the time to talk with us.

Chandra Kapireddy: Hey, Sam. Good to see you here. Happy to be part of the podcast.

Sam Ransbotham: Let’s start with Truist and your role. I’m guessing many people likely know of Truist. It’s one of the largest financial services companies in the United States, formed in 2019 by the merger of BB&T and SunTrust [Banks]. Chandra, can you explain your role at Truist?

Chandra Kapireddy: Sure. For those who may not be familiar with Truist, this is a top 10 financial services company. We operate in 17 states, [have] 15 million customers, $530 billion in assets. We offer products and services in consumer banking, wholesale banking — that includes wealth management, corporate investment banking, commercial banking — so pretty much all of the different products and services you can think of from a top 10 financial services [company].

I’m the head of AI, machine learning, [and] analytics, as you introduced at the beginning. My group broadly does three things. No. 1, we provide the AI strategy and drive AI policy for the company. Just think of that as a first bucket.

The second bucket: We build platforms and capabilities for [the] data science community, both training the models as well as serving the models, also the model risk management teams.

The third bucket is really developing advanced machine learning and GenAI capabilities, including the applications themselves. Think of this as more like AI research and cutting-edge and next-gen kind of solutions for the company. Those are three areas that my group is accountable for.

Sam Ransbotham: Those three areas are pretty large. You’ve got a strategy policy, you’ve got platforms, you’ve got advanced research. Can you give us some examples of some initiatives within the bank? How is the bank using artificial intelligence?

Chandra Kapireddy: We broadly classify AI into two buckets: traditional AI and generative AI. Within traditional AI, we have a number of AI models that are already in place, fraud detection models, we have some customer segmentation models, and so on and so forth.

The second bucket is generative AI. We break that into three sub buckets. Number 1, there are models that we can actually build, and that’s what we do. There are fine-tuned models we can actually build based off the foundation models. And the second is using the services [and] APIs: Azure OpenAI as a service, for example, [Amazon] Bedrock as a service, [with] Anthropic behind the scenes. And the third one is the applications themselves, for example, Microsoft 365 Copilot or an Amazon Connect transcribe application.

Shervin Khodabandeh: Chandra, one of the things that this distinction between generative AI and what you call traditional AI brought in my mind is, you know, Sam and I have been looking at this space for almost a decade, and the fraction of companies that would say that they have a robust AI strategy has steadily increased over time.

Now, one thing we saw with the advent of generative AI was companies took a step back and they said, “Well, maybe we don’t really have our strategy as together as we thought we did.” I’m just trying to get your sense on how real that is. You don’t have to comment specifically on Truist if you don’t want to, but just sort of generally, [I] want to understand from you, from your peer group, folks you talk to, what do you see?

Chandra Kapireddy: Look, I think AI strategy is something that most financial services companies have been working on, and [they’re] constantly evolving it. I don’t think that evolution has slowed down or has not changed. I think it’ll continue to happen. Generative AI definitely brought awareness to a broader community. I think for folks who have been doing natural language processing (NLP), they probably knew this kind of innovation would come at some point in the future, but it just happened quickly.

As a matter of fact, 2017, I think, is when it all changed with “Attention Is All You Need” and the transformer architectures. At that time, people had access to GPTs [general purpose technologies]. It was not GPT-3, it was GPT-1, GPT-2, and the responses were really not anywhere close to where humans could rely upon them.

For financial services, as you know, we always have been on the cutting edge, leveraging cutting-edge innovations technology to offer better financial services and products to our customers. We’ve been watching this field for some time, and so, definitely, the shift has occurred with generative AI, only from the standpoint of bringing the latest and greatest innovations and [seeing] how we could replace some of the existing NLP technologies that [were] built with the limitations we had, given the accuracy of the models, where we could use GenAI to surpass some of those and bring efficiencies.

So that’s kind of where we are. I don’t think the strategies are changing because of that. That could give you a perception that, “Hey, we probably are moving away from AI strategy.” When we make statements like that, it’s probably meant that we’re moving away from relying just on traditional AI to really adding generative AI capabilities.

Shervin Khodabandeh: To build on that point, even some leaders I speak with now are still thinking of these two as very separate domains of capabilities: traditional AI versus GenAI. I’m curious about your thoughts about the combination of both. Do you have any comments on that?

Chandra Kapireddy: You’re absolutely right. I see them as complementary. I think your audience and everybody in the community who is part of machine learning, we all know that all models are wrong, but some are useful. And, similarly, every GenAI model hallucinates, right? We know that for sure. The rationale behind whether we should use or not use depends on the understanding of the degree of uncertainty that really lies with either of those.

Imagine a scenario where a customer [is] applying for a credit card, and behind the scenes there’s an underwriting model that really either helps with the decision-making of approving or rejecting a credit card. The level of explainability that has to be built into that model is quite humongous. This is where the regulations come into big play and how we test the models. That model was specifically built to either give a score to aid the decision for credit card acceptance or rejection. And that model was supposed to exactly do that and nothing else.

But if you take generative AI, you really could use it for multiple use cases, and that’s where the challenges are. Maybe you want to summarize the text, maybe you wanted to extract entities out of the text, maybe you want to get the sentiment of the text. Previously, we used to have individual specific models; with generative AI we could actually use one single model to do that. There are positives there. We don’t have to worry about training multiple individual models, but the downside of it is really, “Hey, the single model is trying to do multiple things. Can it actually? Can we rely on it?” And how do we make sure that we validate it and test it to ensure that we stay within the accuracy levels that we would expect it to [achieve]? And also [be] responsible in a way that we are not really making decisions that otherwise would be unfair and not transparent, and so on and so forth?

And that’s kind of where I see there are use cases for both, and both will stay together. I don’t think we’ll ever remove traditional AI. We would actually continue to have that while we continue to build on GenAI, and I think those two will actually converge at some point.

Sam Ransbotham: I think that’s particularly interesting. I mean, you mentioned SR 11-7, which is a [Federal Reserve] Supervision and Regulatory Letter. That guidance came out to the banking industry maybe 15 years ago. And that was really before our new wave of artificial intelligence. How do you reconcile a regulatory framework that was developed and put into practice well before the modern developments in models? It seems like a fundamental issue of timing is difficult there.

Chandra Kapireddy: You’re right, Sam. SR 11-7, as you said, came in 2011. SR 11-7 was all about the effective challenge of any model that gets built. It revolves around a check for conceptual fit, which is super important, whether you use a GenAI model or a non-GenAI model.

If you take a call center summary, if a customer calls a call center and really talks about a specific product, and maybe the customer has called to kind of express their dissatisfaction about a particular service and so on and so forth, how do you make sure that the summary is transcribed? That’s a technology we’ve been using for some time. We mastered it. But how do you take that call summary and then summarize it in a way that represents the call itself? Previously, we were actually using NLP technologies, let us say we replaced it with GenAI. How do you make sure we rely upon that summarization piece of GenAI capability? If that model is in place, you would expect to have the same ongoing monitoring, same effective challenge concepts that SR 11-7 dictated, right?

That’s why … if you talk to regulators today, and some regulators really think that, “Hey, the technology is changing behind the scenes, but what we really want all the banks to do from a due diligence [standpoint] and using these models and providing financial services, that’s not changing.” And so I totally agree with that thinking.

I’m sure you also have seen the NIST AI Risk Management Framework that came into existence. After that, GenAI had gone live with ChatGPT, and then NIST had to release a generative AI profile to aid the NIST AI Risk Management Framework. But if you take all of the policy, the NIST AI Risk Management Framework, one would expect that SR 11-7 is still intact in terms of really how you do the model risk management.

Sam Ransbotham: Yeah. The framework is there, the skeleton is there, and the pieces may have changed, but the bones are still the same, perhaps.

Chandra Kapireddy: Absolutely.

Shervin Khodabandeh: I want to talk a little bit about people in this equation. We’ve been talking about AI and technology, and I would say, with predictive AI, there was an impact on people, whether it was in underwriting or in customer servicing or in marketing, and the impact was making complex decisions, if you will, or complex analysis, easier by processing information or predicting or optimizing that which is impossible for a human to do with billions of data points and all that.

However, I would say, the extent of that impact was more limited than what we see with generative AI, where the models, as you said, are doing tasks that naturally were human tasks, whether it was in, you know, you talked about summarization or understanding the tone or even logic or planning and some of the complex even analysis or reasoning that humans do. And so the impact on people should be far greater with generative AI in terms of roles and nature of work and skill sets and sort of that sort of thing. What are your views on that, and what are you seeing happening in the industry?

Chandra Kapireddy: Shervin, I think you hit up on a very important point, though, at Truist, we know that generative AI, as I said, it hallucinates. I think we know for sure that is the case, if you truly understand the math behind it. Fortunately we’ve been able to implement a seeding mechanism, so it actually at least gives the same output, no matter how many times you ask a set of questions, the same question, same output.

Previously, it was actually difficult even to get to that. You asked the same question twice, you’d get a different answer, so we’ve come across many hurdles. I think we’ve made significant progress. At Truist, what we do is we take [a] responsible AI approach — what does it mean?

You take all the dimensions of responsible AI from a privacy perspective, explainability perspective, transparency, accountability, safety, and security. So, what we did was we codified, we kind of did all of that into a policy. The AI policy is kind of how we drive the conversations of the company. And we translated the AI policy into a set of processes. And understand, if you look at how we implement at Truist, [we] take the life cycle approach. Every actor in the life cycle is made aware of what stages of the life cycle exist, what their roles are in each of the stages of the life cycle.

It starts with ideation, so if somebody in the company has an AI use case, they start with the ideation life cycle stage, and then we actually do the risk assessment, the risk clearing — that’s kind of the second stage of the life cycle. Then, we actually begin [to get] into the development, testing, independent validation, and implementation, and ongoing monitoring. This is the seven stages of the life cycle, so we take the policy with all the responsible AI dimensions, and we make sure that every single stage of the life cycle, those dimensions are accounted for.

And how we translate that to the actor that has to adhere to it is through the standards. So if somebody is at the ideation stage, they really have to go through the ideation standard, make sure they meet all the requirements, and they go to the next stage. And that’s how we implement it. And so the reason why we have taken such an approach is really making sure that whenever we build a GenAI solution, we have to ensure the reliability of it. We have to ensure there is a human in the loop who is absolutely [checking the] outputs, especially when it’s actually making decisions. We are not there yet. If you look at the financial services industry, I don’t think there is any use case that is actually customer facing, affecting the decisions that we would make without a human in the loop.

As an industry, when we present those GenAI-built applications or [GenAI] driven applications, we would make it clear that, “Hey, you are interacting with an application that is AI.” And we also warn them that it could hallucinate, and so people can take the appropriate actions, and No. 1, they have to be trained on how to use it. No. 2, they know the consequences of relying on that output without you applying their judgment. That’s what we do at Truist every time we have a GenAI situation, and those things are mandatory steps that we would take.

Shervin Khodabandeh: I get it, that it’s absolutely important not to rely on GenAI to make critical decisions given the state of the art, and hence all of the responsible AI framing that you referenced. But I have to believe that not so much of what everybody in the company does is about making decisions, right? I mean, so much of it is also about gathering information or doing analysis without necessarily making decisions.

In a year or two, when these models get better, when the frameworks get better, what is your view on what will happen to the nature of work, effectively, in many of these places where GenAI is taking [away] so much of the work — maybe even, say, none of the decision-making — but so much of the work that leads to the decision-making?

Chandra Kapireddy: You’re absolutely right. So there are lots of productivity type applications that we can actually build using GenAI, and even [for] productivity-based applications, there has to be training given to the teammates on how to use it. A simple example like Copilot: To use Copilot, if you do not understand how to write a prompt to get an answer, and if you don’t understand how the system will actually behave, there’s a limitation on what it can do and what it cannot, then obviously either you give up using that capability, or you use it in the most inefficient way, even though the product could actually do a lot more.

So that’s one area where we would actually take that into account in terms of how do we train employees to use GenAI-based applications because this is a brand-new technology and also, most teammates who use those GenAI applications are knowledgeable. They’re the subject matter experts, and they can figure out when and when not to, with the understanding of the fact that they’re interacting with a large language model that could hallucinate. So the overreliance on that LLM over a period of time is a challenge, but at this point, I think that’s probably the least risk-based application.

We see those happening at Truist as well as in the industry. I think the next step that we want to take is, really, how do we ensure that we do it more economically? Or how do we do it more efficiently? How do we scale it across [the] enterprise? That’s where the challenges are for those applications.

Sam Ransbotham: A lot of what you’ve been talking about is, I would say, operational focused or internally focused on productivity, and that’s important. But at the same time, I wonder about the competitive environment and the business environment that banks are finding themselves in now, that now have artificial intelligence.

You must be at the forefront of seeing a lot of the difficulties that AI has brought. Can you talk to us a little bit about the kinds of things you’re seeing change and, in general, roughly what you’re doing about it and how you’re thinking about it?

Chandra Kapireddy: You’re absolutely right. [Take] cybersecurity. GenAI is definitely [the] most favorable tool for folks who really want to take advantage of it to increase their attacks by the fake invoices, right? I mean, not just the invoices but also invoices of your business. That has been going on for some time. And the only thing that has changed is the sophistication at which those documents are generated, those voices are generated, has gone up, and the technologies are out there to constantly upgrade and how to catch those. And we definitely have to take advantage of those.

I’m sure you’re all very familiar with the GAN models, the generative adversarial networks model. As they get sophisticated, it becomes extremely challenging to discriminate what is real versus what is fake. And so that’s one thing. That’s obviously an area of focus for us, just like other banks. So that’s No. 1.

But No. 2 is really around innovation in the GenAI space. We definitely wanted to take advantage of generative AI capabilities, whether in terms of fine-tuning the existing models out there or leveraging the services and applications as we talked [about] earlier. But if you look at the evolution now, we are now into agentic frameworks, right? Most people, I think, who are unfamiliar with the terminology here, everything to them is a large language model, and everybody refers to ChatGPT. But if you look at the implementation of GenAI capabilities, it’s really an orchestration of a number of different components, and it’s not really a single model that’s going to really help us. It’s a combination of different services that really have to work together.

Right from when somebody tries to submit a prompt and understanding who that person is, authentication, and understanding what this person can do, using the LLM, the authorization piece of it, all the way through parsing the input and guardrails on how this person is really interacting with, and whether we can actually catch any insider threat, for example … those kinds of parsers have to be built and then [determine] how the agents really take all these inputs and then how we decide on which technologies that agents could use to interact with.

It’s not about LLMs. If somebody types in a prompt and says, “I want to know my checking balance for my account,” and if you try to use generative AI to generate the balance, I’m sure the answer is going to be either approximate or completely wrong. And this is not how we want to build applications, so we’re going to have a deterministic API call that agent has to make to get a specific answer for that. And how do you orchestrate when to use an API and when not to use an API? That’s what the real challenges are. And if we do that, Sam, I think it’s the same exact thing for security as well. When the attacks go up, there’s a combination of both the generative AI as well as the traditional AI techniques and models that we use to really counter them.

Shervin Khodabandeh: Yeah, I think you are hitting the nail on the head.

Sam Ransbotham: I like the example of the balance; you don’t want the approximate balance.

Shervin Khodabandeh: The biggest unlock here is the ability to orchestrate across a variety of AI tools, so that the machines are working together as well as they can. And on the other hand, to orchestrate the machine/human interaction so that humans also help versus hinder, and they’ve also helped versus hindered, right?

Chandra Kapireddy: Exactly. If you look at any implementation of GenAI that is probably working closer to what one would expect, it's that careful orchestration of different components, that I just kind of alluded to in the beginning. It’s not only that but also the validation of the outputs, right? I mean, at the end of the day, I think we talk about the various RAG [retrieval augmented generation] patterns. So when somebody types in an input as a prompt, and you obviously contextualize it by extracting information relevant to the prompt, and then you send it to the LLM, including making API calls to get more deterministic answers, but the LLM responds back to you with an output. How do you make sure this output is valid?

There are whole validation techniques that you had to put in place. And then — and this is where the guardrails come into play, and sometimes, you know, guardrails are given by default by service providers, by companies like Truist; we build our own guardrails to make sure that the output is thorough and ensure that this output can be trusted and relied upon by the end user.

If we are not confident about the output, then our response is going to be, “Sorry, we can’t answer this question at this time.” That probably is a much better answer than really giving them some output with some probability of confidence. I think that’s an important step that most of the companies that are implementing GenAI at scale are thinking about and considering — that’s where the real challenges are right now.

So hopefully, we’ll get there soon, but as it stands right now, it’s really highly speculative at this time.

Sam Ransbotham: Chandra, we have a section now where we talk about a series of questions. These are short questions. Just answer with the first thing that comes to your mind. What’s the biggest opportunity for artificial intelligence right now?

Chandra Kapireddy: Actually, I would give a little longer answer here. There are revenue-making opportunities, loss minimization opportunities, teammate productivity opportunities, and operational efficiency opportunities.

The sky is a limit in what we can do with AI. And AI is real to me. I know some folks think it’s hype, but it’s actually real. The underlying technology is getting better, the data is getting better, the application of the technology is getting better, so I say, huge opportunity across all those domains.

Sam Ransbotham: So what’s the biggest misconception about AI? Is it that it’s all hype? What’s the biggest misconception that people have?

Chandra Kapireddy: Yeah, I think the misconception is the fact that there’s not enough evidence that it could work reliably for a longer period of time. The misconception is it may fade away, just like some of the other technologies that had a lot of hype in the past but had not lived up to the expectations. Generally, the misconception is about the lack of sufficient proof that it would actually work over a long period of time.

Sam Ransbotham: What was the first career you wanted?

Chandra Kapireddy: I wanted to be a movie director.

Sam Ransbotham: I was not expecting that.

Chandra Kapireddy: I know.

Sam Ransbotham: What happened?

Chandra Kapireddy: I think, when I was in India, I was contemplating to either come to the United States to do my master’s or to go and join a film technology institute in India and get formal training on becoming a director. And I couldn’t resist coming to the U.S. because of the opportunities that I had and my passion toward database management systems at that time.

My plan was to come to the U.S. quickly, and learn the latest and greatest. I never went back and you know how it happens. I mean, it’s not easy to go and become a director at this point, so I had to drop that idea at least for this life.

Sam Ransbotham: Maybe that’s next. When is there too much artificial intelligence?

Chandra Kapireddy: Too much AI is when it understands you without you explicitly giving permission to it. That’s when I kind of feel like this is where too much AI is happening without all the stakeholders knowing about it.

Shervin Khodabandeh: When it understands you more than you understand yourself, then there’s way too much.

Sam Ransbotham: What’s one thing you wish AI could do that it can’t currently do?

Chandra Kapireddy: To be honest with you, from a corporate world, look at the number of emails that we get, and there’s an attempt that’s being made by a number of players in writers of the emails, summarizing the emails, even taking the meeting minutes and really feeding out actions. We’ve made a lot of progress in that. I wish we could do a little bit more. That is one area that I definitely see.

No. 2 is I want to see a day when humans are not needed to be in control. And will that day ever come? And what’s the consequences of that happening?

We still have a ways to go. I know we talk about [artificial general intelligence] and [artificial superintelligence]. I don’t know if we are there yet, given how the models work behind the scenes. Once we get to that level of sophistication, I’m sure AI could do a lot more, as simple as managing your finances, managing filing taxes, and if AI could do my taxes, then I could save tons of money. There are many examples like that.

Sam Ransbotham: That’s a good segue because I think we’ll probably publish this episode right around tax day. Thank you for taking the time to talk with us today. I think the ideas of orchestration are particularly resonating with me right now. The idea of the coordination involved, not just humans and machines, which is what we talk about a lot, but machines to machines and different types of machines — deterministic, generative machines, and how these all work together. It’s not just one piece; [it's] lots of pieces to put together. Thanks for taking the time to talk with us.

Shervin Khodabandeh: Thank you for this conversation, it’s been really, really illuminating.

Chandra Kapireddy: Thanks, Sam. Thanks, Shervin.

Sam Ransbotham: Thanks for listening. Next time, Shervin and I speak with Steve Preston, CEO of Goodwill industries. Please join us.

Allison Ryder: Thanks for listening to Me, Myself, and AI. Our show is able to continue, in large part, due to listener support. Your streams and downloads make a big difference. If you have a moment, please consider leaving us an Apple Podcasts review or a rating on Spotify. And share our show with others you think might find it interesting and helpful.