Introducing season 2 of Unstructured Unlocked! Indico Data CEO, Tom Wilde, steps in as co-host alongside Michelle Gouveia, VP at Sandbox Insurtech Ventures. In this episode, they discuss Indico’s recent top position in Everest Group’s Intelligent Document Processing (IDP) Insurance PEAK Matrix® 2024.
Michelle Gouveia:
Hey everybody. Welcome to a new episode of Unstructured and Unlocked. I’m Michelle Govea
Tom Wilde:
And I’m co-host
MG:
Tom Wild. And we are really excited today to be joined by Asif Sed, the Vice President of Data Strategy at HSB Asif. Welcome to the podcast.
Asif Syed:
Thank you Michelle. Good afternoon everyone, and nice to be here.
MG:
Thrilled to have you. We are really excited about the conversation we’re going to have today because we spend a lot of time on the podcast, as you all know, talking about automation, right? So data intake, data ingestion, how do you automate the submission workflows, claims, workflows. But I think today we’re going to take a step back a little bit and talk about the strategy of how all of that comes to be, right? How do we make sure that the data is clean, is usable? How do you think through getting it into the organization, how do you use it? How do you make sure that people know how to use it? And I’m excited ASIC, that you’re going to walk us through how you think about that in your day-to-day. Maybe before we get started on all of that, that I just shared, can you talk a little bit about your experience and your background and your role today at HSB?
AS:
I’ve been working in this data analytics and technology field almost 25 plus years. I started my career with Tata Consultancy Services who sent me here in US in 1999, and then I started working for GE as a consultant building the data platform and the analytics platform for the analytics people. So I worked with quite a lot of companies before joining HSB, GE as a consultant at Yale, Yale University as a consultant. And then I did the modeling part at The Hartford. That’s where I built the actuarial pricing models. And then I worked for a couple of years at AX XL before joining at HSB, almost nine years back, a little more than nine years back. And I joined here to build the predictive models in the analytics division. But then soon after that there was an opportunity to move and lead the data governance and the strategy part of it.
AS:
And so I took the opportunity to move to the data side of the house. I lead some of the data topics from a business perspective. I have the data engineering team. I also lead the third party data from HSB and the discussion around the third party data, some of the business value of data that I help my boss, who is the chief data analytics and digital officer at HSP. So I help her with some of the business value up data. Another things that is big with me is the data quality. There are quite a lot of data quality topics and discussions that I lead to make sure that all our data tion and the data is actually ready for consumption, both for analytics purposes as well as for reporting and other data usage purposes. So that’s kind of another big topic. Another topic that I get very close to with data teams and the data business partners, they work with me as well.
AS:
And in that role, we talked to a lot of business folks about what the data uses is what are they doing with the data besides just building models and analytics on it. A lot of business reports and analytics that they do on an ad hoc basis, we support them as well. And we do have a data technology team within IT who implements the actual IT projects of the data side of the house. So I partner with them as well in any large projects where we have to build some of the data tools and technologies through IT. Processes and practices
MG:
Sounds like a huge job. Oh, thank you. With a lot of moving parts. There’s a lot to unpack there in what you just said ass. But I think one of the questions I had is in your role, what’s the ratio maybe of conversations or times where you are identifying data needs or vendors that provide a certain type of data that you’re bringing to the business and saying, we think this may be helpful or relevant to you versus the amount of times where the business is coming to you and requiring you to maybe vet or assess a data set before determining whether or not there’s a value to bring it into whatever the use case for submission, supplementing a submission or even just risk modeling or pricing.
AS:
Yeah, no, it’s a partnership Michelle. It’s like we really are very close to our business partners and we talk to them. So most of the time we understand what problem they’re trying to solve and because we are closer to the data and we know what data is available in the marketplace in both ways, both internal data and external data. So we influence the business discussion of saying that maybe we should collect this data from our internal partners, the data that we get it from our internal sources or maybe this data we should buy it from external sources from a third party data vendor because it’s cheaper, it’s high quality, it’s available, and the process of getting those data ingested from an internal process or collecting from our customers, probably it’s not worth it. And so some of those discussions, because we are very close to our customers, our business partners, we bring it to the table as well.
AS:
But as you know, the business people also travel and they go to the data conferences, they go to the business conferences, they hear a lot of the third party data that is available and they bring it to some of those discussion to our attention as well. So it’s a really partnership. And another things I want to add, you probably know that we are part of the Munich regroup. Munich regroup has a big investment on identifying third party data. They used to call that as a data hunting team. Now they call that as a data asset team, but the goal is identifying all the third party data available globally to solve our business problems, both to augment our internal data and to do business better with little internal data and maybe with external data if it’s cost effecti.
TW:
Do you think of data now in almost a supply chain metaphor where you have sources, you have rules and transformations and you have outcomes you’re trying to achieve. Is that a good metaphor, do you think of it that way?
AS:
It’s there. I haven’t really thought about from the supply chain perspective, but managing the data and having that balance from both internal and external perspective. As you know, the data has a journey in itself, correct? The data gets created in some systems, whether it’s internally or externally, and then how do you utilize that data in the right way to solve the business problem? There’s quite a lot of work as you know Tom, because it’s not just the cost of the data that you buy, but also the internal cost of aggregating that data, getting it high quality, joining with our internal data and actually solving the business problem in and putting that in production. It’s quite a lot of a journey and having a process mindset to solve those business problem, it’s really important because you really need to improve the process the right way with the data. And data is not just in numbers of columns and roles, it’s also telling something to solve a business problem and we need to have the right balance there.
MG:
What do you wish third party data vendors knew about the internal vetting and assessment process? What would be helpful in those conversations or would help streamline that negotiation or just what are some pain points that you wish they were more understanding about?
AS:
I think third party data many times, but I felt that they don’t understand the use cases. They have the data and so they don’t understand the cost of it and probably they could do better from that perspective. Understanding the use cases of where and how the data is going to be exactly used would be helpful to have a more definitive and speedy conversation where we don’t waste time, anyone’s time. Other things Michel that that is going on a lot of in the AI governance and all the EU AI act and all of those things, the department of Insurance are asking that we are as an insurance company, we are liable if the third party data is not gathering or collecting data the right way or if they have biases in that data and we put that data in our model and the model become biased in a form. Then as an insurance carrier, we are responsible for that data elements that we have acquired from a third party data vendor and how quickly they can help us to understand that there is no bias in their black box. That will be another good things to have in the play as well. Beside a thorough understanding of what the carrier is trying to do with their data.
TW:
Given that you have a long track record in this space, what are some of the big changes you’ve seen in the last say from 10 years ago? And then I’m equally interested what’s changed in the last year because obviously we’ve all seen incredible velocity of change in the last year. So I’m curious those two timeframes, what’s jumped out to you?
AS:
So data was always a centerpiece of the discussion as a lot of the data was for the last 10 years, everybody was trying to build the data warehouse and the data repository and many, many projects filled as in the data warehouse scheme because the business outcome was not solidly defined. They just wanted to kind of gather all the data together and didn’t know where they’re going to do with those data. So that’s the problems still there. I think there are some forms. It is like, okay, let’s gather all the data that is available. Let’s combine everything and there is a high cost to that effort. So there is another, but very recently in the last one year I would say this AI or innovation of large language models on the chat gpt have definitely changed the conversation more towards we need to have an AI strategy.
AS:
Some of the times those more driven by those new technology and the Wall Street Journal or the Harvard Business School magazines saying that the world is going to change with this new model and it’s justifiably so. Correct. Those are as a technology breakthrough, these are the big events, the industry or in the world on the technology, but where do you want to do that balance? Because as you know this AI models are dependent on good quality of data and how already you are with the good care quality of the data to take advantage of those. The recent development of the technology, sometimes we go ahead with the flow that okay, the technology will solve all the problem when your foundation is not ready for that kind of technology to be utilized properly. And on many times they probably may not have a business case for that industry to be able to use it right away.
AS:
They might be better off waiting for a little bit than jumping into it because they probably want to solve their foundational problems first. But because of the hype cycle of the nature of the hype cycle of the ai, sometimes people miss that. So having a right balance I think is always important. While you do want to do the proof of concept and be ahead of those technology curves so that you don’t miss the board, but you first solve your other big problems of getting the data and getting the data ready for those AI and application is important to me,
TW:
It still seems that defining an outcome is still the most important thing regardless of the tool or the technology or the data that you’re going to use.
AS:
Oh, absolutely, Tom, absolutely. Because many times I think that’s where the projects fail because we don’t define what business problem we are trying to solve. We say that AI will solve the problem and as we know, correct, as a technology practitioner, there are a lot of limitation of what AI can do and it’s not just matured enough to solve all the world’s problem. Maybe someday it’ll, so I think a very calibrated approach of what business I’m trying to solve, what business problem, okay, what data do we need? Do we have all this data with the high quality and do we have our technology infrastructure ready to even deploy those models the right way? Correct, because it’s not just doing a POC or doing testing, deploying those new algorithms in a production cycle. And is our culture ready to kind of support that as well?
MG:
Just hit on, I have another question, but you just hit on just internal infrastructure, which to me and from my time within working at insurance carriers was always a challenge. And so from your perspective as someone that worked with the different business units, overseas, data engineering teams, and you have probably seen just through all of your various roles and different companies, various states of technology readiness to you, defines having an infrastructure that’s ready versus something that needs to kind of go through this large transformation as it relates to being ready for ai.
AS:
So as you know, data quality is a big issue in all insurance carrier, it’s not just at HSP, correct? It’s we are trying to improve and we are really farther away than while we are in five years back. But we are dependent on a lot of other issues within the system, within the policy system, within the claim systems. And so doesn’t a constant movement of new tools and technologies moving in the cloud and moving to the new version of it. So I think that having the right technology framework in place in with the data strategy and data technologies in parallel, how do you layer all of those things, especially in a big organization, correct, because there are many different groups, many different technologies that comes into the play. So I think the marrying the different layers of the technology practices, the data architecture, the business architecture and the business objectives and the data strategy and quality, all layering one at each other in the nice space would solve some of the business value that we’re trying to achieve together.
TW:
How do you think about your data strategy, say sub-segmentation, do you think about it in terms, because I think about emergent categories that insurers are increasingly using telematics, drone imagery, obviously in Indigo’s case email and documents and historical unstructured data that companies have accumulated for decades but maybe not have been able to leverage. Is that how you think about it or is it more sort of business problem and then the data? How do you matrix those two things together given that they’re both evolving at rapid paces but somewhat independently?
AS:
So at HSB, we don’t have some of those telematics data because as you know, we are in a different kind of a business within the PNC sector. At HSB, we are very big on equipment breakdown. We have few products in cyber, we have a big presence in cyber, we have big presence in some of the liability lines and we are also very big on our IOT and sensor data. So some of the data is a little different in structure than in other PNC carrier, but we do have a big lot of unstructured data and there are a lot of benefits of using those unstructured data with the structured data together. But I am always a big believer, Tom, with having the right problem and the business value of getting those data. While we want to kind of incrementally get to a better place with the data, but we don’t have a business problem to solve right away, just accumulating millions of data points for imagery or other things may not give you a lot of business value. And because as you know Tom, the technology is changing so fast, if you don’t use those data today, tomorrow, there’ll be better way of utilizing that data with the different tools and technologies. And so to me always is doing the appropriate right balance and the right, the things you do with the data is that you need to solve the business problem today and not, okay, I’ll do something with the 10 years from now or five years from now, maybe we’ll need a different data elements by that time,
TW:
Given how fast it’s changing, does the leadership see data as an opportunity for differentiation and competitive advantage or is it more a means to an end? How do you think about it?
AS:
No, absolutely. Leadership do think that data is a differentiator because that’s what kind of differentiates us between from us versus any other company and how well you manage and the access of the data, the data that you generate internally and how you manage those external data as well with the internal data. Those are really key differentiator and at least at HSB, our senior leaderships put a premium on having the right data at the right time with the right quality. So we do have a real good executive buy-in on the data perspective.
MG:
You mentioned the internal data that you create, right as a carrier and then the external data that you acquire. When you think about all of the data that is created within HSB or any carrier that they’re creating that data, all of the notes that they have when a claim is being adjudicated or that cycle of data that happens between claims and underwriting, how difficult or how valuable is it to take all of that historically unstructured internal data that may be in data silos in different groups and extract that and then make it either structured or accessible where it wasn’t otherwise accessible? Is that a major initiative that’s happening?
AS:
It’s a big initiative. I wouldn’t say that it’s a major initiative, but it’s really a big part of our data strategy and data journey. As you know, Michel from our majority of our business, we do get data from our external customers or external vendors, not vendors, external partners, other insurance carriers. We have tie up with 600 plus probably even more other insurance carriers who gives us the monthly exposure data, the premium border road data that we get it from them. And we try to ingest all of those data and many of those data are struck structural data, not the unstructured data, but more structured data and how do you master those 600 plus client companies or even more, I forgot the exact number, those client files that comes every month, every day. How do you master those? So that’s a big part of our effort to get those data cleaned, get those data organized, get those data mastered, be able to join claims with the exposure data.
AS:
But we do have quite a lot of unstructured data in our engineering systems. We have all the notes that engineers write down. We have in the claims system all the claims notes that the claims books write down and how do you match those claims notes and engineering inspection notes with our master data, with our exposure and claims joint tables, it’s, it’s always a big effort because we do quite a lot of analytics on those notes. And does that tell anything from the more claims processing perspective or optimizing or improving our engineering operation and performance as well? So from our data themes perspective, we are always looking for optimizing those unstructured data and trying to join with our internal structured data.
MG:
And that’s usually the payoff of that is not immediate, right? Because you’re doing those analytics, you may think that there’s correlation and you really need to go through another cycle to identify if the data that you did add is actually a differentiator or not to the different outcomes.
AS:
Yeah, I just said Michelle. Yeah, it’s not immediate. And we do a lot of proof of concept. And I think that there is another balance that is required because to create an IT process right away with those, or you do more ad hoc processes where you extract some of those unstructured data and join with your structured data and do POC and see if there is any value and then put into the production versus, okay, let’s do a big IT project and let’s get all the unstructured data that we have in our company for the last 30 years and join all those unstructured data with our MDM or master data effort. So is, and you may not gain a lot of value to doing a large IT project right away before you actually see the business value of it. So that’s where we do a lot of those activities in an ad hoc way. And then if we see value, some goes to the IT to productionalize it, it’s kind of Michelle as you know. Now all of those things are, especially for those text in the context of AI and large language models are big. Correct, because in the last one year after chat, GPT and all of those things, can all of us think that this LLM can do a big thing in the unstructured data mining thing and there might be some lot of business value that our business can gain immediately by reading those text and notes using lms.
TW:
On the subject of LLMs and GPT, what’s your personal view on how enterprises will adopt prompting? Will prompting become something that all employees have to become good at? Will it be restricted to a handful of people who have been specifically trained? Because prompting in some ways, in my view, has almost become a new software programming language, but it’s available to everybody. So it’s sort of an interesting challenge to solve. Everyone woke up and became a software programmer, but without any of the traditional software development processes that have been evolved over the last 30 years in the IT space.
AS:
That’s a good question, Tom. This prompting, and again, this is completely my personal research and opinion about it, this is going to be embedded in every function of organization that we do today because it’ll be embedded already in all the Microsoft products. It’ll be embedded in all other products that we do it today or we will do it today. So it’s not just that we’ll be building internally or the companies will be building internally, but the products that you are going to utilize from external vendors will have those features embedded in it and it’ll be more augmentation of your work effort. So to me, the prompt engineering is going to be another foundational skill. People need to know Excel. People need to know what those things are and how those keywords are and how to ask those right questions to get your business done. It’s going to evolve because many of the things right now, all the companies are worried about their ips, correct? They don’t want to give away their secrets to the large language model providers because there is no, again, there are some things coming up that make it a little bit more safe and secured, correct. Now the enterprises can utilize that a little bit more robust way, but we still are not there where everybody in the organization are using the right prompt to get the right answer. But I believe in a few years will be there. Everybody will be a prompt engineer and need to know that skill just like they need to know Excel.
MG:
As someone who’s responsible for data governance and compliance, what about AI and large language models really excites you and what are some of those use cases and then what keeps you up at night about it?
AS:
As you know, the AI governance is a big topic, correct? In this year itself, the EU came up with their EU AI act and then the Biden administration had their executive orders on the AI and the fair uses of data fair uses of ai. And then the congress is working on different rules. As you know, the department of insurance or many department of insurance state level, they’re coming up with their own version of the AI governance and AI documentation that they would need. So this is really a big topic. Even the state of Connecticut very recently came up with their documentation and their requirements on how insurance companies use AI very recently from the DUI at state of Connecticut. So we are all really working towards that, having a better governed AI process from end to end. As you know, Munichre is a very big company and HSB is a part of that.
AS:
We are globally tackling that problem AI governance, and we are really a step ahead of all the data and the processes that we are using and getting into the model and the use cases that we are trying to solve. But as you know, Michelle, at least at HSV, we are still the predictive maintenance company, correct? We are still the equipment engineering company. Majority of our businesses are on the commercial side, on the equipments. And I’m not really too much worried about, at least as of now because the personal data that we have is very little at HSB because most of the problems that people get in trouble with using AI is all are in the personal space. And if you are writing a personal line carrier writing auto home, those are really, and if you’re doing life, if you’re doing health, those kind of businesses are really on the forefront of how the AI governance needs to be applied. We are really way ahead in our game because we don’t want to take it seriously, but I’m not really too worried about that from, because our business model and the data that we have is a little different, not too much personal.
TW:
I’m not sure if this applies to HSB because this is a relatively emergent area, but maybe flipping it over, how do you think about the insurance risks of underwriting, the development of autonomous robotics and things like that in the engineering and manufacturing space because it’s sort of a problem in reverse. How do you assess the risks that these autonomous, whether it’s manufacturing or robotics for maintenance or any of those kinds of things, how do you think about that? So
AS:
We see Tom from a data perspective, that’s what I get involved from a data perspective and some of those, the recent changes in technology in the engineering space definitely affects the data that we get. But we are really working on a lot of the sensor data to get step ahead of all those, the new technology or new changes that is happening there. I could tell that that HSB is really a forefront of all those sensor data and IOT data, and we are trying to marry those with our policy data and expose data and trying to get ahead of basically the predictive maintenance nature of it. Correct. There are two aspects of it, pricing it and then once you price it so that the claims doesn’t happen, can you do a better predictive maintenance? So from a data perspective, we are really, as you know that HSB is really ahead of all the iot data and sensor data, and we collect quite a lot of data there. Care,
TW:
Yeah. Fascinating.
MG:
One of the things about HSB is that the product set is specialty. And so they’re very unique. They’re differentiated in that way. You’re not going to get the standard insurance lines and very data-driven organization. So what data can you collect to develop a policy that is meaningful to cover some type of very niche, or I’ll call it unstandard risk? What do you predict are some interesting new insurance products that could come out of all of the data or application of ai?
AS:
So we are really working, there are a few big areas that HSB is working and cyber insurance is one of the big area that we’re working. So I strongly believe some of those new development of ai, the technology will help us in that underwriting better in the cyber insurance and also preventing cyber claims better from a long run perspective. So I think there is a lot of benefit of the new AI technologies in that space. I think as Michel that we also Munich globally, we are doing AI Sure products. We are actually ensuring some of those AI models from an outcome perspective. And I think that’s another interesting area that the MUNICHRE and HSB is working from an AI and data perspective. Other than that, all the products are as sell or all our traditional products. We are trying to get better in our EV writing equipment breakdown. We are trying to get better on our liability products that we write and how we can get new technology and new AI models to do those jump better.
TW:
In your role, what do you see as the biggest opportunity in the data space right now? And maybe what’s the biggest challenge if you had to kind of pin it down to one of each?
AS:
I think the biggest opportunity, Tom is still because there is a buy-in because there is an executive buy-in to do the big large data projects to get the data, to get the data for all the stakeholders from conjunction, from the ingestion of the data to creating a model ready data. Because as you know, there are tons of steps that goes on and probably traditionally we didn’t have a lot of investments on those in each of those step. So because now there is an executive buy-in and all those things that AI we wanted to do. So I think the timing wise, it’s right to get the data, people get the right investment to get the data platforms, but the things that ordered me is about the expectation gap that’s saying that can we deliver what the business people are thinking that we are delivering because they’re thinking that this will solve this, this, and this problem. And data. People are thinking about creating another data warehouse and there is a big gap between what the business is thinking that it’ll solve with the data versus what the data team is delivering with another MDM or master data management. Therefore, and that gap sometimes or me, because sometimes I feel that there is an expectation gap between the business team versus our team and the data side that we might fail. And that’s what keeps me, me a in the night.
MG:
We talk a lot about that in reference to initiatives generally, but within insurance carriers where the goal is to do something with ai. And so I think what you’re talking about is that shouldn’t be the goal. The goal should be here’s what we’re driving to as a business, and then can we use ai? Is there a solution out there where AI is applicable and would be beneficial to this? Or is there a different route we have to take when you’re working with the businesses or trying to identify data sets for a specific outcome, what are some advice that you would give to other folks in a data governance, in a data management space when trying to partner with someone who maybe doesn’t fully understand the data or the AI capabilities, but has a very specific solution in mind already.
AS:
So Michelle there, I always like the balanced approach because there’s one thing that you want to kind of guide the analytics team and the business team, what data do we have and what the state of the data quality for the data for the problem that they’re trying to solve, because that’s kind of important for them to understand and being a partner with them saying that, Hey, the problem that we’re trying to solve, it’s good, it’s great, but data is not there or the data quality is really bad for you to do it at this point of a time. So for them to understand that problem. So that’s the one. But the second is what I think is the proof of concept and understanding the capability of the new technology. Because many times business may not appreciate or may not know that what technology is capable of today, some of the large language models, something that you could do that the business people may not know.
AS:
So that’s kind of our role as a support person from a data strategy perspective as well, saying that, hey, this data is available. And then partner with the technology team and analytics team and saying that, Hey, we have this data and this technology and this kind of new models and methods maybe, and we know the business problem because we are talking to them and aligned with them all the time, and maybe we can do a POC and kind of show them that what it can do for them and maybe they will like it. So I think that partnership is extremely important and having a balanced approach of, one is the production process of delivering the iterative value to the business. Another is supporting their innovation goal with even you know that, okay, the data is not there, it’s not a hundred percent, but maybe there is something that you could do because the technology is there where the innovation will pay the beta bill or it’ll be a differentiator in our business outcome may not be today, maybe it might take six months, but you have to do those innovation in a continuous way. And as a data person, our goal is to support those conversation. To support those discussion.
TW:
Great. Well, I think it’s been an excellent conversation, Naif, really appreciate your perspectives, especially in the role that you’ve held for a long time here and what you’ve seen in terms of innovation in the last 25 years. So super helpful, super helpful context, and a really insightful conversation.
AS:
No, thank you Tom, and thank you myself for inviting me and glad to be here and talking about the data strategy and governance and quality here. Excellent. Thanks for joining us. Thanks so
TW:
Much.
AS:
Thank you. Have a nice day.
Check out the full Unstructured Unlocked podcast on your favorite platform, including: