Unstructured Unlocked episode 40 with Joe Curry, Head of Apollo 1971 Data Science

Watch Christopher M. Wells, Ph. D., Indico VP of Research and Development, and Michelle Gouveia, VP at Sandbox Insurtech Ventures, in episode 40 of Unstructured Unlocked with Joe Curry, Head of Apollo 1971 Data Science

Listen to the full podcast here: Unstructured Unlocked episode 40 with Joe Curry, Head of Apollo 1971 Data Science

Michelle Gouveia: Hey everybody. Welcome to a new episode of Unstructured Unlock. I’m co-host Michelle Govea and I’m co-host Chris Wells, and we are excited to be joined today by Joe Curry, head of Data Science for Apollo 1971. Joe, welcome to the Instructor on Live podcast.

Joe Curry: Thank you very much for having me. Hi Michelle. Hi Chris.

MG: We’re really excited to talk to you. Could you maybe start off by explaining what Apollo 1971 is and then a little bit more about your role within that?

JC: Sure, absolutely. Yeah, so Apollos Syndicate 1971, we are a syndicate within the Apollo Group. So Apollo Group was founded I believe in 2009 and we have a number of syndicate now within the group and we are a Lloyd’s market insurer. So Apollo Syndicate 1971 specifically became its own syndicate I believe in 2021. And we focus primarily on ensuring non Lloyd’s traditional class of business. We focus on ensuring future of mobility, so we set up working in the sharing economy space, so working with some of your large tech clients within the states. And we’ve expanded to focus on innovative uses and future of mobility. So we are sort of looking at interesting things like autonomous vehicles. And so my role within the company is head of data science. So I set up a data science team, a data science function about four years ago when I joined.

And as a head of data science, I think everybody has their own kind of definitions of what they think a data scientist does and what a data science team does, but within the team we’re entirely embedded within the syndicate. So we work very closely, our underwriters and our actuaries and directly with clients and brokers. And the role of our team is to engage with clients in terms of what data is going to be valuable for us in terms of pricing and assessing risk, working on the best ways to transfer that data feed and back, data insights and data analytics, building some engineering pipelines, doing a lot of automation and some of what you would probably call true data science, which I think is what everybody thinks of when you say data science, which is building your predictive models and really trying to innovate and try and do things that work with new technologies or things that we think haven’t been done before. Or take aspects of machine learning and new tech stacks from other industries and apply them to insurance.

MG: Really cool. A lot of the guests that we have, if we do start talking about specific products and needs in terms of underwriting claims or data, for the most part we’re talking to people who are working on what I’ll call established products. Just future of mobility to me is just indicative of the fact that there’s still a lot of exploration going on. What do those products actually cover? And so really interested to understand from you, how do you go about building, I guess a data repository for something that is so new and is still developing in the greater tech world and then trying to have insurance catch up to it or stay and step with it?

JC: Yeah, it’s a challenge like you say, if we build insurance products that have never existed before, and so how do you take something that’s never been done and maybe data that doesn’t exist? So I mentioned autonomous vehicles. How do you build an insurance product off that and have enough comfort that you’re pricing that product effectively? So we are lucky and in what we do in that we build really strong partnerships with our clients and with our brokers and our partners. And what we look to do is really knowledge share and share data with clients. Obviously some of the tech clients that we deal with, they have terabytes and terabytes and terabytes of data. They collect every single movement from every single nanosecond of every single vehicle that they have deployed. And what we try to do is inform from a risk perspective, we might not have potentially the capability to understand and work with the levels of data that they’re working with and build sort of the convoluted neural networks that they’re building.

But what we do have is that sort of risk hat and that understanding to be able to understand what data they have, which is going to be valuable to inform risk and what we know from other lines of business and other more traditional insurance products that we’ve worked on and really collaborate on that. So that’s what we really try to do. We try to take learnings from maybe more of your traditional lines of business data that we have, multiple years of data, we’ve got the experience, we know how it’s going to play out, and then try to apply those learnings to new clients and new customers and new innovative ways of doing things.

Christopher Wells: Sounds like really fun stuff. I’m curious how your team interfaces with, say the actuarial teams at Apollo. A lot of what you’re describing, I would’ve generally assumed traditionally would’ve gone to an actuarial role. So how do you interface with them?

JC: Yeah, I mean we work very closely with the actuarial team. I actually sit within the actuarial function, so I’m not an actuary myself. I flunked out after one exam, but I do have that sort of history of working within pricing teams and working within actuarial teams. My real interest and my focus is data coding. I’m a bit of a nerd. I love tech. And so we do work very closely together. So you rightly point out that is in terms of coming up with the price and assessing historical data, that is a very typical kind of actuarial function. I suppose where we would come in would be working with the potential terabytes of data from our clients, integrating with their tech stacks, understanding what additional insights we can bring beyond traditional actuarial techniques and feed that into the actuarial process. So we’re not doing two separate jobs. We will work very closely with the actuaries, we’ll feed directly to the underwriters, but the way the process works within iBot certainly is that every single account you’ll have input from every single function and we’ll all work together on all those accounts. So we compliment each other. I’d say. I love to get stuck into the coding and the nerdy stuff.

CW: It’s fascinating to me that you’re distinguishing yourself as nerdy versus the actuaries, but that’s another podcast.

MG: Do I want to hit on something that you just talked about When something comes in, everyone there is part of that process. So when you think about current state or the future state of the future mobility products, what does a standard submission intake process include and where are some of the challenges that you see today, whether it’s data related, whether it’s just speed processing, love your thoughts.

JC: So I’d say the submission process is, I dunno how many Lloyd’s market insures you’ve had on this podcast previously, but I’d say it’s very typical. There are a lot of challenges in terms of the quality of data, the amount of data, getting the right data, data, cleanliness, speed of being able to turn around submissions. The amount of submissions that come in, the submissions that we receive are very typical I would say.

MG: Are they primarily still, how are they coming in Joe Standard there? Is it email, is it through an automated post? It’s mostly email. Okay. Yeah.

JC: Yeah. And so I guess we, Apollo are, I would say genuinely an innovative company. Insurance, I don’t think as an industry moves particularly quick. And then you’ve got different layers within the insurance industry. You’ve got your personal lines where everything’s kind of 10, 15, 20 years ago, whatever it was, everything was coming through a website. The data was already really structured, it was enriched. They were building machine learning models then. And you get to the commercial lines insurance where you have a little bit less data and then the specialty lines of insurance in the Lloyd’s market where you have very little data and it’s another step behind each time. And so you’re starting to build machine learning models now, which actually they were probably doing 10, 20 years ago in the personal mind space. So data as a general rule as submissions, email is by far the most common way of doing things, but it’s not the best way of doing things. We can automate certain processes whether something comes in through somebody’s email inbox, but again, that might not necessarily be the preferred way of doing things. I guess the issue that you have is if you’re interacting with brokers and clients, you need to get everybody in the whole supply chain on board if you’re going to revolutionize that process. Yeah, and longwinded answer to your question emails, but we try to be a little bit smarter, the triage in the automation, we’re really working towards that. You’re never going to get rid of it email.

MG: It’s the ultimate API.

JC: Exactly. Yeah,

MG: There’s a lot of discussion out there just about as you’re spending all your time on the future of mobility specifically, how will autonomous vehicles change how insurance carriers think about paying out claims? Where does liability lie? But there’s also with the ingestion of all this data that you’re getting real time, I assume updates from the vehicles or you’ve got the information from the OE EM that gives you more detailed information than you would’ve otherwise had. How do you see the claims process transforming with the mobility products as they grow based more on autonomous vehicles?

JC: Yeah, sure. I mean, at the point of claim now you have so much information already and sometimes, and we see this with a lot of our more sophisticated partners that a claim has happened before a claim actually comes in. So I think telematics, having that sort of automated first notice of loss is absolutely massive for claims. And from an insurer’s perspective, you’ve got the efficiencies of first notice of loss claims, triage put, setting your claims handling authority almost as a media point. It’s beneficial to the customer. They’re going to get payouts a lot quicker than they would have if you’ve got to do a bit of a deep dive into every single claim and weed out fraudulent claims and actually find out the specific details of the claim. So I think, yeah, it is huge and some of our partners in particular have really built out quite sophisticated processes to handle these claims.

From our perspective, we work a lot in excess lines and what we like to do is really get on top of large losses that are really going to impact us and impact our layer. And what we can do now is develop predictive tools to say, oh, you know what? This claims come in and it’s really early stages, but already from the claim description where it is the details surrounding the circumstances of the claim, this is actually something that they need to notify their excess carrier or we need to keep an eye on. And then we can really make that process a bit more efficient where we are not removing any humans from the loop, but we can just get the right experts looking at it at the right time. So yeah, I think the more these things get integrated, the better it’s going to be and the more impact it’s going to have on claims.

I think the challenge we face is a lot of the times this data is being collected and you’ve got claims teams and risk management teams and they’re still there to disconnect there. You’ve got particularly with let’s say a new mobility operator, they might have set up as a startup company and they’re collecting loads and loads and loads of information, but it’s not until they’ve reached a certain size that they’ve actually hired an internal risk manager and gone, right, we need to count off our insurance costs and they’ve collected all this information, but they’ve not thought about linking it to their claims data and their incident data. And then you’ve got a bit of time where you need to pay catch up and actually go, right, we need to map these two bits of data and then we can actually fully assess what impact making certain risk driven decisions is having on the actual value of your risk.

CW: Yeah, that makes sense. Really interesting. I want to dig into two areas here, one on the claim side and then one on the sort of underwriting side. So if the first notice of loss is automated to you, does that mean you’re reaching out to get the rest of the information to them, like inverting the process?

JC: So I would say with our claims partners that would be the case. So where we sit quite often in the insurance layer, that wouldn’t necessarily be us personally, but yeah, that’s absolutely right. And depending on the departments that we might work with, they can see that something’s come in and we work with some of the largest ride sharing delivery companies and stuff like that in the us they can reach straight out and say, oh, I hear you’ve had an accident, is everything okay? It allows them to get on top of that claims process. It brings down your claims handling costs, but it’s also massively beneficial to the customer as well.

CW: That’s so cool. I remember the first time I ever made an insurance claim from my car, I was totally befuddled about what I was supposed to do. So if someone could have just said, Hey, you had a wreck, give us this stuff and we’ll sort it out for you, man, that would’ve been amazing. So that’s really cool.

MG: Good use of the word befuddled there.

CW: Thank you. Showing off my doctorate this morning. And then on the underwriting side, one thing that we’ve heard over and over again is that the bottleneck that insurers, that carriers are trying to sort of break down right now is the process of getting structured data out of these unstructured bundles of documents so that the clearance process can complete faster and then the underwriter can take over. Where are you seeing automation and particularly artificial intelligence driven automation having wins there,

JC: Moved on to large language models already?

CW: It took us longer than normal.

JC: Yeah, there are so many use cases for, it doesn’t actually, not just large language models, standard language models as I guess they’re called now, but other AI tools that exist. The value of it is huge just from, I mean we talked about claims, but underwriting as well. There are so many huge documents as you mentioned, that just from a report perspective or just building tools for underwriters where they’re interacting with brokers that have X, y, and Z account. Just being able to extract certain information from slips that I can say, oh, here’s all of the information from every single slip for this particular broker is a massive advantage and massive time saver that would’ve taken a UA a couple of days to do and their time is now freed up to do something else. So there’s so much in the kind of standard manual day-to-day tasks that can be automated from extracting key bits of data from submissions.

One of the things that we find massively in our team is historical claims data where we’ve not been on an account and you receive data in all sorts of different formats and it’s so informative as to the future performance of an account, but it could be embedded in A PDF in page 10 of this report and how do I give that to someone and get them to scroll through and then type everything out? And to be honest, that’s what used to happen. But now, yeah, there are so many tools that exist actually outside of large language models that can already do that in such an efficient way. I feel like for the most part, those kind of manual tasks that can be automated I think can be automated with what you would say were existing tools more than two years ago with traditional kind of AI tools and actually just some rule-based automation and standard script building I think. But as we’ve mentioned it, there is then so much more that you can do with large language models on top of that, which actually being able to interpret the context of the data and without really knowing what you want to pull from a slip or a claims adjusters report, but being able to get the machine to pull out relevant data and all you have to do is give it the term relevant, that kind of thing. There’s huge potential, huge potential there as well.

CW: You’re singing my song. Let’s come back to that in a minute and we’ll really drill.

JC: Sure.

MG: Joe, when you agree with everything that you just said. When I think about applying AI capabilities on top of, or just to automate or better streamline the existing underwriting or claims processes that exist, you think about the fact that there’s always this large set of of historical decisions that have been made that that model can leverage. When you’re establishing a new product where the underwriting process is new, there’s still a lot of underwriting guidelines to be developed. You’re looking at everything in more detail, maybe because you’re not used to seeing that type of data come through a submission and there’s something new. And then even on the claim side, it’s a new type of claim or there’s new factors. How do you think about applying ai? What are the differences to you or the challenges in applying AI to a process for a product that is so new or is so nascent and still kind of being figured out even by the underwriters or experts in that space?

JC: Sure, that isn’t established and you don’t have a huge amount of historical data for Exactly. Yeah. It is a challenge. Data scientists, data people, we would love a structured data set which gives you exactly what you’re trying to predict in your output that’s a hundred percent accurate, and you can build a really easy model. I spend all my time just going through all the different types of models and looking at which one’s the most predictive, but in reality, the job is 95%, I’m going to say data cleansing and finding that data and getting it sort of structured in a competition way where we don’t have historical data. I think that’s where really embedding within other teams and relying on expert judgment actually comes in. I don’t think we’re arrogant enough as a data science community to say, we think the data says this and therefore this and this is the answer.

It’s always been the case in insurance that you might build a really predictive model for pricing an auto risk, and you go, oh man, I don’t know why diesel cars are such a high windscreen risk compared to petrol cars. It doesn’t make any sense to me, but if my model says so I’m just going to include it. But if you talk to an underwriter, they’ll go, oh yeah, it’s because they do way more motorway driving and you’re more likely to get a pebble hit. Your windscreen and underwriters have this expert knowledge about the industry that I don’t have and my team doesn’t have. So that’s where we would really look to collaborate. So our first version of a large loss model when we’re looking at claims, it doesn’t come from loads of historical data that we’re trying to get in the right format or that we don’t necessarily have.

The version one is going to be, okay, well we’ve got some expert claim adjusters here that have worked on this for the last 20 years. You tell me what you think is really important. And then we create a baseline of version one, and then as the data comes through, we then iterate through and we build a version two and we say, right, we think this is more predictive because of this. Maybe you’ve overrated this factor. What do you think about that? And it’s all a collaborative process. I don’t think we would ever personally and the way that we’re embedded within the team ever just focus entirely on what we think because the data says so. We’re very collaborative in that sense.

CW: That’s interesting and it all really resonates. Back in my days as a data science lead, we did a project where we were trying to adjust some pricing models in the derivative space and the database holding those prices was just a mess. There were no validations. Everything was all over the place. And so we used NLP tools to go back to the original source documents and just rebuild the thing because that was a cleaner pipeline for the data. So all of that is a lead up to a question. The question is, as you’re seeing AI and automation becoming a part of intake processes for claims and underwriting, are you seeing cleaner data or has that not caught on yet?

JC: That’s a good question. I think cleaner in the sense that it looks cleaner on face value, it looks more stricted, but maybe the fallibility of AI models is that maybe the quality and the trust issues that people might have around them. So I feel like the industry and the topic of ai, it’s moving so quick. We are an innovative insurer, but we are still an insurer. So I wouldn’t say we haven’t adapted to the use of ai maybe as quick as some of your massive tech companies. But yeah, I do think as a general rule that as long as you’re applying it in the right way, then yeah, absolutely. And to be honest, it’s the same with any technique that isn’t ai. It is the application of it.

CW: So it’s not just producing dirty data faster is what you’re saying?

JC: No, I mean maybe we’ve tested some things that have done that. We’ve had some very interested examples of tried to adapt early versions of large language models and build certain things ourselves internally and some of the output of that has been absolutely hilarious. They’ve come up with some absolutely cracked things, but obviously we’ve tested things so that certain things don’t work and we’re not going to use that. So in terms of actually figuring out the right way to use AI and new tools, I wouldn’t be doing my job. Well, if it didn’t give better data, we wouldn’t be using it.

CW: Yeah, you might not be doing your job at all.

JC: That’s true. I read a great quote the other day attributed to Corey Schaefer that AI isn’t going to replace developers, but developers that use AI will replace developers that don’t. And I couldn’t agree with that more. I don’t think it’s going to replace people. I think I’ll still be doing my job, but I think I’ll be doing it better because I’m using ai.

CW: Yeah, I love that. So let’s starting to talk about tools. What is the tool stack that your team uses? What are they into every day?

JC: We are big on our Python development in my team, and we’re big on Microsoft tools as well. So that’s a tech stack of choice. There’s a lot of, as with probably every company, we’re doing a lot of transformation projects and figuring out new tools for pricing and data. But for the most part we’re utilizing I’d say the Microsoft stack for AI and cognitive services and our data infrastructure and Python as a general day to day to kind of tie everything together and script in a code and that kind of thing. Plus Excel can’t get away from it, can we?

CW: No, that’s not an OR banking or anything else really

JC: Anything. I’ll be there forever.

CW: Yeah, no, that stack sounds like the right stack to me, and I can’t believe 10 years ago, I would never have said this, but I’m a huge fan of what Microsoft is doing in this space. I think they’re doing awesome stuff. I want to start to get into large language model discussion. I think I want to get your reaction to this. I think the industry, maybe all industries are thinking about the distinction the wrong way, large language model versus small or standard language model. I think that the distinction should really be based on task like discriminative or generative applications. And because people are distinguishing based on size, I think people are trying to use the wrong tools for the wrong things. What do you think about that?

JC: Yeah, I think that makes sense. I completely agree. I think there’s the right tools for the right jobs and large language models agree, and they came with a lot raw and a lot of hype, and we’ve got to that point where everyone’s so super hyped about it. Maybe the dropping off a little bit, there are uses for large language models for instance. I think anything that it’s going to make my life easier and it’s going to be more efficient for me to use a general generative AI large language model where it’s not going to be a valuable use of my time to train a specific model for a specific purpose. So I’m thinking things like my day-to-day life has changed on a work basis using large language models. It is like having extra people in your team. It writes code so much quicker. I can then peer review that code and that’s essentially what I would use it for. But if we’re looking at specific tasks that are going to be repeatable and you really want a solid kind of audit trail almost, and you need to know exactly what language model you’ve used and you’re going to continue to use it, and I’m thinking something like maybe we’ve built a specific model ourselves for looking at sentiment that we’re applying to a specific use case, then I think actually maybe using a large language model isn’t necessarily the best thing to do in that case.

I think using large language models, given what is available now, it’s just so easy. That’s the thing really, it’s just if you can adapt to your use case to be able to use it and the temptation is there because it’s just so easy to do it.

CW: Yeah, let’s talk about that temptation. That is a hot topic in my world right now.

So we’ve had a lot of conversations with folks where, well, actually, let me rewind to the past. So when the standard language models came out, indico had a lot of conversations where the data science team, your counterparts, my former counterparts would jump in and say, oh, we can just get the weights for that model. We’ll build the automation. And the data science team was forgetting that there’s a whole lot of user experience that goes into gathering training data. And the data science team didn’t really want to, they don’t want to write JavaScript. I don’t want to write JavaScript. So building a user interface where the business owner and the folks in the automation side can work together, really important. So I’m having flashbacks to that nowadays where we’ll be talking to a prospect and they’ll be like, oh, we don’t need your platform that has standard language models and large language models. We’ll just GPT, the thing ourselves. Right? And I think one, I don’t think people fully understand, again, like I said, the right uses of these models. And two, I don’t think they understand both the limitations of the large language model and the capabilities of the standard language model. And I’m curious, are you seeing the same thing internally at Apollo where it’s like, oh, we don’t have to talk to Joe, we will just get A-G-P-T-A-P-I key and we’re good?

JC: I think we’re good within Apollo, as in we’ve got a great IT central data team, and I think people listen to experts. And so a massive amount of work at the moment is going into understanding what the limitations are, what the issues are, what the potential threats are concerning large language models and from data security, that’s a massive thing. You can’t just give free rein to people to just shove whatever you want into a large language model. There are rules against this one, we’ve got our own ip, but two, there’s actual laws that says you can’t just share certain bits of data. And so there’s been an educational piece around that. We’ve got a usage policy in general internally, but then what we’re really building out is that kind of educational piece around and governance piece around what is the right way to use large language models and for what purpose.

For instance, like I mentioned before, you can’t assume validity of the output of a large language model. You just can’t. But that’s probably also true of a junior member of staff. I wouldn’t give an analyst on day one a task, they give me the output and I just present it to the board. I’m not going to do that. And it’s similar I think with a large language model that you need to be able to have that human kind of that pair of eyes over it and using it for any task that you might hand off to an analyst and then you’re going to peer review. I think that’s kind of a really good example of where it’s valuable and where it’s useful. But I think probably as an organization will, the masses that just are using gt, I think everybody in the company probably now is using chat GPT in some way shape or form.

And most people in most workplaces are doing exactly the same thing. Really trying to educate them on what happens if you put this data in here and just assume that their output is going to be correct. What happens if you ask it about a regulation in insurance? And it helps us nonsense and probably try and show some examples of where this has actually happened and you can’t take anything as given. I think it’s a lot easier to use large language models internally. So when I’m writing codes and I’ll take a snippet from it, take that, but I’ll adapt it to my own personal use cases. That’s a great way to use it. But when something that you really need to have a huge amount of trust in and a person’s been removed from that kind of pipeline, that’s where I think that’s where the trickiness comes in. Large language models that are being used as chat bots and getting them to auto respond to brokers. It’s a matter of time before something really bad comes out of it and it says something that it shouldn’t have or writes a slip with a really incorrect term if you’re not planting over it.

MG: Yeah. Joe, I think you’ve hit on something really important that I think Chris and I have talked a lot about just between us and have addressed with other folks that have been on the podcast as well is just what are the regulations and the compliance protocols that come into place when you start using AI internally for different use cases, for different workflows? And just in my experience, in my role, a lot of the, I’ll call it concerns, but the things that we hear in conversation about carriers looking to partner with vendors that bring AI solutions is that there is that concern about data privacy, the data exchange, and then the decision making that happens as part of those workflows being combined. What are you thinking about when you think about the regulatory landscape and how that’s progressing as it relates to the speed at which AI and chat GBT specifically has democratized the use of data so much across insurance and outside? That’s one big question that I’m sure

JC: That is a very big question. I’m not sure regulation is really within my wheelhouse, but I think what I will say is people are going to use it and you can’t stop ’em, and people are using it and they can’t be stopped. Even companies that are banning chat, EBT people are still using chat EBT, they’ve got their personal accounts or whatever. And actually in a sense, you’re just increasing risk. People go and log on using their personal account and start using company data. And there we have internally a compliance team and the exec that are taking this very seriously. We very early stages, very early on created a usage policy in terms of we’re not using any IP or public or company data. You can’t do that. These are some issues. We’re not going to ban people from using chat GBT because that’s just probably going to make things worse. But we are looking at, like you say, actual private connections and internal solutions in that space. And I think data privacy is a really big one that we have to make sure that any data that somebody puts into a large language model isn’t going to be publicly available. We absolutely have to do that. And the only way that we are going to get that confidence is by setting up a private connection with open ai, Microsoft, AWS or any of the other big tech companies.

But yeah, I think the regulation space is probably a question a little bit something on a global level or on a national level. I think the topic is huge. Honestly, I think probably the biggest growing industry is going to be AI regulation and compliance. It is going to take off, it’s going to be massive because it is such a complex thing. I don’t think anybody really sort of has the answer. There’s so many lawsuits and stuff going on about just how the models have been trained, and as soon as they, and they have started recreating, recreating things that are honestly similar to copyrighted pieces of work and stuff like that. It’s a super, super complex topic. It’s fascinating. But yeah, from my perspective, all we I think can do is try to be as secure over our own data, be as clear as with our own usage policies as we possibly can and educate in terms of the usefulness and actually the pitfalls of lms,

CW: What do you see as this regulation evolves? What do you see as the potential worst case output?

JC: The worst case output,

CW: I should have said outcome, worst case outcome for regulation

JC: As in regulation that comes in. What would be the worst case? From my perspective,

CW: Or just for a practitioner,

JC: Not kind of just a dystopian nightmare. What’s the worst case that could possibly happen with?

CW: No, I want people to keep listening for this.

JC: I don’t know the answer to that to be honest. Maybe I’m just too inherently an optimist. I’m not thinking about the worst case outcome. Okay. Yeah, I don’t know. I really don’t know what it is. Just such a big topic. What are your thoughts on the worst case outcome?

CW: Oh man. Great question because spent way too much time thinking about this. I worry. So you’ve got open ai and I would say somewhere down here you’ve got Google and then AWS, this is my opinion, not the words of indigo data, just one man’s opinion. You’ve got AWS somewhere lagging behind, although I do love the stuff that they’re doing with philanthropic. Philanthropics a great company, but it’s really a three horse race effectively. And the last historical example I can think of that was like, this is nuclear power. And nuclear power is now regulated like a utility, which is a miserable way to regulate a really powerful technology except when it’s a potentially very dangerous technology. And so I worry that you have these sort of three big names get ensconced in that regulatory framework and it just destroys innovation for the rest of the industry. And so I really keep my eyes on what people are doing with the small language models. You can fit on a single card and you could run it yourself and you could do your own experiments, but they’re way behind in capabilities right now. So anyway, that’s my answer. It becomes a utility. I think that would be a terrible outcome.

JC: Yeah. Yeah. I tend to agree with that kind of private versus public and depending on how far it goes as well, what we’ve went from in the space of a year, GPT-3, 3.5 to four, and the capability levels are increasing, increasing, and you’ve got essentially one company invested in that. But then a couple of other companies, like you say, that are really heavily invested in this technology, at what point do they, are they just so powerful? They’ve got the tool that every single person then becomes entirely reliant on in their workplace? Yeah. I don’t think that’ll be a nice situation to be in and the whole open source and contributors from the community. But then also, yeah, I guess comes to the point of this question is regulation because actually just there’s nuclear powers regulated like a utility, but you can’t just put it in everyone’s hands. Interesting. And there’s a lot of potential bad use cases for ai, and we are seeing them already. How can you tell something that’s AI and something that isn’t?

CW: Yeah,

JC: It’s really difficult.

CW: No, it is.

JC: The image creation ones have got insane.

CW: This took a really cheery turn.

JC: Sorry, Michelle.

CW: No, anytime we start talking about regulation, it goes this way. I wonder, Michelle, does my dystopian view resonate with you in terms of what it would do to innovation and the venture market with ai?

MG: It does. I think you’ve definitely have spent a little bit more time thinking about it from that perspective than I have. I think I agree with you that there’s really those three large players. The conversation and the debate or discussion rather that we continue to have is what ultimately wins out. And I have an opinion on it that I won’t necessarily share, but it aligns with what you framed of. There’s the large incumbents that are providing these solutions as part of a larger tech stack that you can just have at your fingertips because you’re already using those platforms. Or there’s the newer, I’ll call ’em startup solutions that are bringing AI that are maybe more point solutions. Those that maybe aren’t, that are much more specific you can think about as a feature solution as opposed to an own standalone large, large business. And so what you’ve talked about gets back to who wins out there, and if it really is the three large incumbents winning out and everything else gets eaten up by that, then you do have some of that.

How is it going to be governed? And so eventually my discussion gets to the point that you made. But I think I’m still a little bit more in the trying to explore all of the different options that we see day to day. But I think it leads to, I still think when we talked about this on a few episodes, what’s so different about gen AI and this type cycle on it is that in the past, AI and machine learning, everything that was all part of the data science actuary group that you didn’t touch it, you didn’t know about it, you didn’t understand it unless you were working in it day to day and chat. GPT is just democratize the access to it. And so you have people that can use it that are excited to use it rightfully. But then Jody, your point, you have people that are using it but don’t understand the repercussions of using it. And that’s where the true danger lies, whether they’re doing it in their personal or on in their work life. I think those are the considerations or the concerns that I get brought up when there’s a new technology that kind of just sweeps the globe like this one did.

CW: Alright, let me try to make this more cheerful as we end. So two questions. Looking to the future, one short-term horizon and one longer term horizon. Okay. Short-term horizon, where do you think the near term future of getting better with these models is? Would you say it’s fine tuning them or would you say it’s getting better at prompt and prompt chaining?

JC: Good question. I think fine tuning them. If I had to pick one of those, I do, honestly, I think both super important. I think prompt engineering, massive skill and big topic. I think the way I see it is developing the potential is developing the real use cases for them. And the techs out there now is, as you say, Michelle’s been democratized. Everyone’s kind of got access to it. But then how do we really find what it’s best for? What are the best use cases for it and how do we fine tune it for specific use cases? And you’re going to have, I think 1,000,001 different kind of technical applications that are powered by GPT and that have all been slightly fine tuned to specific use cases. So yeah.

CW: Nice. And then say five years from now, I know that’s an infinite amount of time in the space we all work in, but how do you think the industry you’re working in will have changed in five years based on all these tools and technologies that we’ve discussed today?

JC: We’ll all be sitting on a beach, Chris and Will. We’ve automated the whole thing.

CW: Best answer,

JC: No, it’s so hard to say. Look at where we were a year ago, A year, five years. I don’t know. I said before, we are an innovative insurer, but we’re still an insurer. Insurance isn’t renowned for moving that quickly overnight, but yeah, five years. I like to think that the game will completely change. All underwriting will be augmented underwriting as our CO likes to call it. Machines will be involved in most processes in some way, shape or form, but people won’t be removed. Realistically, I was only joking about being on the page. Sorry Chris. No, it’s okay. They won’t. But the tasks

CW: Specifically,

JC: I think we’ll still be thinking of new and innovative use cases for automated and making people’s jobs easier and really focusing on where humans can add value.

CW: Yeah, see, there you go. We ended on a hopeful note. Well done. There you go. Also augmented underwriting. I am probably stealing that. That’s great. That’s a great buzzword. I love it.

JC: Get our chief underwriting officer on your

CW: Podcast. The other one. Alright, there you go. Okay, I’m up for that. Well, our guest today has been Joe Curry, who is the head of data science at Apollo 1971 and a wealth of information. Thanks Joe. This has been great.

JC: Absolute pleasure. Thank you very much.

CW: Thank you for joining us for this episode of Unstructured Unlocked. You can find all of our episodes wherever you listen to podcasts today. Be sure to write a review if you like what you hear.

Check out the full Unstructured Unlocked podcast on your favorite platform, including:

Unstructured Unlocked episode 40 with Joe Curry, Head of Apollo 1971 Data Science

Listen to the full podcast here: Unstructured Unlocked episode 40 with Joe Curry, Head of Apollo 1971 Data Science

Subscribe to our LinkedIn newsletter.

Ask Indico