Watch Christopher M. Wells, Ph. D., Indico VP of Research and Development, Madison May, Indico Co-founder, and Tom Wilde, Indico CEO, in episode 12 of Unstructured Unlocked recorded live at Insurtech Insights London.
Christopher Wells: Hi, and welcome to another episode of Unstructured Unlocked. I’m your host, Chris Wells, vice President of Research and Development at Indico Data, and today I am joined by two very special guests, Tom Wild, CEO of Indico Data, and Madison May ML architect and co-founder. Guys, how are you today?
Tom Wilde: Hey, Chris. Good to talk to you as always. Yeah. Hey, Madison.
Madison May: Doing great.
CW: Now, Tom, I gotta ask, most people don’t take these interviews from a disco tech. So, what’s going on in the background,
TW: I am reporting in here from intech Insight at the oh two in London. A big event. About 5,000 people here all focused on the future of insurance. And obviously, you know, London being a, a critical geography for the insurance market worldwide, you know, with, with the London Market and Floyd. So a very big event here
CW: That’s exciting. Insurance seems to be like the flavor of the day. Is there a lot of energy there?
TW: A lot of energy. You know, I think that insurance perhaps gets a bad wrap or how much they wanna transform their businesses. They’re obviously businesses that have been around for a very long time. True. but the energy around creating the next generation of, of both, you know, the business itself, profits within the business the organization of these businesses to respond to, you know, what is definitely a changing landscape out there in terms of, of risk and returns. Yeah, the energy here is terrific.
CW: Excellent. today’s episode is about all things GPT three and chat GPT, and I’m sure we’re gonna circle back to how those massive, groundbreaking technologies are influencing these industries. But let’s start off with quick intro. So Madison, you’re a co-founder, you get to go first. Tell us about your journey in this space.
MM: Of course. So I, I started at Indico nearly eight and a half years ago now. Dropped out of college along with the other three co-founders in order to kind of see if we could make the stream a reality. Indico has evolved quite a lot over her time from the a p I company that we were in the very beginning. But some things have always held constant. And our interest in generative a generative AI and natural language processing is one of those things. It’s been wild to see this technology in the industry mature and to kind of have to adapt as academia and industry change what it means to work in natural language processing over the years.
CW: Yeah. Eight and a half years has been an eternity in this space, right? Like Absolutely. You’re one of the OGs. <Laugh>
MM: Natural Language Processing was very different eight and a half years ago. We got started right when computer Vision was having its AlexNet moment back when computer vision was making great strides cause of deep learning and natural language processing was a little bit late to the party. So yeah, when we were getting started with Inco, it hadn’t yet experienced that same explosion that AlexNet had provided for, for computer vision. So we got to ride that first wave and it, it feels like we’re preparing to ride another wave here as G P D three is the model of the hour.
CW: Yeah. Let’s, I’m gonna put a big pin in that one. We’re gonna come back to that. Tom, tell us about your journey with this company and just technology in general. How did you get here?
TW: Yeah, I’ve been in the enterprise content technology space for the majority of my career. You know, what started out as, you know, the transformation that search itself brought to these markets, you know, really sparked by, by Google and, and then extending into enterprise use cases and the possibilities arounds understanding and accessing unstructured information in way it’s not previously possible. This is sort of a carry forward of a personal interest of mine, you know, since that time. So, my involvement with indico started with meeting the founders at indico, you know, seeing the very disruptive innovation they developed in it, in the way that we’re able to understand and comprehend unstructured data in ways that, you know, were previously impossible, sort of another step function disruption and, and innovation. And what really struck me as, as fascinating was there was an opportunity to figure out what does the product look like?
Who is the customer? What is the problem we’re gonna solve? All the things you have to do when a, a start from a position of a technology innovation. So how to take that and turn it into a business. And so I joined and we started down that path, and it’s been, you know, just wildly fascinating and, and, and also a lot of fun. And especially now you know, you see a lot of the beginnings of indico having sort of come to life here in the market that we’re all we’re all experiencing with chat, G P T, et cetera.
CW: Yeah, that’s interesting. Do you not a loaded question. Do you ever think maybe Indico got there a little too early?
TW: Yeah, yeah, absolutely. I think that timing when you’re building a young company is, is a big part of the equation. And I think we were very early then we were early and, you know, now the, the market has, has really arrived I’d say in the last 18 months. And that’s part of, you know, the startup journey is you need to get the product right, the market, right, the customer, right, the messaging, right. And that comes together when it comes together sometimes. And, you know, when we think back to the, the sort of way we articulated this in the early days, it was very much an enterprise AI pitch. And at the time, frankly, that was sort of the level of detail the enterprise was looking for. You know, they, if you recall, there were VPs of ai at many insurance companies and banks. In retrospect, it seems a little bit funny because now it’s sort of today being like, I’m the VP of Java, right? Like, you wouldn’t do that, you know, AI is a technology, it’s not a solution. And so that the, the market’s journey is, is very similar to ours. And we find ourselves now having this incredible technology, incredible product and team and, and you know, a market that is very, very interested in, in now taking advantage of these things to solve critical business problems.
CW: Yeah, the world seems ready finally. It’s exciting. Also, I hope all those VPs of AI are doing okay. I hope they landed. All right. <Laugh> anyway,
TW: I think so.
CW: Speaking of the market being ready, I’m gonna kick it over to Madison. Why don’t you help us understand, help the audience understand what is G P T three, three and a half, three, whatever, and what’s the difference between chat G P T and why does everyone care so much? So three questions there.
MM: This might take a little while to unpack. We might need most of the episode for this, Chris, but yeah. So in many ways, G P three is not that different from technology available probably five or six years ago. So the earliest G P t wasn’t even called G P T at the time. It had some not so memorable name. It was just kind of a, a long paper title. And the name was actually given to the original model by some subsequent authors who had the site that work. It stands for generative pre-trained transformer, and at its heart is something called a language model. I think it’s useful to understand just the basics of a language model in order to kind of reason about how G P three behaves and what task is good at what task. It’s not what errors it’s prone to. So language models it’s just a fancy word for saying, predicting the next word in a sequence. So if you have a part of a sentence and you ask the model to fill in the next word with whatever it believes is, most likely, you call that task language modeling.
And it turns out that training a model to perform that task well, enables it to build a useful foundation on which you can build all sorts of interesting applications, because learning to predict the next word in a sequence well requires quite a deep understanding of how English language works and requires solving many difficult kinda sub-tasks to solve that task well. So sometimes predicting the next word in a sentence, it’s just a matter of understanding that, oh, hey, we probably need an adjective here or a noun here. It’s kind of simplistic to, to get the early wins at that task, but later on, it’s a matter of understanding that, oh, this pronoun refers to this proper noun over here. Maybe that’s what’s being referred to here. You know, maybe you need to evaluate some mathematical expression. You need to know some world knowledge, like what the capital of particular country is to predict the next word in a sentence. The kind of the long tail of the task that a model needs to solve in order to solve this task well is, is quite interesting.
CW: And even the simple stuff, right? Like I’m going to the grocery store is more likely than guillotine, but given the broader context either answer could be right. So learning that, that waiting of which word even that is non-trivial, right?
MM: Absolutely. And language is very contextual. Very, so often you need more than just a couple words of context to solve that. Well,
CW: Gotcha. So why is G P T three so good? What’s the, what’s the difference?
MM: Well, not much in terms of the ac actual model architecture. The general blueprint for the model is very similar to what it was five years ago with the initial G P T paper, but it’s been scaled up by a factor of a thousand, and it turns out that factor of a thousand mattered quite a lot in terms of the model’s ability to solve practical problems.
CW: So this thing isn’t gonna run on my MacBook. That’s what you’re, that’s what you’re telling me.
MM: Not even close.
CW: <Laugh> not even close, yeah. Okay. All right. So that’s G P T three. It’s a language model. It looks in quacks like previous language models. It’s got a lot more data and a lot bigger footprint, so to speak. So what’s chat G P T and why? Why am why am I worried about my daughter using it to write turn papers?
MM: Now, G P T is actually quite similar to the tech it’s built on, which is G P D three. You can kind of think of chat G B T as just a sparkling version of G P D three. You know, it’s almost the same, but it has a little bit of extra training in order to make it more conversational in order to make it a little bit less likely to hallucinate content, to produce profane content or content that you might not want the factual inaccuracies. So just kind of making it more appropriate for use for real applications.
CW: And I, I think one of the things that folks miss out there in the wild, whether it’s in the news media or LinkedIn, is that that extra layer of training to get that sparkling quality that was actually human beings heavily involved mm-hmm. <Affirmative>, right? Can you talk about that a little bit?
MM: Absolutely. Maybe the first thing to notice here is just that in this language modeling task that we talked about previously, this model was trained on a significant portion of the internet. And learning to solve that task well is about learning to model the data on the internet. Well, and the data on the internet has varying quality. Some of it is excellent, some of it is more like Yahoo answers <laugh>, where learning to model Yahoo answers well doesn’t make for a useful end product. It makes for hot garbage. And in order to get something useful at the other end, you need to guide the model towards the productive portion of the internet, or productive kinds of responses, intelligent kinds of responses. And that’s what this final bit of training is all about. Just steering the model in the right direction.
TW: I think what’s really interesting about chat G P T that isn’t widely understood, the build on Madison’s point is how important human in the loop as a concept is also to chat, treat pt, that there are several hundred people behind the scenes curating a lot of these responses, not in real time but offline. Because again, it still needs guardrails and it needs guidance to be useful. And I think that that’s a critical concept embedded in, in most ai, right? Which is you still need to provide guardrails or it risks being not useful or even worse than not useful. So I think that it, it’s not quite magic, you know, there are still a lot of controls that have to go into using these technologies to make them successful.
CW: Yeah, I think this, this is, this has been just another hype cycle of like, ah, the robots are coming for our jobs. And really, I think everything, everything I’ve seen, everything we’ve talked about internally has been like, this is just another tool in the toolkit. If anything, it’s gonna make jobs better when it, when the right guard rails are around it, as you’re, as you’re making that point. So I
TW: Think put what, what chat PT is, sorry, Chris. I think what chat PT has really done is provide a very sort of democratic access and exposure to these remarkable technologies that just wasn’t there before. Right? The average person couldn’t really experience generative AI technologies Yeah. Prior, and what chat G p T has done is provide a consumer interface into that and, and let people play with it, right? And, and initially there was just, you know, of course wild excitement in awe, and now you see some of the news stories where some of the, the drawbacks or weaknesses are. So it’s, it’s good. It’s a natural progression, but that to me is, is what G P T has provided is this sort of profound consumer ex exposure to these remarkable technologies that didn’t exist until chat arrival.
CW: No, that’s right. Previously the most experienced, you know, the average person had with AI was Siri or their Google Assistant, which as we know, is not a great experience. Right? Very limited Siri. Yeah. All right. I put a big pin in a topic, which I wanna circle back to now, cuz I think it’s timely in this conversation. There is a lot of, and this question is aimed at both of you. You know, just a year ago, maybe two to two years ago even, there was a lot of talk of are we heading into the next AI winter? And now we find ourselves, Madison used the language of writing, you know, getting ready to, to ride this big wave of G P D three, amazing artificial intelligence. So what is that wave going to look like? And what do you think it is about these technologies that’s captured the imagination of both folks, folks in the enterprise who are really curious about this, and just the average person, like, why is this wave cresting so high?
TW: Madison, why don’t you go first?
MM: Sounds good. And I guess I should lead off by saying I don’t wanna present an overly optimistic view of the technology, because like any technology, it has its flaws. Yep. But it does have some pretty major strengths. One of the things that I think is most revolutionary about chat Q P T is simply its ability to put machine learning technology in the hands of consumer. You don’t necessarily need a data science background in order to interact with it in order to get it to do useful things for you. Just the ability to speak natural language to the ability to speak English is all you need to, to get started. And it also has some key strengths over traditional data science that are appealing. Namely, it’s very, very easy to rapidly prototype and to iterate on an initial design because with traditional machine learning iteration requires essentially making model architecture changes, making changes to your dataset. There’s kind of a couple factors that that slow you down if you want to change your mind about how you framed some business problem as a machine learning solution with G P D three, that just requires changing the content changing how you interact with it by asking a slightly different question, refining some of your criteria for what a good response looks like. It’s just a much quicker feedback loop. And short feedback loops drive massive productivity.
TW: Yeah, I think from my perspective, you know, the, what’s been so profound you know, about indico development as well as the industry is our focus on using context. And Madison used this word, our focus on using context to determine meetings is really profound in that it freed us from the brittleness and constraints of trying to teach the machine with instructions, which is software to teaching the machine by example, right? Which really depends on context. And that to me is, has been what’s most profound about all of these developments is that teach by example has unlocked the ability for much less technical folks to allow the machine to do ever increasingly complex topics. I think what JTT three has shown us with this a thousand x increase in, in the size of the model, is just how complex language is, especially the context of language. And it also shows you how powerful human communication is, right? All of that context is embedded in how we speak to each other without even knowing it. That’s been historically very difficult to translate to a machine’s understanding of context. And this has really been a profound, you know jump in that understanding JPT three in particular which unlocks, you know, even more robust use cases. So I think that’s part of the reason for the excitement, you know, is that suddenly become quite tangible.
CW: Yeah. It’s an interesting point you make, Tom. I think you’re, you’re talking about the progression from, if you know from you to, from Madison to you as you know, to start off with, you have to know how to write tensor flow code to get good answers outta these models. And then you have to know how to label a document the correct way. And now it, you know, with these models, it’s like you have to, this is like the Oracle at Delphi. You, you have to ask the question in the right way to get a reasonable answer. So it’s interesting the transfer of those skillsets. And I, you know, looking at you Madison, you’ve spent so much time in your life figuring out how to really architect models well and all of this stuff. Is there any morning that’s going on as, as you think about the way that the, you know, the sort of onus has shifted from the backend engineer to you know, the person just writing a prompt?
MM: Not at all. And part of that is, is because it comes with its fair share of weaknesses, which I’m sure we’ll discuss later. We’ll moving from building data sets to writing prompts means that you don’t get the same quantitative metrics that you typically get when working on a machine learning project. And there are a couple other related drawbacks too that just make it mean that you have to put together additional process to make sure that you can trust the outputs of Qpd three as much as you can trust the outputs, the traditional maybe transfer learning based machine learning system.
CW: Yeah. Okay. All right. We’re gonna circle back to that hard. I want to, I wanna talk a little bit about the history of Indico Ma Madison. You mentioned that generative AI has been something that’s been part of the d n a of this company for a long time. So I’m gonna kick that over to Tom and why don’t, why don’t you give us a little history lesson, we’ll kick up our feet by the fire
TW: Yeah. The origin story. Yeah. And this has become, you know, even more fun now than ever now that the the sort of broader markets have the context to see you know, why these initiatives and developments that we undertook, you know, six, seven years ago were so important and have become so impactful. You know, going back to the very beginning, the indi co-founders Madison Slater, Victor, Alec Radford you know, these were, I I still think of the, the, and Luke Mets, you know, these were generational talents, our generational talents in the field of, of natural language processing. Alec in fact, wrote what I think today is still one of the most cited papers in this category called DC gans, and then went on to do a lot of the research and writing around the original G P T and then G PT two now even g PT three.
So Indico has been using these technologies, you know, let’s include generative and declarative type of technologies to, to understand unstructured since the very beginning. I think, you know, if we kind of fast forward there, what makes us unique is we’ve built this incredibly powerful framework that allows the enterprise to take advantage of these technologies without having to do all of the tremendously hard work that goes with training, deploying, explaining, governing, right? Those are all things that are really all part of the, the problem statement. It’s not enough just to build a model. It is how you’ll actually use and experience that. And if you look at our origin story with each new development in the market, gpt g, PT two to Roberta we are very, very good at understanding what those developments are really skilled at, what they’re, what they’re good to point at refactoring them so that they can be used in the enterprise.
In fact, our product can be completely deployed inside the enterprise firewall, including, you know, the Indico large language model. So from a security standpoint, you know, that’s a very important consideration. So now JPT three, which has some very specific strengths and weaknesses, which we’ll talk about here in a bit how do we allow the enterprise to take advantage of those strengths, not be risked by the weaknesses? And so that’s kind of the, the origins Vindico is this is for us very much business as usual, right? We’re, we’re, we, we always look for that next evolution of the technology factor, it make it part of our solution, allow the enterprise to succeed.
MM: Yeah, go ahead. Just gonna kind of expand upon that. Our policy has always been that you should be skeptical of any startup that claims some singular technical advantage. Our philosophy has always been that in the machine learning space, we’re blessed that the community’s really embraced open source, has published work that would in other industries possibly be kept private for years and years. And really the key skillset needed to be productive in the machine learning industry is recognizing when the advances from academia and the giants like Google, Facebook, Microsoft, are worth actually productizing, and when it’s kind of incremental improvement over the norm.
CW: So I remember
TW: In some ways, at times, I, I, I think of a, at times as, as almost more akin to a commercial open source in that, you know, we provide the framework, the training, the deployment, the management of it, you know, it, it’s not exactly a perfect analogy, but there’s definitely some parallels there for sure. Sorry, I didn’t on it. No, no,
CW: No, that’s fine. I, I was just gonna pull that thread a little bit further and ask, you know, we’ve all worked with open source projects, we all know they have pitfalls. And to your point, we’re very thankful that they exist, especially in the ML community. But what do you see as being the biggest gaps? And I’m gonna focus this question in a second on G P T three, what do you see as being the biggest gaps typically between someone releases the weights and biases of their model with maybe some training scripts and getting that into the hands of, say, a knowledge worker at an insurance company to help them with broker intake? What does that productization process? What is it usually fixing in these contexts?
TW: Okay, Madison, you wanna go first?
MM: Of course. So part of what we do is we try and make these models as efficient as possible, as cheap as possible to host. Often there’s not as much emphasis on that in academia, so that word often falls to industry. So this means looking at things like speedups from reduced floating point precision, or just speedups from the way we apply the technology to our problem. And there’s a lot of art required there. I wouldn’t say that it tends to be one or two singular big things. It tends to be hundreds of tiny details that need to be done correctly in order to get optimal performance.
CW: Yeah. And those, those tiny details, sorry, Tom. I was just gonna say, having been an academic, those tiny details are not the kinds of things that end up in publications, right? So you really have to be an expert to get under the hood and find them and fix them. Sorry, go ahead, Tom.
MM: Yeah, go ahead.
TW: I think from the very beginning, we had a philosophy that we really wanted a, you know, I, I like to say that we’ve built our product for the business, but our technology for the enterprise, meaning, you know, we wanted to put in enough guardrails and, and help a business user succeed with these technologies, avoid, you know, some of the common traps in, in pitfalls but not reduce the power of it so much that it was no longer useful. That’s a real art, you know, and Madison use the word art. That’s the art in this is, is that, that very careful navigating of those two poles to deliver an enterprise worthy application, but one that also doesn’t require, you know, extraordinarily technical or field people to, to deploy and operate it, because not those people aren’t important to the, the use case, but because there simply aren’t enough of them, you know, within the enterprise to do this, you know, at scale cost effectively, et cetera.
CW: Yeah. That, and that is an excellent segue into the next section. I swear we didn’t rehearse this. I want to talk about hype versus reality. So what’s ready to go in the G P T three space for the enterprise and what isn’t quite ready for primetime? What are some of those pitfalls and, and what’s really working already?
MM: Maybe before we talk about what’s ready to go, we should talk about some of the weaknesses of the model. Let’s do it just to so for one, there’s no guarantee that models like GPD three aren’t going to make content up.
CW: Now, why, why is that, Matt? Why is that Madison?
MM: It’s kind of an artifact of the entire field of generative ai. It’s with great flexibility comes great responsibility, and AI has this capacity to write any text. With a lot of the traditional machine learning technology, we’re solving a much more constrained problem. So we’re, we’re doing something like applying a label to the words in a document or classifying a document into one of 10 categories. There’s no real choice for our model other than to put that document into one of 10 buckets. In the case of G P D three, the model can produce any English word. And because it can produce any English word, sometimes the words it produces are grounded in the document that you’re interested in. And sometimes the words it produces are just simply likely words for whatever reason. So I believe if you maybe to make this more concrete, yeah, I think one of the, the examples shown in one of the early releases of Bing Chatt was a question like asking Bing chat to provide restaurant recommendations in a particular area. And if you read through the brief recommendations that provided, many of those restaurants have this string no reviews for this location yet. And if you actually go and look up those businesses on Google or Bing, you’ll find that they are thousands of reviews. Yeah. But because there were just loads of, of businesses that did have no reviews, the model emitted that string simply because it was likely, cuz there were a lot of cases where there were no reviews present for a business.
CW: Would, would you say this is the language equivalent of like, you know, an image generation platform creating like crazy seven, eight fingered hands? Is that, is that kind of what’s going on here?
MM: It’s certainly similar.
CW: Yeah. If you squint hard enough, it’s a hand, if you squint hard enough, it’s a reasonable restaurant review. Right,
TW: Yeah. And I think part of this is, you know, this is a, a technology in some ways behaving like a teenager, right? It, it’s, it’s much too confident in it’s understanding of the world, which, you know, teenagers often fall into this category primarily because they don’t yet have enough experience, but that doesn’t prevent them from believing you know, they have the experience to, to perform a certain task or have an opinion about something. That’s very much what Chatt p t behaves. Like, if you ask it a question, it will give you an answer. Yeah. whether you know, it, it actually has the standing to provide such answers. And that’s where this sort of hallucinations sometimes come into play. You know, where you, you don’t have the ability to really understand how confident it’s or, or what basises have for giving you that answer, but its opinion is, well, you asked, so I’m gonna tell you.
CW: Yeah, yeah. I mean, it’s, it’s trained to answer, right? That’s its whole shtick, it’s trained to answer. Alright, so hallucinating, confident, lying, big pitfall. What’s next on the list? Let’s keep marching through this.
MM: Boss is certainly a big one. GP three is probably 10 x more expensive or more than the task specific models that we deployed. Indico 10 x is probably a dramatic underestimate.
CW: Wow. All right. So it won’t, as, as I joked earlier, it won’t fit on my laptop. In fact, it won’t even fit on a single, you know, compute node, right? It’s not fitting in a, a single processor. So you have to have a really specialized skillset. Not only is it expensive, but even if you wanted to run it on your own, the, the development ops, like, you know, it’s a really specialized skillset to get this thing to run. Do you see that cost coming down?
MM: Yes. I think there’s still some low hanging fruit to be reaped in this space. So maybe we can expect a three x change in cost. We’re not gonna see dramatic reduction in cost in the near future.
CW: Tom, dYou talked to the folks that would that would have opinions on this more than I do. Do you think this cost is gonna pass muster in the enterprise? How’s, how’s this gonna work?
TW: I don’t think so. And, and I think there’s a little bit of a, a Moore’s law in reverse here with machine learning that I’m noticing, which is, you know, Moore’s law would’ve told you that, Hey, don’t worry in 18 months, you know, G P T three will will be in cost wise. I don’t think that’s true. I think that it’s sort of like the engine manufacturers figured out how to build a thousand horsepower engine, but now they build a 10,000 horsepower engine, right? So the cost one has like, come down, the performance will go way up. But I think the cost won’t, it, it, that’s not really how machine learning seems to be operating, unlike say, you know a computer performance historically. I don’t think that JT three is, is viable or large scale enterprise use cases right now. I think it’s going to have to be deployed for very specific pieces of the, of the challenge, but not the whole challenge itself.
You can’t, you know, some of our customers process billions of items a year within Indico platform. The math for that in terms of the cost they would take for GPT three to do that does not justify the performance increase, right? So part of what vendors like Indico will have to be good at is carefully selecting when to use technologies like GPT three, also guard railing them because there’s other issues. You know, for example, you have to use sort of a web type integration with it, which means there security issues, SAS only. So if you want it behind the firewall, you can’t have it. So all those things will make enter project option a challenge. And, you know, that’s why the years of experience we have with adopting these kind of technologies, I think puts us in a really good thing. Yeah.
MM: I wanted to touch on something that you said there, Tom. I think you kind of implied that GP three is always going to be more performant in terms of quality for a given problem. I also don’t think that’s true in general. GPD three tends to be a jack of all trades, master of none kind of model. It’s very good at getting you to good enough. It is not necessarily the best at getting you to production quality. And the reason for this is simply that it, there’s a trade off between breadth and depth and many companies have very specific problems. They have very specific processes, and you need to be very descriptive about exactly what you want in order to get the correct behavior. Mm-Hmm. You can only be so descriptive. Contrast
TW: It as, you know, it’s fantastic in straight line acceleration, but you know, don’t try to turn it right. And yeah,
TW: Some of the analogies there would be, for example, it doesn’t have an understanding of the layout of the thing that it’s been similar, right? Right. It only understands the text. And in enterprise use cases, the layout is a vital signal to help understand the about of, of that, that piece of data in question, if we’re talking about documents specifically. So to build on Madison’s point this sort of master of none is, is a challenge when in the enterprise, the specificity accuracy, these things matter critically because of the kind of decisions that they are, they are powering, right? In terms of an insurance, should I underwrite this risk? Should I approve this claim in lending? Should I, you know, lend this money in healthcare, you know, what is this diagnosis? These have a level of impact in terms of accuracy or mistakes that is far, far higher than say, a consumer based application or deployment based technology.
CW: Interesting. So circling back to the, the question of how do you productize something like this with, you know, with other models we’ve found sort of where the potholes are and we’ve hit the landmines, what’s the path to productizing J P T three? Is there one, or is it, you know, sort of acute, it’s gonna get a lot of enterprise buyers excited about ai and that might get us into conversations, but is there, is there really any meat on the bone here?
MM: Certainly, I think you just need to play to the model strengths. So I don’t think it’s an appropriate fit. For many of the use cases that indico tackles day in, day out, it’s greatest strength is that speed of iteration, that, that time to value aspect, we’re able to get up and running and get to a good enough solution, look at you split and making use of G three I think is gonna be a lot about making use of that particular strength.
CW: Okay. Tom, how about you? Yeah, yeah.
TW: The dream and machine learning has always been, you know, zero shot learning, right? It just knows before I even ask it. And I think GPT three brings us a lot closer to, to a zero shot, you know, understanding of context that has pretty profound implications for how we initially train and build models in a very positive way. Because you can start to, you know, understand how to, to build a training step for these model very, very quickly to point about you know, speed to impact and, and the ability to then iterate from that quickly. So I think that’s very profound. You know, I think that what we’re working on is how to, how to then take that sort of initial traction that you get from GPD three, but still make it cost effective to use at scale. So, you know that that’s a lot of the, the interesting work to be done here as as one example.
CW: Yeah. So to my knowledge, Indico is the first platform out there that, you know, put transfer learning in the hands of the people who actually know how to read documents. And that was built on Burt Roberta, the first couple GPTs. It sounds like what you’re saying is there’s a path to using G P T three to do some of the easy parts of that transfer learning, like label the stuff that it can let the human worry about the hard cases. And to your point, that’s the straight line acceleration, right? G PT three just has to be g p and then, you know, you take over the wheel afterwards.
TW: Interesting. That’s the way to think about it. And I think that, you know, we always promote the bionic arm as the right metaphor here. A lot of people want to use kind of robots and brains as the metaphor, and it’s just not appropriate. It’s not accurate. It sets the wrong expectation. I think it also vastly underestimates, you know, just how powerful the human brain is. We’re very, very far from from that. That said, you know, we believe the bionic arm is the right metaphor because putting experts at the middle allowing them to do more, faster, better, more accurately, more efficiently that has very positive and dramatic ROI for the enterprise. But also has the appropriate, you know, guard, I’ve used this word guardrail quite a bit, you know, the human loop has to be that guardrail to make sure that the, the quality of of these ultimate decisions is appropriate.
CW: Yeah. It’s also not as cool of an analogy. Robots are much less cool than Mex suits, right. <Laugh>. Right. Let’s, let’s move on. Alright. So Tom, you talk about one of the strengths being, you know, it’s sort of playing off of Madison. One of the strengths is the zero shot capabilities. It will answer questions that it’s never seen before, and often it’s, it’s pretty good as long as you know how to s to sift out the stuff that isn’t quite there. Madison, what, what other strengths do you see that this big model has that others before it didn’t have?
MM: That’s a good question. For one, I think it’s applicable to a larger set of use cases than previous models. Previous models, if we talked about this classification framing or assigning a document to one of 10 buckets Yeah. Or assigning a word and a document to one of 10 buckets based on function. Like, is this a, a company name or price, something like that. This are very constrained problems. You know exactly what you’re gonna get out the other end, but you’re kind of limited to assigning labels to things. Yeah. On the other hand, generative ai, it’s, it is, its greatest weakness is its greatest strength. Yeah. In that it’s not restricted to just adding metadata to whatever you provide. It can produce new content. You know, you can ask it to write a article about insurance in the style of Shakespeare and get a reasonable result at the other end.
That’s something that’s simply not possible with the traditional machine learning paradigm of assigning labels to things. The NAC is in finding the right applications for that technology summarization tends to do quite well. This kind of a killer app for generative ai, it’s natural language summarization of like a longer document <affirmative>, another application might be search followed by natural language question answering. So if you have a large corpus of documents, folks narrow that large corpus down to a smaller number of, of relevant documents, and then ask t p d three to answer a question based on that context.
CW: Yeah. Like what’s the difference between these two? Right,
MM: Exactly. Or yeah, that too. So I guess the set of problems that is applicable to is simply much broader than what we’re used to dealing with. And it’s gonna take a little bit of creative thinking in order to figure out how we can leverage that for practical business benefit.
TW: I think Chris, to build on your, your comment about your daughter writing term papers with chat. See, I, I put a comment on LinkedIn last week and I said, you know, rather than saying students shouldn’t use chatt, let’s ask them, or let’s require that they show their prompts. Because in many ways prompts are the new coding language. And I think that’s gonna be true within the enterprise as well, that the, the skill required to quote unquote write good prompts will be sort of the next generation of, of software coding and, you know, writing good prompts and, and, and also helping enterprise users write good prompts and use them to get the information they need even more quickly, more robustly than they could before. That’s a profound change and very, very compelling and exciting. But we’re at the very, very beginning of that.
CW: Yeah. I remember old, old man story incoming, I remember back to when Google came out, right? You were talking about the advent of Google style search, and that was a c change from like AOL keywords, like purely faceted index search. Right? And I had friends
TW: Who, right.
CW: Bullying and bullying. Yeah. Yeah. I had, I had friends who just could not use Google adequately. Like they couldn’t figure out how to, how to write the search query in a way to get back the things that they cared about. And I think this skill that is emerging in terms of prompt engineering is just that on steroids, right?
TW: I remember the original search nerds at the beginning of the search search wave same experience, right? That true information scientist absolutely rejected the Google style search and said, you’ll, it’ll never surpass bullying, right? Bullying is still the only way to search information correctly. And this is kinda a moment like that, I think,
MM: It also might change the skills that we value. So by virtue of having such great generative ai, we need to become better discriminators. We need to get better at recognizing based on our own intuition, our own judgment, when an output is trustworthy. Yeah. And when it really needs to fact checked, when, when we need to be a little bit more skeptical. So I think it’s gonna be interesting to just see people improve at that skill of discriminating between what is flower flowery but empty and what is actually genuinely insightful from GP
CW: Three. Yeah. No, this is like, yeah, go ahead. Go ahead Tom.
TW: I was gonna say, and going back to, you know, another weakness of of chat j PT right now is it’s lack of explainability, right? And the enterprise has to have explainability as part and parcel to any AI based solution. Something that, you know, again, we were first to market with explainability for AI models targeted unstructured data. If you think about governance and compliance, if an AI says, yes, you should underwrite this risk, and you say, why? And it says, I don’t know, I just think you should, I mean, that’s obviously not acceptable and I’m being lib, but that is the equivalency, right? That Yeah, totally. Okay. You need to explain your decision to me so I can understand it, and I can also, by the way, tune it because maybe your understanding is still not quite where it needs to be. So explainability is vital to, to using these technologies in the enterprise. And, and again, some of the things we’re working on is how do we bridge the gap between the lack of explainability and <inaudible> free and the explainability we’ve built into our, our product experience so that it can be used, but used, for lack of a better word, safe alert, right?
CW: Yeah. No, the the FinTech guy in me has crippling anxiety at the thought of trying to explain the data lineage of decision that GPT three was involved in. Right? Like, yeah, that’s a big bridge. I don’t, I don’t, I don’t know exactly how to build it. No. so circling in on the industry, Tom, since you’re at the insurance disco tech what’s the buzz there? <Laugh>? Like what’s the mood? What are people worried about? What do they see as opportunities in the insurance space specifically?
TW: There’s a very large trend right now in major carriers reimagining their, for lack of a better term, their workbenches, right? Mm-Hmm. <affirmative>, so claims workbench policy, admin workbench, underwriting workbench, and think of a workbench as the orchestration of all of the upstream data and the downstream decisions. You know, I’ve always had a, a kinda a personal opinion that if you wanna distill what insurance companies are actually really good at, it’s, it’s making a specific set of decisions, right? Underwriting risk, improving planes, et cetera, et cetera, to do that well. They know now that data is the vital ingredient, and they’ve made huge investments in these decision systems and core platforms. But what they’ve realized is the source data is often very ugly. It is scanned pdf, the body of an email, you know, information designed for human consumption, not designed for machine consumption. And so, you know, the, the power of the solution we built is that translation layer. You know, we take information really designed for human consumption and translate it to machine consumption. Now, creating this superpower for your decision engines to operate on set of data than, than they could before.
CW: Yeah. I’ve, I’ve seen this light bulb appear over so many heads over the years where it’s like we automated something or partially automated something and oh my God, our data’s so clean now we can actually use it downstream for things right. Beyond just the process itself. So how is, you know, how is the G P T stack playing into that? Is it just sort of raising the level of excitement? Do people have specific thoughts about how it figures into the insurance industry? Or is, is it too late or too early to determine that?
TW: It’s, it’s definitely being discussed a lot right now at the show as well. Nobody knows for sure you know exactly what the killer use cases are yet. I think the, the, the level of interest is very high, as you often see, if we’re plotting this, you know, on the typical Gartner hype cycle, right? We’re, we’re on the incline in, in terms of the enthusiasm. We haven’t hit the the, the peak of interest or even begun into the trough of disillusionment that no doubt is is ahead of us, like all new technologies, that’s okay. That’s where we learn how to, to, to really create value from these innovations. But very much we’re on the, the upward slope of, of the hype cycle right now with good reason, right? I think the, there’s a lot to be hyped about.
CW: I love insurance folks. I have lots of friends in the insurance industry, but they tend to be fairly conservative people. So you’ve talked about the hype and the excitement. What are they worried about, Tom?
TW: Well, I think they’re realistic that things like explainability and governance, you know, are tough paths. You know, for sure in terms of the, the conservativeness, because when you think about it, you know, these, this is an industry where there are hundreds of billions of dollars at risk and getting that wrong, you know, can be catastrophic. So there’s a good reason for that conservatism that, that you find in the internal industry. But I think there’s a, a very strong grace of the need to keep being able to take advantage of more and more data better and better decisions to drive the business forward. And a real understanding that that is a, a survival imperative. You know, it’s not something that they want exclusively, it’s something that they need.
CW: Interesting. All right. I wanna, I wanna wrap this with you wanna put two questions on the T one for each of you? Madison, haven’t heard from you in a sec. So as you think about your peers out there, ML architects, engineers in, in large companies, like the insurance companies that Tom’s talking to this week, how should they be getting ready to deploy solutions that are built on top of these models, whether it’s our platform or another platform, what can they do to get ready to get educated to remove roadblocks for themselves?
MM: Good question. For one, I think we need to be careful not to be enticed by the simplicity of the interface to G P D three and skip some of the best practices that data science teams are good at. QP three doesn’t require that you necessarily build out a label training data set in order to train the model in order to measure the model’s performance. But we sure as heck should be still going through that same process of producing quantitative metrics of how well the model’s performing prior to putting something in production. It’s gonna be tempting not to because it’s no longer strictly required as a portion of the, the model development process when you’re building content building models based on GP three, but it’s still super valuable. So I think maybe just setting a little making sure we’re cautious about how we deploy this new technology and we don’t circumvent some of the best practices that we now have the option to.
CW: Interesting. So stick to your guns, stay rigorous, stick to your guns, fight all the hype.
MM: Yeah. Not like the bullying guys, but
CW: Yeah, that’s right. Don’t go down in the firefight, but yeah, I get it. And then Tom, to you, for the enterprise buyer out there who’s got some mandate from their c e O to like, get me some ai, I need some of that ai how should they be thinking about this? How can they remove roadblocks for themselves? One in terms of not buying the wrong thing, and two, in terms of how to evaluate the right thing?
TW: Yeah, I, I think the, the normal human reaction here is, wow, look at this new hammer. It’s like diamond and crusted. You know, I can where, where can I go find some good nails to hit with this new hammer? And I, I think that’s, it’s natural, but it’s not very youthful. So I think what’s more important is, and what we really try to drive with our customer engagement is tell me about the outcome that’s going to have a business impact that you desire. Talk to me about the outcome. Don’t talk to me about the approach. Don’t talk to me about the methods, technologies or tools. Talk to me about the outcome first, and then let’s work backwards to figure out which technologies and tools to use when. Now that said, you also have to have enough imagination to look at it and say, what was previously not possible may now be possible. So I think we can help with that as a, as a provider in this space. And I think the customers can help by understanding the strengths and weaknesses of these technologies to say, Hey, previously this was not possible. Is it possible now? And, and, but still be very rigorous about understanding the outcome you’re trying to craft as opposed to the approach or method. Because a lot of times that will lead you down the wrong, the wrong road.
CW: Yeah. That’s great. They should get educated, I think is what I hear you saying. Maybe they should be listening to unstructured unlocked, for example.
TW: That’s a good starting point.
CW: Yeah. And on that note, I will bid our audience a due. This has been another episode of Unstructured Unlocked. I’m your humble host, Chris Wells. My esteemed guests this morning have been c e o of Indico data, Tom Wild and Principal ML architect and co-founder of Indico Data Madison, may. It’s been a pleasure, fellas.
TW: Thank you guys for having us, Chris.
CW: You’re welcome. Best luck out there.
Check out the full Unstructured Unlocked podcast on your favorite platform, including:
Transform your own unstructured documents with our OOTB models
Explore firsthand the value the Indico Platform delivers
Discuss how the Indico Platform can help you tackle your unstructured data problems