One of the issues insurance companies face with artificial intelligence is accurately labeling data to create effective automation models. But help is arriving in the form of AI large language models trained on so much data that they’re getting bigger and smarter.
Let’s look at the traditional way of creating automation models, then dive into what the concept of zero-shot predictions makes possible for machine learning models, including GPT (Generative Pretrained Transformer) models.
Traditional automation model labeling
Most intelligent automation models in insurance are created by labeling documents to indicate what data you want to extract. To automate an insurance claims intake process, you may take a few dozen documents and label fields such as claimant name and address, policy number, damage description, repair estimates and the like.
While effective tools exist to make this process relatively painless, it nonetheless takes time. For any given claims or underwriting policy, there’s likely dozens of different document types that may be involved. Each has to be labeled.
Some companies address the issue by creating teams to label data. While that relieves the burden on any single team member, it can lead to problems akin to “too many cooks in the kitchen.” People interpret descriptions of labels differently. Or they simply do their jobs differently from one another, making what should be the same process more difficult to define and, hence, automate. The result is a sub-optimal model.
Simplifying labeling: zero-shot
But the labeling game is beginning to change.
LLMs like GPT have been trained for so long on so much language that they’re now getting large enough to unlock zero-shot predictions. Give a description of what you’re looking for, and where in a document to find it, and the model gets you the data. In some instances, you may only need the name of the data element, such as “policy number.”
Now instead of labeling dozens of documents, users can go into prompting mode to create automation models. After seeing what the model comes back with, they can refine the prompt until it delivers the desired result. That includes adding instructions such as, if this word is next to that one, ignore it.
Additionally, you can run a model based on an initial prompt, then fire additional prompts against the result it returns, to get more finely tuned results or weed out erroneous values.
You can also combine traditional hand-labeling with zero-shot. So, label five or 10 documents, see what results you get, then go into prompting mode and modify the prompt until you get satisfactory results. Once you’re happy with the prompts, click a button and the GPT model labels the data set for you.
Indico 6 Prompt Studio
At least, this is the way model creation works in Indico 6, thanks to our new Prompt Studio. In practice, it means you can create highly effective models far more quickly. Instead of maybe a week of hand-labeling documents, associates do a couple of hours of prompting followed by a couple of hours of quality assurance and the model is ready to roll.
The Indico Unstructured Data Platform could already reduce insurance claims intake and underwriting submissions process times by around 85%. With Indico 6, you can realize that value faster by reducing the time spent creating automation models.