WEBINAR: From expertise to AI - bridging the knowledge gap in insurance underwriting
Register Now
  Everest Group IDP
             PEAK Matrix® 2022  
Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)
Access the Report

BLOG

A sure fix for a vexing insurance automation problem: importing data from table

By: Christopher M. Wells, Ph. D.
December 21, 2023 | Artificial Intelligence, Intelligent Intake

Back to Blog

One of the challenges with automating insurance document intake processes is dealing with data that’s in tabular form, including spreadsheets, invoices, and the like. But help is on the way, as the latest artificial intelligence large language models are surmounting the challenge.

There’s a (relatively) old saying in the artificial intelligence world, “There’s no such thing as a table.” What that means is it’s exceedingly difficult to define a table in a way an AI model can consistently understand. As soon as you think you have it right, an Excel spreadsheet will come along with data formatted in some new way that breaks all the rules defined in your model. It’s Murphy’s Law meets spreadsheets, or tables of any kind.

Related content: Unleashing efficiency: AI-powered document intake for managing general agents

 

The problem with tables in insurance automation

 

The problem is, in a typical, high-throughput LLM, only a few hundred words can fit in a window at a time. Say you’ve got a table that spans multiple pages, or even a single-page table that’s dense with rows. On a wide table, you may only get three or four rows deep before the model loses memory of the row header; it can’t “see” it anymore.

So, not far into the table, the model loses the ability to predict, and you start getting sub-optimal results (read: garbage).

That doesn’t mean it’s impossible to automate intake of underwriting submissions and claims documents that include tables. But the automation routines that get the job done are typically brittle, taking a form-style approach. By that I mean the model treats the table like a static form and has rules to extract the values from various cells. If the format changes, or perhaps the scan quality is poor, all bets are off.

Related content: Simpler labeling means faster time to value for insurance intelligent automation models

 

Additional capacity for the table issue

 

In the latest version of our platform, Indico 6, we’ve added modeling capacity to address the table issue. We essentially squared our table model capacity, giving it the ability to keep track of information about an entire column or row in which a particular cell resides. That means headers no longer get lost. Rather, the model retains the context that is so important to accuracy in any LLM, and to the effectiveness of any insurance document processing automation solution.

Even better, getting the job done doesn’t take any additional labeling as compared to any other production use case. Just use the usual bounding box tool to carve out the row or column you want to pull data from; simple.

I don’t generally talk product in these missives, but I do believe this is a breakthrough. In beta tests of Indico 6 over the summer, we’ve seen double-digit improvement in performance for documents containing tables. I’m confident in saying it’s best in class in terms of modeling large, complicated tabular data.

We’re happy to field any questions about how this works – just contact us. Or try it out for yourself by signing up for a free demo.

Subscribe to our LinkedIn newsletter.

[addtoany]

Increase intake capacity. Drive top line revenue growth.

[addtoany]

Unstructured Unlocked podcast

April 24, 2024 | E45

Unstructured Unlocked episode 45 with Daniel Faggella, Head of Research, CEO at Emerj Artificial Intelligence Research

podcast episode artwork
April 10, 2024 | E44

Unstructured Unlocked episode 44 with Tom Wilde, Indico Data CEO, and Robin Merttens, Executive Chairman of InsTech

podcast episode artwork
March 27, 2024 | E43

Unstructured Unlocked episode 43 with Sunil Rao, Chief Executive Officer at Tribble

podcast episode artwork

Get started with Indico

Schedule
1-1 Demo

Resources

Blog

Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.
Subscribe to our blog

Get our best content on intelligent automation sent to your inbox weekly!