Everest Group IDP
             PEAK Matrix® 2022  
Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)
Access the Report

BLOG

How intelligent document processing helps address the ambiguity in automation efforts

By: Christopher M. Wells, Ph. D.
June 24, 2022 | Intelligent Document Processing, Intelligent Process Automation

Back to Blog

As we’ve discussed previously, automation models can be brittle. Often it’s because models are rule-based and not all documents play by the rules, but another big issue is ambiguity. It’s not always easy or clear how to parse the meaning of a document in a way that translates to an automation model, which is all the more reason we need intelligent document processing – with an emphasis on “intelligent.”

Some colleagues and I explored this issue in an episode of our “Unstructured Data Explained” video series. In the episode, Indico Data Founder and Machine Learning Architect Madison May laid out the issue nicely when describing the process of converting a PDF to plain text, which is a first step used with templated approaches to automation or robotic process automation.

Check out our entire library of Unstructured Data Explained video series here.

Related content: “Gartner 2022 Market Guide for intelligent document processing solutions

 

Why a lossy process is the enemy of automation

“Most people tend to think of documents as plain text,” he said. “You have some PDF and it’s really just a series of words and if you feed the series of words into a machine learning model you expect you can understand everything there was to understand about the document you vetted.”

But as he correctly pointed out, you can lose a lot of salient information in that conversion process. Such information is “present in the layout of the page, in how the page is styled, the size of your words on the page, potentially graphics,” he said. “Sometimes text is a caption for an image and if you lose the image the text ceases to make sense.”

It all adds up to what Madison called a “lossy process” where you lose much of the context behind the page before you ever feed it to a machine learning model.

There’s also ambiguity inherent in the process itself, before you even try to apply automation.

Most, if not all document-based processes have decision points at which an employee deals with some inherent ambiguity and must make a decision, at times relying on their experience to infer information that may not be in the document at all. Given that, two different people who perform the same job may wind up with different outputs from the same document.

No artificial intelligence engine is going to solve for that issue, so the goal is really to get your AI engine to the point where it can accurately automate the processing of documents that are unambiguous, or at least sufficiently low-risk that you’re comfortable with letting the model make decisions. For others, you let the model do the best it can then loop in a human to apply judgment. The human + AI interface is key here.

 

Difficulties in training automation models

Part of the reason AI will get you only so far is because it can be difficult to teach an automation model all the nuances inherent in a document like a PDF. This is where the conversation in our recent Unstructured Data Explained episode ventured into esoteric topics, with our CTO and co-founder Slater Victoroff positing questions like: What is a word? What is a sentence or a paragraph?

“If you’re a non-technical person and you hear that question you’re like, “Are you an idiot?  You don’t know what a word is?” Slater said.

But it turns out defining such things is not always so easy. A classic example is trying to define a paragraph that spans a page boundary. The page break might mean it’s a separate concept that’s starting. Or it might not. You might need to know how the text on the previous page reads or you might not because new paragraphs containing new thoughts are indented.

The more granular you want to get, the more you need precision that may be hard to come by. For example, are individual bullets in a list each their own sentences? Or should each element of the list include the task or heading that started the list? Are section headings or subheadings themselves sentences? To Slater’s point, words seem uncontroversial, until you have to decide whether the dollar sign symbol before a charge in an invoice is a separate word, as it would be if you read it out loud, or something else.

Related content: “Why intelligent document processing succeeds where other AI technologies fail

 

Intelligent automation requires contextual understanding

All of these definitions can be important for various document process automation use cases. In some instances, a big part of a use case is breaking up a document, such as a contract, into its various component parts. The ideal situation is an optical character recognition engine does the job for you and the automation model can then do its thing on the resulting output.

But when boundaries get fuzzy, such as those paragraphs split across a page, it takes contextual understanding to make accurate decisions. Contextual understanding only comes with intelligent automation platforms built on artificial intelligence technologies such as natural language processing, machine learning and deep learning.

It’s also helpful if your automation platform offers multiple types of models. Text-based models are a given. Object-based models that treat a page like an image are also helpful, with their ability to identify and extract data from elements such as a table, photo caption or logo. Now you can get models working together in parallel, each performing the tasks they’re best at.

The Indico Unstructured Data Platform, for example, offers numerous model types and gives you a confidence rating with each one. If the model is 95% sure a result is accurate, you can likely send that output along. If it’s only 75%, you may decide to kick it to an employee for review. You can basically turn knobs to deal with the imprecision and ambiguity in a process, defining how to deal with various issues that may crop up.

The platform is also smart enough to understand the context behind any kind of unstructured content or data, a testament to the database of some 500 million labeled data points upon which it’s built. So yes, with Indico, you can build models that know what a word is and can deal with paragraphs that span pages.

And in the cases where the artificial intelligence still can’t confidently make the decision for you, our platform’s human-in-the-loop capability ensures you can seamlessly combine artificial and human intelligence to maintain a high level of accuracy, and far greater efficiency than either by itself.

To learn more about just how smart the Indico Data platform is, check out our interactive demo or schedule an in-depth demo.

Automate your most complex unstructured document workflows

Get started with Indico

Interactive demo

Transform your own unstructured documents with our OOTB models

Live Demo

Explore firsthand the value the Indico Platform delivers

Talk with us

Discuss how the Indico Platform can help you tackle your unstructured data problems

Resources

Blog

Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Data Explained

Answers to the most complex questions in unstructured data.

CTO Corner

An accumulation of content straight from our co-founder and CTO.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.
Subscribe to our blog

Get our best content on intelligent automation sent to your inbox weekly!