Indico Data releases industry-first large language model benchmark for document understanding tasks
Learn More
  Everest Group IDP
             PEAK Matrix® 2022  
Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)
Access the Report


3 Keys to Scaling Unstructured Data Automation in the Enterprise

November 18, 2021 | Intelligent Document Processing, Robotic Process Automation, Unstructured Data

Back to Blog

Companies looking to achieve truly transformative change in the way they process documents and other data quickly learn that it’s not going to happen solely with robotic process automation tools for a simple reason: RPA solutions can’t deal with the unstructured content that accounts for the vast majority of data in most companies.

We see it with customers all the time. They start with an RPA platform and get some quick wins with simple processes involving highly structured content. But when they try to scale the solution and/or take on use cases involving unstructured documents, they hit a wall.

To achieve true enterprise scale with process automation, you need an intelligent document processing platform that addresses the following three key criteria.


Use case agnostic

We’ve seen lots of companies buy point automation solutions for use cases like contracts, invoices and email. They may work well enough for that given use case, but that’s it. Given the many varied process automation use cases enterprises would like to address, they will wind up with dozens of siloed point solutions. Now they’ve got an administrative nightmare and a highly inefficient automation program, because the solutions don’t talk to each other and employees have to learn how to use each point solution they may need.

True enterprise scalability calls for an industry agnostic intelligent automation platform that can be applied to any use case involving unstructured content, including documents, images and video. That enables you to get more from your automation investment by applying the same tool to myriad use cases. Such a tool is particularly valuable if you have an automation center of excellence (COE) that can help find use cases and quickly get them into production, because the knowledge they build with any one use case applies to all others.


No templates

In addition to RPA, we also see many customers try to use templated approaches to automation, often combined with optical character recognition. Like RPA, this approach is successful only with highly structured documents where the fields from which you want to extract data are exactly the same from one document to another.

In practice, this approach tends to break down even at modest scale, such as beyond 20 documents or so. That’s because even a document that may seem to be structured, such as an invoice, is less so at scale. In the invoice example, if you get only 20 invoices per month from maybe 5 different companies, it’s possible you can build templates to automate invoice processing.

But an enterprise must deal with hundreds or thousands of invoices from many different companies. Effectively, you’re now dealing with unstructured data, because there’s no way you can create templates to reliably process all of those different invoice formats.

In such a case, you need a platform that has artificial intelligence capabilities such as deep learning, machine learning and natural language processing. Such technologies give an automation platform cognitive capabilities that enable them to read documents and data just as your employees would – with no need for templates.


Cost-effective, scalable architecture

Often-times, however, you may find that platforms with deep AI capabilities can’t effectively scale to handle use cases involving hundreds or thousands of documents. The server load gets too high and the automation routines break down.

It’s also common to find AI platforms that require multiple GPUs in order to scale, which means your infrastructure tab can quickly reach into the millions.

Indico Data takes a different approach. Our base generalized model is based on some 500 million labeled data points, enough to enable it to understand the context behind any document, image or video – with no templates. That model runs on a single GPU.

Customers can then customize that base model for their own use cases. But those custom models sit on relatively inexpensive CPUs. That means with the Indico Unstructured Data Platform, as you increase usage you’re adding only CPUs, not GPUs. The platform is also flexible enough to handle any use case involving unstructured content, meaning it’s use case agnostic.

To learn more, check out our interactive demo that lets you see the platform in action, with applications such as data classification and extraction – common automation use cases.


Increase intake capacity. Drive top line revenue growth.


Get started with Indico

1-1 Demo



Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.
Subscribe to our blog

Get our best content on intelligent automation sent to your inbox weekly!