Indico Data receives top position in Everest Group's Intelligent Document Processing (IDP) Insurance PEAK Matrix® 2024
Read More
  Everest Group IDP
             PEAK Matrix® 2022  
Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)
Access the Report

BLOG

Overcome the complexity of machine learning: get to know machine teaching

By: Christopher M. Wells, Ph. D.
May 6, 2022 | Citizen Developer, Machine Learning

Back to Blog

The first line of a 2017 paper written by a group of 11 experts at Microsoft Research succinctly articulated a problem that too many companies face to this day: “The current processes for building machine learning systems require practitioners with deep knowledge of machine learning.” 

But the paper also posited a solution to the machine learning complexity problem, in the process coining the term “machine teaching.” 

 

Machine teaching: a better way to build automation models

“We believe that in order to meet this growing demand for machine learning systems we must significantly increase the number of individuals that can teach machines. We postulate that we can achieve this goal by making the process of teaching machines easy, fast and above all, universally accessible,” the researchers wrote

As it turns out, they were right. Today solutions do exist that make it fast and easy to build powerful machine learning models that automate processes involving documents and data, including unstructured content. And these intelligent automation solutions – such as the Indico Unstructured Data Platform – are most definitely accessible. Indeed, they’re intended to be used by employees who have no knowledge of machine learning whatsoever – but a deep knowledge of the process to be automated. 

In a sense, that flips on its head the traditional approach to building machine learning models. Previously, when the business had a process it wanted to automate, it would take requirements to a data science team. The data scientists would try to build a machine learning model to extract key elements of documents or images and drive automation for the process in question. That would likely take months because building such models is a complicated endeavor to complete from scratch, even for professional data scientists. And the model would almost certainly have to be fine-tuned before it went into production, because data scientists are experts in data generally, not in your business’s data specifically,  meaning more back and forth and delays. 

 

Like advanced coding tools for AI

I think of the progression of machine learning model-building as similar to what we witnessed with writing computer software code. When I first learned to program, I wrote Java code in a notepad file, ran a compiler from the command line, and held my breath while waiting to see if the code did what I intended. If it didn’t, I had to review it line by line to troubleshoot it.
println(“gET me oUt oF hERe!!!!”)

Today we’ve got tools that will tell you code is flawed as soon as you write it. That is much more productive (albeit less challenging) and makes it easier to write complex programs.

Machine teaching offers the same sort of benefit when it comes to building intelligent document processing models. Your process experts – meaning the people who perform the task day-to-day – are the ones who “teach” the model. They do so by using simple tools to label documents, telling, or teaching, the model which components of a document are important. And just like developer tools tell devs when they miss a semicolon, machine teaching tools give the user feedback on where the model is learning well and where it needs help.

Really, this mimics the work employees perform in processing, say, an invoice: cutting and pasting values such as name, amount, invoice number and so forth from the invoice and into a downstream ERP or other processing system – a process also known as “see and key.” With machine teaching, instead of cutting and pasting they use the labeling tools to teach an automated model how to do the job for them.

 

Change made easy: Staggered Loop Training

Going a step further, Indico Data recently came out with a new capability that addresses another problem that has plagued machine learning: dealing with changes. You can train a model to deal with a process as it exists today, but change is inevitable. Vendors change their invoice formats, regulations change, new documents crop up, and so on. Going back to your data science team to update models for all of these changes is time-consuming, expensive and impractical.  Machine learning operations is a nascent field with emerging (maybe?) best practices and fragmented tooling.

To address it, Indico Data developed Staggered Loop Training. This is intended for “human in the loop” processes, where an employee is required to perform one or more steps in a given process, or simply to ensure accuracy in an automated process. 

In such cases, whenever an employee finds an exception, they simply make a correction. The Indico Data platform will then learn from that correction and update the model accordingly – all on its own, with guardrails that the user puts in place. And it will do so without changing how it deals with all the content and data that hasn’t changed. Think of it as continuing education for process automation models. 

Staggered Loop Training also addresses a fear you often hear about with respect to machine learning and artificial intelligence in general: that intelligent machines will take over all of our jobs. Staggered Loop is a great example of a far more likely reality, that of humans and machines working together. Employees teach machines how to be smarter and free themselves up for more valuable and less mundane work.

To learn more about Staggered Loop Training and other new features, check out Indico 5 or schedule an in-depth demo. You can also register for a free trial to test the platform for yourself or get in touch with any questions.

 

[addtoany]

Increase intake capacity. Drive top line revenue growth.

[addtoany]

Unstructured Unlocked podcast

April 24, 2024 | E45

Unstructured Unlocked episode 45 with Daniel Faggella, Head of Research, CEO at Emerj Artificial Intelligence Research

podcast episode artwork
April 10, 2024 | E44

Unstructured Unlocked episode 44 with Tom Wilde, Indico Data CEO, and Robin Merttens, Executive Chairman of InsTech

podcast episode artwork
March 27, 2024 | E43

Unstructured Unlocked episode 43 with Sunil Rao, Chief Executive Officer at Tribble

podcast episode artwork

Get started with Indico

Schedule
1-1 Demo

Resources

Blog

Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.
Subscribe to our blog

Get our best content on intelligent automation sent to your inbox weekly!