When it comes to document process automation, you can write a rule to automate virtually any step in a process. But that doesn’t mean you should.
Writing rule after rule to address all the variables that may come up in a process involving numerous documents is a trap, says Slater Victoroff, CTO and founder of Indico Data.
“When you proceed in that way the ability to solve every problem is actually the biggest possible danger you could face because it sucks you into this belief that at some point your rules will be correct,” he says. The rules-focused approach assumes the problem is you haven’t written enough rules or haven’t written the right rules. “When in fact that is not the problem. The problem is that you are writing rules to begin with.”
In a recent installment of the “Unstructured Explained” video series, Victoroff discussed the issue with two Indico Data colleagues: ML Architect and Co-Founder Madison May and VP of Business Development Brandi Corbello.
Many rules make for brittle models
The problem with rules is they tend to make automation models brittle, May says. While any single rule may improve the quality of an automation application, taken together, they amount to numerous potential points of failure.
“It means when [the model] fails it fails much harder because you’re imposing stricter and stricter constraints on what your system can and cannot do,” May says. “And at a certain point it ceases to become useful to try and inject all of your preconceived knowledge into the problem and you should take a step back and let the model handle it for you.”
Corbello agrees and says a rules-based approach harkens back to the days when shared service centers were convinced that optical character recognition would solve all their problems. For an invoice processing application, for example, the solution was to have a huge file of words that may exist on an invoice and training the OCR application to look for any or all of those words.
“It’s basically like this big ‘Control F’ was happening rather than actually understanding what was on the document,” she says. “There’s a big difference between those two things.”
Artificial intelligence delivers real understanding
Fully understanding what’s on a document requires a level of intelligence inherent in artificial intelligence technologies such as machine learning, natural language processing and transfer learning. Such technologies are the foundation upon which the Indico Unstructured Data Platform is built. They give the platform the ability to read and comprehend even unstructured documents just like your employees would – only far faster and with greater accuracy.
To learn more, check out the full video below.