From my experience as a data scientist for a large financial firm, you can learn a lot from technology vendors, a point my latest guest on the Unstructured Unlocked podcast proved in spades. Patricia Thaine provided practical advice around how best to train intelligent automation models and how to sell the idea of artificial intelligence projects to the business at large.
Thaine is CEO of Private AI, which addresses an important aspect of AI projects: ensuring privacy. Her company’s software makes it a simple matter to integrate privacy into application pipelines, whether it’s to ensure you stay in regulatory compliance or simply as a matter of customer trust. Private AI enables you to find such private data even amidst complicated, unstructured data and content.
Listen to the full podcast here: Unstructured Unlocked episode 5 with Patricia Thaine
In that sense, Private AI has some similarities with Indico Data, which enables you to incorporate unstructured data into intelligent document processing models. So, we both have to be able to “read” and make sense of unstructured data.
Creating effective automation models
We got to talking about how best to create automation models that deal with unstructured data. It can be tricky because two people looking at the same document may deem different things to be important and worth extracting, depending on their point of view.
Take an invoice, for example. An employee who handles accounts receivables will be looking for different things from one who deals with accounts payable. Trying to get machine learning engineers or data scientists to understand all the subtleties inherent in automating invoice processing, then, will be quite difficult.
“This is actually something that healthcare AI has figured out because, interestingly, the data is just so foreign to the machine learning engineers,” Thaine said. “So who do they get to label it? Doctors. Nurses. People who have grown up in healthcare.”
Necessity is the mother of invention, as they say. And I would argue the issue goes well beyond healthcare, to pretty much any vertical. Which is why it’s important that those employees who actually perform your business processes – the accountants, commercial insurance claims adjusters, underwriters and the like – be the same people who train your automation models.
That, of course, means you need an intelligent document processing platform that’s simple enough for those folks to use, one that does not require data science or IT experience. (Brief ad: Indico Data has you covered there.)
Related content: Johnson Controls unlocks its unstructured data with intelligent document processing
Selling AI in the enterprise
Another area in which Thaine and I share some experience is in selling AI projects in the enterprise. Before joining Indico Data, I was Data Science Team Lead for Chatham Financial. Part of that role involved convincing various groups to adopt AI as part of their data science toolbox.
Today, although Thaine and I are on the other side of the fence as vendors, the challenge is much the same in terms of educating enterprises on the value of AI. She has what I think is a practical, sensible approach that should resonate with any enterprise.
“I don’t see it so much as big companies having to embrace AI as there’s a solution to a problem, and maybe AI solves it better [than some alternative],” she said. “It’s actually the solution you’re selling to them, not AI itself.”
Taking that approach becomes an exercise in effectively framing the question. If you say, “I have a solution that can save 40% to 50% of the time we spend onboarding new commercial real estate customers,” for example, I suspect you’ll get the attention of business stakeholders.
Getting real about accuracy
Of course, as soon as you start throwing percentages around, you may well find yourself in a discussion about accuracy. Specifically, something to the effect of, “Why isn’t this 100% accurate?”
That’s a particular sticking point for Thaine’s company because it deals with data privacy – and no customer wants even a single leak of private data. Data scientists, on the other hand, are generally not comfortable with the idea of guaranteeing 100% accuracy with anything; it’s just not feasible. Think about it: are any of the processes handled solely by your employees 100% accurate? Correct answer: no, they’re not.
Yet some regulatory demands are not far off from that standard. HIPAA, the healthcare privacy regulation, defines a risk threshold of .04%, Thaine said. That means HIPAA-complaint privacy tools can only err in four out of 10,000 instances.
“What I like about that guideline, even though it’s certainly flawed, is that it is admitting that no one’s perfect,” she said. “And the thing is a lot of people expect privacy technologies to be perfect.”
In reality, Private AI’s aim is to minimize risk, not eliminate it entirely, because that’s not a reasonable goal. The same goes for intelligent document processing projects. You’re not going to get 100% straight-through processing every time, but you can achieve significant reductions in the amount of hands-on time your employees spend on various processes, which is a worthy, valuable endeavor.
While those are some highlights of my conversation with Thaine, we discussed plenty of other topics, including various ways to get value from unstructured data, attributes of innovative companies, why you may want to shy away from “rolling your own” automation software and more. You can read a transcript here or listen to the entire episode (no. 5) on your favorite platform, including: