In this age of digital transformation, companies of all stripes are interested in data analytics, enabling them to find nuggets of golden insight from their various data sources. A big challenge in that effort is that 80% to 90% of all enterprise data is not in a structured format, leaving companies searching for unstructured data analysis tools.
It’s a worthy endeavor, as companies can gain significant benefits from effective data and text analytics. In its “Market Guide for Text Analytics,” Gartner outlines numerous use cases for text analytics, including:
- Generating market intelligence in real-time to improve information capture and to react more quickly to opportunities.
- Brand loyalty and sentiment analysis
- Identifying fraud, compliance, and legal risks by scanning through all text-based information and interactions in fraud analytics
- Automating information-centric structured processes to improve efficiency and quality
- Assess customer insurance claim narratives and derive feedback for new underwriting models
Related content: “Gartner 2022 Market Guide for intelligent document processing solutions”
3 stages to unstructured text analytics
Gartner defines three stages of the text analytics pipeline, the first of which is preparation, which it describes as “Transforming myriad sources and types of content into a form that can be processed for analysis.” Next is the text analytics stage, where a data analysis tool performs analytics. Insights and representation is the final stage, where visualization, natural language generation, and other techniques present the data in a meaningful way.
The preparation stage is relatively straightforward for structured data, meaning data stored in SQL databases, spreadsheets, and the like can be poured directly into a data analysis tool. Similarly, analytics tools such as Microsoft Power BI have integrations with data-generating applications such as Salesforce, Google Analytics, and Microsoft Dynamics.
Related Content: Outlining the Difference Between Structured, Unstructured and Semi-Structured Data
How to analyze unstructured data
But many companies want to be able to unlock insights from various unstructured documents, including PDFs, emails, invoices, contracts, and audio and video files. That requires an unstructured data platform that can ingest these multiple formats, pull out relevant data and transform it into a format that’s acceptable to an analytics tool.
As the Gartner report details, the Indico Unstructured Data Platform is a solution for document intake and understanding that enables unstructured data analytics. “The platform, with its point and click interface, enables business SMEs to upload training samples, label and classify samples, orchestrate workflows, and review model output,” Gartner says.
The report also correctly points out that it takes only about 200 documents to train an Indico model to extract data with about ultra-high accuracy. That’s because the Indico platform is built on a database of some 500 million labeled data points, enough to enable it to understand almost any document or image. Indico then takes advantage of artificial intelligence technologies, including transfer learning and natural language processing, allowing that base model applied to a customer’s specific use cases, including unstructured data analytics as well as text analytics.
The result is that you can pull data from unstructured documents and translate it into a structured format that analytics tools such as Microsoft Power BI, Tableau, Google’s Looker, and more can understand.
Unlocking years of unstructured text data
That means you can now unlock all the intelligence stored in any document or image in your organization. MetLife, for example, has documents that date back more than 100 years. By using the Indico IPA platform to extract data from the unstructured documents and then applying data analytics, the company expects it can better predict mortality and morbidity, says Sean Nicolello, Vice President of Intelligent Automation at MetLife.
“A small adjustment to our actuarial models can result in billions of dollars of savings and revenue generation over the next 5 to 15 years,” he says.
That’s the kind of cost savings that digital transformation was intended to deliver. To learn more about how Indico can help you achieve unstructured data analytics, Click here to schedule a demo to learn more about how Indico helps you achieve unstructured data analysis.