Indico Data releases industry-first large language model benchmark for document understanding tasks
Learn More
  Everest Group IDP
             PEAK Matrix® 2022  
Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)
Access the Report


MSRB chief data officer explains how to extract value and insights from unstructured data

By: Christopher M. Wells, Ph. D.
December 8, 2022 | Center of Excellence, Unstructured Unlocked

Back to Blog

Over the course of his three decades in various IT positions with financial services firms, Brian Anthony has witnessed the growth of technology from being a “necessary evil” to opening up opportunities and value propositions. Today, one of those opportunities revolves around using technology to extract value from unstructured data.

After cutting his teeth as a mainframe software programmer some 30 years ago, when technology was no more than a means to an end of solving problems, Anthony found his way to the data side of the house. He is now Chief Data Officer at the Municipal Securities Rulemaking Board (MSRB), the principal regulator of the $4 trillion municipal bond market, charged with protecting both bond issuers and investors.

Listen to the full podcast here: Unstructured Unlocked episode 7 with Brian Anthony

Leveraging data in meaningful ways


Today his job is all about extracting value from data and finding new opportunities buried in mountains of data. In the past, organizations intuitively figured the more data they collected, the better. “The transition I’m starting to see is more about how we leverage the data that we have in more meaningful ways,” Anthony said.

Complicating the issue is the fact that the vast majority of data in any organization is unstructured – meaning it’s not neatly formatted in a database or spreadsheet. The bond market is chock full of such data.

One example is a bond offering document, which is like a prospectus that describes the bond in detail. These documents may run 200 to 300 pages and are required for every municipal bond, to convey the health of the issuer.

“All of that information comes in an unstructured document,” Anthony said, including financial data and any “faults” or other material information that may affect an investor’s decision.

“So, we collect unstructured [documents] and then we immediately disseminate back out unstructured,” he said. “You’ve got tens of thousands of people or organizations who are all doing the same thing, which is trying to derive insights to understand the health of the bond market.” And they’re all working from unstructured data.

Anthony estimates since January 2020 the MSRB has received some 300,000 PDF documents, which represents 28 or 29 million pages of text. The opportunity lies in providing structure to at least some of that data and providing insights into what the data is saying. “Then everyone doesn’t have to be doing the same thing,” he said.

Related content: Be more selective in underwriting by automating the insurance submission intake process


Creating decision-ready data


The idea is to create “decision-ready data,” Anthony said. The problem today is the investor community and business users are inundated with data. “So how do we present them with meaningful insights that can aid, not necessarily replace, but aid their decision making process?”

An example of decision-ready data lies in the GPS applications we all use to navigate. Those applications use reams of raw data that, by itself, would not be particularly useful in enabling you find the quickest route from point A to point B. But map applications make sense of the data and give you an aggregate picture with a limited number of options: one will take you 15 minutes, another 20, one on major highways, another back roads, and so on.

“That’s the type of information that we need to make decisions on; not every data point that everyone else has collected on your route,” Anthony said. “Decision ready, or decision useful data, means you’re getting the insights you need at the time you need them to make the decision.”

Related content: Why AI is required for transformative insurance claims automation


Giving structure to unstructured data


OK, then the next logical question is, how do we extract insights from these mountains of unstructured data, given business intelligence and visualization tools like Tableau generally work only with structured data?

It amounts to cleaning up the data so you have a good data set to work with. “Let’s get that cleaned up as much as possible so we don’t spend an extraordinary amount of time every time we do a research project going back through that,” Anthony said.

For MSRB, the first step in that process was to make its data searchable. When you’re talking about millions of pages of text, that is a useful capability to say the least. “We’re proud to have released our EMMA Labs platform, which enables searching,” he said. (EMMA is MSRB’s flagship source for municipal securities data and documents.)

Now the goal is to build off that development to create increasingly powerful analytical tools that “empower data users to better identify, visualize and understand market trends,” as Anthony told The Bond Buyer.

Indico Data is on board with that idea. We, too, see the value in unstructured data and our mission is to help companies unlock that value by giving structure to unstructured data. Our intelligent intake solution enables companies to automate the processing of unstructured documents, including financial forms like the MSRB deals with, insurance submission or claims forms, mortgage documents and more. Learn more about the Indico Data solution here.

For a deeper dive into my conversation with Brian Anthony, read the full transcript of the podcast here.

Check out the full Unstructured Unlocked podcast on your favorite platform, including:

Subscribe to our LinkedIn newsletter.



Increase intake capacity. Drive top line revenue growth.


Get started with Indico

1-1 Demo



Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.
Subscribe to our blog

Get our best content on intelligent automation sent to your inbox weekly!