Indico Data releases industry-first large language model benchmark for document understanding tasks
Learn More
  Everest Group IDP
             PEAK Matrix® 2022  
Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)
Access the Report

Interview with Automation Center of Excellence expert Cristina Duta

Christopher M. Wells, Ph. D., Indico VP of Research and Development talks with Cristina Duta, Director of Intelligent Automation, AECOM, in episode 3 of Unstructured Unlocked. Tune in to discover how enterprise data and automation leaders are solving their most complex unstructured data challenges.

Listen to the full podcast here: Unstructured Unlocked episode 3 with Cristina Duta


Christopher Wells: Okay, here we go. Hi, and welcome to this episode of unstructured unlocked. I’m your host, Christopher Wells, VP of R and D at Indico data. And I’m really excited to introduce you to our guest today, Dr. Cristina Duta Director of Intelligent Automation at AECOM. Christina, welcome to the show.

Cristina Duta: Hi Chris. Thank you for having me here. It’s a pleasure.

CM: Good. Yeah, it’s a pleasure for me too. As we’re getting started you’ve had an interesting journey in your career. I’d love to hear about it and then sort of take us all the way to today and what you’re doing. If you wouldn’t mind starting there.

CD: Sure. So I actually started my career as a technical expert. I started in it almost 12 years ago where I took a bachelor degree in computer science. Then I moved towards a master degree in parallel and distributed computer systems. And then I actually pursued one of my passions, which was research and I actually did a PhD and this was focused also on computer science. I want to mention that I am a technology geek, which means that while I was doing my PhD, I also started working and I was a software engineer for almost five years and I engaged with different tools. And afterwards I thought it was the moment to do a career shift and that’s when I heard for the first time about RPA. And I decided to pursue that path. I saw the potential seven years ago in this particular technology. And I said, I want to get in as fast as I can. So I started working on RPA as a software developer and then easily progressed towards technical leader then towards leading different centers for RPA and center of excellence. And now I’m in the position of a director of a global center of excellence here at the AECOM.

CM: That’s fantastic. Your career journey is almost as winding as mine. I, I did a PhD in theoretical physics and somehow ended up in the automation space. So it’s good to be talking to a fellow wanderer.

CD: I think they all intertwined to be honest. So you’re always going to find glimpse and pieces of research or anything like that in the automation space. So you’re never going to miss necessarily doing research like it did for PhD, for sure.

CM: That’s right. Yeah, absolutely. Well, that’s great. This show is all about talking to COE leaders and, and automation and analytics leaders. As you think about that journey you’ve been on and landing at AECOM, how does, how does, how does the center of excellence at AECOM function? You know, what are the roles, what are the business units you interface with? Just anything you could tell us about that would be interesting to our viewers.

CD: Sure. I think companies are very different. What is right now particular for us is that the global center of excellence and there is actually under the GPS, the global business services as an overall operating unit. And as part of this, we interface with almost, I would say every department across ACOM and we are automating every back office process that is available for us to automate in terms of the way that we are actually built. The center of excellence was officially formed almost one year and a half ago. There were a lot of tentatives previously and there were the foundation was already set by using different tools to automate business processes, but there wasn’t necessarily a formal context around it or a formal structure. And this was defined one and a half years ago. I would say that mainly right now, we have a full fledged COE in the sense that there is a team of around 10 people right now we have all the dedicated roles. So we have our architects, we have business analysts, we have deliver lead. We have developers, we have a dedicated support team. That’s working on that one. And we have what I like to call shared business analysts because in general, the, the knowledge sits within the business and not necessarily within the COE. So I prefer to use the people that actually have the knowledge whenever we’re trying to automate processes.

CM: Interesting. So the shared business analysts are sort of double agents they’re working for both sides, right? Exactly.

CD: Yes. And they’re our main drivers, to be honest, they put in the good word for us.

CM: Yeah, no, that’s good. Yeah. I assume they eventually become the champions within the business for the project, right?

CD: Yes.

CM: Yeah. Yeah. I’ve talked to a lot of folks that have that they have that construct within their COE, but they haven’t given it a formal name. And I, I really liked the shared business analyst framing of it. That’s cool. That’s really good.

CD: I think it’s about empowering people to see an opportunity for them to grow in the automation space, because everybody’s afraid from a business perspective about automation and how it can actually impact their daily jobs. So when you actually shift it towards, okay, you can bring more value by looking at automation and showing how they can grow. It’s, it’s actually something that is driving more and more opportunities for us.

CM: Yeah, that’s excellent. So thinking about those, that team and those roles, and you’ve been at it for 18 months now, you’ve you have been in the, in this you know, in this vertical for a while, but for this group, about 18 months, what’s the typical timeline for a project from, you know, someone says, Hey, this might be an opportunity to it’s in production and, and people are relying on the automation.

CD: I think the answer depends, correct. It depends a lot on what department are we talking about? Because each department can comes with their own regulations, compliance and procedures. It depends on the applications we interact with because you might have limitations in terms of the it applications that processes interact with. So you also need to consider that. And it also depends on let’s say the priority at strategic business levels. Is that something that’s going to produce an impact overall on you as a person a group of people or the entire department. So we, we usually consider this whenever we’re starting, let’s say on the path of, let’s say doing an assessment and prioritizing an opportunity, and then it goes into the space. Okay. Let’s say we reach the particular assessment. The process is feasible for automation, and it has a specific complexity. We define three complexities. We have small complexity, which is simple for us. We have medium complexity, we have medium towards high, and we have high complexity. Okay. So we try to put this in place and we’re actually I think I’m not sure how many COEs are doing that, but we are actually following the agile methodology.

CM: Okay. 

CD: Which means that we are trying to show incremental value to the business. We’ve realized that it’s extremely important to show how the robot works progressively for them to actually gain more knowledge and to be able to interact with it in the end, instead of actually waiting towards the end and say, Hey, this is the robot tested out and see how it works. So for us, it works and we see a lot of reduction in the way we are delivering the robots. If we do this type of agile methodology, whenever we’re delivering. So come back to your question, because I took a detour.

CM: Yeah, no, that’s good. 

CD: While answering that, I think it may start from a simple process that gets delivered in overall from the moment you assess the opportunity to the moment you actually put it in production one month for simple ones. Okay. Then it goes between, let’s say one month and eight weeks. So two months, whenever we’re talking about medium ones and medium towards high, and then we’re talking about probably three months, whenever we’re talking about very complex automations that include, let’s say more than 10 applications that they interact with, or we’re talking about cross department applications and cross department users that we actually need to engage to get the requirements and be able to deliveries to deliver that particular automat. So it comes down to that to be honest. And that’s what we have currently in place.

CM: Amazing. I love the detail to your previous comment, lots of COEs talking about, talk about being agile, but they’re really just waterfall and agile clothes. That’s been, that’s been my experience. It’s, it’s hard to break those habits, especially a lot of COE leads, you know, they don’t have the software background that you have, right. And so you’ve actually like done agile, right. And you know what it looks like and doesn’t, so that that’s a big deal. And it doesn’t surprise me that, you know, given those timelines that you’re moving so quickly because you are taking agile seriously. 

CD: We are. And again, I think it’s a methodology that people should embrace as we go down the path of automation because it streamlines not only the delivery, but it streamlines the way that the team is working as well. It’s very helpful to put structure around the team, the tasks and everything that’s being delivered. You have a clear vision of what has been done, what still is spending. And everybody can have a visibility on how the COE is working. So it’s a very easy way to, to have a measurement of the way you are actually doing delivery.

CM: Yeah, absolutely. Good. So let’s talk a little bit about the interaction between complexity and payoff, right? So, you know, there must be some efficient frontier where something becomes so complex that it’s not worth what it could pay off the impact to the business. And there must be, you know, some really easy to automate things that just don’t have enough payoff to be worth doing. So how do you, how do you think about the interaction between those two vertices?

CD: It comes down to the idea of how you have it set up in your COE to give a particular example. Right now we are looking at overall business objectives. We try to align whatever we’re trying to automate with the overall business objectives to see if they make sense or if they don’t make sense. Whenever we’re talking about quick ones, cause we have quick one, right? Like I don’t know a person wants to automate the emails they sent out on a daily basis or maybe even something more simplistic, those types of opportunities. We do encourage people to come to us and expose them to us and tell us about them. But we are keeping them from what we call citizen developers. So we like people to actually be able to work on real use cases. And that’s where we give them the pool of opportunities.

These are simple examples, real life examples that we use within the company. So they get the sense of how to actually develop and build automations using those small complexity use cases. And for the rest, we actually prioritize them together with the leadership. We go through a steerco, whenever we validate, let’s say the opportunities that may have a bigger ROI, align them with the strategic objectives, what we are trying to achieve. This year in the next two years in the next three years to see exactly where this fits and then we start delivering those opportunities. So mainly this is what we do when it comes to what is the, I would say the benefits that we get from automations.

CM: Okay. That’s interesting. So the quick wins they’re quick wins because they don’t affect a lot of different folks and therefore let the folks that they affect take care of them, I guess is sort of what you’re saying, right?

CD: Yes.

CM: Fascinating. And then the stuff that has a major impact and therefore will need ongoing maintenance and governance that, that goes into the full, you know, the machinery and, and all of that’s taken care of, right?

CD: Yes.

CM: Yeah. That’s a good breakdown. I haven’t talked to a lot of COE leads that think that way it’s usually everything’s under control or it’s the total wild west and nothing’s under control. And I like, I like the breakdown that you’ve got going on. That’s cool.

CD: I would say that definitely people are looking towards federating their COE because there are too many opportunities that are going to remain untapped unless you tried this approach. And if you start with a strong governance and you put that in place, whenever you’re deciding to move towards a federated model and embed that in the culture of the company, it’s going to be very easy to migrate into this space and start having citizen developers actually doing automations.

CM: Yeah. No, that’s, that’s great. That’s a great way to say it, embedding it in your culture. Okay. So let’s zoom out a little bit and talk about COEs more broadly for automation and analytics. From what you’ve described, I would sort of place what you’re doing all the way on the, the far end of maturity in terms of how COEs think about things and process and all of that. I noticed that you were at intelligent automation week about a month ago. And so I’m, I’m wondering what you’re seeing lately in terms of how you think about maturity, how other companies are thinking about maturity and like if you’re, you know, three standard deviations away from the mean in terms of maturity, where would you put the rest of the, you know, the rest of this vertical,

CD: If we were to refer, let’s say to purely automation. Maturity.

CM: Yeah.

CD: Okay. If we are referring to automation, maturity, I think there has been a great evolution since the last two years, to be honest, everybody is growing towards that space where governance procedures delivery methodologies are in place and they have a clear vision on what they want to achieve throughout. Let’s say one year or two years, three years terms. And that’s very important. One thing that is very important and I realize it’s happening more and more is the idea of diversity. Everybody is realizing that there is not one solution that fits every problem that they’re going to have and that there is not one technology that’s going to be only for their particular industry. So people are diversifying in terms of the tools that they’re going to use for the challenges that they’re trying to resolve. So while let’s say five years back, we had three main RPA vendors and everybody was using them right now.

You have such a wide pool of tools based on what we are trying to achieve, that is actually much, much easier to pick them up, put them inside of your COE, see how they integrate and then do a strategy around them and see exactly how you’re going to scale them up or how they are going to evolve throughout time. This is something that is very visible right now. And I think it’s very good because it gives us as COE leaders, a lot of flexibility on what we are trying to do. Otherwise we would’ve been limited to a few options and that’s not necessarily the best way to move forward with these technologies.

CM: Yeah, absolutely. Yeah. I I’ve heard the same thing that in the last few years, the stack, you know, the automation stack or fabric, some people will call, it has gotten more diverse. You don’t need to name the individual technologies unless that’ll get you a break on your license and therefore go ahead, are the categories of technologies now that you’re seeing as like indispensable for getting your job done and your team to be productive.

CD: If I were to start, think about the life cycle development. Yeah. Let’s start with opportunity assessment previously. This was a very highly manual task that people were doing. We needed dedicated BAS for that. Right now that space is covered by very good technologies. We have task mining that can help you drill down to the task level and get that insight that you need from systems that are not necessarily enterprise systems, because that is where task mining brings value. You can connect the systems that don’t necessarily give you particular locks. And then it comes to the process, mining, adding the process mining, to be able to understand how is your company working? Where are your processes flaws, I would say, how can you actually improve your processes, reengineer your processes, and then apply automation. So it starts from the ideal of process improvement and process discovery.

So you can identify where are your areas and how you can apply automation. And then I would go into the space of, okay, what type of technologies do you want to use for automation? Cause not necessarily, RPA is always the answer you can think about system integrations. Yeah. We have APIs integrations and that’s for sure something that you can leverage, you have RPA tools that definitely you can leverage. And you also have, I would say the possibility to build custom applications that are managed centrally. And that also gives your possibility to automate within a particular flow, like building, let’s say an engine that’s going to do some decision rules for you or building a particular flow and workflow inside of those tools. So it depends in the middle, what you want to put in terms of processing. And I would say overall automation, you can do task automation. So it depends on what challenge you’re trying to resolve.

CM: Yes.

CD: And then I would say that in the end, it all comes down to performance measurement. This is something that was lacking in the previous years right now people are looking, I would say at a lot of data analytics tool that’s going to help them understand what is the performance, what are my actuals versus my estimates. You have a clear vision of what you have invested and what is there to actually continue to invest in that particular COE or in that particular capability. Yeah. And would go even further than that, the idea of that you do need management for your solutions to be able to do, I would say incident and change management, all of these come all together and think about expanding even more. Because right now we’re talking about inputs that are coming as structured, right? So we’re talking about data that comes from systems, data that comes from let’s say different digital formats, but you also have the possibility to use the new vendors that are in the intelligent document, trusting space to actually get data from unstructured document that also feds into those automation solutions that you wanna build either with RPA, with integration or with any other tool that sits within the middle.

So I would say that it depend on where are you looking at the stage of the life cycle and what exactly would you need to show the value of those particular stages?

CM: Great. All right, I want to race off into unstructured, but I’m going to try to be disciplined and, and recap all of that. Because that was great information. So you talked about task mining, you talked about process mining, you talked about the actual automation stack, which may be traditional programming, but also the RPA tools themselves. Then you talked about visibility into the process and the ROI and I’m, I’m assuming this is, you know, you’re talking about some dashboarding in Tableau or something like that, right? Yes. Okay. And then there’s, and then there’s tech to make the management oversight easy. Does that capture everything that you just said?

CD: Yes.

CM: Good. All right, good. That second cup of coffee was working this morning. 

No, that’s, that’s an excellent overview. And I think one of the things that you’re highlighting, one of our marketing folks was asking me like why is now the time to get into unstructured data? And I think there are, for me, there are two answers once the pessimistic answer, like there’s a lot of risk in the world, just period and unstructured data because you can’t easily see into it. Like you could a database, potentially warehouse is a lot of risk. But then there’s the optimistic answer, which is the tools have just in the last couple of years gotten really good. And I think the more mature COEs, like the one that you’re, that you’re running and driving have realized that. And so to switch gears with that optimistic idea in mind, what is unstructured data to you? Where does it come from? What does it do? And what’s hard about it.

CD: I would say that right now I’m considering unstructured data, everything that comes, I would say as not organized, right. Everything that doesn’t have a pre defined schema or model, it’s something that I would call as unstructured. And I would say that is the data that cannot be. I like to call it unlabeled, untapped UN, findable and untapped. Right. Cause it’s the data that sits everywhere, but you don’t have it organized or you don’t have it secure in any way. So it’s the data that sits out there, but can actually bring a lot of value.

CM: Yeah. Yeah. I like that. My usual definition is anything that doesn’t fit easily in a spreadsheet. And I, I think that’s good. I think our two definitions are pretty well. I

CD: Will expand it with database. Not necessarily a spreadsheet cause it hones more

CM: That’s right. Yeah, absolutely. Although people do a music cell, don’t they? Right.

CD: I know

CM: Anyway, that’s enough of that. So unstructured doesn’t have a predefined schema. I liked untapped, right. It’s bytes on a disc somewhere and you don’t know what’s in there. So practically speaking that ends up being documents of various types, Excel, for example, PowerPoint, PDF images, video, audio, and then, you know, any sort of mutant hybrids of all of those things. So

CD: Think about social media right now. Cause it’s expanding more and more social media post webpage content that you can actually use and think about the marketing and the sales team that can actually benefit from those. Think about like whenever we have meetings right now, we are doing transcripts, right? Every transcript that sits in those meetings and you can gain insights from so everything that is around that space, even if you’re thinking about, I would say articles that sit across different subjects and you would need to get, get that information on a specific subject.

CM: Yeah. Yeah, absolutely. I, yeah. Social media is a great one generating terabytes of yeah. Maybe mostly, mostly silly data every day, but still it’s unstructured. So you talked, we talked a little bit about what it is. Why has it taken so long for technologies to become useful in this space? And another way to ask that question is what’s hard about unstructured data.

CD: I would say it’s hard to organize it. That’s the main challenge, organizing the data and actually tracking the data. It’s very difficult. And that’s why mainly people are considering that information as either useless or very bad quality. And sometimes they even consider it a liability because if you have missing information and not connecting dots, you’re reaching the space of, I would say sec risks in the security and privacy space. So it’s very difficult because companies need to, I would say, ask themselves three questions. How do you store that data? Because we’re talking about extremely large volumes of unstructured data. How do you integrate it? Because you need to put it into your enterprise systems. What do you do with it after you actually get the data? So you need to see how you can actually fit it into your systems unless but not least. How do you secure it? Cause right now it’s not secure. How do you put measurements around that? Yeah.

CM: Yeah. Great. I mean, you’re basically naming all of the things that are not true about structured, structured data, right? Like you, you still have to secure a database, but that’s a lot easier than securing a blob store full of various. I mean it could be anything in there. Right. So it’s a really different problem. Tying this back to the COE what kinds of tools are you using in the unstructured space? You talked about the categories, where, where does unstructured and where does unstructured tooling fit in that stack?

CD: I would say that it comes at the for us, it’s just another tool within our technology stack. Yeah. And we do use them. It depends on the use case itself. So for instance, we are using a combination. We are using OCR for semistructured documents if we were to need that. But it’s not something that we actually promote because it’s already passed that moment within the technology space. We use out of the box IDP tools to be able to get value from the unstructured documents, but we also use separately for instance, NLP and machine learning whenever we need to customize or tweak based on the use cases that we have.

CM: Interesting. Okay. So you’re talking about the full spectrum of structure from like I assume some of that out of the box stuff works on like form style documents, right? Like very structured finance.

CD: I would say. So most of the tools are very well defined in the finance spectrum and on finance documents.

CM: Okay. Right on. And then you’ve got generic OCR, which is good for sort of key value style, things that exist in the document. And then in terms of the custom machine learning models, do you have a platform? Are you building your own stuff, hugging face? Like how, how have you approached it, given your background? I could imagine it being just about anything.

CD: I would say that we do have, and we build some models internally because they fit only ACOM needs and we use Python to build those models. Yeah. It’s not necessarily the best approach when you try to steal something very fast. So I would suggest that if there’s something that’s out there on the market, try to use that first and then go for a customized approach for us. For one of the particular use cases, we, we weren’t able to find exactly what we wanted and that’s why we started building that. But I would encourage to actually use the ones that exist on the market. And I usually say this because people have dedicated research and resources to building these kind of tools. So for me to actually start from scratch, it’s a new total cost of ownership to build that it’s about finding the right skills and the right resources to be able to do that particular algorithm. So it’s better to go for an out of the box approach if it’s possible and if not, try to customize zero.

CM: Yeah. Oh, that’s, that’s fantastic advice. I try not to use this podcast as a sales pitch for Indico, but, and it’s not sexy, but one of the most valuable things about these platforms that you’re talking about is they handle all that operational stuff. If they’re good anybody can build a machine learning model nowadays, like you can, you can watch YouTube videos that tell you exactly what to click and what code to type, but managing them out in the wild in the enterprise is a big job. And so make sure, yeah. Make sure you’re talking to your vendors about how they solve that problem for you. It’s really important. Great advice. Alright. So you talked about some success with unstructured. It, it sounded like the finance spaces where there’s a lot of, that’s sort of the hotspot right now where have you seen, and it doesn’t necessarily have to be you or what’s going on at AECOM, but where have you seen COEs struggle with unstructured? Like what, what are the hard use cases out there?

CD: If I were to maybe pinpoint from my previous experience, a few use cases where I saw a lot of struggles I would start with data analyzation. This is extremely important. And when the new regulations related to GDPR kicked in for Europe, especially that is when everybody was looking to how they can actually anonymize all their data across all of the documents and think about contracts, think about any lease agreements, anything that PII data, and that’s something that you need to quickly solve on actual data and historical data because everything needed to be cleaned up. And I would say scrambled and removed from, from those documents. So that was one area where I saw people were struggling. And it’s something very important that is considered nowadays, whenever we’re talking about data, one other space, I would say that people are struggling when it comes down to understanding the potential of unstructured data from large agreements like contracts, how do you actually leverage contracts to gain insights and how can you build and streamline your contract review process, for instance, how can you do that?

And that’s a space that still, I would say there are a few use cases and there are a few companies that were able to do something, but I think there is much more potential that sits in that space related to how you can actually identify contract types, how you can decide if the user is a buyer or a seller, how you can flag the risks because imagine reading hundreds of pages of contract, and for sure that there are pro because you as a user get tired sometimes. So you may miss something that flagged actually as a risk.

CM: Yeah, no, that’s great. There, there are a few things I want to drill into there. In terms of like the PII Phi types of anonymization and spoofing and all of that, when that was a struggle. And I, I think it’s actually still probably a struggle in, in most cases. Is it the technology? Is it the requirements, you know, the sort of you know, regulations being so stark, all of it, what, what makes that, what makes that hard in your mind?

CD: I would say it’s both because first of all, you have a large volume of data that you need to cleanse instantly, which is extremely hard for you as a company to do. And you had specific timelines and deadlines to perform those tasks. And the second part was related to okay. Technology at that moment, not all of them were able to find PII data. So right now we are good in terms of finding let’s say addresses names, birth dates let’s say pictures of national IDs or passports. So that’s right now at very good point. But if you were to look like two years back when GDPR kicked in with all of these requirements yeah. That’s when everybody was struggling and how do we actually remove all that data from millions of documents?

CM: Yeah. Yeah. And the requirement is that you remove it all, but of course these are all statistical technologies, right? So at what confidence level is the thing that your data scientist is going to ask, and the regulator’s going to say like a hundred percent and those, those are not compatible ways of thinking, right. So that’s a struggle on the contracts bit, I’m really interested in your thoughts on this. So, you know, intelligent document processing, intelligent process automation, sort of born out of or at least born adjacent to RPA and, and robotics, right. For, for better or worse, that’s the way these things have come about. RPA is very much about, we’ve got something it’s a very clear process most of the time. And we can sort of straight through process things. If we, if we have the right scope and we have the right tools and we document, well, then we can straight through process. And I’ve seen a lot of folks talk about that with documents, that, that that’s the ultimate goal is straight through process. And I’ve also seen others talking about the goal is return investment and reduce time on task and do the simple things for the human. Where do you fall on this? Is it a future of robots? Is it a future of robot, human hybrids something else I’m missing? How are you thinking about that?

CD: I would say it’s about digital workforce. So it’s a hybrid workforce. I don’t see it as being necessarily the, the technology that’s going to have a hundred percent straight through throughput rate because that’s not going to happen to be honest. Yeah. You are still having variations and you’re still having elements that come as new every time and you cannot guarantee that. So that’s why it’s a human robot collaboration. That’s how I see it.

CM: Yeah, no, I, I see it the same way. And maybe, maybe I led you there partly in the answer, but you know, you know, more about this than I do. So okay. So given that it’s a, I think you called it a digital workforce, right? A hybrid workforce RPA, the RPA tools and vendors. Don’t talk that way for the most part. Maybe they’re starting to a little bit, but given that framing of what you think is coming and where we are now, what’s missing in the tech stack, like what has to change in the tech stack to get us where we, where you think we should be?

CD: Are you referring from a technology perspective or what exactly?

CM: I guess I’m interested in both, like, are there technologies interfaces that are missing and then are their skill sets, people process aspects of, of this space that are missing right now that would make this easier to get there?

CD: I would say that it’s a work in progress on the three domains, like people process and technology. Yeah. And I will start with people first of all, because people need to understand and be aware of what does it mean to interact with an intelligent document processing technology. That’s still something very new for them. And the idea of errors and training the models is not something very familiar. Yeah. So it needs to be educated and users need to be trained into that space. So that comes down to also process. How do you define that process of the new way of working? How are people going for instance, to do contract moving ahead with such a solution in place, where do they fit in within the entire process? Where do they do their validations? Where do they do? I think it’s also about the technology. The technology is going to definitely going to improve as we go through these types of use cases, in the sense it’s going to be more user friendly and it’s very easy for people to understand how they can interact with it. How do you do the validations? How do you do the changes, how you can customize this particular types of technologies to fit your needs or your particular processes. So I think it’s an improvement in that space. I’m not able to pinpoint right now a particular gap, but I would say that’s the direction having something very user friendly that easily embeds into the day to day processes of people.

CM: Yeah, yeah, absolutely. Yeah. You talked a little bit about citizen developers. I, I’m not sure in most places anyway, it sounds like you’ve gotten there at AECOM, but in most places, I think that dream of citizen developers for RPA, hasn’t really been realized to the extent that we thought it might be. But I think on the unstructured side, there’s a real chance that that dream could be realized in the sense that, you know, developing a machine learning algorithm is really just labeling data. Right. And telling the machine what it is that the machine should care about. I actually, I talk about machine teaching a lot more than I talk about machine learning because you know, that that’s really what we want to do and make easy is the teaching the learning’s already figured out. Right. right. We have, we have algorithms for that. Interesting. So let’s see. Let’s try to come in here for a landing. I said you were at intelligent automation week a little while ago. We talked about some of the recent history of the COE and the automation space. You were sort of a, an early mover in that space. So I credit you for your foresight and I want to take advantage of that foresight. So given where we are right now, like what are the next, it’s a fast moving space. So let’s limit it to two years. What are the big things that are coming in the next two years for, for unstructured and for automation more generically,

CD: We need an extra hour for this. 

CM: I can make that happen. Don’t tempt me.

CD: There are so many things happening right now. I would start by mentioning that semantic AI it’s catching up very fast. It’s be definitely incorporated in the space of unstructured data. Yep. That is more and more embedded into the idea of how do we do automation and how do we apply it and where does it fit? I would then refer to the idea that unstructured documents and unstructured data also fits in, in the space of process mining and task mining. Because you still tap. Yeah. I say unstructured data. It’s like, you’re getting into emails into documents and into, into those spaces. So it’s still around there and that’s where a big focus is going to be on process mining and how process mining is, what people like to call right now as a buzz. The digital twin is going to be part of the digital thing. And how do you process mining within the digital twin strategy? That’s one other thing that’s coming right now related to the idea of Symantec AI hand in hand, it goes with conversational AI, right? Yeah. How do you take the data from everything that’s happening right now with IVR, with chat bots, with every interaction that we’ve completely migrated from human agents towards the technology and how do you actually gain insights from that data?

CM: Yeah.

CD: And I’m trying to think that overall, if there’s something right now at automation level that is growing more and more, I would say that the main focus is shifting almost everything towards AI and having decision based AI, data driven decisions and this particular space. I would say that it’s a very high visibility right now on all of the tools that have AI embedded in them.

CM: Yeah. Okay. Now I, people like to sprinkle AI into things to make it, you know, it’s almost like adding MSG to a dish. Right. And it makes the flavor better. Depends. Yeah. So what are in terms of automation specifically, what are you most hopeful about with AI being added in, is it, is it that, you know, sort of bay and algorithms helping bots to make better decisions or is, or is there something else that you’re excited about?

CD: I would say that right now I’m envisioning what people like to call digital workers, like building completely roles and functions that would be able to work together with a people, a person that has the same role and function. So for instance, if I’m an accountable, I wanna have a digital worker that is an accountable next to me, that’s able to do the same type of roles and responsibilities that I do.

CM: Yeah. And, and back to your point about conversational AI, you know, you can get them on teams or slack right? When you need to interact with them. 

CD: I think there’s a real ledge sharing as well. Even if it’s a technology or not, you still have, you still have to have it there.

CM: Yeah. Especially if you have the semantic technologies underneath of it. Right. And it can make those connections for you. Interesting. Now that again, on topics, speaking of topics that could take another hour, meta just released their giant chat bot, right. About two months ago. And one of the things that I found most interesting about that release was the, the level to which they disclaimed it. Like basically this thing’s probably biased. We don’t trust it. You definitely shouldn’t trust it like user beware. How do we, and that, that raises the larger point that robotics is nice because we can pinpoint like, this is why it’s doing, this is why it did what it did. How do we get, how do we continue to get the enterprise comfortable with AI and machine learning solutions, which they don’t have that characteristic generally speaking.

CD: This is an ongoing challenge that everybody’s facing. How are we looking at AI from that perspective? I don’t know if there’s a particular solution right now for that. Yeah. Problem. I would mention that it starts with the idea of building trust. AI is still new for a lot of companies and for a lot of people. And that’s why there is a, a general reluctance on what it can do. And until what particular point in your processes or within your company, you should use it before it actually starts doing much more harm than good. So that’s something that starts from a trust level and then grow up towards that one in the sense of, okay, how do you put around it governance? How do you think about applications? But it’s not something that has a particular response right now. It’s an ongoing challenge. And is the subject that every time I’m talking to leaders in the industry, they bring this up. How do we do? Yeah. and how do we promote AI as a technology that’s worth being implemented and embedded within companies without looking into that space of, okay, is it good or is it bad?

CM: Yeah, right. Yeah, eventually everyone sort of, not everyone that’s too strong. A lot of people eventually just punt and say, is it accurate enough? And of course that, you know, accuracy, doesn’t really tell you everything you need to know. But it’s an easier question to answer. The point about trust is really important. I think part of the reason it’s hard to trust the AI is often, and I’ve seen this in my work consulting for folks on, you know, getting an ML solution up and running as part of your automation stack. Part of the reason it’s hard to trust the AI is because we don’t have a really good understanding of what the humans are doing. Right. And so I’ve, I’ve been in situations where, you know, you have each person label 50 documents, that’s a part of their normal flow, right?

They’re doing data entry in these documents and, you know, the model should be great with 300 labeled documents, but it’s terrible. And it turns out it’s terrible because everyone labeled the documents differently. So in fact, what you thought was one process was actually six different processes and a latent variable, which is who got what email at what time, and then, and therefore process it, processed it differently. So like you get those results, you think everyone’s doing the same thing. And then it’s like, Hey, this model’s terrible. And it’s well, it’s like terrible, because your process is terrible. So it learned exactly what it was supposed to, but you know, having that honesty about humans are good at ambiguity and we’re good at sort of dealing with those things and machines aren’t right. And we, we have to own up to that. I think if we’re going to keep working with machines

CD: And it’s the idea of making sure that people are open to failure, I keep on saying that nothing is perfect. You’re not going to have an ML. That’s going to work the first time you put it there, you need to continue improving it as you go through the process. So nobody should expect like 90% straight through rate. As soon as you put something in place that’s not happening, you need to make sure that everything you’ve captured in terms of requirements of, and way of working is actually aligned.

CM: Yeah.

CD: Failure is important. Extremely important.

CM: It is. Yeah, it is. I’ll come back to one of the points you made earlier, which is you need a vendor that supports like it has to be easy to get a model up and running and, and repeatable, right. Because it’s really, there are, there are very easy ways to fail on ML projects, which take nine months to a year before you actually realize that it was a failure. And so you need to be able to iterate quickly. Your ML projects need to be agile as well.

CD: I wouldn’t agree on that one.

CM: Yeah. Don’t be afraid of failure. Alright. We talked you’re you seem like a fairly optimistic person. What are you worried about with automation and, and robotics in the next few years? You talked about like the exciting things to, but what’s scary to you.

CD: I wouldn’t say I’m necessarily scared of anything. I believe that everything that’s still new and out there to come, it’s just another challenge we need to tackle. Okay. So I wouldn’t see it as something scary. I would see that what comes is going to come and we are going to tackle it as soon as it arrives and we’ll figure it out then what we can do about it. I’m not sure if something scary about automation and future of automation. I think it’s actually very exciting to be honest.

CM: Yeah. Okay. I think I was right. You’re an optimist. I, I have some things I’m scared of, but I’m a little bit more maybe I am worried that the enterprise will continue to think about intelligent automation, the same way it thinks about RPA and a lot of intelligent automation projects will die on the table because again, that hyper focus on accuracy, it’s entirely the wrong thing to be focused on. You should be focused on how you’re human and digital workers are making each other together faster, right. Rather than, you know, what’s my F1 score. So that one worries me. Worries me a lot that in fact, that’s one of the reasons we’re doing this podcast, right. Is to tell people that like, like you’ve been saying, like there are better ways to think about these things. And you need to actually be agile. You need to be willing to fail fast, even on really hard things like ML driven projects. So anyway, long story short, not being able to change the dialogue for the better broadly is something that worries me. I think a lot of potential could go untapped.

CD: I think it’s something that’s in work in progress because you’re going to see more and more leaders with that mindset that whenever we’re talking about intelligent automation, we’re talking about more than cost and efficiency and they’re not seeing it just as a tool that’s being added to their stack. They’re actually looking at it strategically. What do I want to do in the next three years? And how can actually intelligent automation support me overall, even if we’re talking about customer experience, right? So we want to include the customer experience space, even if we’re talking about improving employee experience and giving them the opportunity to grow, or even if we’re talking, let’s say about streamlining our processes. I think right now that’s one of the best things that’s happening. We ha we see more and more leaders that go into the space of what can intelligent automation do beyond the initial wins like cost.

CM: Yeah. Yeah. Try starting to climb up the tree higher. Right. The low hanging truths sort of been picked already. 

CD: But it is work in progress. We’re not there yet.

CM: It is work in progress.

CD: Yeah. But we are going to get there. So everybody’s going to reach a particular digital maturity within the time.

CM: Yeah. Yeah. You think that’s, you think that’s in that, that same two year window that we were talking about?


I would say yes, because the, the technology evolves very fast and it’s relevant on the market. You need to have this competitive advantage with in automation.

CM: Okay. Yeah, I agree. All right. We’re coming down to the wire here. Sometimes I like to ask this question, what should I have asked you that I forgot to ask you? What else, what else do we need to hear?

CD: I’m trying to think if there’s something that you haven’t asked me. Yeah, no. I would say that whenever you, everybody is onboarding into the intelligent automation journey they need to ask themselves why they want to do that, how they want to do that and identify the people that are able to drive that for them. So it comes down to the idea of a people culture. And I think this is really important because without people you don’t have success. And you mentioned dedicated on the example of the machine driven project, right? So it comes down to raising awareness for your people and building your people into that space. So you can actually drive your automation journey. And I would say, this is it. So think about the people whenever you are onboarding new technology, whenever you’re changing processes, whenever you are looking at strategic objectives, how would this impact your people, either your colleagues, your leadership, or your day to day employees?

CM: Yeah. Great advice. We often spend so much time talking about the code and the robots and the data, and really all of them sort of serve at the pleasure and for the benefit of the people who are involved in the process. Right. It’s good not to forget that. Well this has been unstructured unlocked. I’m Chris Wells and I have had just a fantastic time talking to Dr. Christina DDA, director of intelligent automation at AECOM, Christina, everyone out there needs to follow you on LinkedIn and Twitter, wherever you are, you’ve given us some great tidbits today. I’m really thankful that you took out the time.

CD: Thank you for having these for having me here. It’s been a pleasure.

CM: Yeah. Wonderful. We’ll have to, we’ll have to do this again. It sounds like there’s more to talk about.

CD: Oh, sure.

CM: All right. Take care. Bye.

Check out the full Unstructured Unlocked podcast on your favorite platform, including:

To learn more subscribe to our LinkedIn newsletter.

Subscribe to our LinkedIn newsletter.


Get started with Indico

1-1 Demo



Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.