Indico Data receives top position in Everest Group's Intelligent Document Processing (IDP) Insurance PEAK Matrix® 2024
Read More
  Everest Group IDP
             PEAK Matrix® 2022  
Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)
Access the Report


Is it possible to learn without data (or at least one or two samples)—like humans—in AI?

May 26, 2018 | Ask Slater

Back to Blog

Humans do not learn without data, but as you’ve said, they can often learn with very small amounts of data. This is an extremely active area of research typically referred to as “one-shot learning” (learning from one example) or even “zero-shot learning” (learning from zero examples and just the name of the label you’re looking to apply) [helpful paper: A Comprehensive Evaluation of the Good, the Bad and the Ugly]
In general, these branches of investigation are a part of “Transfer Learning”, which focuses on leveraging information learned in previous tasks to solve new tasks with less data. It’s a bit of a misnomer to say that you’re learning without data, instead what you’re doing is taking things that you’ve learned previously and applying them to new problems. To learn the original problem you still need a massive amount of data.

Obviously, no machine learning model is going to have access to as much data as a human has (decades of training time constantly processing all of the data coming in from every sensory organ), but we can train them on relatively large image and video feeds, text data, etc… and get behavior that is very close to this.

Now comes the kicker though. How do you know if you’ve done a good job when you learn on one data point? You don’t. People often have the assumption that they’ll just look at the results and see if they “feel” right to them. The real issue is that humans generally operate without oversight and their errors are ignored. When we test humans on their consistency in a given task we often realize that different people very frequently disagree even on tasks as simple as sentiment analysis (is this positive or negative).

The issue becomes that without some data (not millions, but at least dozens or hundreds) you have no idea if it works or not. You might think that you do, but you have literally no idea whatsoever unless you create a dataset, measure multiple different humans, check what their consistency rate is, and then compare that to the accuracy of the machine.

The other problem is that learning from a single example is not useful. Why? Well because labeling a couple hundred examples is extremely cheap. Depending on the problem you can generally create a dataset of a few hundred examples in a couple hours. It takes much longer than this to create and evaluate a machine learning model (or even read the paper that I linked). If you’re not willing to label a couple hundred examples then your problem is not important.

So can you learn on zero or one example? Certainly. Transfer learning literature has no shortage of examples where people have done this. The problem is that in a real-world scenario, without some data to validate a working model it would be tremendously irresponsible to ship something. It’s very fascinating from a research perspective, and people are making more and more effective models with less data every day, but this is generally a solution to an imagined problem.

The question should be about learning in a low data environment. Learning from zero or one data point generally just means you don’t want to define the problem that you’re solving — you can’t solve it and have no idea if people are capable of evaluating it effectively.

View original question on Quora >

Follow Slater on Quora >>

Increase intake capacity. Drive top line revenue growth.


Unstructured Unlocked podcast

April 10, 2024 | E44

Unstructured Unlocked episode 44 with Tom Wilde, Indico Data CEO, and Robin Merttens, Executive Chairman of InsTech

podcast episode artwork
March 27, 2024 | E43

Unstructured Unlocked episode 43 with Sunil Rao, Chief Executive Officer at Tribble

podcast episode artwork
March 13, 2024 | E42

Unstructured Unlocked episode 42 with Arthur Borden, VP of Digital Business Systems & Architecture for Everest and Alex Taylor, Global Head of Emerging Technology for QBE Ventures

podcast episode artwork

Get started with Indico

1-1 Demo



Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.
Subscribe to our blog

Get our best content on intelligent automation sent to your inbox weekly!