Why Is It Called Deep Learning?

Adapted from: Deep Learning, the book co-authored by Ian Goodfellow, Yoshua Bengio and Aaron Courville and “Representation Learning: A Review and New Perspectives” by Yoshua Bengio, Aaron Courville and Pascal Vincent (click here for a copy of this paper).

The power of software has been the ability to codify tasks that can be clearly defined and listed.  AI, in its early days was fed problems that were intellectually hard for humans but relatively easy for computers – tasks that could still be formally described via mathematical rules.  The real challenge were problems that humans solved intuitively (automatically) but hard for computers to “get” – such as recognizing images or spoken words with context and continuity.

Instead of formally specifying all this intuitive knowledge, computers must learn from experience, fed as data.  It must build its own specifications of these experiences as a hierarchy of concepts.  Complicated concepts are built on simpler ones.  Thus, the degree of abstraction increases as you get to complicated concepts.  If this hierarchy of concepts is visualized as graphs, then it would be deep, or one with multiple layers and hence this approach is called AI Deep Learning.

 

A Venn diagram showing how deep learning is a kind of representation learning, which is in turn a kind of machine learning, which is used for many but not all approaches to AI. Each section of the Venn diagram includes an example of an AI technology. Source: Deep Learning book


Thus, Deep Learning models “either involve a greater amount of composition of learned functions or learned concepts than traditional machine learning does”.  Now, its graphs (and the concepts) are heavily dependent on the choice of data representation on which they are applied.  That is why data representation or feature engineering is so important.

“Such feature engineering is important but labor-intensive and highlights the weakness of current learning algorithms: their inability to extract and organize the discriminative information from the data.  Feature engineering is a way to take advantage of human ingenuity and prior knowledge to compensate for that weakness.  In order to expand the scope and ease of applicability of machine learning, it would be highly desirable to make learning algorithms less dependent on feature engineering, so that novel applications could be constructed faster, and more importantly, to make progress towards Artificial Intelligence (AI). An AI must fundamentally understand the world around us, and we argue that this can only be achieved if it can learn to identify and disentangle the underlying explanatory factors hidden in the observed milieu of low-level sensory data.”

Stay tuned for more to come …

Leave a Reply

Your email address will not be published. Required fields are marked *