Author: Matthew Simonson
Artificial intelligence is a topic of intense media hype. Machine learning, deep learning, and AI come up in countless articles, with deep learning being heralded as an incredible breakthrough in AI. In this blog post I will discuss deep learning, what deep learning has achieved so far, the significance of these contributions, how deep learning extends the capabilities of “shallow” machine learning approaches, and why deep learning is the “right” approach for companies to be investing in and for researchers to flock to.
What is machine learning?
Before we discuss deep learning I think it’s important to have some understanding of what machine learning is, and how deep learning relates. Machine learning arises from this question: can a computer learn how to perform some task without having been given the step-by-step instructions for completion? Rather than programmers crafting data-processing rules by hand, could a computer automatically learn these rules by looking at data? In classical programming, humans input rules (a program) and data to be processed according to these rules. If the program works, we get a result as output. With machine learning, humans input example data as well as example answers expected from the data, and our output is a set of rules that can be used to generate the desired data output. These rules can then be applied to new data to produce original answers. A machine-learning system is trained rather than explicitly programmed. A machine-learning model transforms its input data into meaningful output using a process that is “learned” from exposure to known examples of inputs and outputs.
The central problem in machine learning is to meaningfully transform data. Machine-learning models are all about finding appropriate representations for their input data—transformations of the data that make it more amenable to the task at hand. All machine-learning algorithms consist of automatically finding such transformations that turn data into more useful representations for a given task.
Deep learning vs. machine learning
Deep learning is a specific subfield of machine learning and is another approach to take on learning representations from data. With deep learning, the problem of meaningfully transforming data into useful representations is broken down in a step by step manner, learning successive layers of increasingly meaningful representations. The “deep” in deep learning isn’t a reference to any kind of deeper understanding achieved by the approach; rather, it stands for this idea of successive layers of representations. Modern deep learning often involves tens or even hundreds of successive layers of representation that are all learned automatically from exposure to training data. There are two defining characteristics that describe how deep learning learns from data. The first is the incremental, layer-by-layer way in which increasingly complex representations are learned within every subsequent layer, and the second is the fact these intermediate incremental representations are combined and optimized jointly with respect to the models overall performance.
Alternatively, other approaches to machine learning tend to focus on learning only one or a few layers of representations of the data; hence they’re sometimes called shallow learning.
Deep dive into deep learning
So, what is deep learning exactly? Deep learning is a mathematical framework for learning representations from data. In deep learning, layered representations are learned using models called neural networks, structured in literal layers stacked one after the other. You can think of a deep network as a multistage information-distillation operation, where information goes through successive filters and comes out increasingly purified.
What has deep learning achieved so far? Specifically, deep learning has achieved the following breakthroughs in the following areas of machine learning (as well as other not listed here):
- Near-human-level speech recognition
- Ability to answer natural-language questions
- Improved machine translation
- Improved text-to-speech conversion
- Near-human-level image classification
- Near-human-level handwriting transcription
- Visual art generation (artistic style extraction and transfer)
- Near-human-level autonomous driving
- Digital assistants such as Siri, Google Now, and Amazon Alexa
- Improved advertisement targeting, as used by Google, Baidu, and Bing, and others
- Deep reinforcement learning has been used to approximate the value of possible direct marketing actions
- Improved search results on the Web
- Recommendation systems have used deep learning to extract meaningful features for content-based music recommendations
- Automated drug discovery by predicting novel candidate biomolecules for disease targets
- Several applications in bioinformatics
- Superhuman game playing ability
What makes deep learning different? One reason deep learning is so popular is that it offers better performance on many tasks. Deep learning also makes problem-solving much easier, because it largely automates the process of feature engineering, the most crucial step in a machine-learning workflow. Feature engineering is a time-consuming process where the initial input data is manually transformed into the format required for a given analysis. Alternatively, deep learning largely automates this step: with deep learning, you learn all features in one pass rather than having to engineer them yourself. Have missing values in your data? Not a problem for deep learning, it will learn to ignore them. Have a huge number of features with complex interrelationships? Not a problem for deep learning, it will learn to properly weight these features and untangle their interrelationships. This has greatly simplified machine-learning workflows, often replacing sophisticated multistage pipelines with a single, simple, end-to-end deep-learning model.
Is there anything special about deep neural networks that makes them the “right” approach? Absolutely, some specific examples include:
- Simplicity: Deep learning reduces and automates much of the feature engineering process, replacing complex, brittle, engineering-heavy pipelines with simple, end-to-end trainable models.
- Scalability: Deep learning is easily parallelized and can take advantage of high powered graphical processing units (GPUs). Also, deep-learning models are trained by iterating over small batches of data, allowing them to be trained on datasets of arbitrary size.
- Versatility and reusability: Unlike many prior machine-learning approaches, deep-learning models can be trained on additional data without restarting from scratch, an important property for very large production models. Furthermore, trained deep-learning models are easily repurposed and reused. For example, it’s possible to take a deep-learning model trained for image classification and drop it into a video-processing pipeline. This allows customers to reinvest previous work into increasingly complex and powerful models.
In this blog post I have reviewed deep learning, what deep learning has achieved so far, the significance of these contributions, how deep learning extends the capabilities of “shallow” machine learning approaches, and when deep learning is the right approach for finding meaningful insights in your data that are hidden beneath the surface, beyond the simple observed trends. Contact Nihilent to find out how you can add deep learning to your company.