Table of Contents
Python is a general-purpose high level programming language that is widely used in data science and for producing deep learning algorithms.
This article introduces Python and its libraries like Numpy, Scipy, Pandas, Matplotlib; frameworks like Theano, TensorFlow, Keras. It also explains how the different libraries and frameworks can be applied to solve complex real world problems.
Deep structured learning or hierarchical learning or deep learning in short is part of the family of machine learning methods which are themselves a subset of the broader field of Artificial Intelligence.
Deep learning is a class of machine learning algorithms that use several layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output from the previous layer as input.
Deep neural networks, deep belief networks and recurrent neural networks have been applied to fields such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, and bioinformatics where they produced results comparable to and in some cases better than human experts have.
Deep Learning Algorithms and Networks −
- are based on the unsupervised learning of multiple levels of features or representations of the data. Higher-level features are derived from lower level features to form a hierarchical representation.
- use some form of gradient descent for training.
Here , we will learn about the environment set up for Python Deep Learning. We have to install the following software for making deep learning algorithms.
- Python 2.7+
- Scipy with Numpy
It is strongly recommend that Python, NumPy, SciPy, and Matplotlib are installed through the Anaconda distribution. It comes with all of those packages.
We need to ensure that the different types of software are installed properly.
Let us go to our command line program and type in the following command −
$ python Python 3.6.3 |Anaconda custom (32-bit)| (default, Oct 13 2017, 14:21:34) [GCC 7.2.0] on linux
Next, we can import the required libraries and print their versions −
import numpy print numpy.__version__
Installation of Theano, TensorFlow and Keras
Before we begin with the installation of the packages − Theano, TensorFlow and Keras, we need to confirm if the pip is installed. The package management system in Anaconda is called the pip.
To confirm the installation of pip, type the following in the command line −
Once the installation of pip is confirmed, we can install TensorFlow and Keras by executing the following command −
$pip install theano $pip install tensorflow $pip install keras
Confirm the installation of Theano by executing the following line of code −
$python –c “import theano: print (theano.__version__)”
Confirm the installation of Tensorflow by executing the following line of code −
$python –c “import tensorflow: print tensorflow.__version__”
Confirm the installation of Keras by executing the following line of code −
$python –c “import keras: print keras.__version__” Using TensorFlow backend
Artificial Intelligence (AI) is any code, algorithm or technique that enables a computer to mimic human cognitive behaviour or intelligence. Machine Learning (ML) is a subset of AI that uses statistical methods to enable machines to learn and improve with experience. Deep Learning is a subset of Machine Learning, which makes the computation of multi-layer neural networks feasible. Machine Learning is seen as shallow learning while Deep Learning is seen as hierarchical learning with abstraction.
Relating Deep Learning and Traditional Machine Learning
One of the major challenges encountered in traditional machine learning models is a process called feature extraction. The programmer needs to be specific and tell the computer the features to be looked out for. These features will help in making decisions.
Entering raw data into the algorithm rarely works, so feature extraction is a critical part of the traditional machine learning workflow.
This places a huge responsibility on the programmer, and the algorithm’s efficiency relies heavily on how inventive the programmer is. For complex problems such as object recognition or handwriting recognition, this is a huge issue.
Deep learning, with the ability to learn multiple layers of representation, is one of the few methods that has help us with automatic feature extraction. The lower layers can be assumed to be performing automatic feature extraction, requiring little or no guidance from the programmer.
The Artificial Neural Network, or just neural network for short, is not a new idea. It has been around for about 80 years. The act of sending data straight through a neural network is called a feed forward neural network. Our data goes from input, to the layers, in order, then to the output. When we go backwards and begin adjusting weights to minimize loss/cost, this is called back propagation.
Deep learning models/algorithms
Let us now learn about the different deep learning models/ algorithms.
Some of the popular models within deep learning are as follows −
- Convolutional neural networks
- Recurrent neural networks
- Deep belief networks
- Generative adversarial networks
- Auto-encoders and so on
The inputs and outputs are represented as vectors or tensors. For example, a neural network may have the inputs where individual pixel RGB values in an image are represented as vectors.
The layers of neurons that lie between the input layer and the output layer are called hidden layers. This is where most of the work happens when the neural net tries to solve problems. Taking a closer look at the hidden layers can reveal a lot about the features the network has learned to extract from the data.
Different architectures of neural networks are formed by choosing which neurons to connect to the other neurons in the next layer.
We will now learn how to train a neural network. We will also learn back propagation algorithm and backward pass in Python Deep Learning.
We have to find the optimal values of the weights of a neural network to get the desired output. To train a neural network, we use the iterative gradient descent method. We start initially with random initialization of the weights. After random initialization, we make predictions on some subset of the data with forward-propagation process, compute the corresponding cost function C, and update each weight w by an amount proportional to dC/dw, i.e., the derivative of the cost functions w.r.t. the weight. The proportionality constant is known as the learning rate.
The gradients can be calculated efficiently using the back-propagation algorithm. The key observation of backward propagation or backward prop is that because of the chain rule of differentiation, the gradient at each neuron in the neural network can be calculated using the gradient at the neurons, it has outgoing edges to. Hence, we calculate the gradients backwards, i.e., first calculate the gradients of the output layer, then the top-most hidden layer, followed by the preceding hidden layer, and so on, ending at the input layer.
The back-propagation algorithm is implemented mostly using the idea of a computational graph, where each neuron is expanded to many nodes in the computational graph and performs a simple mathematical operation like addition, multiplication. The computational graph does not have any weights on the edges; all weights are assigned to the nodes, so the weights become their own nodes. The backward propagation algorithm is then run on the computational graph. Once the calculation is complete, only the gradients of the weight nodes are required for update. The rest of the gradients can be discarded.
Gradient Descent Optimization Technique
One commonly used optimization function that adjusts weights according to the error they caused is called the “gradient descent.”
Gradient is another name for slope, and slope, on an x-y graph, represents how two variables are related to each other: the rise over the run, the change in distance over the change in time, etc. In this case, the slope is the ratio between the network’s error and a single weight; i.e., how does the error change as the weight is varied.
Each weight is just one factor in a deep network that involves many transforms; the signal of the weight passes through activations and sums over several layers, so we use the chain rule of calculus to work back through the network activations and outputs.This leads us to the weight in question, and its relationship to overall error.
Given two variables, error and weight, are mediated by a third variable, activation, through which the weight is passed. We can calculate how a change in weight affects a change in error by first calculating how a change in activation affects a change in Error, and how a change in weight affects a change in activation.
The basic idea in deep learning is nothing more than that: adjusting a model’s weights in response to the error it produces, until you cannot reduce the error any more.
The deep net trains slowly if the gradient value is small and fast if the value is high. Any inaccuracies in training leads to inaccurate outputs. The process of training the nets from the output back to the input is called back propagation or back prop. We know that forward propagation starts with the input and works forward. Back prop does the reverse/opposite calculating the gradient from right to left.
Each time we calculate a gradient, we use all the previous gradients up to that point.
Regularization methods such as drop out, early stopping, data augmentation, transfer learning are applied during training to combat overfitting. Drop out regularization randomly omits units from the hidden layers during training which helps in avoiding rare dependencies. DNNs take into consideration several training parameters such as the size, i.e., the number of layers and the number of units per layer, the learning rate and initial weights.
Deep learning has produced good results for a few applications such as computer vision, language translation, image captioning, audio transcription, molecular biology, speech recognition, natural language processing, self-driving cars, brain tumour detection, real-time speech translation, music composition, automatic game playing and so on.
Deep learning is the next big leap after machine learning with a more advanced implementation. Currently, it is heading towards becoming an industry standard bringing a strong promise of being a game changer when dealing with raw unstructured data.
Deep learning is currently one of the best solution providers fora wide range of real-world problems. Developers are building AI programs that, instead of using previously given rules, learn from examples to solve complicated tasks. With deep learning being used by many data scientists, deeper neural networks are delivering results that are ever more accurate.
The idea is to develop deep neural networks by increasing the number of training layers for each network; machine learns more about the data until it is as accurate as possible. Developers can use deep learning techniques to implement complex machine learning tasks, and train AI networks to have high levels of perceptual recognition.
Deep learning finds its popularity in Computer vision. Here one of the tasks achieved is image classification where given input images are classified as cat, dog, etc. or as a class or label that best describe the image. We as humans learn how to do this task very early in our lives and have these skills of quickly recognizing patterns, generalizing from prior knowledge, and adapting to different image environments.