There are two phases in CNN
Training Phase: Training phase is a more important phase where we take loads of data as input and then go through for feedback pattern and create filter data set which are then used for the inference phase.
Inference phase: Inference is something which we do in real-time application
There are six layers is involved in CNN
- Convolution layer
- ReLU
- Pooling
- Flattering
- Fully Connected Network
- SoftMax.
As in the image, the part of the image goes through the different layer and finally reached the last layer under three categories which say:
The probability of being a leaf is: 95%
The probability of being grass is: 3%
The probability of being bushes is: 2%
The probability of being a leaf is: 95%
The probability of being grass is: 3%
The probability of being bushes is: 2%
Layers Explanation:
1. Convolution layer (3D): It is the fundamental block of Convolutional Neural Network applies on the input image and then creates multiple but divers activation filter for that input images or simply extract the activation filter from the images like one filter will be extracting edges, other maybe colours and so on so forth, hence we have input parameter as:
Input: W(idth) x H(eight) x D(epth) of the input image.
Kernels or Filters: These could be of any dimension like 3x3, 5x5 anything, but depth must be the same as the input.
Example: 5x5xD
Number of Kernel or Filter: These are the number of activation filters which are going to be generated for our input image, assume ‘N’
After applying these set of the kernel to the input image and the output is generated from the convolution which will have the same spatial dimension as the input and the depth would be the same as the number of kernels
Output: W(idth) x H(eight) x N ( number of Kernels)
The spatial dimension is also dependent on a factor called stride (stride define how we want to process our pixels it is acting as downsampling)
2. ReLU Layer: This is most common to use in Convolutional Neural Network, it is non-linear activation layer hence the main role of this function is to bring the non-linearity into the network, as everything less than zero will be considered as zero and greater than zero would be as it is, hence it brings the non-linearity in the network.
Although we can use any other like Leaky ReLU, Sigmoid, TanH instead of ReLU. ReLU eliminates the over-fitting probability hence makes more adaptable to the real world.
3. Pooling Layer: It is again a non-linear function however it is used to down-sample the input data. It will get the input from the previous layer. For instance: output from the previous layer (Convolutional Layer) is: W(idth) x H(eight) x N(umber of Kernels), and output is dependent on the size of windows used for pooling.
This Iteration from “Convolution Layer” to “Pooling Layer” continues for image processing
4. Flatten Layer: At this layer, a multidimensional data array from the output of a previous layer is converted into a 1-dimensional array create a single long feature, and it is connected to the final fully-connected layer in Convolutional Neural Network.
5. Fully Connected Network Layer: The actual intent of this layer is to identify the category. In this layer, every neuron at the input is connected with the coefficient. The data is loaded heavily including the coefficient. After processing all the input data it gives a bunch of output to a softmax layer
6. SoftMax layer: In this layer, we execute the probability algorithm to pick up the most relevant output.
Real-life application of CNN:
- Optical Character and Image Recognition in the scanner
- Object detection in the path for self-driving vehicles
- Face recognition and detection by Social Security camera
- Image analysis and reporting in the medical field.
0 Comments