Introduction to Artificial Neural Networks (ANNs)

Machine learning models can be broadly classified into three categories based on the nature of the algorithm used: Statistical Machine Learning, Artificial Neural Networks, and Deep Learning. Artificial Neural Networks (ANNs), also known as Neural Networks, are a type of algorithm that is inspired by the human brain. While the comparison between ANNs and the human brain is superficial, it can help us understand ANNs in a simple way. In this article, we will delve deeper into the intricacies of this branch of Machine Learning.

Key takeaways from this blog

Here, we will be discussing these topics in greater detail:

  1. An analogy from the human brain and defining ANNs in simple English language.
  2. Explanations of the constituent terms in the definition of ANNs.
  3. Key insights from the schematic diagram of NN.
  4. What are the Advantages and Disadvantages of ANNs?
  5. Practical use-cases of ANNs in real life.

The human brain is a complex system that learns and adapts over time through experiences and circumstances. Neurons, the basic building blocks of the brain, play a crucial role in this process. Billions of neurons, along with their connections, store learned experiences in the form of memories. When our sensory organs, such as our eyes, skin, and ears, encounter similar situations, they respond in a similar way. An example of this is learning to drive a car. Our brains experience various situations on the road and learn how to respond. Once learned, the brain uses signals from our eyes, nose, and ears to control the various parts of the vehicle.

As new experiences occur, the brain adapts and modifies the stored learnings. Essentially, the brain maps input signals to output signals in response to new information. This process is mimicked in artificial neural networks, which also use interconnected neurons to store and process data.

Deriving the analogy of ANN from human brain

Defining ANNs in Layman's term

If we define the term ANN in plain English, we can say:

Neural Networks are user-defined nested mathematical functions with user-induced variables that can be modified in a systematic trial-&-error basis to arrive at the closest mathematical relationship between given pair of input and output.

Let's know the terms used in the above definition:

  • User-defined: In Machine Learning, developers create a mathematical function that includes adjustable parameters. This function serves as a starting point for the machine to learn from. However, the ability to define this function raises the question of why ML is needed in the first place. The reason is that without a defined function, the machine would have to consider an infinite number of possibilities when fitting the function to the dataset. By providing a defined function, the machine's search is limited, allowing for more efficient and effective learning.
  • Nested mathematical functions: In the diagram, there is a single neuron in the hidden layer, but in practice, there can be multiple neurons in multiple hidden layers between the input and output layers. Schematically, the input of the first hidden layer is the input layer, and the output of the first hidden layer serves as the input for the second hidden layer. The final output of the second hidden layer is the desired output from the neural network. The overall process is nested, meaning that the hidden layers can be thought of as nested mathematical functions.
  • Mathematical Functions: Machine Learning models represent the relationship between input and output pairs as mathematical functions. These functions have two main components:

    1. Aggregation Function: The inputs to the mathematical function are modified by weights and biases, which are user-induced variables, to calculate the weighted sum.
    2. Activation Function: Activation functions are applied to the weighted output of the aggregated function to introduce non-linearity in the input-output relationship. This nested non-linearity allows the model to learn complex patterns present in the dataset. In later posts, we will delve deeper into the various types of activation functions and their uses.
  • User-induced variables: Machine Learning models attempt to find the optimal set of parameters that make the final function as similar as possible to the actual function. These parameters, also known as weights, are used to assign importance to the input parameters. For example, in the function Output = 2*Input + 3, the value 2 represents the weight applied to the input. If the input is multi-dimensional, the weight will also be multi-dimensional. These weights are trainable parameters and are modified during the learning process based on the input and output samples.
  • Trial-and-error basis: The cost function is a measure of the average difference between the output of the model and the actual output. The goal of the model is to minimize this cost function by adjusting the values of user-induced variables such as weights and biases. This process of adjusting the variables is done systematically to ensure that the cost function is gradually reduced. 
  • Systematic: If the user-defined mathematical function is unable to capture the intricate patterns in the data, a different form of the mathematical function may be used or the complexity of the function can be increased. This process of adjusting the function is done systematically to ensure that the model can learn the complex patterns in the data.
  • Closest mathematical relation: For example, a user-defined mathematical function may be defined as aX + b, where X is the input and (a and b) are user-induced variables that the machine can adjust while finding the best suitable mathematical relationship. However, if the actual dataset follows the mathematical function 2X + 3, the machine may only be able to find a mathematical relationship that is close to the actual one, such as 1.9*X + 3.2, based on the given samples and the number of iterations it was allowed to perform. This result is not an exact match but it is the closest approximation that the machine was able to find based on the given conditions.

With all this, we might have understood what exactly is present in any Neural Network. Now let's learn how exactly this nestedness work.

How to represent a neural network containing two hidden layers?

Key insights from the Neural Network Diagram

When designing the structure of Neural Networks, it is important to consider the following:

  • Each neuron in the input layer corresponds to one feature of the dataset. So if the dataset has 50 features, the Input layer will have 50 neurons.
  • The number of output categories determines the total number of neurons in the Output layer. For example, if there are 10 categories in the output, then the Output layer will have 10 neurons.
  • The number of hidden layers and the number of neurons in each hidden layer are pre-defined in the network, and they are not trainable parameters. These non-trainable parameters are called hyperparameters, and they are tuned based on multiple experiments on the same dataset.
  • Every neuron in any layer is connected to every neuron in the adjacent layers. For example, if the hidden layer 1 has 20 neurons, and the hidden layer 2 has 60 neurons, every 20 neurons in hidden layer 1 will be connected to all 50 neurons of the input layer and all 60 neurons in hidden layer 2.
  • Every neuron in the hidden and output layers has one trainable parameter called bias, and every connection between neurons is weighted by a trainable variable called weights. These weights and biases are collectively called the weight matrix.
  • In a Neural Network with 50 features in the input, 10 output categories, 20 neurons in the hidden layer, and 60 neurons in hidden layer 2, the total number of trainable parameters will be: Biases = 20 + 60 + 10 = 90 and Weights = 50*20 +* 20*60 + 60*10 = 2800, so the total trainable parameters will be 2890.

Advantages of Neural Networks

Some of the key benefits of using Neural Networks include:

  • Neural Networks are able to learn intricate non-linear relationships between input and output data, making them well suited for complex problems.
  • They have the ability to generalize their learning to new, unseen data and make predictions with a high degree of accuracy.
  • Neural Networks are not limited by the distribution of the input data, and can work well with data that follows heterogeneous distributions. This makes them versatile and suitable for a wide range of data types.
  • Neural Networks are robust to noise in the data, meaning that their predictions are not greatly affected by random variations in the data.

Disadvantages of Neural Networks

While Neural Networks have many advantages, there are also some drawbacks to consider. Some of these include:

  1. Training Neural Networks can be computationally intensive and require powerful hardware. As the number of hidden layers or nodes in any hidden layer increases, the need for better processing power also increases.
  2. One of the main disadvantages of Neural Networks is their lack of interpretability. They are not able to explain how or why they arrived at a certain prediction, which can make it difficult to understand and trust their results.
  3. There is no set method for designing Neural Networks, and various hyperparameters such as the number of layers, number of neurons, and the type of activation function must be fine-tuned through experimentation.

Practical use-cases of ANNs in real life

Neural Networks are particularly well suited for datasets that exhibit high levels of non-linearity. Some areas where this is commonly the case include:

  • Optical Character Recognition (OCR): OCR is a complex problem that involves highly complex non-linear relationships between the characters in an image. Neural Networks can process a large amount of input, such as a complete image represented in matrix form, and automatically identify these complex relationships. This makes them well suited for OCR, facial recognition, and handwriting verification applications.
  • Stock market price prediction: Forecasting stock prices is a challenging task, as the market behavior is often unpredictable. However, many companies now use Neural Networks to anticipate whether prices will rise or fall in the future. By making more accurate predictions based on past prices, Neural Networks can help companies make significant financial gains.

Conclusion

In this article, we covered the fundamental concepts of Artificial Neural Networks, including the constituent terms used in its definition. We also discussed the learnable parameters present in any Neural Network and how to calculate the total number of these parameters. Lastly, we presented some practical, real-world applications of ANNs in various fields. We hope you found the information presented in this article to be informative and useful.

Enjoy Learning!

More From EnjoyAlgorithms

© 2022 Code Algorithms Pvt. Ltd.

All rights reserved.