Pathway For Attempting Machine Learning Projects

EnjoyAlgorithms Blog Cover Image

Artificially providing the ability to possess intelligence to our computers is known as artificial intelligence. Machine intelligence or Artificial intelligence, in a broader sense, tries to mimic the intelligence of human beings and sometimes performs even better than humans after completing the learning stage. 

One fantastic example where machines surpass human intelligence could be, analyzing the larger chunk of data. Suppose some human is studying a larger piece of data, and while doing so, they observe 500 similar samples. They might probably become biased and skip going through every sample after that, which can be wrong sometimes. But on the other hand, machines will analyze every sample thoroughly and provide us insight, which will be less erroneous.

Key takeaways from this blog

In this article, we will try to learn the following aspects of machine learning,

  1. Categorization of Machine Learning use-cases.
  2. Understanding the exact pathway while solving any problem statement using machine learning technology.
  3. Categorizing available data into three crucial sets of training, validation, and testing datasets.
  4. A detailed use-case of training, validation, and testing datasets.
  5. Making model production-ready and rollout in the real world.

So let's start without any further delay.

Types of Machine Learning use-cases

In the current era of software engineers, we mainly have traditional Information Technology (IT) infrastructure where business teams (or experts) make decisions and convey them to our developers. Developers write logic as programs and integrate it with the IT system, producing output and ultimately decisions. 
At the current stage of development, ML techniques can not replace this entire process yet can compliment them for betterment. And for this betterment, there can be possibly two scenarios,

Decision Supportive:

In this case, IT experts define rules based on their experience. Our Machine Learning models analyze the data and provide their insights, which can be supporting the rules made by experts or even discard them.

For example, in our traditional banking systems, bank experts make rules that if a farmer has not repaid the previous loan on time, do not grant them another loan as there is a higher chance that he will not repay it. But our machine learning model analyzed the previous loan repayment history of that farmer and analyzed that if that farmer takes the loan in the spring season, he repays it very soon because he makes a more significant profit from spring crops. In such a scenario, the ML model will help our bank experts to rectify their decisions.

Decision supportive use-case

Source: IVT Network

Other possible examples:

  1. Prediction of the attrition rate for any company. 
  2. Predicting the fraudulent transactions via credit/debit cards.
  3. If you can also think of any such example, please send them to us.

Cognitive: 

In this case, IT experts can not define explicit rules for some tasks. Our ML algorithms try to mimic the cognitive intelligence possessed by humans.

For example, a bank manager is reading a customer's mail and is trying to identify whether his customer is happy with the banking services or not. He made the rule that if the content of mail includes "fantastic" word, it means the customer is satisfied. Suppose in the mail, the customer writes,

The banking services are so fantastic that every issue gets solved within 3–4 years.

Although the mail includes words like "fantastic" and "issue get solved", the bank manager can identify sarcasm, and the customer is unhappy. So this intelligence can not be explained in hard-coded rules. Here machines try to learn humans' cognitive intelligence.

Cognitive intelligence

Other possible examples:

  1. Understanding the speech signals.
  2. Image recognition
  3. If you can also think of any such example, please send them to us.

If we have noticed clearly, we see that the decision-supportive use cases are generally structured data with rules defined in the numerical format. But on the other hand, in cognitive use-cases, we have data, not in numbers but images, videos, or text format. Computers understand the numbers, and we need to convert these data samples into a computer-readable format.

Life Cycle of Any Machine Learning Project

In computer science, more specifically in the IT sector, business and technology need a high level of coordination. The difference between "business demand" and "feasibility of demand via technology" is known as the IT error. The more the coordination between business experts and coders, the lesser will be IT error.

In Machine Learning solutions, we need to have the most coordination between technology and business verticals. In the case of decision supportive use case, this tech requires the use of rules formulated by business experts and the corresponding data to support or derive new rules. And in the case of cognitive use-case, a close watch on the learning process is required as humans have to evaluate the model's accuracy.

Machine Learning project completion steps

For any Machine Learning project from business experts, there are mainly seven different verticals or phases it has to pass. All of these seven verticals are mentioned in the image above. Let's discuss all these verticals in detail.

Problem Formation

Technology is developed to solve a particular problem, but this process requires a "clear definition" of the problem we want to solve. This is more like goal finalization, where the goal is the target that we want to achieve.

For example, suppose we are the head of any banking firm, and we want our customers to be happy. Till now, we roughly stated the central objective, but this can not be the problem statement that technology would target. Machine learning would provide the technology to enhance the customer experience using multiple ways. One of which could be alerting the customers in case of fraud transaction via credit card.


Data Gathering

Now we have a clear problem statement in our mind; we can start thinking from the requirement perspective. We want to solve the defined problem statement using the machine learning technique, and the core requirement for this technique is the historical data. There can be three scenarios,

  • Case1: Sufficient data is available with all the essential attributes.

In this case, we can start utilizing the available data and find whether it is present in the supervised, unsupervised, or semi-supervised format. We have discussed the classification of the machine learning model based on the nature of input data, and the detail can be found here

  • Case 2: Sufficient data is available but lacks essential attributes.

In the second case, we need to analyze the data and check whether available attributes are sufficient to make the model or not. This stage requires a trial and error condition. We can start categorizing the data and check whether machine learning fulfills our objective. If not, we will have to capture a sufficient amount of data from scratch.

  • Case 3: Either insufficient data or no data is available.

In such a scenario, we will have to gather or record the data from the start. In this case, we first need to gain some domain knowledge about the problem statement. For example, suppose we are trying to sense whether any transaction is fraudulent, then what attributes( previous month transaction history, type of shopping, etc.) would be essential to learn the mapping function successfully.

Note: There are multiple platforms like Kaggle, 


Data Pre-processing and Analyzing

After collecting the data, it presents in the raw format and can not be used directly in the model building process. We need to pass this raw data through a "set of defined processes", known as data pre-processing steps. These steps vary based on the project requirements and raw data quality. Some famous pre-processing steps for structured data are,

  • Feature selection
  • Feature engineering
  • Handling missing data samples, 
  • Reducing dimensionality, etc.

Yet these steps vary with different problem statements and the quality of available (or collected) data, but the overall objective of this process is to give the processed data which should be rich in quality and in a machine-readable format. For unstructured data, like text, images, and videos, our computers can not understand them directly. In pre-processing such data, we convert them into a machine-readable format.

Data processing into computer readable format

Source: Datadriveninvestor


Model Selection

After processing the data, we have the exact information about the quality data used to train our machine learning algorithms. 

But there are many algorithms in the market, which algorithm we should target?

Selection of model mainly depends upon these factors,

  • Nature of Input data: Whether the data is in supervised, unsupervised, or semi-supervised format.
  • Nature of problem statement: Whether we have to solve the classification problem, regression problem, clustering, or dimensionality reduction.
  • Need of explainability: Statistical models and famous machine learning models like SVM, Linear regression, logistic regression, decision trees, etc., possess an explainable nature. Explainable can be described like "what business demands" and "how ML models are actually achieving it" should be traceable. Neural Networks and Deep Learning models lack explainable properties.
  • Availability of computational capability: Computational power can force us to choose the lesser complex model by compromising accuracy. Linear and logistic regression algorithms require very little compute power by methods like Gradient boosting can give you better accuracy.
  • Availability of data samples: Algorithms like SVM can work with fewer data samples, while Neural Networks and Deep Learning demand a higher amount of data samples.
  • Decision Supportive or cognitive use case: In the case of decision supportive use-case, statistical models like linear regression and other ML algorithms can work perfectly. But in the case of cognitive use-case, to achieve substantial accuracy, Deep Learning algorithms are required.

Data Splitting

We have decided on the algorithm we need to train; we need to make three sets from the processed data: training, validation, and test datasets. Let's first learn about all these three datasets.

  • Training Data: This dataset is responsible for training our model. It is generally 60–80% of the processed data we had.
  • Validation Data: Once the model gets trained, we check the model's performance on our validation set to check whether the minima is reached for the cost function. If not, we tune the hyperparameters and again train our model on the training dataset. It is generally 10–20% of the processed data.
    In the image below, if we consider the choice of initialization of parameters as a hyperparameter, we need to check multiple times for which initialization the cost is the most minimum. 

3D plot of updating parameters with different intial values

  • Test Data: It is generally 10–20% of the processed data. Once the model starts performing better on the validation set, we test our final model on completely unseen (but processed) data to check whether our model-building process becomes successful or not. If yes, we can move ahead with the performance evaluation step to test the developed model on various parameters. If not, we need to debug the reason or select some different algorithm.

But how do we split the data?

There can be mainly three ways of doing so,

  1. Splitting randomly into three buckets
  2. Splitting the data such that all three sets contain all possible variations in the dataset. This is considered to be the best split.
  3. Multiple random splitting was performed and checked which combination was working better.

Data splitting into 3 sets


Performance Evaluation

Once our model is ready, we need to evaluate it rigorously on different metrics and check whether it beats previous methodologies. The evaluation metrics can vary based on project requirements. 

  • Classification model evaluation metrics: Accuracy, confusion matrix, F1 score, Area under the curve, etc. 
  • Regression model evaluation metrics: Mean Absolute Error, Mean Squared Error, R-Squared, etc.

We sometimes need to design our evaluation metric, but that is beyond the scope of this blog to discuss here.


Rollout of Model

Once our model clears our evaluation criteria, we need to integrate these models with our mobile apps and website APIs. It will start giving predictions on the user data in real-time. But there can be several scenarios based on which we might need to retrain our ML algorithms. 

  • Data drift: Our model worked on the collected dataset perfectly, but the data trend has changed a lot which was not present in the raw data we collected earlier. For example, we made our model predict the stock price based on the historical data of the last 5 months. But we observed that the stock market trend completely changed over the previous 5 days. In such a case, the model needs to re-learn these things, which require re-training the model.
  • New datasets: We figured out that we were not recording one attribute earlier, which is beneficial scientifically. Then we must need to include that attribute to enhance the accuracy of our model. Here we would need retraining of the model with newer features.
  • Learning from mistakes: There can be situations like, for a particular word, we marked our email as spam, but then we realized it was incorrect. But we made our raw data with earlier assumptions, and the model also learned the same thing. We need to rectify those mistakes and retrain the model to perform better.

There can be more such reasons, but we might have noticed that this technology demands close coordination with business so that whenever new things are needed, retraining can solve the problem.


Conclusion

In this article, we have learned the different use-cases of Machine Learning technology. We also discussed the complete pathway that needs to be followed while building any Machine Learning project. We hope you enjoyed the article.

Enjoy Learning! Enjoy Thinking! Enjoy Algorithms!

We'd love to hear from you

More content from EnjoyAlgorithms

Our weekly newsletter

Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops and math. enjoy learning!