Regular expression is an expression that holds a defined search pattern to extract the pattern-specific strings. Today, RE are available for almost every high-level programming language and as data scientists or NLP engineers, we should know the basics of regular expressions and when to use them.
Random forests is a supervised learning algorithm that can be used to solve both classifications and regression problems. It is popularly applied to data science competitions and practical, real-life situations and provides very intuitive and heuristic solutions.
The best machine learninmodel would have the lowest number of features involved in the analysis keeping the performance high. Therefore, determining the relevant features for the model building phase is necessary. In this session, we will see some feature selection methods and discuss the pros and cons of each.
Anomaly detection is a process of finding samples behaving abnormally compared to the majority of samples present in the dataset. Anomaly detection algorithms have important use-cases in Data Analytics and Data Science fields. For example, fraud analysts rely on anomaly detection algorithms to detect fraud in transactions.
Cancer classification is one area where ML can deliver a robust predictive model based on given observations to identify the cancer possibility. In this article, we have built a cancer classification model to predict the presence of malignant (cancer-causing cells) or benign cells using a support vector classifier model.