In machine learning, anomaly detection is a process of finding samples behaving abnormally compared to the majority of samples present in the dataset. Anomaly detection algorithms have important use-cases in data analytics and data science fields. For example, fraud analysts use anomaly detection algorithms to detect fraud transactions.
In this article, we will learn about one of the essential topics used in scaling different attributes for machine learning: Normalization and Standardization. Normalization and Standardization in machine learning are used to scale all the features in the same range. It avoids the cases of biases on higher or lower magnitude features.
Companies are collecting tons of data, and the need for processed data is increasing. In this blog, we will do hands-on on several data preprocessing techniques in machine learning, like feature selection, feature quality assessment, feature sampling, and feature reduction. We will use different datasets for demonstrating data preprocessing methods.
Time Series Preprocessing techniques have a significant influence on data modeling accuracy. In this blog, we have discussed: 1) Definition of time-series data and its importance. 2) Preprocessing steps for time series data 3) Structuring time-series data, finding the missing values, denoising the features, and finding the outliers present in the dataset.
Unlike humans, machines don’t understand words and their semantic context. So, we convert processed text into a format that the machine can understand using vector encoding. In this blog, we will learn: 1) Word embedding 2) Techniques to embed words (One-hot encoding, Word2Vec, TF-IDF, etc) 3) Implementation of all these embeddings.
We need to clean the text data before feeding it to machine learning algorithms. Fortunately, Python has excellent support for NLP libraries (NLTK, spaCyto) to ease text analysis. In this blog, we will learn: 1) Real-time working on the sentiment analysis dataset 2) Techniques for cleaning text data. 3) Exploratory analysis of text data.
Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.