Every machine learning project needs to go through these steps: Problem finalization, data gathering, data pre-processing, model selection, data splitting, performance evaluation, and model deployment. In this blog, we have discussed: 1) Steps of implementing machine learning projects 2) How to make ML models production-ready?, etc.
Machine learning has become so advanced that it is being used for drug discovery, which reduces the time needed to produce a new drug. In this blog, we have discussed: 1) Use cases of drug discovery problem 2) Steps involved in drug discovery 3) Implementation steps of XGBoost regressor model 4) Active and inactive compounds 4) Need of fingerprints, etc.
Sentiment analysis is a technique that comes under natural language processing(NLP) and is used to predict emotions reflected by a word or a group of words. Sentiment analysis is instrumental in brand monitoring, market research, social media monitoring, etc. This blog will discuss naive bayes to predict sentiments using their tweets.
As data scientists, we should know how to handle the date-time data and the standard set of date-time operations we can apply to transform the raw data. Fortunately, we have date-time manipulation libraries specifically for this purpose. In this blog, we will talk about all basic date-time manipulations, explorations, transformations, and applications.
In this blog, we will focus on applications of regex by implementing it to some tedious tasks that wouldn’t be possible without regular expressions. Some standard applications of regular expressions in data science: 1) Web scraping and data collection 2) Text preprocessing (NLP) 3) Pattern detection for IDs, e-mails, names 4) Date-time manipulations
In this blog, we have demonstrated data analysis of the company's attrition rate and built a machine learning model (logistic regression model) to predict it. We have explored some exciting patterns that lead to employee attrition. We will be using Kaggle's IBM HR analytics Employee Attrition and Performance dataset for this analysis.
Many learners want to master machine learning and don’t know where to start. It seems like a formidable task, especially if one lacks a thorough background. This article will discuss some of the factors that can be obstacles in learning machine learning. Working around these obstacles can help us to develop a long-term interest in this subject.
Time series forecasting uses statistical models to predict future values using previously recorded observations. It is classified into two parts: 1) Univariate time series forecasting (Involves a single variable) 2) Multivariate time series forecasting (Involves multiple variables). Note: Time Series is a set of observations taken at a specific periodic time.
Time Series Preprocessing techniques have a significant influence on data modeling accuracy. In this blog, we have discussed: 1) Definition of time-series data and its importance. 2) Preprocessing steps for time series data 3) Structuring time-series data, finding the missing values, denoising the features, and finding the outliers present in the dataset.
When we talk about the machine learning model, one question comes to mind: What are the errors associated with that prediction? Bias and Variance are those error-causing elements, and ideas about these errors will help to diagnose the model. Bias, Variance and Bias-Variance tradeoffs are frequently asked questions in machine learning interviews.
Unlike humans, machines don’t understand words and their semantic context. So, we convert processed text into a format that the machine can understand using vector encoding. In this blog, we will learn: 1) Word embedding 2) Techniques to embed words (One-hot encoding, Word2Vec, TF-IDF, etc) 3) Implementation of all these embeddings.
We need to clean the text data before feeding it to machine learning algorithms. Fortunately, Python has excellent support for NLP libraries (NLTK, spaCyto) to ease text analysis. In this blog, we will learn: 1) Real-time working on the sentiment analysis dataset 2) Techniques for cleaning text data. 3) Exploratory analysis of text data.
Artificial intelligence and machine learning are the most famous buzzwords in the technical industries. We generally use them as synonyms, but these tech stacks are different, although machine learning is just a part of artificial intelligence. In this blog, we will discuss the basic comparison between artificial intelligence and machine learning.
Logistic Regression is one of the most used machine learning algorithms. It is a supervised learning algorithm where target variables should be categorical, such as positive or negative, Type A, B, or C, etc. Although the name contains the term "regression", we can also say that it can only solve classification problems.
Linear Regression is a supervised machine learning algorithm used to solve regression problems. In this blog, we have discussed: 1) What is linear regression? 2) Various types 3) The loss function for linear regression 4) Ordinary Least Squares (OLS) method 5) Polynomial regression 6) Python implementation of linear regression.
Machine Learning has become a tool used in almost every task that requires estimation. Companies like Cars24 and Cardekho.com uses Regression analysis to estimate the used car prices. So we need to build a model to estimate the price of used cars. The model should take car-related parameters and output a selling price.
We sometimes overemphasize the capabilities of machine learning. Here are some popular misconceptions: 1) ML, DL, and AI are the same 2) Machine learning can predict the future 3) Machine learning can solve any problem 4) One machine learning algorithm will be sufficient 5) More features in the data, better will be the machine learning model, etc.
In this blog, we have discussed how exactly machines learn in machine learning. We have solved one common problem of finding the value of straight-line using two different approaches: Using traditional programming and using machine learning approach. We also looked at what information machines will store, which we say is machine learning.
This is a glossary of machine learning terms commonly used in the industry. Some popular machine learning terminologies: neural networks, supervised learning, unsupervised learning, reinforcement learning, regularization, classification, regression, clustering, optimizers, outliers, bias, variance, underfitting, overfitting, normalization, etc.
In recent years, the quantity of spam emails has decreased significantly due to spam detection and filtering software. In this, We have covered these concepts: 1) Methods to segregate incoming emails into the spam or non-spam categories? 2) Steps to implement a spam classifier using the k-NN algorithm. 3) Real-life use case of Gmail, Outlook, and Yahoo.
Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.