Types of Machine Learning – Data science
In machine learning problems, algorithms are used to learn from historical data and make predictions about future events. There are different types of machine learning algorithms used for different purposes. There are some elements of each type of learning algorithm that must be considered. Please find below various types of machine learning algorithms.
Reinforcement learning algorithms:
These algorithms use to learn how to perform a task. It uses the trial and error method. They are “reinforced” by feedback from an external agent. The frequently uses reinforcement learning algorithm is Q-learning.
Supervised learning algorithms:
These algorithms use to learn how to predict the desired output value, based on input values. The algorithm is “supervised” by a human who provides the correct outputs for training data sets. The most frequently uses supervised learning algorithm is linear regression.
Semi-supervised learning algorithms:
These algorithms are used to improve the accuracy of machine learning models.
It uses a combination of labeled and unlabeled data sets. The most common semi-supervised learning algorithm is support vector machines.
These algorithms use to create machine learning models.
- It works like the human brain
- It can use for supervised or unsupervised learning tasks,
- To recognize any patterns in data.
Deep learning is a pattern of machine learning to use artificial neural networks to learn patterns in data. Neural networks are composed of interconnected processing nodes or neurons.
Deep learning algorithms can learn to identify complex patterns in data or images or text or speech.
This way we can extract more information from data to improve the accuracy of their predictions.
Different Types of learning problems
Machine learning algorithms have their own strengths for different applications.
Machine learning algorithms are used to predict everything from whether you’ll click on an ad, to how likely you are to default on a loan. In supervised learning, the algorithm is a set of training data with the correct answer. In an unsupervised learning system, the algorithm is only data and must figure out patterns that exist within it on its own.
Machine learning has come a long way in the past few decades. Algorithms are helping machines make better decisions every day.
Unsupervised Machine learning:
These algorithms use for unlabelled training data. Unsupervised learning algorithms come in two different types: generative and discriminative.
Generative algorithms on a probabilistic model that specifies the conditional distribution of some hidden variables given some observed variables.
One example of a generative algorithm is Latent Dirichlet Allocation (LDA).
This algorithm assumes documents to compose of topics that each LDA learns these distributions from a set of documents,
The aim is to learn categories that account for most of the variance present in the data. Discriminative models predict the desired label for new data points. One example of a discriminative algorithm is linear Support Vector Machines (SVMs).
SVMs are generative algorithms to learn a conditional distribution over some hidden variables. It can make better predictions about the labels for new data points.
Clustering is an unsupervised learning task where the goal is to find natural groupings of objects in a data set. A common approach to clustering is finding clusters of similar objects in the training data.
Other clustering algorithms include spectral clustering is called DBSCAN.
It can find clusters based on implicit links between observations in the original space using a distance function.
Dimensionality reduction methods: – This method is used to reduce a large number of random variables. This method works with transformation under consideration to achieve it.
Principal Component Analysis is an example of dimensionality reduction. It uses a mathematical approach to represent a set of variables as a smaller set containing uncorrelated variables, which can be defined as principal components.
These new variables have a simpler relationship among themselves. There are other techniques that can find a low-dimensional representation of the data.
- Linear discriminates analysis
- Canonical Correlation Analysis
- Non-negative Matrix Factorization
- Partial Least Squares
Natural language processing (NLP):
Natural language processing is a subgroup of artificial intelligence. It shows how to make computers understand human languages and produce language.
NLP uses machine learning algorithms on a large set of documents to find patterns within the data that can use as a predictor for some classifier.
Ex:- You can build an algorithm that automatically distinguishes spam from legitimate e-mails by using past examples of each type.
This task is usually divided into two sub-tasks abstractive summarization and extractive summarization.
Abstractive summarization: – It builds a summary by reducing the original texts with few words. The meaning will be the same.
Extractive summarization: – It does this task by using sentences from the original document as they are.
Real-time bidding (RTB) is an advertisement platform that enables publishers to sell online. You can ad inventory via real-time auctions between advertisers competing for ad positions.
The RTB process has three steps which are planning, buying, and reporting.
- Demand-side platforms (DSPs) – It is a combination of data management platforms (DMPs) and campaign management platforms (CMPs).
You can calculate expected bids from on-site content and user information that belongs to its own data set and/or third-party sources. As a third-party source, it would be Google Display Network (GDN) and Double Click Campaign Manager.
- Send those expected bids to ad exchanges . It uses real-time ad transactions. DSPs adjust their bid prices based on how much inventory is available from the publisher at a given time.
- Publishers receive these bids for their inventory for advertisement.
There are a few elements of each type of learning algorithm that must be considered. The amount of training data that must into the algorithm is determined by the classifier’s complexity. The dimensionality of the training sets can influence the output variance. This can be mitigated by adjusting the bias. Overfitting and underfitting are problems with inaccurate outputs based on data fitting.