机器学习英文做笔记,顺便学英语。
There isn’t a well accepted definition of what is and what isn’t machine learning.
A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
— Tom
Let’s say your email program watches which emails you do or do not mark as spam. So in an email client like this, you might click the Spam button to report some email as spam but not other emails. And based on which emails you mark as spam, say your email program learns better how to filter spam email.
classifying emails is the task T.watching you label emails as spam or not spam is the experience E.the fraction of emails correctly classified, that might be a performance measure P.There are several different types of learning algorithms.
The main two types are what we call supervised learning and unsupervised learning.
I hope to make you one of the best people in knowing how to design and build serious machine learning and AI systems.
In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.
Supervised learning problems are categorized into “regression”(回归) and “classification”(分类) problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.
put a straight line through the data, also fit a straight line to the data. (purple)
And there might be a better one. For example, instead of fitting a straight line to the data, we might decide that it’s better to fit a quadratic function, or a second-order polynomial to this data. (blue)
We’re trying to predict a discrete value output zero or one. sometimes you can have more than two possible values for the output.
The learning algorithm can deal with an infinite number of features.
Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don’t necessarily know the effect of the variables.
We can derive this structure by clustering(聚集) the data based on relationships among the variables in the data.
With unsupervised learning there is no feedback based on the prediction results.
Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.
Non-clustering: The “Cocktail Party Algorithm”, allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).