I am working through Google’s Machine Learning Crash Course. The notes in this post cover  and .
A lot of ML quickstarts dive right into jargon like model, feature, y’, L2, etc, which makes it hard for me to learn the basics – “what are we doing and why?”
The crash course also presents some jargon, but at least explains each concept and links to a glossary, which makes it easier to learn.
After a few days of poking around, one piece of jargon seems irreducible: linear regression. In other words, this is the kind of basic ML concept I’ve been looking for. This is where I’d start if I was helping someone learn ML.
I probably learned about linear regression in the one statistics class I took in college, but have forgotten about it after years of string parsing 🙂
The glossary entry for linear regression describes it as “Using the raw output (y’) of a linear model as the actual prediction in a regression model”, which is still too dense for me.
The linear regression module of the crash course is closer to my level:
Linear regression is a method for finding the straight line … that best fits a set of points.
The crash course provides a good example of a line fitting points describing cricket chirps per minute per temperature:
The “linear” in “linear regression” refers to this straight line, as in linear equation. The “regression” refers to “regression to the mean”, which is a statistical observation unfortunately unrelated to statistical methods like the least squares technique described below, as explained humorously by John Seymour.
Math is Fun describes a technique called “least squares regression” for finding such a line. Google’s glossary also has an entry for least squares regression, which gives me confidence that I’m bridging my level (Math is Fun) with the novel concept of ML.
Helpful tip from StatQuest’s “Machine Learning Fundamentals: Bias and Variance”: differences are squared so that negative distances don’t cancel out positive distances.
The crash course even describes this equation as a “model”: “By convention in machine learning, you’ll write the equation for a model slightly differently …”
All this helps me understand in the most basic sense:
- A “model” is just an equation
- “Training” and “learning” are just performing a regression calculation to generate an equation
- Performing these calculations regularly and on large data sets is tedious and error prone, so we use a computer, hence “machine learning”
- “Prediction” and “inference” are just plugging x values into the equation