Locally Weighted Linear Regression
This locally weighted linear regression function is a non-parametric Learning algorithm, where the size of h0(x) is linearly proportional to the size of our training set m. Thus memory sizes increase with the training set.
Finding a new algorithm that is easy to fit curved lines
- Look at the data at a small point that you’re interested in
- Build a local hypothesis just for that section and try to predict that area
- Given location X where we want to make a prediction,
- The weights depend on the particular point x at which we’re trying to evaluate x
if |x(i) − x| is small, then w(i) is close to 1
if |x(i) − x| is large, then w(i) is small (close to 0)
- So how do we determine the appropriate values of θ?
We pick a θ that gives the highest weight based on training examples that are closest to the query point
- Bandwidth Parameter: The function is selected because we want a bell-shaped curve that peaks close to x and then falls of quickly after
helps to identify the shape of the curve (fat vs thin)
Regular Normal Equation:
Probabilistic interpretation of data
Where is an error term which captures unmodeled effects or random noise
The density of the is given by
This implies that
the distribution of y(i)
Given the design matrix X which contains all the
Maximum likelihood estimation
We should choose θ so as to make the data as high probability as possible.
We can maximize the log likelihood l(θ):
Maximizing is the same as minimizing , which is the cost function J(θ).