Machine Learning 101

Statistical Decision Theory

In this post, we will develop a framework for tackling a supervised learning problem. We will also see how this framework leads to one of the most common supervised learning algorithm: K-Nearest Neighbor. A supervised learning problem involves predicting an output (y) given an input (x). Assuming we are given a joint probability distribution of x and y our goal is to find an approximating function (f) which predicts the value of y given x. In order to analyze the performance of a prediction function, we define a loss function (L). Typically loss functions are real-valued functions bounded at 0. Let y and y' be two real numbers where $y \neq y'$ then a loss function will hold $L(y, y) \leq L(y, y')$. Our goal is to select the function f which minimizes the expected value of loss i.e $E[L(y, f(x)]$ under the joint distribution of x and y. Assuming joint distribution to be discrete we get: \begin{align} &E[L(y, f(x)] = \sum_x \sum_y L(y, f(x)) * P(x,y) \\ \implies...

Machine Learning 101

Search This Blog

Posts

Statistical Decision Theory