Fitting the line to data: the Quality Metric

The estimated function is:

In this setting are the estimated regression coefficients, also just called to generalize the notion. So in our discussion, we can talk about rather than because the coefficients fully describe the function.

We can calculate the ‘cost’ of using a given function (in this case a line) using Residual Sum of Squares (RSS). A residual is the difference between a predicted value and an observed value. The RSS for a set of regression coefficients is calculated by using the regression coefficients to calculate a predicted value, then finding the difference between the predicted value and the observed value and then squaring that difference. We do this for all training data and sum the squared differences. So this could be more descriptively called the Sum of the Squared Residuals (but don’t do that). So this can be written:

Now given this function, we can calculate the cost of any estimated line (the line with estimated coefficients and ). The idea then, is to find the line with the lowest cost and use that as our model for predicting values.

results matching ""

    No results matching ""