Implementing Regression and goodness of fit


Implementing Regression and goodness of fit

Implementing OLS regression
2.     Implementing goodness of fit –chi square
Concept:
In statistics, ordinary least squares (OLS) or linear least squares is a method for estimating the unknown parameters in a linear regression model. This method minimizes the sum of squared vertical distances between the observed responses in the dataset and the responses predicted by the linear approximation. The resulting estimator can be expressed by a simple formula, especially in the case of a single regressor on the right-hand side.
OLS is used in economics (econometrics), political science and electrical engineering (control theory and signal processing), among many areas of application.
A chi-squared test, also referred to as chi-square test or χw² test, is any statistical hypothesis test in which the sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true. Also considered a chi-squared test is a test in which this is asymptotically true, meaning that the sampling distribution (if the null hypothesis is true) can be made to approximate a chi-squared distribution as closely as desired by making the sample size large enough.
Pearson's chi-squared test is used to assess two types of comparison: tests of goodness of fit and tests of independence.
·        A test of goodness of fit establishes whether or not an observed frequency distribution differs from a theoretical distribution.
·        test of independence assesses whether paired observations on two variables, expressed in a contingency table, are independent of each other (e.g. polling responses from people of different nationalities to see if one's nationality is related to the response).
The procedure of the test includes the following steps:
1.   Calculate the chi-squared test statistic\chi^2, which resembles a normalized sum of squared deviations between observed and theoretical frequencies 
2.   Determine the degrees of freedomdf, of that statistic, which is essentially the number of frequencies reduced by the number of parameters of the fitted distribution.
3.   Compare \chi^2 to the critical value from the chi-squared distribution with d degrees of freedom, which in many cases gives a good approximation of the distribution of \chi^2.

Comments