Simple Regression

 

SIMPLE REGRESSION

 How do we know what will happen in the future if we adopt a policy?

One way is to look at what happened in other places when similar policies were adopted.

 Using the Chi Square test, we could test nominal data.

The most simple examples are the 2 by 2 crosstab.

 Crosstab of Dichotomous (Yes-No) Data

 

 

Dependent Variable

 

 

No

Yes

Dependent

Variable

No

7

13

Yes

13

7

 

But sometimes the policy variable, or the expected outcomes of that policy, are interval level data.

Linear Regression (also called Ordinary Least Squares) can be used on interval level data.

 

For example, imagine municipality is considering adopting a 1.5 percent local payroll tax.

What will happen to wages?  What will happen to new job creation?

 

It would be very easy if we could take 2 similar municipalities, impose a 1.5 % payroll tax on one, and no such tax on the other, and see how their subsequent development differs.

But in a free society, we cannot impose our experiments on other people.

Municipalities, however, may conduct their own experiments.

One city may impose a half percent payroll tax, and another impose a two percent, and another impose a five percent tax.  It is possible that no one ever imposed a 1.5% payroll tax before.

Different levels of payroll tax are interval data.

 On the outcome side, the economies of these cities may grow at different rates from two to six percent a year.  Different growth rates are also interval data.

 When both our dependent and independent variable are interval data, we can use regression analysis.  From our results we can estimate the percent change in our dependent variable (like new jobs creation) for each percentage of payroll tax.  Thus we could predict the response to a 1.5 percent payroll tax, even if no one had ever had a payroll tax of that rate before.

 

The basic assumption of simple regression is that changes in X cause changes in Y.

The first thing to do is put your Ys in one column, and your Xs in another, and graph them.

 Linear regression assumes that the relationship is a straight line.  If it is not, you may have to do a transformation of the data to make them a straight line.  This is something that you would learn in a course devoted to entirely to linear regression. 


The assumptions of linear regression are:

 1. Both X and Y are interval data

 2. The relationship between X and Y are linear.

 3. The errors (difference between expected and observed values of Y) are normally distributed with a mean of zero.  This results in a bell shaped curve.  Note: in last class we found we use probabilistic reasoning and a bell shaped curve to estimate significance.

 4. The error is constant regardless of the value of X.  (If error was greater over a certain range, which are usually high values, then our predictions and estimates of significance would not be as valid over those ranges.

 5. Errors must be independent of each other.  Violations of this assumption can happen if the subject of the experiment has a memory, or if past independent variables are still influencing behavior.   

 

 

 

EXCEL Instructions

   

The instructions for interpreting a regression analysis are in your Essential Statistics book.

The instructions for generating a simple regression on EXCEL are as follows:

Enter your data, Y or independent variable in one column, and your X(s) or dependent variable(s) in the next column.

Click your mouse on: Tools/ Data Analysis/ Regression.

On the resulting page:

 Click on the X range and highlight your Xs

 Click on the Y range and highlight your Ys

 Hit enter and your output will appear on a new worksheet (usually sheet 4).

 

Your Adjusted R Square is the portion of the change in Y explained by changes in X

Your “intercept” is the value of Y without any X.

Your “X variable 1" is that change in Y for each unit of X.  

 

Homework is on the Blackboard Course site.