A Note on the Role of Behavioral Agents in Estimating Causal Treatment Effects Using Observational Data
To appreciate this point, consider the following simple example.
Assume that the interest rate = 0%. Assume that people live for two years. If you go to college for one year, you only work the 2nd year. If you don't go to college, you work both years.
Person Earnings if go to college Earnings if do not go to college
Matthew 86000 41000
Jill 99000 43000
Billy 108000 42000
Suppose that the tuition cost = $10,000
If each of these 3 different people is a lifetime net income maximizer, then we see that;
Matthew does not go to college because 86000 -10000 < 2*41000
Jill does go to college because 99000-10000 > 2*43000
Billy goes to college because 108000-10000>2*42000
In this case, in the face of essential heterogeneity --- the researcher using these observational data would conclude that the average treatment effect of going to college =
E(earnings|go to college) - E(earnings| not go to college) = 103500-41000*2
Note that the 103500 is the average lifetime earnings of Jill and Billy as college graduates and the 45000 is observed because this is Matt's earnings as he chooses not to go to college.
A key point in the Heckman paper is that there can be instrumental variables that would nudge Matt to go to college. From the participation equation above, note that if Matt is offered free tuition to go to college (a cost shifter) he will choose to go to college because 86000>2*41000. Matt is "at the margin" in this 3 person economy. If he is offered a tuition break of $500, he will still choose not to go to college.
Facing this zero tuition, all 3 would go to college and a researcher would not be able to estimate the average returns to going to college because there is no control group! Nobody chose to not go to college.
Now suppose that we continue with this 3 types of person economy but we make the new assumption that there are 1000s of each of these 3 types. These 3 types are identical except for 1 new assumption.
Suppose there are 5000 Matthews, 5000 Jills and 5000 Billys. In this economy of 15,000 people let each of these decision makers choose whether to go to college at random. This may be a personality trait such as being over-confident on being impatient about bearing upfront costs (paying for education) versus later benefits. So, I am assuming that this personality trait is iid mean zero for each of the 15,000 people. Essentially, these people are flipping coins to see if they go to college or not.
Under the assumption that these behavioral personality traits are independent of one's skill level and that these behavioral traits determine whether one attends college or not, what does the econometrician observe?
1/2 of the Billys go to college and 1/2 do not.
1/2 of the Matthews go to college and 1/2 do not
1/2 of the Jills go to college and 1/2 do not.
The Econometrician recalculates;
E(earnings|go to college) - E(earnings| not go to college) =
(1/6)*88000 + (1/6)*99000 + (1/6)*108000 - (1/6)*(41000+42000+43000)
and this is an accurate average college treatment effect for the entire population.
What just happened? When economic agents sort on the gain, OLS overestimate the population average treatment effect. When economics agents sort into treatment at random, OLS performs much better. Behavioral agents can be thought of as "randomly sorting".
In reality, only a subset of decision makers sort by such random factors but this means that the presence of an unknown subset of behavioral agents shrink the OLS treatment estimator back to "to the truth".
So, the point of this blog post is to highlight that behavioral agents improve the quality of life of the reduced form econometrician.