Interpretation of Regressions with Multiple Proxies

Darren Lubotsky Martin Wittenberg
lubotsky@uiuc.edu wittenbergm@sebs.wits.ac.za
University of Illinois at Urbana-Champaign University of the Witwatersrand

April 2003
 

Abstract

Multiple proxy variables are typically available for one unobserved explanatory variable in a linear regression. We provide a procedure by which the coefficient of interest can be estimated from a multiple regression in which all the proxies are included simultaneously. This estimator is strictly superior in large samples to the more common practice of creating an index or summary measure of the proxy variables, such as the average of the proxies or the first principle component. We use our procedure to examine the relationship between parents' permanent income and children's reading test scores in the United States, and the relationship between parents' assets and children's school enrollment in India, and demonstrate that the reduction in attenuation bias from a better use of proxy variables can be significant.

Paper download