Estimate the Impact of Climate Change: An Exploration of the Bin Regression Model
Abstract
In recent years, a large number amount of empirical work has been undertaken to study theinfluence of climate variables on a wide range of different social and economic outcomes. As noted
by Hsiang (2016), the measurement and representation of these climate variables is a critical step
in identifying the impacts of climate change. The “bin” regression model, which is a flexible semiparametric
method for representing one or more of these climate variables, has emerged as the
workhorse approach for empirical work (e.g., Deschênes and Greenstone, 2011).
This paper is the first to formally explore the econometric properties of the bin regression approach
in climate economics literature. The bin regression approach takes the desired explanatory variable
and discretizes in a manner like a histogram with each bin now being represented by a count (e.g.,
number of days during the growing seasons where mean temperature falls in a specified interval).
Formally, the bin approach is a M-piecewise constant function, where the intervals defining the M
bins are chosen by the researcher.
We show that, although the bin regression approach often produces reasonable results, it produces
consistent parameter estimates only under very stringent and unlikely assumptions on true data
generating procedure, which we characterize in detail. Furthermore, because the researcher choses
the bin definitions, the approach is not truly semiparametric.
The bin approach has two other major problems. First, because information within a bin is lost and
the bins themselves represent points of discontinuity, it is possible for an overall shift in the
distribution of the variable of interest such as an average increase in temperature either have little
or an extreme effect.
We propose alternatives to bin model for estimating the impact of climate change that assume the
same underlying DGP. The main line of attack we take comes from recognizing that the bin
regression approach conflates two problems: (a) how to summarize the distribution of an
exogenous stimulus variable and (b) how determine