Some additional thoughts related to Serena Ng's World Congress piece (earlier post here, with a link to her paper):

The key newish dimensionality-reduction strategies that Serena emphasizes are random projection and leverage score sampling. In a regression context both are methods for optimally approximating an NxK "X matrix" with an Nxk X matrix, where k<<K. They are very different and there are many issues. Random projection delivers a smaller X matrix with columns that are linear combinations of those of the original X matrix, as for example with principal-component regression, which can sometimes make for difficult interpretation. Leverage score sampling, in contrast, delivers a smaller X matrix with columns that are simply a subset of those of those of the original X matrix, which feels cleaner but has issues of its own.

Anyway, a crucial observation is that for successful predictive modeling we don't need deep interpretation, so random projection is potentially just fine -- if it works, it works, and that's an empirical matter. Econometric extensions (e.g., to VAR's) and evidence (e.g., to macro forecasting) are just now emerging, and the results appear encouraging. An important recent contribution in that regard is Koop, Korobilis, and Pettenuzzo (in press), which significantly extends and applies earlier work of Guhaniyogi and Dunson (2015) on Bayesian random projection ("compression"). Bayesian compression fits beautifully in a MCMC framework (again see Koop et al.), including model averaging across multiple random projections, attaching greater weight to projections that forecast well. Very exciting!

## Sunday, August 20, 2017

## Monday, August 14, 2017

### Analyzing Terabytes of Economic Data

Serena Ng's World Congress piece is out as an NBER w.p. It's been floating around for a long time, but just in case you missed it, it's a fun and insightful read:

Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data

by Serena Ng - NBER Working Paper #23673.

http://papers.nber.org/papers/w23673

(Ungated copy at http://www.columbia.edu/~sn2294/papers/sng-worldcongress.pdf)

Abstract:

This paper seeks to better understand what makes big data analysis different, what we can and cannot do with existing econometric tools, and what issues need to be dealt with in order to work with the data efficiently. As a case study, I set out to extract any business cycle information that might exist in four terabytes of weekly scanner data. The main challenge is to handle the volume, variety, and characteristics of the data within the constraints of our computing environment. Scalable and efficient algorithms are available to ease the computation burden, but they often have unknown statistical properties and are not designed for the purpose of efficient estimation or optimal inference. As well, economic data have unique characteristics that generic algorithms may not accommodate. There is a need for computationally efficient econometric methods as big data is likely here to stay.

Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data

by Serena Ng - NBER Working Paper #23673.

http://papers.nber.org/papers/w23673

(Ungated copy at http://www.columbia.edu/~sn2294/papers/sng-worldcongress.pdf)

Abstract:

This paper seeks to better understand what makes big data analysis different, what we can and cannot do with existing econometric tools, and what issues need to be dealt with in order to work with the data efficiently. As a case study, I set out to extract any business cycle information that might exist in four terabytes of weekly scanner data. The main challenge is to handle the volume, variety, and characteristics of the data within the constraints of our computing environment. Scalable and efficient algorithms are available to ease the computation burden, but they often have unknown statistical properties and are not designed for the purpose of efficient estimation or optimal inference. As well, economic data have unique characteristics that generic algorithms may not accommodate. There is a need for computationally efficient econometric methods as big data is likely here to stay.

## Saturday, August 12, 2017

### On Theory, Measurement, and Lewbel's Assertion

Arthur Lewbel, insightful as always, asserts in a recent post that:

[Related earlier posts: "Big Data the Big Hassle" and "Theory gets too Much Respect, and Measurement Doesn't get Enough"]

The people who argue that machine learning, natural experiments, and randomized controlled trials are replacing structural economic modeling and theory are wronger than wrong.

As ML and experiments uncover ever more previously unknown correlations and connections, the desire to understand these newfound relationships will rise, thereby increasing, not decreasing, the demand for structural economic theory and models.I agree. New measurement produces new theory, and new theory produces new measurement -- it's hard to imagine stronger complements. And as I said in an earlier post,

Measurement and theory are rarely advanced at the same time, by the same team, in the same work. And they don't need to be. Instead we exploit the division of labor, as we should. Measurement can advance significantly with little theory, and theory can advance significantly with little measurement. Still each disciplines the other in the long run, and science advances.The theory/measurement pendulum tends to swing widely. If the 1970's and 1980's were a golden age of theory, recent decades have witnessed explosive advances in measurement linked to the explosion of Big Data. But Big Data presents both measurement opportunities

*and*pitfalls -- dense fogs of "digital exhaust" -- which fresh theory will help us penetrate. Theory will be back.[Related earlier posts: "Big Data the Big Hassle" and "Theory gets too Much Respect, and Measurement Doesn't get Enough"]

## Saturday, August 5, 2017

### Commodity Connectedness

Forthcoming paper here.

We study connectedness among the major commodity markets, summarizing and visualizing the results using tools from network science.

Among other things, the results reveal clear clustering of commodities into groups closely related to the traditional industry taxonomy, but with some notable differences.

Many thanks to Central Bank of Chile for encouraging and supporting the effort via its 2017 Annual Research Conference.

We study connectedness among the major commodity markets, summarizing and visualizing the results using tools from network science.

Among other things, the results reveal clear clustering of commodities into groups closely related to the traditional industry taxonomy, but with some notable differences.

Many thanks to Central Bank of Chile for encouraging and supporting the effort via its 2017 Annual Research Conference.

## Sunday, July 30, 2017

### Regression Discontinuity and Event Studies in Time Series

Check out the new paper, "Regression Discontinuity in Time [RDiT]: Considerations for Empirical Applications", by Catherine Hausman and David S. Rapson. (NBER Working Paper No. 23602, July 2017. Ungated copy here.)

It's interesting in part because it documents and contributes to the largely cross-section regression discontinuity design literature's awakening to time series. But the elephant in the room is the large time-series "event study" (ES) literature, mentioned but not emphasized by Hausman and Rapson. [In a one-sentence nutshell, here's how an ES works: model the pre-event period, use the fitted pre-event model to predict the post-event period, and ascribe any systematic forecast error to the causal impact of the event.] ES's trace to the classic Fama et al. (1969). Among many others, MacKinlay's 1997 overview is still fresh, and Gürkaynak and Wright (2013) provide additional perspective.

One question is what the RDiT approach adds to the ES approach, and related, what it adds to well-developed time-series toolkit of other methods for assessing structural change. At present, and notwithstanding the Hausman-Rapson paper, my view is "little or nothing". Indeed in most respects it would seem that a RDiT study *is* an ES, and conversely. So call it what you will, "ES" or "RDiT".

But there are important open issues in ES / RDiT, and Hausman-Rapson correctly emphasize one of them, namely issues and difficulties associated with "wide" pre- and post-event windows, which is often the relevant case in time series.

Things are generally "easy" in cross sections, where we can usually take narrow windows (e.g., in the classic scholarship exam example, we use only test scores very close to the scholarship threshold). Things are similarly "easy" in time series *IF* we can take similarly narrow windows (e.g., high-frequency asset return data facilitate taking narrow pre- and post-event windows in financial applications). In such cases it's comparatively easy to credibly ascribe a post-event break to the causal impact of the event.

But in other time-series areas like macro and environmental, we might want (or need) to use wide pre- and post-event windows. Then the trick becomes modeling the pre- and post-event periods successfully enough so that we can credibly assert that any structural change is due exclusively to the event -- very challenging, but not hopeless.

Hats off to Hausman and Rapson for beginning to bridge the ES and regression discontinuity literatures, and for implicitly helping to push the ES literature forward.

It's interesting in part because it documents and contributes to the largely cross-section regression discontinuity design literature's awakening to time series. But the elephant in the room is the large time-series "event study" (ES) literature, mentioned but not emphasized by Hausman and Rapson. [In a one-sentence nutshell, here's how an ES works: model the pre-event period, use the fitted pre-event model to predict the post-event period, and ascribe any systematic forecast error to the causal impact of the event.] ES's trace to the classic Fama et al. (1969). Among many others, MacKinlay's 1997 overview is still fresh, and Gürkaynak and Wright (2013) provide additional perspective.

But there are important open issues in ES / RDiT, and Hausman-Rapson correctly emphasize one of them, namely issues and difficulties associated with "wide" pre- and post-event windows, which is often the relevant case in time series.

Things are generally "easy" in cross sections, where we can usually take narrow windows (e.g., in the classic scholarship exam example, we use only test scores very close to the scholarship threshold). Things are similarly "easy" in time series *IF* we can take similarly narrow windows (e.g., high-frequency asset return data facilitate taking narrow pre- and post-event windows in financial applications). In such cases it's comparatively easy to credibly ascribe a post-event break to the causal impact of the event.

But in other time-series areas like macro and environmental, we might want (or need) to use wide pre- and post-event windows. Then the trick becomes modeling the pre- and post-event periods successfully enough so that we can credibly assert that any structural change is due exclusively to the event -- very challenging, but not hopeless.

Hats off to Hausman and Rapson for beginning to bridge the ES and regression discontinuity literatures, and for implicitly helping to push the ES literature forward.

## Tuesday, July 25, 2017

### Time-Series Regression Discontinuity

I'll have something to say in next week's post. Meanwhile check out the interesting new paper, "Regression Discontinuity in Time: Considerations for Empirical Applications", by Catherine Hausman and David S. Rapson, NBER Working Paper No. 23602, July 2017. (Ungated version here.)

## Sunday, July 23, 2017

### On the Origin of "Frequentist" Statistics

Efron and Hastie note that the "frequentist" term "seems to have been suggested by Neyman as a statistical analogue of Richard von Mises' frequentist theory of probability, the connection being made explicit in his 1977 paper, 'Frequentist Probability and Frequentist Statistics'". It strikes me that I may have always subconsciously assumed that the term originated with one or another Bayesian, in an attempt to steer toward something more neutral than "classical", which could be interpreted as "canonical" or "foundational" or "the first and best". Quite fascinating that the ultimate "classical" statistician, Neyman, seems to have initiated the switch to "frequentist".

## Thursday, July 13, 2017

## Sunday, July 9, 2017

### On the Identification of Network Connectedness

I want to clarify an aspect of the Diebold-Yilmaz framework (e.g., here or here). It is simply a method for summarizing and visualizing dynamic network connectedness, based on a variance decomposition matrix. The variance decomposition is not a part of our technology; rather, it is the key

For certain reasons (e.g., comparatively easy extension to high dimensions) Yilmaz and I generally use a vector-autoregressive model and Koop-Pesaran-Shin "generalized identification". Again, however, if you don't find that appealing, you can use whatever model and identification scheme you want. As long as you can supply a credible / defensible variance decomposition matrix, the network summarization / visualization technology can then take over.

*input*to our technology. Calculation of a variance decomposition of course requires an identified model. We have nothing new to say about that; numerous models/identifications have appeared over the years, and it's your choice (but you will of course have to defend your choice).For certain reasons (e.g., comparatively easy extension to high dimensions) Yilmaz and I generally use a vector-autoregressive model and Koop-Pesaran-Shin "generalized identification". Again, however, if you don't find that appealing, you can use whatever model and identification scheme you want. As long as you can supply a credible / defensible variance decomposition matrix, the network summarization / visualization technology can then take over.

## Monday, July 3, 2017

### Bayes, Jeffreys, MCMC, Statistics, and Econometrics

In Ch. 3 of their brilliant book, Efron and Tibshirani (ET) assert that:

Jeffreys’ brand of Bayesianism [i.e., "uninformative" Jeffreys priors] had a dubious reputation among Bayesians in the period 1950-1990, with preference going to subjective analysis of the type advocated by Savage and de Finetti. The introduction of Markov chain Monte Carlo methodology was the kind of technological innovation that changes philosophies. MCMC ... being very well suited to Jeffreys-style analysis of Big Data problems, moved Bayesian statistics out of the textbooks and into the world of computer-age applications.Interestingly, the situation in econometrics strikes me as rather the opposite. Pre-MCMC, much of the leading work emphasized Jeffreys priors (RIP Arnold Zellner), whereas post-MCMC I see uniform at best (still hardly uninformative as is well known and as noted by ET), and often Gaussian or Wishart or whatever. MCMC of course still came to dominate modern Bayesian econometrics, but for a different reason: It facilitates calculation of the

*marginal*posteriors of interest, in contrast to the*conditional*posteriors of old-style analytical calculations. (In an obvious notation and for an obvious normal-gamma regression problem, for example, one wants posterior(beta), not posterior(beta | sigma).) So MCMC has moved us toward marginal posteriors, but moved us away from uninformative priors.
Subscribe to:
Posts (Atom)