BadEconomics Discussion Thread, 25 February 2016

Integralds · 31分前

WARNING: tl;dr post incoming (700 words)

I want to talk about replication. We've talked about replication before, but my views have evolved slightly.

Suppose that some researcher X has written a paper. What does it mean for you to "replicate" the results of that paper in economics?

Assume that the paper is using observational public-use data like the Penn World Table, or World Development Indicators, or FRED, or something like that. Something that is maintained, easily accessible, updated over time, and possibly subject to revisions.

Some public official (or your advisor) asks you to "replicate" the results in their Figures and Tables. What does that mean? What should it mean?

Here are the levels of replication I would distinguish among. They're not quite nested, but there is a clear theme as you go down the line.

The author supplies you with their data. The author supplies you with their code. You run their do-file.
- This is mere verification. Does the author's code and data produce the tables they say it does? It's important, but it's not replication.
The author supplies you with their data. You write your own code, based on instructions in the paper plus any print or online appendicies. You run your code on their data to attempt to match their tables.
- This is verification+. Does the paper contain sufficient instructions to compute the numbers found in the authors' tables?
The author supplies you with their code. You construct the dataset based on instructions in the paper plus any print or online appendicies. You run their code on your data.
- This is verification+. Does the paper contain sufficient instructions to construct the dataset the author used?
The author dies and can't give you anything. You construct the dataset based on instructions in the paper plus any print or online appendicies. You write code based on based on instructions in the paper plus any print or online appendicies.
- You try to get as close to the author's data as possible. So if they use GDP from 1960 to 1995, and use the 1997 NIPA tables, you go to ALFRED and get GDP from 1960-1995, using the 1997 vintage.
- This is replication. You've followed all the instructions in the paper; can you get the same numbers they did?
Same as (4), but:
- You use the most recent vintage of the data the author used. So if they use GDP from 1960 to 1995, obtained from the 1997 NIPA tables, you go to FRED and get GDP from 1960-1995, using the current vintage.
- This is replication+. Do the author's results survive data revisions?
Same as (5), but:
- You use the most recent vintage of the data the author used, plus any more recent data. So if they use GDP from 1960 to 1995, obtained from the 1997 NIPA tables, you go to FRED and get GDP from 1960-2016, using the current vintage.
- This is replication+. Do the author's results survive data revisions and extending the sample?
- You could even run the authors' regressions with 1960-1995 data, then 1995-2016 data, then 1960-2016 data, and perform standard Chow tests for parameter stability.
Same as (6), but:
- You use as close of an analogue as possible to the author's data, but not the same source. So if they use GDP across countries, 1960 to 1985, from the Penn World Table 3.0, you use GDP across countries, 1960 to 1985, from the 2016 edition of the World Development Indicators.
- This is replication++. Do the author's results survive data revisions and extending the sample and using different measurements of the same underlying data?
- You could do the same things the authors do, but with the new data source, and perform Chow tests across the two data sources.

Clearly there is a gap between (3) and (4). For (1) to (3), the author gives you something; for (4) onwards, you construct everything based on instructions in the paper and any print or online appendicies. Clearly as you go from (4) to (7), you move into a fuzzy region between "replication" and "robustness checks."

Usually when I'm replicating, I mean something like 5 and 6. Say Mankiw, Romer, and Weil (1992) uses cross-country data, 1960-1985, from the PWT 1.0. My first hunch at a replication would be to go to the current PWT, extract the 1960-1985 data, and run regressions on that data. Then I try 1960-2015 to see if anything changes. Then I may try 1985-2015 to see if the sample split is important.

What am I missing from this schema?

How should I modify the schema for experimental or RCT-derived datasets?

badeconomics

Related Subreddits

Some self deprecating humour

Want to Chat?

調停者