Jump to ContentJump to Main Navigation
Handbook of Experimental Economic Methodology$

Guillaume R. Fréchette and Andrew Schotter

Print publication date: 2015

Print ISBN-13: 9780195328325

Published to Oxford Scholarship Online: March 2015

DOI: 10.1093/acprof:oso/9780195328325.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (oxford.universitypressscholarship.com). (c) Copyright Oxford University Press, 2020. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in OSO for personal use.  Subscriber: null; date: 19 October 2020

The Lab and the Field: Empirical and Experimental Economics

The Lab and the Field: Empirical and Experimental Economics

(p.407) 19 The Lab and the Field: Empirical and Experimental Economics
Handbook of Experimental Economic Methodology

David Reiley

Oxford University Press

Abstract and Keywords

This chapter comments on two articles, by Kagel and by Harrison, Lau, and Rutstrom. These articles emphasize “control” as one of the most important aspects of experimental economics. By contrast, the chapter suggests that “control” is not always an unambiguously good thing. There are three reasons why control might be undesirable. First, many economic decisions take more time to reach than the typical time limits in a laboratory session. Second, despite careful laboratory protocols to prevent communication, people often do talk to others when making real-world decisions. Third, we do not know whether economic theory is correct in its primitive assumptions, such as modeling charities as public goods or (b) bidder values to be privately known in auctions. If it turns out that such features matter for economic decision-making, we will only know the truth by going outside the traditional laboratory setting.

Keywords:   experimental economics, laboratory experiments, field experiments, economic theory


I AM grateful for this opportunity to reflect on the purpose of experimentation in economic science. I have been asked to comment on two articles, by Kagel and by Harrison, Lau, and Rutstrom. These articles express a sentiment, shared with a number of other laboratory experimentalists, emphasizing “ control” as one of the most important aspects of experimental economics. By contrast, when I think of experiments, I think about a source of exogenous variation in data.

My difference in perspective has something to do with my background and training. As a graduate student, I trained as an empirical economist. One of the most important achievements of applied econometrics in recent decades is a deep understanding of the difficulty of causal inference: using observational data to infer causality rather than mere correlation. Empirical labor economists have been exploiting natural experiments for this purpose for several decades. When I learned about experimental economics circa 1995, I enthusiastically joined the experimental fold, seeing experiments as an opportunity to create exogenous variation in situations where we weren’t lucky enough for Nature to have provided us with a natural experiment. Over the past 15 years, the profession has accumulated hundreds of (p.408) examples of field experiments, a majority of them conducted by people who, like me, trained in empirical (observational) economics.

By contrast, I observe that the majority of laboratory experiments have been conducted by scholars who trained in economic theory, laboratory methods, or both. We field experimentalists are indebted to these laboratory experiments for paving the way and for demonstrating to us that economics is not merely an observational science (as had been assumed for many decades) but also an experimental one.

Field experimentalists are now taking this important message to other empirical economists. Our most important dialogue is with traditional empiricists, demonstrating to them the feasibility of experiments for answering important questions of measurement and causal inference. Which auction format would raise more revenue for the types of collectibles sold on eBay? Which fundraising strategy would raise more revenue for a public-radio station? Does advertising cause incremental sales? What is the price elasticity of demand for soda? Does an amusement park truly set the monopoly price for soda within the park? What is the magnitude of racial discrimination against African-American job applicants? What are the productivity impacts of incentive-compensation schemes for salespeople? These are questions that previously would have been answered only by analysis of observational data, which is usually woefully inadequate. (How often does the U.S. Forest Service change its timber-auction format? And on the rare instance when they do, how do we know that the auction format is uncorrelated with the quality of the timber being auctioned?) In my view, field experiments are about the creative use of experiments in situations where economists previously assumed their science was purely observational.

“ Lack of control” was one of the earliest complaints I heard about field experiments. And I agree: certainly a field experiment provides less control than a laboratory experiment. In auction field experiments, we do not get to control the bidders’ values for the good, nor do we even get to observe them directly. Thus, the laboratory provides the valuable ability not only to manipulate the environment, but even to observe it. Field experimentalists know that we give up valuable control in order to get increased realism, such as when we have subjects bid for real goods instead of induced values.

Because of my interest in engaging in dialogue with empirical economists, I do not always see “ control” as an unambiguously good thing. Here are three reasons illustrating why control might be undesirable. First, many economic decisions take more time to reach than the typical time limits in a laboratory session. Second, despite careful laboratory protocols to prevent communication, people often do talk to others when making real-world decisions. Third, we do not know whether economic theory is correct in its primitive assumptions, such as modeling charities as public goods, or bidder values to be privately known in auctions. If it turns out that such features matter for economic decision-making, we will only know the truth by going outside the traditional laboratory setting.

(p.409) John Kagel: “ Laboratory Experiments”

John Kagel’s paper provides a great summary of the relationship of the laboratory to the field on two important topics: the winner’s curse and gift exchange. On the winner’s curse, Kagel draws lessons from experiments comparing students with professional construction bidders, and he also makes comparisons to available field data. On gift exchange, Kagel compares laboratory experiments with field experiments, where the field experiments differ from the lab in offering payment for work done outside the laboratory.

Regarding the winner’s curse, Kagel reviews his own work, dating back to Dyer et al. (1989) in which he and his coauthors recruited professional construction bidders to participate in abstract laboratory experiments involving common-value auctions. This extremely innovative work was the first example I am aware of to make a serious effort to examine the importance of subject pools in laboratory experiments.

Because of their professional experience estimating anticipating construction costs and bidding based on their estimates, we might expect these subjects to behave very differently from students concerning the winner’s curse. However, the experiments show that, if anything, the professional bidders bid even more aggressively than students, falling even more prey to the winner’s curse overall.

With hindsight, we can see that the abstract task of the laboratory is a rather different task than what the professional construction managers face in practice, as documented in Dyer and Kagel (1996). In the real world, a number of other concrete factors affect managers, but are missing from the abstract environment of the lab. First, managers have a feel for their own capacity for error in estimating the cost of a real construction job, but they have no experience dealing with the abstract “ signals” in a laboratory common-value experiment. Second, in real construction auctions it turns out that bidders who significantly underestimate costs are often able to withdraw their bids without penalty after the auction results are realized, but such protection is not offered to bidders in the abstract laboratory setting. Third, the common-value component in construction auctions appears (upon investigation) to be much less important than theorists might have assumed; instead, privately known differences in firms’ unused capacity play a much larger role than the uncertainty of their estimates of uncertain job cost.

As Kagel notes, learning about the importance of these institutional features of the construction industry was a major benefit to the research program comparing professionals to students in the lab. This is a key role of both experimental and empirical economics. Theorists often are unable to say exactly how well their models apply to real-world situations. For auctions in particular, it is very difficult to know how much private uncertainty in values exists in a given auction market; hence it is difficult to know whether private-value or common-value models are (p.410) more applicable. Theorists have commonly cited construction as a clear example of a market with a significant common-value component, but upon further empirical and experimental analysis this becomes much less clear. I wish to highlight the importance of collecting real facts about markets and behavior, as well as cite the work by Kagel and coauthors as a prime example of a research program that does just that.

I also want to bring attention to the important role of context in laboratory experiments. Construction bidders showed no better mastery of the winner’s curse than students did in an abstract experiment. The bidding skills the professionals have in the field, where we believe they know something about their ability to underestimate costs, do not necessarily transfer to a task where they are asked to understand the variance of an abstract probability distribution.

That is, context matters quite a bit, in the sense that people find it much easier to make good decisions when the setting is more concrete and more familiar. This reinforces results from psychology, notably the Wason (1966) task, where concrete details (such as “ Are all the females old enough to drink?”) enable subjects to understand a logic puzzle much more readily than in a completely abstract setting (such as “ Do all the even-numbered cards have a consonant on the back?”). In Levitt et al. (2010), my coauthors and I show that professional poker players and professional soccer players, despite their experience in games where mixing of strategies is important to success, show just as much deviation from mixed-strategy Nash equilibrium as do student subjects. Because of our belief in the importance of context, we set out to replicate the surprising results of Palacios-Huerta and Volij (2008) that professional soccer players played exactly the Nash proportions in abstract mixed-strategy matrix games. We were able to show that Nash equilibrium is a decent predictor of the average mixing proportions, but unlike in Palacios-Huerta and Volij (2008), we find that professionals have just as much variance away from Nash equilibrium as student subjects do. As Kagel notes, “ one should not automatically expect economic agents who are fine tuned to the field settings they typically operate in to be able to adjust to experimental environments that deprive them of their typical contextual cues.”

This is one reason why I have been so excited to promote field experiments during my career. By doing field experiments in real markets for real goods and services, we can ensure that subjects have a familiar context and are more likely to behave naturally. Learning about any new economic environment typically takes place only slowly. I can easily find numerous examples of slow learning from my own life, each of which cost me thousands of dollars’ worth of utility. I bought three houses using my real estate agent’s recommended mortgage lender before it occurred to me that I should shop around for the lender with the best interest rate. I underestimated the benefits I would have received from a prenuptial agreement in my second marriage, in part because my first divorce had been so amicable. When I changed employers, I bought a bare-bones health insurance plan, not realizing that my preferred doctor would subsequently drop that plan because (p.411) it was not financially generous enough to him. Of course, one hypothesis is that I am unrepresentatively slow to understand new economic decision problems, but I believe that the experimental data on the importance of context allow us to reject that hypothesis.

Regarding gift exchange, Kagel’s second example about interplay between the laboratory and the field, I believe that the laboratory experiments in this area have been extremely interesting and important in documenting reciprocity at odds with standard noncooperative game theory. I also believe that field experiments have been very important for probing the limits of the laboratory results, finding that positively reciprocal behavior can diminish over timescales longer than the typical laboratory session. Kagel’s review has convinced me that this issue is far from settled, and we need much more research in this area, in a greater variety of field settings.

Kagel also gives a great summary of the advantages of laboratory experiments, which I think bears repeating here. Much less is observable in a laboratory experiment than in a field experiment: the cost of effort, the value of increased of effort to the employer, and the beliefs that employees have about the game that they are playing. There are also more options available to the experimenter, in the sense that once subjects are in the laboratory, we can ask them if they would like to do “ work” at wages that would be unacceptably low in a field experiment, either for legal reasons (minimum wage) or for ethical reasons (IRB) or both. I particularly like Kagel’s point that we need to think carefully about the baseline level of wages in future work, because a higher-than-normal wage may already be eliciting positive reciprocity in the control group of a field experiment, even before we implement a treatment that surprises some subjects with even higher wages.

I also like Kagel’s observation that, unlike in the laboratory, workers in field settings typically can respond to wage offers along multiple dimensions (quality, quantity, etc.). Kagel assumes that this is to be a disadvantage of field experiments relative to the lab, but it can also be seen as an advantage. If workers in the real world can respond along multiple dimensions, then many things can happen. For example, workers might have some confusion and contribute positively reciprocal actions along a dimension that turns out to have little value for the firm, thereby diluting the positive effect of paying “ efficiency wages” relative to what we might estimate in the laboratory setting. It is absolutely an advantage of the laboratory to be able to create simplified, abstract environments that help us isolate individual behavioral effects. However, the richness of the real world demands that we complement our laboratory experiments with field experiments that allow us to estimate causal effects even in the messy real world. Laboratory experiments allow us to isolate precise causal effects on behavior in the simplified models assumed by theory, while field experiments allow us to examine whether the theoretical models have captured the most important aspects of a messy reality.

I agree with Kagel that laboratory experiments have documented the importance of reciprocity in economic transactions and that this provides (p.412) valuable information to theorists writing macroeconomic models involving sticky-downward wages. As an empirical economist, I would like to go even further and begin to measure the extent to which we should expect wages to be sticky in the real-world economy. The field experiment of Gneezy and List (2006) has made a first attempt in that direction. Knowing what fraction of subjects engage in positive or negative reciprocity in gift-exchange laboratory experiments does not get us very far toward understanding how far labor demand can decrease in the real world before wages must begin to fall.

Kagel is pessimistic about prospects for greater verisimilitude, asserting that “ the very structure of these experiments, both laboratory and field studies, cannot approach the target environment of ongoing labor relations within firms” and that being fully faithful to the real-world environment is “ an impossibility under any circumstances.” (Harrison et al. (2015) similarly express in their paper, in a different context, “ There is simply no way to run a naturally-occurring field experiment in this case.”) However, I am more optimistic. Bandiera et al. (2005) have already accomplished some very nice field experiments on employee compensation within ongoing firms. More generally, we have been making big, creative advances in the use of field experiments in recent years, conducting real-world experimental measurements in many areas that would have been assumed impossible just 15 years ago: auctions, charitable giving, incentive compensation, and economic-development policy are a few notable examples. I wouldn’t be surprised if a clever researcher manages to design a field experiment that gets us much closer to measuring real-world labor demand. I look forward to additional field experiments that may get us closer to an answer to this messy, but important, question.

Harrison, Lau, and Rutstrom: “ Theory, Experimental Design and Econometrics are Complementary”

Let me begin by noting, for the record, my (long-standing) disagreement with these authors about the use of the term “ field experiment.” Field experiments have been the passion of my career. I have invested considerable effort in persuading other economists that experiments in real markets with real goods are an important technique for learning about the world. It wasn’t easy, at first, to get people to take the work seriously. Preston McAfee, who recruited me recently to Yahoo! Research, confessed that when I was doing auction field experiments in graduate school, he and many others thought that I was “ just playing games.” He said that it took him a number of years to recognize the importance of field experiments, but he is now a great admirer of this research technique. I feel gratified that economists have now begun to see field experiments as an important part of the economist’s toolkit.

(p.413) I disagree with the usage, by these authors among others, of “ field experiments” to refer to induced-value experiments that were performed using a portable laboratory in the “ field” rather than a stationary laboratory at a university. What Harrison et al. call “ field experiments,” I prefer to call “ lab experiments with subject pools other than university students,” the same style of experiment that Dyer, Kagel, and Levin did with construction bidders back in 1989. I think this style of research is valuable, but I felt that the nomenclature diluted the term “ field experiment,” which I, and others, had been promoting as experiments with real transactions in naturally occurring markets.

My esteemed colleague John List, who largely shares my sentiments, coauthored with Harrison an influential paper categorizing types of field experiments (Harrison and List, 2004). I gather that they initially disagreed strongly over how to define a field experiment, but they finally settled (compromised?) on a taxonomy of three main types of field experiments. “ Artifactual field experiments” refers to what I call portable-lab experiments, while “ natural field experiments” refers to the sorts of studies I call field experiments. “ Framed field experiments” are intermediate between the two, such as an experiment involving bidding on a commercial auction site for artificial goods with values induced by the experimenter.

Harrison and List have been quite influential with their taxonomy, as I now see many papers published with Harrison and List’s “ natural field experiment” nomenclature in their titles. However, I disagree with the nomenclature. First, I think that “ artefactual field” and “ natural field” are too cumbersome, when we could easily replace them with “ field experiment” and “ portable-lab experiment” (or “ lab with Danish subjects aged 25–75,” as in Harrison et al. (2002)). Second, I think that these two research strategies have fundamentally different intentions, and to use the same term to describe them only serves to confuse matters. A “ natural field experiment” is designed to go into an existing market and make a manipulation that generates exogenous variation, in order to solve the problems of observational economists where Nature has not provided a good experiment. These also include various policy experiments (on educational policy, crime, job-training programs, etc.). An “ artifactual field experiment” is designed to take protocols developed in the lab and experiment on different subject pools in order to explore robustness and document heterogeneity. Both share a commitment to using interventions to generate useful data, but they differ greatly in technique and intended purpose. Using “ field experiment” to describe both activities increases confusion, and it requires people to use extra words to reduce the confusion. (Unfortunately, “ natural field experiment” also runs the risk of confusion with “ natural experiment,” despite the important distinction that a field experiment involves deliberate treatment assignment by the researcher while a natural experiment cleverly takes advantage of a situation where Nature ran the experiment for us.)

Third, I disagree about the decision to lay out three categories in the taxonomy, instead of just making a contrast between two styles. What I love about the discussion of “ framed field experiments” is the explicit observation that there are (p.414) many ways to be intermediate between a field experiment and a lab experiment: task, context, institutions, induced values, and even whether or not the subjects know they are being experimented on. All of these various features of the transactions are worth exploring for the different things they can teach us in the trade-off between realism and control. What I dislike is the attempt to make arbitrary distinctions to delineate the boundaries of the intermediate category. For example, I like to think of my dissertation research on auctions as a canonical example of a field experiment, but the Harrison–List taxonomy dictates calling that work a “ framed field experiment”—it doesn’t qualify for the full “ natural field experiment” distinction because the MIT Human Subjects Committee required me to let all my online-auction bidders know that they were in an experiment. The variety of ways that experiments can get classified into the “ framed field experiments” category makes them very different from each other, which makes that category less than useful. In my view, the concept of a continuum between laboratory and field experiments is really valuable, particularly as we note that the continuum has multiple dimensions to it. What’s not helpful is trying to put some papers into a precise intermediate box, because that box contains more differences than similarities, in my opinion. This further muddies the water about what a field experiment is really about.

Despite the relatively wide acceptance of the Harrison–List taxonomy, I prefer to call my experiments either “ field experiments” (most of my work) or “ laboratory experiments with professional soccer players” (Levitt List and Reiley 2010). To me, the important part of the work with soccer players is not that we went out into the field and played games with these players in their team locker room, but rather that we recruited laboratory subjects with a particular kind of experience in playing games. The important characteristics of my field experiments on auctions, charitable fundraising, and advertising is that we have been able to use them to generate exogenous data to estimate causal effects of various real-world policy changes (auction format, solicitation strategy, advertising intensity). For that reason, I think that “ natural field experiments” are much more different from typical laboratory experiments than are “ artifactual field experiments.” I would like to return to a simple distinction between “ field experiments” and “ lab experiments.” In this framework, “ field experiments” would include all “ natural field experiments” and most “ framed field experiments,” while “ laboratory experiments” would include most “ artifactual laboratory experiments” and perhaps a few “ framed field experiments.” Laboratory experiments with unusual subject pools, or various other innovations, could refer to themselves as such. There will be a gray area in the middle that defies clear classification, but in those cases I don’t mind whether we call it a field experiment or a lab experiment. My goal with this proposal is to use terms that are simple, clear, and evocative. I think that labels matter, and I prefer to see them be as useful as possible.

Despite my disagreements about taxonomy, I can certainly find agreement with Harrison et al. when they say “ whether these are best characterized as being lab experiments or field experiments is not, to us, the real issue: the key thing is to see (p.415) this type of experiment along a continuum taking one from the lab to the field, to better understand behavior.” I completely concur. I like to think that we can use a simpler taxonomy without obscuring that important point, which I first learned from Harrison and List (2004). I do think that labels matter and are useful.

Now for the substance of the paper. First, I really like the iMPL technique for eliciting risk preferences. I think that this iterative procedure is a major advance over the previous technique of using a fixed set of lottery choices to infer risk preferences. Assuming that the subjects behave relatively consistently across choices, this technique gives us the ability to zoom in on a more precise measurement of an individual’s risk attitudes.

Second, I consider it an important observation that consistent estimates of time preference depend on measurements of individuals’ risk preferences. If we assume risk neutrality when estimating time preferences (as is common practice), we find that people are much more impatient than if we allow them to be risk averse, and estimate their risk aversion using lottery choices. By the way, as far as I can tell, the dependency only goes in one direction: we should be able to estimate risk preferences just fine without having to estimate time preferences. (I found the paper somewhat unclear in its discussion of this issue.) The paper also makes the important point that correctly estimating standard errors on the discount rate requires joint estimation of risk and time preference parameters, so that any uncertainty about risk preference will correctly result in larger estimated uncertainty in the discount rate.

Next, I would like to step back from the trees of structural preference estimation and take a look at the forest. What are we really measuring when we measure risk aversion with an experiment? Are we measuring utility over income in the experiment? Utility over annual income? Utility over different wealth levels? Though not explicitly assumed in this work, researchers often assume implicitly that different types of risk preferences are the same. It may well be that what we care about for policy questions is the risk aversion over large changes in permanent wealth; but since we can’t easily manipulate permanent wealth in the laboratory, we extrapolate from laboratory experiments on income and assume the parameters we estimate are relevant in the situations we care about. I want us to be more aware of these assumptions we’re making.

I agree strongly with a comment made by Lise Vesterlund earlier in this conference: economic theory’s comparative statics are much easier to verify with experiments than are point predictions. Therefore, I believe that Harrison et al. are likely pushing too hard on structural estimates (point predictions) of risk preferences, assuming that we can go a very long way toward answering important welfare questions using measures elicited from experiments. Before I am willing to believe in the welfare estimates, I need some reassurance that the required extrapolation is valid. One kind of evidence that could convince me would be a demonstration that we can use estimates from lottery-choice experiments to predict any sort of real-world choice behavior. Can we use this sort of risk-preference (p.416) estimate to predict life insurance purchases? Lottery ticket purchases? Deductibles chosen in automobile insurance policies? Investment choices in retirement accounts? I have not yet seen any evidence that we can use laboratory data to predict real-world transactions. If such evidence could be collected, I would get much more excited about the preference-elicitation research program.

To put things slightly differently, Harrison et al. are quite devoted to structural estimation of preference parameters. I am always wary of structural estimation, because such research tends, in my opinion, to rely too heavily on the particular parametric models chosen. The two-parameter expo-power model of risk preferences must be wrong at some level, because it’s only a model. We just don’t know how good or bad an approximation it is to the truth. Even though we go to great lengths to estimate accurate standard errors given the model, we don’t ever inflate our standard errors to the extent that we are uncertain about the model itself (probably because we don’t yet have a good mathematical procedure for incorporating specification uncertainty).1 Thus, all structural parameter estimates are based on a large degree of faith in the model, and thus I prefer to maintain a high degree of skepticism about the estimates. I would be persuaded that we’re really measuring the right thing if we could take structural point estimates and validate them by testing some kind of out-of-sample prediction. In this case, that could look like using structural estimates of risk and time preferences from laboratory-style elicitation experiments and using them to predict behavior in insurance choices, investment choices, education choices, and so on. Perhaps some future research program will employ clever field experiments (the sort that involve natural transactions) to test the structural estimates from laboratory-style experiments on the same individuals and see how well we can use them to extrapolate to other types of economic decision-making.

Final Observations: Theory Testing, Measurement, and Research Complementarities

What do field experiments contribute to theory testing? As noted above, laboratory experiments generally take a theory very seriously, impose all its primitives (independent private values, convex costs, etc.), and investigate whether behavior is consistent with predictions. I believe that laboratory experimentalists sometimes lose sight of the fact that theory is an abstraction and that the theory can miss a crucial detail that may change predictions. For example, when a charity raises money in a capital campaign for a threshold public good, will they really burn the money if they fail to reach their stated goal? Does auction theory accurately model bidder behavior on eBay, or does the existence of multiple competing auctions cause bidders to (p.417) bid differently? In a field experiment, we jointly test the primitives and the decision making process, to see how accurately a theory’s predictions hold in the real world.

What do field experiments contribute to measurement and policy? A particularly brilliant idea in development economics, promoted by Michael Kremer and Esther Duflo, has been to randomize proposed policy treatments in order to figure out what actually works to promote health, education, income, and so on. It is hard for me to imagine an economic laboratory experiment being able to measure credibly the effects of medically de-worming schoolchildren on their educational attainment (Miguel and Kremer, 2003), or to what extent providing free mosquito netting can improve children’s health outcomes (Cohen and Dupas, 2010).

Field experiments are clearly not a replacement for laboratory experiments on these topics. Rather, they open up new lines of inquiry on important economic questions that would be difficult to answer with a laboratory experiment. I can think of many other microeconomic examples of this sort. What is the optimal tax rate on cigarettes? How much does advertising affect consumer purchases? How do consumers react to prices with the last digit 9 versus the last digit 0? What is the elasticity of labor supply with respect to wages?

Often there will be productive interaction between laboratory and field experiments; one illustrative example comes from my own work. In Lucking-Reiley (1999), I used online field experiments to test revenue equivalence between auction formats. By contrast with the laboratory experiments of Cox et al. (1982), I found that I raised more money in a Dutch (declining-price) auction than in a first-price auction. There were a number of differences between my field experiments and the previous laboratory experiments, and it was not clear which caused the difference in revenues. Katok and Kwasnica (2008), in a subsequent laboratory experiment, managed to isolate one of these differences and show that it could explain the results. In my online auctions, I ran very slow Dutch auctions in order to fit the institutional features of the market: prices decreased just once per day, instead of the six-second interval previously used in the laboratory. Katok and Kwasnica showed that they could reverse the revenue rankings of the two auction formats in the laboratory by slowing the Dutch auction down. Thus, the field experiment led to a new discovery based on real institutional features, and the laboratory experiment was able to isolate an important cause of the field results.

I wish to close by stating how much I agree with a very important sentiment expressed in both of the papers I have been asked to discuss. Field experiments and lab experiments are extremely complementary to each other. Laboratory experiments allow us to observe and manipulate far more variables in service of testing theory. Field experiments, by getting us involved in real-world transactions, help us measure economic behavior in more realistic contexts.

I believe I speak for most field experimentalists when I acknowledge my debt to laboratory experimentalists, who have taught us to think of economics as an experimental science rather than a merely observational one. I believe that field experiments have the potential to engage the entire economics profession in learning (p.418) more from experiments in both the field and the laboratory. For some time, laboratory experiments have successfully engaged theorists in exploring more realism in decision-making processes. Now, by demonstrating their ability to generate exogenous variation in real-world situations, field experiments are getting the attention of empirical economists who have largely ignored experiments in the past, in their assumption that “ real” economics was only an observational science. I feel strongly that improved dialogue between experimentalists and econometricians will lead to a much deeper understanding of economic behavior, and I urge other experimentalists to join me in promoting that dialogue.



Bibliography references:

Bandiera, O., I. Barankay, and I. Rasul. 2005. Social Preferences and the Response to Incentives: Evidence from Personnel Data. Quarterly Journal of Economics 120917–962.

Cohen, J. and P. Dupas. 2010. Free Distribution or Cost-Sharing? Evidence from a Randomized Malaria Prevention Experiment. Quarterly Journal of Economics 1251–45.

Cox, J. C., V. L. Smith, and J. M. Walker. 1982. Tests of a Heterogeneous Bidders Theory of First Price Auctions. Economics Letters 12207–212.

Dyer, D. and J. H. Kagel. 1996. Bidding in the Common Value Auctions: How the Commercial Construction Industry Corrects for the Winner’s Curse. Management Science 421463–1475.

Dyer, D., J. H. Kagel, and D. Levin. 1989. A Comparison of Na"ve and Experienced Bidders in Common Value Offer Auctions: A Laboratory Analysis. Economic Journal 99108–115.

Gneezy, U. and J. List. 2006. Putting behavioral Economics to Work: Field Evidence of Gift Exchange. Econometrica 741365–1384.

Harrison, G. W. and J. A. List. 2004. Field Experiments. Journal of Economic Literature 421009–1055.

(p.419) Harrison, G. W., M. Lau, and E. E. Rutstrom. 2015. Theory, Experimental Design and Econometrics Are Complementary (And So Are Lab and Field Experiments). In Handbook of Experimental Economic Methodology, eds. G. Fréchette and A. Schotter, pp. 296–338. New York: Oxford University Press.

Harrison, G. W., M. Lau, and M. Williams. 2002. Estimating Individual Discount Rates in Denmark: A Field Experiment. American Economic Review 921606–1617.

Kagel, J. H. 2015. Laboratory Experiments. In Handbook of Experimental Economic Methodology, eds. G. Fréchette and A. Schotter, pp. 339–359. New York: Oxford University Press.

Katok, E. and A. M. Kwasnica. 2008. Time is Money: The Effect of Clock Speed on Seller’s Revenue in Dutch Auctions. Experimental Economics 11344–357.

Levitt, S. D, J. A. List, and D. Reiley. 2010. What Happens in the Field Stays in the Field: Exploring Whether Professionals Play Minimax in Laboratory Experiments. Econometrica 781413–1434.

Lucking-Reiley, D. 1999. Using Field Experiments to Test Equivalence Between Auction Formats: Magic on the Internet. American Economic Review 891063–1080.

Miguel, E. and M. Kremer. 2003. Worms: Identifying Impacts on Education and Health in the Presence of Treatment Externalities. Econometrica 72159–217.

Palacios-Huerta, I. and O. Volij. 2008. Experientia Docet: Professionals Play Minimax in Laboratory Experiments. Econometrica a76a75–115.

Wason, P. C. 1966. Reasoning. In New Horizons in Psychology, ed. B. M. Foss. Harmondsworth: Penguin.


(1) To their credit, the authors do relax their model with an analysis of a mixture model. But even this requires structural assumptions that, to my mind, are untested. My hope is that experiments will eventually nail down the structural regularities that we can rely on as tools in future estimation, but I don’t think we’re even close to this point yet.