NEW UPDATE: Completed trials may now upload and register supplementary documents (e.g. null results reports, populated pre-analysis plans, or post-trial results reports) in the Post Trial section under Reports, Papers, & Other Materials.
Housing Expectations and Market Behavior
Last registered on September 17, 2019


Trial Information
General Information
Housing Expectations and Market Behavior
Initial registration date
July 17, 2019
Last updated
September 17, 2019 4:09 AM EDT

This section is unavailable to the public. Use the button below to request access to this information.

Request Information
Primary Investigator
Other Primary Investigator(s)
PI Affiliation
Additional Trial Information
In development
Start date
End date
Secondary IDs
We designed a field experiment to measure how housing price expectations affect real estate transactions in the United States. Our motivation stems from the fact that home price expectations play a prominent role in many accounts of the housing boom that occurred during the mid-2000s. We will launch a large-scale high-stakes information experiment with 60,000 households who have recently listed their houses for sale.
External Link(s)
Registration Citation
Bottan, Nicolas and Ricardo Perez-Truglia. 2019. "Housing Expectations and Market Behavior." AEA RCT Registry. September 17. https://doi.org/10.1257/rct.3663-3.0.
Former Citation
Bottan, Nicolas, Ricardo Perez-Truglia and Ricardo Perez-Truglia. 2019. "Housing Expectations and Market Behavior." AEA RCT Registry. September 17. http://www.socialscienceregistry.org/trials/3663/history/53510.
Experimental Details
We will send letters to a sample of 60,000 individuals who have listed their houses on the market. Subjects will be randomly assigned to different types of letters, each containing different signals about the evolution of home prices. We will use publicly available information to measure the effects of the information contained in the letters on individual market behavior (e.g., whether the property was sold).
Intervention Start Date
Intervention End Date
Primary Outcomes
Primary Outcomes (end points)
The primary outcomes correspond to the market behavior of the subjects.
Primary Outcomes (explanation)
The market outcomes are: i. the time elapsed from the day of letter delivery until the time the property was sold; ii. the price at which the house was sold.

The variable "time until sale" will be right-censored at the time we collect the data (e.g., if a house has not been sold 3 months after the intervention, we do not know if it will be sold 3, 4, 6... months after the intervention). We can use standard methods to account for the censoring. The variable "price sold" will also be censored for some subjects, because the price is observed only if the home is sold. We will use standard methods to deal with this censoring.

Additionally, there are two "intermediate" forms of behavior related to the main outcomes that we may be able to study: i. Whether the property was actively taken off the market; ii. the listing price.
Secondary Outcomes
Secondary Outcomes (end points)
Survey measures on housing price expectations.
Secondary Outcomes (explanation)
What we really care is whether changes in housing price expectations affect market behavior. The effects of the experimental information on behavior is the "reduced form" regression. We will collect survey data on housing price expectations to estimate the "first stage" regression: i.e., the effect of the experimental information on housing price expectations.

We will collect survey data in two ways. First, the letter sent to the subjects will also include a URL to complete an online survey. A sample of this online survey is attached to this registration. This survey includes questions about the expected future median price in the ZIP code of the respondent (1 year ahead and 5 years ahead).

There is a high likelihood that this survey data will not be useful. First, based on similar surveys, the response rate to the online survey is probably going to be very low. Also, it is possible that treatments affect the response rate, thus challenging the internal validity of the analysis of the survey data. Relatedly, it is possible that the sample of subjects responding to this survey is highly unrepresentative. For these reasons, we will conduct a complementary survey experiment with an auxiliary sample (Amazon Mechanical Turk workers). In that auxiliary survey, we will not be able to measure the primary outcomes (i.e., market transactions), but we will be able to measure the secondary outcomes (i.e., survey expectations). We will deploy that auxiliary survey at around the same date in which we will send the letters. And we will also attempt to conduct a follow-up survey with this auxiliary sample a month after the baseline survey. Please find attached screenshots of the baseline and follow-up auxiliary surveys.
Experimental Design
Experimental Design
This is an information provision experiment. Subjects will be randomly assigned to different treatments, which result in different signals about the future house prices being included in the letter.

In a deceptive design, we would just randomize the signal given to the subjects: e.g., we tell them with 50% probability that the housing prices will increase by 1%, and we tell them with 50% probability that the housing prices will increase by 10%. Instead, we use a non-deceptive design: we randomize the subject into one of multiple valid signals about the home price dynamics (e.g., forecasts produced by different econometric models).

To generate the non-deceptive variation in signals, subjects are randomly assigned to different treatment types (and to sub-treatments within some of these types). Samples of each letter type (and sub-types) are attached to this application. These letter types are identical in every respect except for the content of the table included in the middle of the first page:

- Present Letter Type: the current median price of similar homes in the same ZIP Code.

- Future Letter Type: the current median price of similar homes in the same ZIP Code, as well as a forecast for the price 1 year ahead. Within this letter type, subjects are randomized to one of three sub-treatments. Each sub-treatment corresponds to a different forecasting model. For a given individual, the three forecasting models result in forecasts of X%, Y% and Z%. The model that the individual is assigned to will determine the forecast that the individual receives.

- Past Letter Type: the current median price of similar homes in the same ZIP Code, in addition to information about past prices. Within this letter type, there are two sub-treatments: the past-1 sub-treatment includes information about the price 1 year ago; the past-2 sub-treatment includes information about the prices 1 and 2 years ago. The idea is that giving sellers information about past price changes can influence expectations about the future because they extrapolate from past price changes to future price changes. Consider an individual for whom the price changes were X% two years ago and Y% one year ago. If this individual is shown the past-1 letter, she will observe an average past change of Y% (i.e., over the last year); but if she is shown the past-2 letter, she will observe an average annual change of ((X+Y)/2)% (i.e., over the last two years).

The main hypothesis of the study is that higher price expectations affect the transaction date and the transaction price: i.e., a seller who expects his or her house to appreciate more should be willing to wait a bit longer to sell the property for a higher price. We cannot manipulate house price expectations directly, but we do it indirectly through the information provision experiments.

Our main regression model exploits treatment heterogeneity. In other words, our main interest is NOT to compare the average behavior between individuals who receive the past, present and future letter types. Instead, we want to exploit the rich variation in signals given to the subjects. Take for example individuals within the future letter type. Assume that one forecast predicts a 3% increase and the other forecast predicts a 6% increase. So, relative to the first forecast, receiving the second forecast is equivalent to being treated with a 3-percentage-point higher signal about future house prices.

Our baseline regression extends this logic from the future letter type to the pooled sample with all letter types: the right hand side variables are a dummy for whether the individual received information about price dynamics (i.e., past or future letter types), the value of the signal that was included (or could have been included, for the present letter type), and the interaction between these two variables. The coefficient on this interaction variable (i.e., the treatment heterogeneity) is the main object of interest. We will use this exact same specification with the survey data on housing price expectations and with the administrative data with market behavior. In the survey data we also observe prior beliefs (i.e., expectations before receiving or not receiving the information), which we can use to augment the model.

Last, note that the identification of this model would be possible even if we were to drop the future or the past letter types. The reason for having the two letter types is of a more practical nature: individuals may be more willing to incorporate one type of signal than the other. For example, individuals may be more comfortable extrapolating from past prices than trusting black-box forecasts produced by researchers, or vice-versa. The main objective is to pool all the treatment arms (to maximize power), but we will consider the possibility that one type of signal (past or future) was more effective than the other.

The reason why we will send a large number of letters (60,000) is that, even if we find the average effect to be close to zero, the next step would be to ask whether there is at least some groups of individuals who are influenced by the information included in the letters. We would look at heterogeneity across subject characteristics analysis to address this question. We would look at subgroups of the populations for which, ex-ante, we expect the letters may have a stronger influence. For example, it is possible that the seller's price expectations have a greater influence when the local market is a seller's market than when it is a buyer's market. Another example is that the letters may affect individuals who are selling their primary residencies less, because they may have less flexibility to delay the sale.

We will also use the heterogeneity analysis in the auxiliary survey to guide the heterogeneity analysis in the field experiment. For example, if the auxiliary survey suggests that less educated individuals put more weight on the signals about the future house prices, we will then test whether the letters had a stronger effect on the behavior of less educated sellers.

We have rich data on subjects characteristics to conduct this heterogeneity analysis. The administrative records already contain some information about the subjects (e.g., whether the seller is living in the property or not, the number of days since the property was put on the market). Additionally, we will merge external data on other characteristics of the subjects (from a market research company) and data on the local housing markets (from Zillow Research).

When looking at these and other sources of heterogeneity, we'll be using standard methods for joint hypotheses testing. Additionally, we will also consider more modern machine learning methods.
Experimental Design Details
Not available
Randomization Method
Randomization done in office by a computer.
Randomization Unit
Was the treatment clustered?
Experiment Characteristics
Sample size: planned number of clusters
60,000 individuals
Sample size: planned number of observations
We will send 60,000 letters. It is important to note that there is a significant delay (i.e., weeks) from the time that the data is acquired to the actual delivery of the letters. As a result, by the time the letter was delivered, we anticipate that a non-trivial fraction of these 60,000 individuals will have sold their properties already. We will only analyze the effects on the subjects who did not sell their houses by our estimated delivery date (because it is impossible for our letter to affect a transaction that has happened already). If there is enough power, we can use the rest of the sample for a falsification test in an event-study fashion.
Sample size (or number of clusters) by treatment arms
Subjects will be randomized into the following letter types: 20% to the "present" type; 30% to the "past" type and 50% to the "future" type. Within the "past" letter type, subjects will be randomized with equal probability to 1-year or 2-year sub-treatments. Within the "future" letter type, subjects will be randomized with equal probability to one of the three possible forecast sub-treatments.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Supporting Documents and Materials

There are documents in this trial unavailable to the public. Use the button below to request access to this information.

Request Information
IRB Name
Cornell University Institutional Review Board
IRB Approval Date
IRB Approval Number
IRB Name
Institutional Review Board at University of California Los Angeles
IRB Approval Date
IRB Approval Number