class: title-slide <br><br><br> # Lecture 13 ## Learning Models ### Tyler Ransom ### ECON 6343, University of Oklahoma --- # Plan for the Day 1. Learning from noisy signals 2. How learning models relate to factor models 3. Bayesian updating & the Kalman filter 4. How to code learning models 5. Examples of learning models in economics --- # Attribution Some of these slides are based on content from Peter Arcidiacono's course on learning models. --- # Imperfect information - Imperfect information abounds in economics (and real life) - life is full of "noisy signals" - see also: "it's better to be lucky than good" - how do we know someone or something is "lucky" or "good"? - how do we know a restaurant we visited for the first time is actually good? - how do we know we didn't just happen to get their best dish on a good night? --- # Tractable imperfect information - How do we estimate models where a person has imperfect information? - We've done some of this already with dynamic discrete choice models: - people can't see the future - instead, have expectations about their future states & preference shocks - we compute individuals' expectations according to the `\(\mathbb{E}\max\)` formula - we impose a strong assumption on the distribution of `\(\epsilon\)` - That way, we can tractably compute the `\(\mathbb{E}\max\)` --- # How learning works - Consider a setting where an agent is trying to learn about something, call it `\(a_i\)` - For simplicity, assume `\(a_i\)` is continuous and drawn from CDF `\(F_a\)` - The agent doesn't know the exact value of `\(a_i\)`, but has beliefs denoted `\(\mathbb{E}_t [a_i]\)` - He gains additional information about `\(a_i\)` from a noisy signal `\(S_{it}\)` - That is, `\(S_{it} = a_i + \varepsilon_{it}\)` where `\(\varepsilon_{it}\)` is pure noise - The agent updates his beliefs to `\(\mathbb{E}_{t+1} [a_i]\)` by incorporating the new information in `\(S_{it}\)` - This process repeats itself in each period where `\(S_{it}\)` is received --- # A little bit more math - If `\(\varepsilon_{it}\)` is pure noise, then it is independent of `\(a_i\)` for all `\(t\)` - Assume WLOG that `\(\mathbb{E}(a_{i})=0\)` and `\(\mathbb{E}(\varepsilon_{it})=0\)` for all `\(t\)` - Then we can decompose the variance of the signal `\(S_{it}\)` `\begin{align*} \mathbb{V}(S_{it}) &= \mathbb{V}(a_{i}) + \mathbb{V}(\varepsilon_{it})\\ &= \sigma^2_a + \sigma^2_{\varepsilon} \end{align*}` - The .hi[signal-to-noise ratio (SNR)] is defined as `\begin{align*} \frac{\mathbb{V}(S_{it})}{\mathbb{V}(\varepsilon_{it})} &= \frac{\sigma^2_a + \sigma^2_{\varepsilon}}{\sigma^2_{\varepsilon}}\\ \end{align*}` - This ratio measures the quality of the signal (bigger is better) --- # Semantic detail - Some people call the signal the "meaningful input" rather than the input + noise - In this case, the SNR would be `\begin{align*} \frac{\mathbb{V}(a_{i})}{\mathbb{V}(\varepsilon_{it})} &= \frac{\sigma^2_a}{\sigma^2_{\varepsilon}}\\ \end{align*}` - I couldn't find a consensus on this, so just make sure you keep track of things --- # Connection to factor models - In factor models, we have `\(J\)` correlated noisy measurements `\(M_j\)` - We try to separate the "factor" from `\(M\)` using correlation across the `\(M_j\)`'s - Factor models also involve a composite error term (can label it `\(a_i + \varepsilon_{ij}\)`) - .hi[Difference:] True value of the factor is known to the individual but not the researcher - In learning models, the factor is unknown to both the researcher and the individual - Learning models require panel data --- # Learning models and factor models - Rather than being substitutes, these two types of models are .hi[complements] - Factor models recognize that agents might possess some private information - Learning models underscore the potential importance of unknown information - If we ignored private information, that might distort what we call "learning" --- # Bayesian updating of beliefs - How exactly do agents update their beliefs given new information in `\(S_{it}\)`? - The simplest way to handle this is to assume .hi[Bayesian updating] - As the name implies, this comes from Bayes' rule - Given a .hi[prior] mean and variance belief `\(\mathbb{E}_t[a_i]\)` and `\(\mathbb{V}_t[a_i]\)`, agents update according to `\begin{align*} \mathbb{E}_{t+1}[a_i] &= \mathbb{E}_t[a_i]\frac{\sigma^2_\varepsilon}{\sigma^2_\varepsilon + \mathbb{V}_t[a_i]} + S_{it}\frac{\mathbb{V}_t[a_i]}{\sigma^2_\varepsilon + \mathbb{V}_t[a_i]} \\ \mathbb{V}_{t+1}[a_i] &= \mathbb{V}_t[a_i] \frac{\sigma^2_\varepsilon}{\sigma^2_\varepsilon + \mathbb{V}_t[a_i]} \end{align*}` - `\(\mathbb{E}_{t+1}[a_i]\)` and `\(\mathbb{V}_{t+1}[a_i]\)` are referred to as the .hi[posterior] beliefs --- # Properties of Bayesian Learning 1. `\(\mathbb{V}_{t+1}[a_i]>0\)` for all `\(t\)` - One is never completely certain of what he has learned 2. If `\(\sigma^2_a>0\)` then `\(\frac{\partial\mathbb{V}_{t+1}[a_i]}{\partial t}<0\)` - As additional signals are received, uncertainty of beliefs goes down 3. If `\(\sigma^2_a>0\)` then `\(\lim_{t\rightarrow\infty}\mathbb{V}_{t+1}[a_i] = 0\)` - In the limit, uncertainty of beliefs vanishes 4. The .hi[speed of learning] is dictated by the signal-to-noise ratio - These properties may not be desirable, but they are intrinsic to Bayesianism --- # Non-Bayesian updating - Bayesian updating is so popular because the math works out nicely - This is because of what is known as a .hi[conjugate prior] (read Wikipedia) - If we assume a different kind of updating, the math could get ugly real quick - But a lot of properties of Bayesian updating make sense (e.g. `\(\lim_{t\rightarrow\infty}\mathbb{V}_{t+1}[a_i] = 0\)`) - Moreover, we often don't know people's beliefs - If we had detailed data on people's beliefs, that would allow us to be more flexible - Sort of like how getting stated preference data aids estimation of choice models --- # Other considerations - Things get more complicated if the signal is not continuous - Naturally, a discrete signal will provide less information - e.g. Pass/Fail on an exam, versus a 0-100 score - Another complication is if the signal is selected - For example, I only see a wage signal if I have a job - In this case, we need a choice model to resolve the sample selection problem - We'll talk about this towards the end of today's class --- # The Kalman filter - The .hi[Kalman filter] is a generalization of Bayesian updating of a learning model - Most common [application](http://greg.czerniak.info/guides/kalman1/): remote sensing of aircraft/spacecraft - Any given sensor sends back a "noisy" signal about exact location - Multiple sensors acting in sequence can provide more reliable location info - Another cool example: estimating the `\(R_0\)` of SARS-Cov-2 - Arroyo Marioli, Bullano, Kucinskas et al. (2020) - They created a neat continuously updated [dashboard](http://trackingr-env.eba-9muars8y.us-east-2.elasticbeanstalk.com/) - `\(R_0\)` tends to 1 in equilibrium, as predicted by [Joshua Gans](https://joshuagans.substack.com/p/why-r-tends-towards-1) --- # Multidimensional learning - What would Bayesian updating look like if `\(a_i\)` were a vector rather than a scalar? - Let `\(A_i\)` denote the vector, and suppose its population covariance is `\(\Delta\)` - `\(\mathbf{S}_{it} = A_i + \boldsymbol\varepsilon_{it}\)` is a vector-valued signal `\begin{align*} \mathbb{E}_{t+1}[A_i]&=(\mathbb{V}^{-1}_{t}[A_i] + \Omega_{it})^{-1}(\mathbb{V}^{-1}_{t}[A_i]\mathbb{E}_{t}[A_i]+\Omega_{it}\mathbf{S}_{it}) \\ \mathbb{V}_{t+1}[A_i]&=(\mathbb{V}^{-1}_{t}[A_i] + \Omega_{it})^{-1} \end{align*}` - `\(\Omega_{it}\)` is a diagonal matrix with `\(\frac{1}{\sigma^2_{\varepsilon_j}}\)` in the `\((j,j)\)` element - Elements of `\(\Omega\)` and `\(\mathbf{S}\)` are set to 0 for signals that aren't received - (in other words, not all signals need to be received in every period) --- # Updating, step by step - This example will hopefully clarify how updating works - In period 1, the individual begins with prior beliefs `\((\mathbb{E}_1[a_i],\mathbb{V}_1[a_i])\)` - Usually, set these to the population values `\((0,\sigma^2_a)\)` for all individuals - Then, a signal `\(S_{i1}\)` is received and beliefs are updated according to the formulas: `\begin{align*} \mathbb{E}_{2}[a_i] &= \underbrace{\mathbb{E}_1[a_i]}_{0}\frac{\sigma^2_\varepsilon}{\sigma^2_\varepsilon + \underbrace{\mathbb{V}_1[a_i]}_{\sigma^2_a}} + S_{i1}\frac{\overbrace{\mathbb{V}_1[a_i]}^{\sigma^2_a}}{\sigma^2_\varepsilon + \underbrace{\mathbb{V}_1[a_i]}_{\sigma^2_a}}\\ & = \frac{S_{i1}\sigma^2_a}{\sigma^2_\varepsilon + \sigma^2_a} \end{align*}` --- # Updating the variance - When `\(S_{i1}\)` is received, `\(i\)` updates the variance as follows: `\begin{align*} \mathbb{V}_{2}[a_i] &= \underbrace{\mathbb{V}_1[a_i]}_{\sigma^2_a} \frac{\sigma^2_\varepsilon}{\sigma^2_\varepsilon + \underbrace{\mathbb{V}_1[a_i]}_{\sigma^2_a}}\\ & = \frac{\sigma^2_a\sigma^2_\varepsilon}{\sigma^2_\varepsilon + \sigma^2_a} \end{align*}` - It is straightforward to show that `\(\mathbb{V}_{2}[a_i]<\mathbb{V}_{1}[a_i]\)` when `\(\sigma^2_a>0\)` --- # Estimating a simple learning model - Let's estimate a simple learning model - Suppose the signals are log wage residuals - Individuals are trying to ascertain their ability from these residuals `\begin{align*} \log w_{it} &= X\beta + a_i + \varepsilon_{it} \end{align*}` - We want to estimate `\((\beta,\sigma^2_a,\sigma^2_\varepsilon)\)` by maximum likelihood - We also want to recover each person's beliefs at each point in time - Let's also compare the results with those from other panel data estimators (FE, RE) - .hi[Note:] our learning model assumes `\(a_i \perp X\)` so is identical to RE --- # Estimation code - In the single linear equation case, estimation is identical to RE (so just use RE) - For a multidimensional learning case, see [this Github repository](https://github.com/tyleransom/LearningModels) - Let's estimate the simple learning model from the previous slide .scroll-box-12[ ``` julia using Random, Statistics, LinearAlgebra, DataFrames, DataFramesMeta, CSV, FixedEffectModels, MixedModels df = CSV.read("nlswlearn.csv") dfuse = df[df.ln_wage.!=999,:] # FE @show reg(dfuse, @formula(ln_wage ~ 1 + exper*exper + collgrad + race1 + fe(idcode)), Vcov.cluster(:idcode)) # RE categorical!(dfuse, :idcode) @show fm1 = fit(MixedModel, @formula(ln_wage ~ 1 + exper*exper + collgrad + race1 + (1|idcode)), dfuse) # gives σ²_ε = .092187 and σ²_a = 0.106297 # Add columns to data indicating the signal (S_{it}) and prior/posterior mean/variances sig_eps = .092187 df = @transform(df, signal = :ln_wage .- coef(fm1)[1] .- coef(fm1)[2].*:exper .- coef(fm1)[3]*:collgrad .- coef(fm1)[4].*:race1 .- coef(fm1)[5].*:exper.^2, priorEbelief = zeros(length(:ln_wage)), postrEbelief = zeros(length(:ln_wage)), priorVbelief = 0.106297*ones(length(:ln_wage)), postrVbelief = 0.106297*ones(length(:ln_wage))) # loop through to apply belief formulas for i = 1:N for t=1:T rowt = (i-1)*T+t row1 = (i-1)*T+t+1 if df.ln_wage[rowt]==999 # didn't get a signal this period df.signal[rowt] = 0 df.postrEbelief[rowt] = df.priorEbelief[rowt] df.postrVbelief[rowt] = df.priorVbelief[rowt] else # these are the formulas from the slides df.postrEbelief[rowt] = df.priorEbelief[rowt]*(sig_eps./(sig_eps + df.priorVbelief[rowt])) + df.signal[rowt]*(df.priorVbelief[rowt])./(sig_eps + df.priorVbelief[rowt]) df.postrVbelief[rowt] = df.priorVbelief[rowt]*(sig_eps./(sig_eps + df.priorVbelief[rowt])) end if t<T # set prior in t+1 to be posterior from t except in very last period df.priorEbelief[row1] = df.postrEbelief[rowt] df.priorVbelief[row1] = df.postrVbelief[rowt] end end end ``` ] --- # Looking at the belief updating .scroll-box-20[ ``` julia │ Row │ ln_wage │ idcode │ t │ signal │ priorEbelief │ postrEbelief │ priorVbelief │ postrVbelief │ ├───────┼─────────┼────────┼───────┼────────────┼──────────────┼──────────────┼──────────────┼──────────────┤ │ 1 │ 999.0 │ 1 │ 1 │ 0.0 │ 0.0 │ 0.0 │ 0.106297 │ 0.106297 │ │ 2 │ 999.0 │ 1 │ 2 │ 0.0 │ 0.0 │ 0.0 │ 0.106297 │ 0.106297 │ │ 3 │ 1.45121 │ 1 │ 3 │ 0.071744 │ 0.0 │ 0.0384221 │ 0.106297 │ 0.0493702 │ │ 4 │ 1.02862 │ 1 │ 4 │ -0.436915 │ 0.0384221 │ -0.127359 │ 0.0493702 │ 0.0321516 │ │ 5 │ 1.58998 │ 1 │ 5 │ 0.0476647 │ -0.127359 │ -0.082101 │ 0.0321516 │ 0.0238378 │ │ 6 │ 1.78027 │ 1 │ 6 │ 0.17047 │ -0.082101 │ -0.0302093 │ 0.0238378 │ 0.0189402 │ │ 7 │ 1.77701 │ 1 │ 7 │ 0.167209 │ -0.0302093 │ 0.00343808 │ 0.0189402 │ 0.0157121 │ │ 8 │ 1.77868 │ 1 │ 8 │ 0.168878 │ 0.00343808 │ 0.0275291 │ 0.0157121 │ 0.0134241 │ │ 9 │ 2.49398 │ 1 │ 9 │ 0.825968 │ 0.0275291 │ 0.129018 │ 0.0134241 │ 0.0117178 │ │ 10 │ 2.55172 │ 1 │ 10 │ 0.883707 │ 0.129018 │ 0.214128 │ 0.0117178 │ 0.0103963 │ │ 11 │ 999.0 │ 1 │ 11 │ 0.0 │ 0.214128 │ 0.214128 │ 0.0103963 │ 0.0103963 │ │ 12 │ 2.42026 │ 1 │ 12 │ 0.752253 │ 0.214128 │ 0.268664 │ 0.0103963 │ 0.00934272 │ │ 13 │ 2.61417 │ 1 │ 13 │ 0.946164 │ 0.268664 │ 0.331007 │ 0.00934272 │ 0.008483 │ │ 14 │ 2.53637 │ 1 │ 14 │ 0.868366 │ 0.331007 │ 0.376288 │ 0.008483 │ 0.00776818 │ │ 15 │ 2.46293 │ 1 │ 15 │ 0.746001 │ 0.376288 │ 0.405021 │ 0.00776818 │ 0.00716446 │ ├───────┼─────────┼────────┼───────┼────────────┼──────────────┼──────────────┼──────────────┼──────────────┤ │ 16 │ 999.0 │ 2 │ 1 │ 0.0 │ 0.0 │ 0.0 │ 0.106297 │ 0.106297 │ │ 17 │ 999.0 │ 2 │ 2 │ 0.0 │ 0.0 │ 0.0 │ 0.106297 │ 0.106297 │ │ 18 │ 999.0 │ 2 │ 3 │ 0.0 │ 0.0 │ 0.0 │ 0.106297 │ 0.106297 │ │ 19 │ 1.36035 │ 2 │ 4 │ -0.019122 │ 0.0 │ -0.0102407 │ 0.106297 │ 0.0493702 │ │ 20 │ 1.2062 │ 2 │ 5 │ -0.259337 │ -0.0102407 │ -0.0971167 │ 0.0493702 │ 0.0321516 │ ``` ] --- # Why learning is important - Uncertainty and learning can explain some empirical puzzles - Why engage in something costly and not finish it? (e.g. college) - Agents might act differently if they have greater amounts of information - This means we need to model how beliefs map into actions - Thus, .hi[learning should be part of a dynamic choice model] - A persistent question is how can we help agents become more informed? - Information is valuable, but usually costly to obtain. How can we lower that cost? --- # Papers that use learning models - Education: - High school dropout (Fu, Grau, and Rivera, 2020) - College dropout (Stinebrickner and Stinebrickner, 2014a; Arcidiacono, Aucejo, Maurel et al., 2025) - College major choice (Arcidiacono, 2004; Stinebrickner and Stinebrickner, 2014b) --- # Papers that use learning models - Labor: - Occupational choice (Miller, 1984; James, 2011) - Employee quality (Farber and Gibbons, 1996; Altonji and Pierret, 2001) - Family: - Marriage match quality (Brien, Lillard, and Stern, 2006) - IO: - Learning about experience goods (Erdem and Keane, 1996; Ackerberg, 2003) --- # References .tinier[ Ackerberg, D. A. (2003). "Advertising, Learning, and Consumer Choice in Experience Good Markets: An Empirical Examination". In: _International Economic Review_ 44.3, pp. 1007-1040. DOI: [10.1111/1468-2354.t01-2-00098](https://doi.org/10.1111%2F1468-2354.t01-2-00098). Adams, R. P. (2018). _Model Selection and Cross Validation_. Lecture Notes. Princeton University. URL: [https://www.cs.princeton.edu/courses/archive/fall18/cos324/files/model-selection.pdf](https://www.cs.princeton.edu/courses/archive/fall18/cos324/files/model-selection.pdf). Ahlfeldt, G. M., S. J. Redding, D. M. Sturm, et al. (2015). "The Economics of Density: Evidence From the Berlin Wall". In: _Econometrica_ 83.6, pp. 2127-2189. DOI: [10.3982/ECTA10876](https://doi.org/10.3982%2FECTA10876). Altonji, J. G., T. E. Elder, and C. R. Taber (2005). "Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools". In: _Journal of Political Economy_ 113.1, pp. 151-184. DOI: [10.1086/426036](https://doi.org/10.1086%2F426036). Altonji, J. G. and C. R. Pierret (2001). "Employer Learning and Statistical Discrimination". In: _Quarterly Journal of Economics_ 116.1, pp. 313-350. DOI: [10.1162/003355301556329](https://doi.org/10.1162%2F003355301556329). Angrist, J. D. and A. B. Krueger (1991). "Does Compulsory School Attendance Affect Schooling and Earnings?" In: _Quarterly Journal of Economics_ 106.4, pp. 979-1014. DOI: [10.2307/2937954](https://doi.org/10.2307%2F2937954). Angrist, J. D. and J. Pischke (2009). _Mostly Harmless Econometrics: An Empiricist's Companion_. Princeton University Press. ISBN: 0691120358. Arcidiacono, P. (2004). "Ability Sorting and the Returns to College Major". In: _Journal of Econometrics_ 121, pp. 343-375. DOI: [10.1016/j.jeconom.2003.10.010](https://doi.org/10.1016%2Fj.jeconom.2003.10.010). Arcidiacono, P., E. Aucejo, A. Maurel, et al. (2016). _College Attrition and the Dynamics of Information Revelation_. Working Paper. Duke University. URL: [https://tyleransom.github.io/research/CollegeDropout2016May31.pdf](https://tyleransom.github.io/research/CollegeDropout2016May31.pdf). Arcidiacono, P., E. Aucejo, A. Maurel, et al. (2025). "College Attrition and the Dynamics of Information Revelation". In: _Journal of Political Economy_ 133.1. DOI: [10.1086/732526](https://doi.org/10.1086%2F732526). Arcidiacono, P. and J. B. Jones (2003). "Finite Mixture Distributions, Sequential Likelihood and the EM Algorithm". In: _Econometrica_ 71.3, pp. 933-946. DOI: [10.1111/1468-0262.00431](https://doi.org/10.1111%2F1468-0262.00431). Arcidiacono, P., J. Kinsler, and T. Ransom (2022b). "Asian American Discrimination in Harvard Admissions". In: _European Economic Review_ 144, p. 104079. DOI: [10.1016/j.euroecorev.2022.104079](https://doi.org/10.1016%2Fj.euroecorev.2022.104079). Arcidiacono, P., J. Kinsler, and T. Ransom (2022a). "Legacy and Athlete Preferences at Harvard". In: _Journal of Labor Economics_ 40.1, pp. 133-156. DOI: [10.1086/713744](https://doi.org/10.1086%2F713744). Arcidiacono, P. and R. A. Miller (2011). "Conditional Choice Probability Estimation of Dynamic Discrete Choice Models With Unobserved Heterogeneity". In: _Econometrica_ 79.6, pp. 1823-1867. DOI: [10.3982/ECTA7743](https://doi.org/10.3982%2FECTA7743). Arroyo Marioli, F., F. Bullano, S. Kucinskas, et al. (2020). _Tracking R of COVID-19: A New Real-Time Estimation Using the Kalman Filter_. Working Paper. medRxiv. DOI: [10.1101/2020.04.19.20071886](https://doi.org/10.1101%2F2020.04.19.20071886). Ashworth, J., V. J. Hotz, A. Maurel, et al. (2021). "Changes across Cohorts in Wage Returns to Schooling and Early Work Experiences". In: _Journal of Labor Economics_ 39.4, pp. 931-964. DOI: [10.1086/711851](https://doi.org/10.1086%2F711851). Attanasio, O. P., C. Meghir, and A. Santiago (2011). "Education Choices in Mexico: Using a Structural Model and a Randomized Experiment to Evaluate PROGRESA". In: _Review of Economic Studies_ 79.1, pp. 37-66. DOI: [10.1093/restud/rdr015](https://doi.org/10.1093%2Frestud%2Frdr015). Aucejo, E. M. and J. James (2019). "Catching Up to Girls: Understanding the Gender Imbalance in Educational Attainment Within Race". In: _Journal of Applied Econometrics_ 34.4, pp. 502-525. DOI: [10.1002/jae.2699](https://doi.org/10.1002%2Fjae.2699). Baragatti, M., A. Grimaud, and D. Pommeret (2013). "Likelihood-free Parallel Tempering". In: _Statistics and Computing_ 23.4, pp. 535-549. DOI: [ 10.1007/s11222-012-9328-6](https://doi.org/%2010.1007%2Fs11222-012-9328-6). Bayer, P., R. McMillan, A. Murphy, et al. (2016). "A Dynamic Model of Demand for Houses and Neighborhoods". In: _Econometrica_ 84.3, pp. 893-942. DOI: [10.3982/ECTA10170](https://doi.org/10.3982%2FECTA10170). Begg, C. B. and R. Gray (1984). "Calculation of Polychotomous Logistic Regression Parameters Using Individualized Regressions". In: _Biometrika_ 71.1, pp. 11-18. DOI: [10.1093/biomet/71.1.11](https://doi.org/10.1093%2Fbiomet%2F71.1.11). Beggs, S. D., N. S. Cardell, and J. Hausman (1981). "Assessing the Potential Demand for Electric Cars". In: _Journal of Econometrics_ 17.1, pp. 1-19. DOI: [10.1016/0304-4076(81)90056-7](https://doi.org/10.1016%2F0304-4076%2881%2990056-7). Berry, S., J. Levinsohn, and A. Pakes (1995). "Automobile Prices in Market Equilibrium". In: _Econometrica_ 63.4, pp. 841-890. URL: [http://www.jstor.org/stable/2171802](http://www.jstor.org/stable/2171802). Blass, A. A., S. Lach, and C. F. Manski (2010). "Using Elicited Choice Probabilities to Estimate Random Utility Models: Preferences for Electricity Reliability". In: _International Economic Review_ 51.2, pp. 421-440. DOI: [10.1111/j.1468-2354.2010.00586.x](https://doi.org/10.1111%2Fj.1468-2354.2010.00586.x). Blundell, R. (2010). "Comments on: ``Structural vs. Atheoretic Approaches to Econometrics'' by Michael Keane". In: _Journal of Econometrics_ 156.1, pp. 25-26. DOI: [10.1016/j.jeconom.2009.09.005](https://doi.org/10.1016%2Fj.jeconom.2009.09.005). Bresnahan, T. F., S. Stern, and M. Trajtenberg (1997). "Market Segmentation and the Sources of Rents from Innovation: Personal Computers in the Late 1980s". In: _The RAND Journal of Economics_ 28.0, pp. S17-S44. DOI: [10.2307/3087454](https://doi.org/10.2307%2F3087454). Brien, M. J., L. A. Lillard, and S. Stern (2006). "Cohabitation, Marriage, and Divorce in a Model of Match Quality". In: _International Economic Review_ 47.2, pp. 451-494. DOI: [10.1111/j.1468-2354.2006.00385.x](https://doi.org/10.1111%2Fj.1468-2354.2006.00385.x). Card, D. (1995). "Using Geographic Variation in College Proximity to Estimate the Return to Schooling". In: _Aspects of Labor Market Behaviour: Essays in Honour of John Vanderkamp_. Ed. by L. N. Christofides, E. K. Grant and R. Swidinsky. Toronto: University of Toronto Press. Cardell, N. S. (1997). "Variance Components Structures for the Extreme-Value and Logistic Distributions with Application to Models of Heterogeneity". In: _Econometric Theory_ 13.2, pp. 185-213. URL: [https://www.jstor.org/stable/3532724](https://www.jstor.org/stable/3532724). Caucutt, E. M., L. Lochner, J. Mullins, et al. (2020). _Child Skill Production: Accounting for Parental and Market-Based Time and Goods Investments_. Working Paper 27838. National Bureau of Economic Research. DOI: [10.3386/w27838](https://doi.org/10.3386%2Fw27838). Chen, X., H. Hong, and D. Nekipelov (2011). "Nonlinear Models of Measurement Errors". In: _Journal of Economic Literature_ 49.4, pp. 901-937. DOI: [10.1257/jel.49.4.901](https://doi.org/10.1257%2Fjel.49.4.901). Chintagunta, P. K. (1992). "Estimating a Multinomial Probit Model of Brand Choice Using the Method of Simulated Moments". In: _Marketing Science_ 11.4, pp. 386-407. DOI: [10.1287/mksc.11.4.386](https://doi.org/10.1287%2Fmksc.11.4.386). Cinelli, C. and C. Hazlett (2020). "Making Sense of Sensitivity: Extending Omitted Variable Bias". In: _Journal of the Royal Statistical Society: Series B (Statistical Methodology)_ 82.1, pp. 39-67. DOI: [10.1111/rssb.12348](https://doi.org/10.1111%2Frssb.12348). Coate, P. and K. Mangum (2019). _Fast Locations and Slowing Labor Mobility_. Working Paper 19-49. Federal Reserve Bank of Philadelphia. Cunha, F., J. J. Heckman, and S. M. Schennach (2010). "Estimating the Technology of Cognitive and Noncognitive Skill Formation". In: _Econometrica_ 78.3, pp. 883-931. DOI: [10.3982/ECTA6551](https://doi.org/10.3982%2FECTA6551). Cunningham, S. (2021). _Causal Inference: The Mixtape_. Yale University Press. URL: [https://www.scunning.com/causalinference_norap.pdf](https://www.scunning.com/causalinference_norap.pdf). Delavande, A. and C. F. Manski (2015). "Using Elicited Choice Probabilities in Hypothetical Elections to Study Decisions to Vote". In: _Electoral Studies_ 38, pp. 28-37. DOI: [10.1016/j.electstud.2015.01.006](https://doi.org/10.1016%2Fj.electstud.2015.01.006). Delavande, A. and B. Zafar (2019). "University Choice: The Role of Expected Earnings, Nonpecuniary Outcomes, and Financial Constraints". In: _Journal of Political Economy_ 127.5, pp. 2343-2393. DOI: [10.1086/701808](https://doi.org/10.1086%2F701808). Diegert, P., M. A. Masten, and A. Poirier (2025). _Assessing Omitted Variable Bias when the Controls are Endogenous_. arXiv. DOI: [10.48550/ARXIV.2206.02303](https://doi.org/10.48550%2FARXIV.2206.02303). Erdem, T. and M. P. Keane (1996). "Decision-Making under Uncertainty: Capturing Dynamic Brand Choice Processes in Turbulent Consumer Goods Markets". In: _Marketing Science_ 15.1, pp. 1-20. DOI: [10.1287/mksc.15.1.1](https://doi.org/10.1287%2Fmksc.15.1.1). Evans, R. W. (2018). _Simulated Method of Moments (SMM) Estimation_. QuantEcon Note. University of Chicago. URL: [https://notes.quantecon.org/submission/5b3db2ceb9eab00015b89f93](https://notes.quantecon.org/submission/5b3db2ceb9eab00015b89f93). Farber, H. S. and R. Gibbons (1996). "Learning and Wage Dynamics". In: _Quarterly Journal of Economics_ 111.4, pp. 1007-1047. DOI: [10.2307/2946706](https://doi.org/10.2307%2F2946706). Fu, C., N. Grau, and J. Rivera (2020). _Wandering Astray: Teenagers' Choices of Schooling and Crime_. Working Paper. University of Wisconsin-Madison. URL: [https://www.ssc.wisc.edu/~cfu/wander.pdf](https://www.ssc.wisc.edu/~cfu/wander.pdf). Gillingham, K., F. Iskhakov, A. Munk-Nielsen, et al. (2022). "Equilibrium Trade in Automobiles". In: _Journal of Political Economy_. DOI: [10.1086/720463](https://doi.org/10.1086%2F720463). Haile, P. (2019). _``Structural vs. Reduced Form'' Language and Models in Empirical Economics_. Lecture Slides. Yale University. URL: [http://www.econ.yale.edu/~pah29/intro.pdf](http://www.econ.yale.edu/~pah29/intro.pdf). Haile, P. (2024). _Models, Measurement, and the Language of Empirical Economics_. Lecture Slides. Yale University. URL: [https://www.dropbox.com/s/8kwtwn30dyac18s/intro.pdf](https://www.dropbox.com/s/8kwtwn30dyac18s/intro.pdf). Heckman, J. J., J. Stixrud, and S. Urzua (2006). "The Effects of Cognitive and Noncognitive Abilities on Labor Market Outcomes and Social Behavior". In: _Journal of Labor Economics_ 24.3, pp. 411-482. DOI: [10.1086/504455](https://doi.org/10.1086%2F504455). Hotz, V. J. and R. A. Miller (1993). "Conditional Choice Probabilities and the Estimation of Dynamic Models". In: _The Review of Economic Studies_ 60.3, pp. 497-529. DOI: [10.2307/2298122](https://doi.org/10.2307%2F2298122). Hurwicz, L. (1950). "Generalization of the Concept of Identification". In: _Statistical Inference in Dynamic Economic Models_. Hoboken, NJ: John Wiley and Sons, pp. 245-257. Ishimaru, S. (2022). _Geographic Mobility of Youth and Spatial Gaps in Local College and Labor Market Opportunities_. Working Paper. Hitotsubashi University. James, J. (2011). _Ability Matching and Occupational Choice_. Working Paper 11-25. Federal Reserve Bank of Cleveland. James, J. (2017). "MM Algorithm for General Mixed Multinomial Logit Models". In: _Journal of Applied Econometrics_ 32.4, pp. 841-857. DOI: [10.1002/jae.2532](https://doi.org/10.1002%2Fjae.2532). Jin, H. and H. Shen (2020). "Foreign Asset Accumulation Among Emerging Market Economies: A Case for Coordination". In: _Review of Economic Dynamics_ 35.1, pp. 54-73. DOI: [10.1016/j.red.2019.04.006](https://doi.org/10.1016%2Fj.red.2019.04.006). Keane, M. P. (2010). "Structural vs. Atheoretic Approaches to Econometrics". In: _Journal of Econometrics_ 156.1, pp. 3-20. DOI: [10.1016/j.jeconom.2009.09.003](https://doi.org/10.1016%2Fj.jeconom.2009.09.003). Keane, M. P. and K. I. Wolpin (1997). "The Career Decisions of Young Men". In: _Journal of Political Economy_ 105.3, pp. 473-522. DOI: [10.1086/262080](https://doi.org/10.1086%2F262080). Koopmans, T. C. and O. Reiersol (1950). "The Identification of Structural Characteristics". In: _The Annals of Mathematical Statistics_ 21.2, pp. 165-181. URL: [http://www.jstor.org/stable/2236899](http://www.jstor.org/stable/2236899). Kosar, G., T. Ransom, and W. van der Klaauw (2022). "Understanding Migration Aversion Using Elicited Counterfactual Choice Probabilities". In: _Journal of Econometrics_ 231.1, pp. 123-147. DOI: [10.1016/j.jeconom.2020.07.056](https://doi.org/10.1016%2Fj.jeconom.2020.07.056). Krauth, B. (2016). "Bounding a Linear Causal Effect Using Relative Correlation Restrictions". In: _Journal of Econometric Methods_ 5.1, pp. 117-141. DOI: [10.1515/jem-2013-0013](https://doi.org/10.1515%2Fjem-2013-0013). Lang, K. and M. D. Palacios (2018). _The Determinants of Teachers' Occupational Choice_. Working Paper 24883. National Bureau of Economic Research. DOI: [10.3386/w24883](https://doi.org/10.3386%2Fw24883). Lee, D. S., J. McCrary, M. J. Moreira, et al. (2020). _Valid t-ratio Inference for IV_. Working Paper. arXiv. URL: [https://arxiv.org/abs/2010.05058](https://arxiv.org/abs/2010.05058). Lewbel, A. (2019). "The Identification Zoo: Meanings of Identification in Econometrics". In: _Journal of Economic Literature_ 57.4, pp. 835-903. DOI: [10.1257/jel.20181361](https://doi.org/10.1257%2Fjel.20181361). Mahoney, N. (2022). "Principles for Combining Descriptive and Model-Based Analysis in Applied Microeconomics Research". In: _Journal of Economic Perspectives_ 36.3, pp. 211-22. DOI: [10.1257/jep.36.3.211](https://doi.org/10.1257%2Fjep.36.3.211). McFadden, D. (1978). "Modelling the Choice of Residential Location". In: _Spatial Interaction Theory and Planning Models_. Ed. by A. Karlqvist, L. Lundqvist, F. Snickers and J. W. Weibull. Amsterdam: North Holland, pp. 75-96. McFadden, D. (1989). "A Method of Simulated Moments for Estimation of Discrete Response Models Without Numerical Integration". In: _Econometrica_ 57.5, pp. 995-1026. DOI: [10.2307/1913621](https://doi.org/10.2307%2F1913621). URL: [http://www.jstor.org/stable/1913621](http://www.jstor.org/stable/1913621). Mellon, J. (2020). _Rain, Rain, Go Away: 137 Potential Exclusion-Restriction Violations for Studies Using Weather as an Instrumental Variable_. Working Paper. University of Manchester. URL: [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3715610](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3715610). Miller, R. A. (1984). "Job Matching and Occupational Choice". In: _Journal of Political Economy_ 92.6, pp. 1086-1120. DOI: [10.1086/261276](https://doi.org/10.1086%2F261276). Mincer, J. (1974). _Schooling, Experience and Earnings_. New York: Columbia University Press for National Bureau of Economic Research. Ost, B., W. Pan, and D. Webber (2018). "The Returns to College Persistence for Marginal Students: Regression Discontinuity Evidence from University Dismissal Policies". In: _Journal of Labor Economics_ 36.3, pp. 779-805. DOI: [10.1086/696204](https://doi.org/10.1086%2F696204). Oster, E. (2019). "Unobservable Selection and Coefficient Stability: Theory and Evidence". In: _Journal of Business & Economic Statistics_ 37.2, pp. 187-204. DOI: [10.1080/07350015.2016.1227711](https://doi.org/10.1080%2F07350015.2016.1227711). Pischke, S. (2007). _Lecture Notes on Measurement Error_. Lecture Notes. London School of Economics. URL: [http://econ.lse.ac.uk/staff/spischke/ec524/Merr_new.pdf](http://econ.lse.ac.uk/staff/spischke/ec524/Merr_new.pdf). Ransom, M. R. and T. Ransom (2018). "Do High School Sports Build or Reveal Character? Bounding Causal Estimates of Sports Participation". In: _Economics of Education Review_ 64, pp. 75-89. DOI: [10.1016/j.econedurev.2018.04.002](https://doi.org/10.1016%2Fj.econedurev.2018.04.002). Ransom, T. (2022). "Labor Market Frictions and Moving Costs of the Employed and Unemployed". In: _Journal of Human Resources_ 57.S, pp. S137-S166. DOI: [10.3368/jhr.monopsony.0219-10013R2](https://doi.org/10.3368%2Fjhr.monopsony.0219-10013R2). Rudik, I. (2020). "Optimal Climate Policy When Damages Are Unknown". In: _American Economic Journal: Economic Policy_ 12.2, pp. 340-373. DOI: [10.1257/pol.20160541](https://doi.org/10.1257%2Fpol.20160541). Rust, J. (1987). "Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher". In: _Econometrica_ 55.5, pp. 999-1033. URL: [http://www.jstor.org/stable/1911259](http://www.jstor.org/stable/1911259). Shalizi, C. R. (2019). _Advanced Data Analysis from an Elementary Point of View_. Cambridge University Press. URL: [http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ADAfaEPoV.pdf](http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ADAfaEPoV.pdf). Smith Jr., A. A. (2008). "Indirect Inference". In: _The New Palgrave Dictionary of Economics_. Ed. by S. N. Durlauf and L. E. Blume. Vol. 1-8. London: Palgrave Macmillan. DOI: [10.1007/978-1-349-58802-2](https://doi.org/10.1007%2F978-1-349-58802-2). URL: [http://www.econ.yale.edu/smith/palgrave7.pdf](http://www.econ.yale.edu/smith/palgrave7.pdf). Stinebrickner, R. and T. Stinebrickner (2014a). "Academic Performance and College Dropout: Using Longitudinal Expectations Data to Estimate a Learning Model". In: _Journal of Labor Economics_ 32.3, pp. 601-644. DOI: [10.1086/675308](https://doi.org/10.1086%2F675308). Stinebrickner, R. and T. R. Stinebrickner (2014b). "A Major in Science? Initial Beliefs and Final Outcomes for College Major and Dropout". In: _Review of Economic Studies_ 81.1, pp. 426-472. DOI: [10.1093/restud/rdt025](https://doi.org/10.1093%2Frestud%2Frdt025). Su, C. and K. L. Judd (2012). "Constrained Optimization Approaches to Estimation of Structural Models". In: _Econometrica_ 80.5, pp. 2213-2230. DOI: [10.3982/ECTA7925](https://doi.org/10.3982%2FECTA7925). Train, K. (2009). _Discrete Choice Methods with Simulation_. 2nd ed. Cambridge; New York: Cambridge University Press. ISBN: 9780521766555. Wiswall, M. and B. Zafar (2018). "Preference for the Workplace, Investment in Human Capital, and Gender". In: _Quarterly Journal of Economics_ 133.1, pp. 457-507. DOI: [10.1093/qje/qjx035](https://doi.org/10.1093%2Fqje%2Fqjx035). Young, A. (2020). _Consistency without Inference: Instrumental Variables in Practical Application_. Working Paper. London School of Economics. ]