class: title-slide <br><br><br> # Lecture 8 ## Estimating Dynamic Models Without Solving Value Functions ### Tyler Ransom ### ECON 6343, University of Oklahoma --- # Attribution Many of these slides are based on slides written by Peter Arcidiacono. I use them with his permission. --- # Plan for the Day 1. Show the relationship between differences in conditional value functions and conditional choice probabilities 2. Show how conditional choice probabilities are helpful in estimating problems with terminal choices 3. Show how to estimate the Rust bus engine problem such that the computational time is less than two minutes --- # Hotz and Miller (1993) - Dynamic discrete choice models are complicated to estimate because of the future value terms - Hotz and Miller (1993) show: - Differences in conditional value functions `\(v_j-v_{j'}\)` can be mapped into .hi[conditional choice probabilities] ( `\(p_j\)`'s ) - We can pull the `\(p_j\)`'s from the data in a first stage - Empirical example: optimal stopping with respect to couples' fertility --- # Difference in `\(v\)`'s and logit errors - Consider an individual who faces two choices where the errors are T1EV - The probability of choice 1 is: `\begin{align*} p_1&=\frac{\exp(v_1)}{\exp(v_0)+\exp(v_1)} \end{align*}` - The ratio of `\(p_1/p_0\)` is then: `\begin{align*} \frac{p_1}{p_0}&=\frac{\exp(v_1)}{\exp(v_0)} = \exp(v_1 - v_0) \end{align*}` implying that: `\begin{align*} \ln(p_1/p_0)&=v_1-v_0 \end{align*}` --- # General structure - The inversion theorem of Hotz and Miller says that there exists a mapping, `\(\psi\)`, from the conditional choice probabilities, the `\(p\)`'s, into the differences in the conditional valuation functions, `\(v_j-v_k\)`: `\begin{align*} V_{t+1}&=v_{0t+1}+\mathbb{E}\max\{\epsilon_{0t+1},v_{1t+1}+\epsilon_{1t+1}-v_{0t+1},...,\\ &\phantom{\text{-------------------------------}}v_{{J}t+1}+\epsilon_{{J}t+1}-v_{0t+1}\}\\ V_{t+1}&=v_{0t+1}+\mathbb{E}\max\{\epsilon_{0t+1},\psi_0^1(p_{t+1})+\epsilon_{1t+1},...,\psi_0^{{J}}(p_{t+1})+\epsilon_{{J}t+1}\} \end{align*}` The `\(p\)`'s can be taken from the data. However: 1. We need the mapping, `\(\psi\)`, 2. We need to be able to calculate the expectations of the `\(\epsilon\)`'s 3. We need to do something with the `\(v_0\)`'s --- # Terminal choices Consider the conditional value function `\(v_{1t}\)`: `\begin{align*} v_{1t}(x_t)&=u_1(x_t)+\beta\sum_{x_{t+1}}V_{t+1}(x_{t+1})f_1(x_{t+1}|x_t)\\ &=u_1(x_t)+\beta\sum_{x_{t+1}}\Big[v_{0t+1}(x_{t+1})+\\ &\phantom{\text{----}}\mathbb{E}\max\{\epsilon_{0t+1},\psi_0^1(p_{t+1})+\epsilon_{1t+1}\}\Big]f_1(x_{t+1}|x_t) \end{align*}` - If `\(v_{0t+1}=X_{t+1}\alpha_0\)`, then .hi[we don't need to solve a backwards recursion problem] - ... so long as we can deal with the last line --- # Terminal choices and logit errors - When the `\(\epsilon\)`'s are Type I extreme value, `\(V_{t+1}\)` is given by: `\begin{align*} V_{t+1}(x_{t+1})&=\ln\left[\exp(v_0(x_{t+1}))+\exp(v_1(x_{t+1}))\right]+c \end{align*}` - We can then express the conditional value function as: `\begin{align*} v_{1t}(x_t)&=u_1(x_t)+\beta\bigg(v_{0t+1}(x_{t+1})+\\ &\phantom{\text{----}}\ln\left\{1+\exp[v_{1t+1}(x_{t+1})-v_{0t+1}(x_{t+1})]\right\}\bigg)+\beta c \end{align*}` which can now be written as a function of the conditional choice probabilities: `\begin{align*} v_{1t}(x_t)&=u_1(x_t)+\beta\bigg(v_{0t+1}(x_{t+1})-\ln\left[p_{0t+1}(x_{t+1})\right]\bigg)+\beta c \end{align*}` --- # Terminal choices and logit errors 2 - In general, `\(v_{0t+1}(x_{t+1})\)` will still be recursive: it has `\(V_{t+2}\)` in it - But if choice 0 is terminal, we'll have something linear for `\(v_{0t+1}\)` (i.e. no `\(V_{t+2}\)`) - We can then use the data to calculate `\(p_{0t+1}(x_{t+1})\)` (e.g. a bin estimator) - Note that this is similar to getting the `\(f_j(x_{t+1}|x_t)\)`'s in a first stage - Things just about reduce down to a simple logit! --- # Derivation - The key idea is that `\(V_{t+1} = v_{jt+1} - \ln p_{jt+1} + c\)` when the `\(\epsilon\)`'s are T1EV - The derivation trick is to multiply and divide the inside of the log sum by `\(\exp(v_{jt+1})\)`: .smallest[ `\begin{align*} V_{t+1}(x_{t+1})&=\ln\left[\sum_k\exp(v_k(x_{t+1}))\right]+c\\ &=\ln\left[\frac{\exp(v_j(x_{t+1}))}{\exp(v_j(x_{t+1}))}\sum_k\exp(v_k(x_{t+1}))\right]+c\\ &=\ln\left[\exp(v_j(x_{t+1}))\frac{\sum_k\exp(v_k(x_{t+1}))}{\exp(v_j(x_{t+1}))}\right]+c\\ &=\underbrace{\ln\exp(v_j(x_{t+1}))}_{v_j(x_{t+1})}+\ln\left[\underbrace{\frac{\sum_k\exp(v_k(x_{t+1}))}{\exp(v_j(x_{t+1}))}}_{p_j(x_{t+1})^{-1}}\right]+c\\ &=v_j(x_{t+1}) - \ln p_j(x_{t+1}) + c \,\,\,\, \forall\,\, j\in J \end{align*}` ] --- # Derivation for GEV - If the `\(\epsilon\)`'s are GEV, we can still express `\(V_{t+1}\)` as a closed-form function of `\(p_{jt+1}\)` - But the math gets more complicated because it depends on the form of `\(G\)` - Recall: `\(V_{t+1} = \ln G\)`, where `\(G = \sum_k \exp(\cdot)\)` if `\(\epsilon\)`'s are T1EV - For nested logit, the formula will involve the nesting parameters (the `\(\lambda\)`'s) - If the `\(\epsilon\)`'s are Normal, there is no closed-form expression for `\(V_{t+1}\)` - You would need to use simulation to compute the `\(\mathbb{E}\max\)` integral - The only paper I've seen use CCP's with GEV is Coate and Mangum (2019) --- # Another way of looking at the problem - We can also write `\(V_{t+1}\)` as `\begin{align*} V_{t+1}(x_{t+1})&=v_{0t+1}(x_{t+1})+\\ &\phantom{\text{----}}\sum_{j=0}^1p_{jt+1}(x_{t+1})\bigg[v_{1t+1}(x_{t+1})-v_{0t+1}(x_{t+1})+\mathbb{E}(\epsilon_{jt+1}|d_{jt+1},x_{t+1})\bigg]\\ &=v_{0t+1}(x_{t+1})+\\ &\phantom{\text{----}}\sum_{j=0}^1p_{jt+1}(x_{t+1})\bigg[\ln\left(\frac{p_{1t+1}(x_{t+1})}{p_{0t+1}(x_{t+1})}\right)+\mathbb{E}(\epsilon_{jt+1}|d_{jt+1},x_{t+1})\bigg] \end{align*}` - Hotz and Miller (1993) eq. (4.12) shows that `\begin{align*} \mathbb{E}(\epsilon_{jt+1}|d_{jt+1},x_{t+1}) = c - \ln[p_{jt+1}] \end{align*}` --- # Another way of looking at the problem - So it is possible to write `\(V_{t+1}\)` in terms of `\(v_0\)` and a bunch of probabilities: `\begin{align*} V_{t+1}(x_{t+1})&=v_{0t+1}(x_{t+1})+\\ &\phantom{\text{----}}\sum_{j=0}^1p_{jt+1}(x_{t+1})\bigg[\ln\left(\frac{p_{1t+1}(x_{t+1})}{p_{0t+1}(x_{t+1})}\right)-\ln\left(p_{jt+1}(x_{t+1})\right)\bigg] + c \end{align*}` - This is the preferred notation of Hotz and Miller (1993) --- # Hotz and Miller (1993), p. 505, eq. (4.13) - `\(j=1\)` means not sterilizing; `\(j=2\)` means sterilizing - `\(j=2\)` is terminal, meaning that `\(v_{2t} =\)` some number (no more `\(\epsilon\)`'s) - Suppose a couple does not sterilize - Then there is some probability `\(\alpha\)` of having a child in the next period - Conditional on having a child, the terms in eq. (4.13) that have CCPs are: `\begin{align*} &\phantom{\text{----}}\left\{\sum_{j=1}^2p_j(H_t,1)\left(c-\ln\left[p_j(H_t,1)\right]\right)\right\}+p_1(H_t,1)\ln\left[\frac{p_1(H_t,1)}{p_2(H_t,1)}\right] \end{align*}` - Other terms in (4.13) are the `\(v_{2t}\)` formula or integrating over the `\(f\)`'s --- # Renewal - An action is termed .hi[renewal] if by, taking the action, the effect of the previous choices on the state are irrelevant `\begin{align*} \sum_{x_{t+1}}f_0(x_{t+2}|x_{t+1})f_j(x_{t+1}|x_{t})&=\sum_{x_{t+1}}f_0(x_{t+2}|x_{t+1})f_{j'}(x_{t+1}|x_{t}) \qquad \textrm{for all } \{j,j'\} \end{align*}` --- # Renewal 2 - Normalizing the future value term relative to the renewal action for choice 1 yields: `\begin{align*} v_{1t}(x_{t})&=u_1(x_t)+\beta\sum_{x_{t+1}}\left[v_{0t+1}(x_{t+1})-\ln(p_{0t+1}(x_{t+1}))\right]f_1(x_{t+1}|x_t)+\beta c \end{align*}` - Now substitute in for `\(v_{0t+1}(x_{t+1})\)` with: `\begin{align*} v_{0t+1}(x_{t+1})&=u_0(x_{t+1})+\beta\sum_{x_{t+2}}V_{t+2}(x_{t+2})f_0(x_{t+2}|x_{t+1}) \end{align*}` - The term involving `\(V_{t+2}(x_{t+2})\)` is then: `\begin{align*} \beta^2\sum_{x_{t+1}}\sum_{x_{t+2}}V_{t+2}(x_{t+2})f_0(x_{t+2}|x_{t+1})f_1(x_{t+1}|x_t) \end{align*}` --- # Renewal 3 - Recall that in estimation we work with _differenced_ conditional value functions - Now consider `\(v_{0t}(x_t)\)` and again normalize the FV term relative to choice 0: `\begin{align*} v_{0t}(x_{t})&=u_0(x_t)+\beta\sum_{x_{t+1}}\left[v_{0t+1}(x_{t+1})-\ln(p_{0t+1}(x_{t+1}))\right]f_0(x_{t+1}|x_t)+\\ &\phantom{\text{----}}\beta^2\sum_{x_{t+1}}\sum_{x_{t+2}}V_{t+2}(x_{t+2})f_0(x_{t+2}|x_{t+1})f_0(x_{t+1}|x_t)+\beta c \end{align*}` - The renewal property implies that the `\(V_{t+2}(x_{t+2})\)` terms are the same, and will .hi[cancel out] once we take differences: .smallest[ `\begin{align*} v_{1t}(x_{t})-v_{0t}(x_{t})&=u_1(x_t) - u_0(x_t)+\beta\sum_{x_{t+1}}\left[u_{0t+1}(x_{t+1})-\ln(p_{0t+1}(x_{t+1}))\right]f_1(x_{t+1}|x_t)-\\ &\phantom{\text{--- --- --- --- --- --- --- -}}\beta\sum_{x_{t+1}}\left[u_{0t+1}(x_{t+1})-\ln(p_{0t+1}(x_{t+1}))\right]f_0(x_{t+1}|x_t)\\ \end{align*}` ] --- # Back to Rust (1987) - Rust (1987) has two choices with the following flow payoffs: `\begin{align*} u(x_t,d_t,\theta)=\left\{\begin{array}{ll}-c(x_t,\theta)&\textrm{if }d_t=0\\-[\overline{P}-\underline{P}+c(0,\theta)]&\textrm{if } d_t=1\end{array}\right. \end{align*}` - The value of replacing the engine at `\(t+1\)` then does not depend upon whether the engine was replaced at `\(t\)` - This implies that we only need the one-period-ahead probability of replacement for the future utility component --- # Rust (1987) with CCP's `\begin{align*} v_1(x)&=u_1(x)+\beta\left[v_1(0)-\ln(p_1(0))\right]+\beta c\\ v_0(x)&=u_0(x)+\beta\sum_{x'}\left[v_1(x')-\ln(p_1(x'))\right]f(x'|x)+\beta c \end{align*}` - In this case `\(v_1(0)\)` and `\(v_1(x')\)` are the same - Taking differences yields: `\begin{align*} v_1(x)-v_0(x)&=u_1(x)-u_0(x)+\beta\left[\sum_{x'}\left(\ln[p_1(x)]-\ln[p_1(0)]\right)f(x'|x)\right] \end{align*}` - Estimation is then as simple as a logit with an adjustment term, with the calculation of the `\(p_1\)`'s and `\(f(x'|x)\)` in a first stage --- # CCP's with finite mixture distributions - Arcidiacono and Miller (2011) show how to use CCPs with unobserved heterogeneity - They show that you can adjust the Rust (1987) model to incorporate unobservable bus attributes - The model still estimates quickly due to additive separability in the model components (Arcidiacono and Jones, 2003) --- # CCP's with actions that are not terminal or renewal - Rust (1987) provides an example of a renewal action - Hotz and Miller (1993) shows an example of a terminal action - We can still use CCP's even if no such actions exist in our model - The main difference is that we will need additional CCPs than just `\(\ln p_{0t+1}\)` - Through a property known as .hi[finite dependence] we can achieve cancellation after at most 3 periods (depending on the model) - Recent examples: - Arcidiacono, Aucejo, Maurel et al. (2016) – see equation (29) - Ransom (2022) – see equation (A.14) and Figures A4 and A5 --- # Counterfactuals and CCP's - The main rub with CCPs is that they don't simplify counterfactual simulations - Why? Because we don't observe `\(\ln p_{0t+1}\)` in the counterfactual world - If we could, we probably wouldn't need a structural model to begin with - So we still must do a backwards recursion computation to get counterfactuals - Or restrict ourselves to short-run counterfactuals --- # References .smallest70[ Ackerberg, D. A. (2003). "Advertising, Learning, and Consumer Choice in Experience Good Markets: An Empirical Examination". In: _International Economic Review_ 44.3, pp. 1007-1040. DOI: [10.1111/1468-2354.t01-2-00098](https://doi.org/10.1111%2F1468-2354.t01-2-00098). Adams, R. P. (2018). _Model Selection and Cross Validation_. Lecture Notes. Princeton University. URL: [https://www.cs.princeton.edu/courses/archive/fall18/cos324/files/model-selection.pdf](https://www.cs.princeton.edu/courses/archive/fall18/cos324/files/model-selection.pdf). Ahlfeldt, G. M., S. J. Redding, D. M. Sturm, et al. (2015). "The Economics of Density: Evidence From the Berlin Wall". In: _Econometrica_ 83.6, pp. 2127-2189. DOI: [10.3982/ECTA10876](https://doi.org/10.3982%2FECTA10876). Altonji, J. G., T. E. Elder, and C. R. Taber (2005). "Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools". In: _Journal of Political Economy_ 113.1, pp. 151-184. DOI: [10.1086/426036](https://doi.org/10.1086%2F426036). Altonji, J. G. and C. R. Pierret (2001). "Employer Learning and Statistical Discrimination". In: _Quarterly Journal of Economics_ 116.1, pp. 313-350. DOI: [10.1162/003355301556329](https://doi.org/10.1162%2F003355301556329). Angrist, J. D. and A. B. Krueger (1991). "Does Compulsory School Attendance Affect Schooling and Earnings?" In: _Quarterly Journal of Economics_ 106.4, pp. 979-1014. DOI: [10.2307/2937954](https://doi.org/10.2307%2F2937954). Angrist, J. D. and J. Pischke (2009). _Mostly Harmless Econometrics: An Empiricist's Companion_. Princeton University Press. ISBN: 0691120358. Arcidiacono, P. (2004). "Ability Sorting and the Returns to College Major". In: _Journal of Econometrics_ 121, pp. 343-375. DOI: [10.1016/j.jeconom.2003.10.010](https://doi.org/10.1016%2Fj.jeconom.2003.10.010). Arcidiacono, P., E. Aucejo, A. Maurel, et al. (2016). _College Attrition and the Dynamics of Information Revelation_. Working Paper. Duke University. URL: [https://tyleransom.github.io/research/CollegeDropout2016May31.pdf](https://tyleransom.github.io/research/CollegeDropout2016May31.pdf). Arcidiacono, P., E. Aucejo, A. Maurel, et al. (2025). "College Attrition and the Dynamics of Information Revelation". In: _Journal of Political Economy_ 133.1. DOI: [10.1086/732526](https://doi.org/10.1086%2F732526). Arcidiacono, P. and J. B. Jones (2003). "Finite Mixture Distributions, Sequential Likelihood and the EM Algorithm". In: _Econometrica_ 71.3, pp. 933-946. DOI: [10.1111/1468-0262.00431](https://doi.org/10.1111%2F1468-0262.00431). Arcidiacono, P., J. Kinsler, and T. Ransom (2022b). "Asian American Discrimination in Harvard Admissions". In: _European Economic Review_ 144, p. 104079. DOI: [10.1016/j.euroecorev.2022.104079](https://doi.org/10.1016%2Fj.euroecorev.2022.104079). Arcidiacono, P., J. Kinsler, and T. Ransom (2022a). "Legacy and Athlete Preferences at Harvard". In: _Journal of Labor Economics_ 40.1, pp. 133-156. DOI: [10.1086/713744](https://doi.org/10.1086%2F713744). Arcidiacono, P. and R. A. Miller (2011). "Conditional Choice Probability Estimation of Dynamic Discrete Choice Models With Unobserved Heterogeneity". In: _Econometrica_ 79.6, pp. 1823-1867. DOI: [10.3982/ECTA7743](https://doi.org/10.3982%2FECTA7743). Arroyo Marioli, F., F. Bullano, S. Kucinskas, et al. (2020). _Tracking R of COVID-19: A New Real-Time Estimation Using the Kalman Filter_. Working Paper. medRxiv. DOI: [10.1101/2020.04.19.20071886](https://doi.org/10.1101%2F2020.04.19.20071886). Ashworth, J., V. J. Hotz, A. Maurel, et al. (2021). "Changes across Cohorts in Wage Returns to Schooling and Early Work Experiences". In: _Journal of Labor Economics_ 39.4, pp. 931-964. DOI: [10.1086/711851](https://doi.org/10.1086%2F711851). Attanasio, O. P., C. Meghir, and A. Santiago (2011). "Education Choices in Mexico: Using a Structural Model and a Randomized Experiment to Evaluate PROGRESA". In: _Review of Economic Studies_ 79.1, pp. 37-66. DOI: [10.1093/restud/rdr015](https://doi.org/10.1093%2Frestud%2Frdr015). Aucejo, E. M. and J. James (2019). "Catching Up to Girls: Understanding the Gender Imbalance in Educational Attainment Within Race". In: _Journal of Applied Econometrics_ 34.4, pp. 502-525. DOI: [10.1002/jae.2699](https://doi.org/10.1002%2Fjae.2699). Baragatti, M., A. Grimaud, and D. Pommeret (2013). "Likelihood-free Parallel Tempering". In: _Statistics and Computing_ 23.4, pp. 535-549. DOI: [ 10.1007/s11222-012-9328-6](https://doi.org/%2010.1007%2Fs11222-012-9328-6). Bayer, P., R. McMillan, A. Murphy, et al. (2016). "A Dynamic Model of Demand for Houses and Neighborhoods". In: _Econometrica_ 84.3, pp. 893-942. DOI: [10.3982/ECTA10170](https://doi.org/10.3982%2FECTA10170). Begg, C. B. and R. Gray (1984). "Calculation of Polychotomous Logistic Regression Parameters Using Individualized Regressions". In: _Biometrika_ 71.1, pp. 11-18. DOI: [10.1093/biomet/71.1.11](https://doi.org/10.1093%2Fbiomet%2F71.1.11). Beggs, S. D., N. S. Cardell, and J. Hausman (1981). "Assessing the Potential Demand for Electric Cars". In: _Journal of Econometrics_ 17.1, pp. 1-19. DOI: [10.1016/0304-4076(81)90056-7](https://doi.org/10.1016%2F0304-4076%2881%2990056-7). Berry, S., J. Levinsohn, and A. Pakes (1995). "Automobile Prices in Market Equilibrium". In: _Econometrica_ 63.4, pp. 841-890. URL: [http://www.jstor.org/stable/2171802](http://www.jstor.org/stable/2171802). Blass, A. A., S. Lach, and C. F. Manski (2010). "Using Elicited Choice Probabilities to Estimate Random Utility Models: Preferences for Electricity Reliability". In: _International Economic Review_ 51.2, pp. 421-440. DOI: [10.1111/j.1468-2354.2010.00586.x](https://doi.org/10.1111%2Fj.1468-2354.2010.00586.x). Blundell, R. (2010). "Comments on: ``Structural vs. Atheoretic Approaches to Econometrics'' by Michael Keane". In: _Journal of Econometrics_ 156.1, pp. 25-26. DOI: [10.1016/j.jeconom.2009.09.005](https://doi.org/10.1016%2Fj.jeconom.2009.09.005). Bresnahan, T. F., S. Stern, and M. Trajtenberg (1997). "Market Segmentation and the Sources of Rents from Innovation: Personal Computers in the Late 1980s". In: _The RAND Journal of Economics_ 28.0, pp. S17-S44. DOI: [10.2307/3087454](https://doi.org/10.2307%2F3087454). Brien, M. J., L. A. Lillard, and S. Stern (2006). "Cohabitation, Marriage, and Divorce in a Model of Match Quality". In: _International Economic Review_ 47.2, pp. 451-494. DOI: [10.1111/j.1468-2354.2006.00385.x](https://doi.org/10.1111%2Fj.1468-2354.2006.00385.x). Card, D. (1995). "Using Geographic Variation in College Proximity to Estimate the Return to Schooling". In: _Aspects of Labor Market Behaviour: Essays in Honour of John Vanderkamp_. Ed. by L. N. Christofides, E. K. Grant and R. Swidinsky. Toronto: University of Toronto Press. Cardell, N. S. (1997). "Variance Components Structures for the Extreme-Value and Logistic Distributions with Application to Models of Heterogeneity". In: _Econometric Theory_ 13.2, pp. 185-213. URL: [https://www.jstor.org/stable/3532724](https://www.jstor.org/stable/3532724). Caucutt, E. M., L. Lochner, J. Mullins, et al. (2020). _Child Skill Production: Accounting for Parental and Market-Based Time and Goods Investments_. Working Paper 27838. National Bureau of Economic Research. DOI: [10.3386/w27838](https://doi.org/10.3386%2Fw27838). Chen, X., H. Hong, and D. Nekipelov (2011). "Nonlinear Models of Measurement Errors". In: _Journal of Economic Literature_ 49.4, pp. 901-937. DOI: [10.1257/jel.49.4.901](https://doi.org/10.1257%2Fjel.49.4.901). Chintagunta, P. K. (1992). "Estimating a Multinomial Probit Model of Brand Choice Using the Method of Simulated Moments". In: _Marketing Science_ 11.4, pp. 386-407. DOI: [10.1287/mksc.11.4.386](https://doi.org/10.1287%2Fmksc.11.4.386). Cinelli, C. and C. Hazlett (2020). "Making Sense of Sensitivity: Extending Omitted Variable Bias". In: _Journal of the Royal Statistical Society: Series B (Statistical Methodology)_ 82.1, pp. 39-67. DOI: [10.1111/rssb.12348](https://doi.org/10.1111%2Frssb.12348). Coate, P. and K. Mangum (2019). _Fast Locations and Slowing Labor Mobility_. Working Paper 19-49. Federal Reserve Bank of Philadelphia. Cunha, F., J. J. Heckman, and S. M. Schennach (2010). "Estimating the Technology of Cognitive and Noncognitive Skill Formation". In: _Econometrica_ 78.3, pp. 883-931. DOI: [10.3982/ECTA6551](https://doi.org/10.3982%2FECTA6551). Cunningham, S. (2021). _Causal Inference: The Mixtape_. Yale University Press. URL: [https://www.scunning.com/causalinference_norap.pdf](https://www.scunning.com/causalinference_norap.pdf). Delavande, A. and C. F. Manski (2015). "Using Elicited Choice Probabilities in Hypothetical Elections to Study Decisions to Vote". In: _Electoral Studies_ 38, pp. 28-37. DOI: [10.1016/j.electstud.2015.01.006](https://doi.org/10.1016%2Fj.electstud.2015.01.006). Delavande, A. and B. Zafar (2019). "University Choice: The Role of Expected Earnings, Nonpecuniary Outcomes, and Financial Constraints". In: _Journal of Political Economy_ 127.5, pp. 2343-2393. DOI: [10.1086/701808](https://doi.org/10.1086%2F701808). Diegert, P., M. A. Masten, and A. Poirier (2025). _Assessing Omitted Variable Bias when the Controls are Endogenous_. arXiv. DOI: [10.48550/ARXIV.2206.02303](https://doi.org/10.48550%2FARXIV.2206.02303). Erdem, T. and M. P. Keane (1996). "Decision-Making under Uncertainty: Capturing Dynamic Brand Choice Processes in Turbulent Consumer Goods Markets". In: _Marketing Science_ 15.1, pp. 1-20. DOI: [10.1287/mksc.15.1.1](https://doi.org/10.1287%2Fmksc.15.1.1). Evans, R. W. (2018). _Simulated Method of Moments (SMM) Estimation_. QuantEcon Note. University of Chicago. URL: [https://notes.quantecon.org/submission/5b3db2ceb9eab00015b89f93](https://notes.quantecon.org/submission/5b3db2ceb9eab00015b89f93). Farber, H. S. and R. Gibbons (1996). "Learning and Wage Dynamics". In: _Quarterly Journal of Economics_ 111.4, pp. 1007-1047. DOI: [10.2307/2946706](https://doi.org/10.2307%2F2946706). Fu, C., N. Grau, and J. Rivera (2020). _Wandering Astray: Teenagers' Choices of Schooling and Crime_. Working Paper. University of Wisconsin-Madison. URL: [https://www.ssc.wisc.edu/~cfu/wander.pdf](https://www.ssc.wisc.edu/~cfu/wander.pdf). Gillingham, K., F. Iskhakov, A. Munk-Nielsen, et al. (2022). "Equilibrium Trade in Automobiles". In: _Journal of Political Economy_. DOI: [10.1086/720463](https://doi.org/10.1086%2F720463). Haile, P. (2019). _``Structural vs. Reduced Form'' Language and Models in Empirical Economics_. Lecture Slides. Yale University. URL: [http://www.econ.yale.edu/~pah29/intro.pdf](http://www.econ.yale.edu/~pah29/intro.pdf). Haile, P. (2024). _Models, Measurement, and the Language of Empirical Economics_. Lecture Slides. Yale University. URL: [https://www.dropbox.com/s/8kwtwn30dyac18s/intro.pdf](https://www.dropbox.com/s/8kwtwn30dyac18s/intro.pdf). Heckman, J. J., J. Stixrud, and S. Urzua (2006). "The Effects of Cognitive and Noncognitive Abilities on Labor Market Outcomes and Social Behavior". In: _Journal of Labor Economics_ 24.3, pp. 411-482. DOI: [10.1086/504455](https://doi.org/10.1086%2F504455). Hotz, V. J. and R. A. Miller (1993). "Conditional Choice Probabilities and the Estimation of Dynamic Models". In: _The Review of Economic Studies_ 60.3, pp. 497-529. DOI: [10.2307/2298122](https://doi.org/10.2307%2F2298122). Hurwicz, L. (1950). "Generalization of the Concept of Identification". In: _Statistical Inference in Dynamic Economic Models_. Hoboken, NJ: John Wiley and Sons, pp. 245-257. Ishimaru, S. (2022). _Geographic Mobility of Youth and Spatial Gaps in Local College and Labor Market Opportunities_. Working Paper. Hitotsubashi University. James, J. (2011). _Ability Matching and Occupational Choice_. Working Paper 11-25. Federal Reserve Bank of Cleveland. James, J. (2017). "MM Algorithm for General Mixed Multinomial Logit Models". In: _Journal of Applied Econometrics_ 32.4, pp. 841-857. DOI: [10.1002/jae.2532](https://doi.org/10.1002%2Fjae.2532). Jin, H. and H. Shen (2020). "Foreign Asset Accumulation Among Emerging Market Economies: A Case for Coordination". In: _Review of Economic Dynamics_ 35.1, pp. 54-73. DOI: [10.1016/j.red.2019.04.006](https://doi.org/10.1016%2Fj.red.2019.04.006). Keane, M. P. (2010). "Structural vs. Atheoretic Approaches to Econometrics". In: _Journal of Econometrics_ 156.1, pp. 3-20. DOI: [10.1016/j.jeconom.2009.09.003](https://doi.org/10.1016%2Fj.jeconom.2009.09.003). Keane, M. P. and K. I. Wolpin (1997). "The Career Decisions of Young Men". In: _Journal of Political Economy_ 105.3, pp. 473-522. DOI: [10.1086/262080](https://doi.org/10.1086%2F262080). Koopmans, T. C. and O. Reiersol (1950). "The Identification of Structural Characteristics". In: _The Annals of Mathematical Statistics_ 21.2, pp. 165-181. URL: [http://www.jstor.org/stable/2236899](http://www.jstor.org/stable/2236899). Kosar, G., T. Ransom, and W. van der Klaauw (2022). "Understanding Migration Aversion Using Elicited Counterfactual Choice Probabilities". In: _Journal of Econometrics_ 231.1, pp. 123-147. DOI: [10.1016/j.jeconom.2020.07.056](https://doi.org/10.1016%2Fj.jeconom.2020.07.056). Krauth, B. (2016). "Bounding a Linear Causal Effect Using Relative Correlation Restrictions". In: _Journal of Econometric Methods_ 5.1, pp. 117-141. DOI: [10.1515/jem-2013-0013](https://doi.org/10.1515%2Fjem-2013-0013). Lang, K. and M. D. Palacios (2018). _The Determinants of Teachers' Occupational Choice_. Working Paper 24883. National Bureau of Economic Research. DOI: [10.3386/w24883](https://doi.org/10.3386%2Fw24883). Lee, D. S., J. McCrary, M. J. Moreira, et al. (2020). _Valid t-ratio Inference for IV_. Working Paper. arXiv. URL: [https://arxiv.org/abs/2010.05058](https://arxiv.org/abs/2010.05058). Lewbel, A. (2019). "The Identification Zoo: Meanings of Identification in Econometrics". In: _Journal of Economic Literature_ 57.4, pp. 835-903. DOI: [10.1257/jel.20181361](https://doi.org/10.1257%2Fjel.20181361). Mahoney, N. (2022). "Principles for Combining Descriptive and Model-Based Analysis in Applied Microeconomics Research". In: _Journal of Economic Perspectives_ 36.3, pp. 211-22. DOI: [10.1257/jep.36.3.211](https://doi.org/10.1257%2Fjep.36.3.211). McFadden, D. (1978). "Modelling the Choice of Residential Location". In: _Spatial Interaction Theory and Planning Models_. Ed. by A. Karlqvist, L. Lundqvist, F. Snickers and J. W. Weibull. Amsterdam: North Holland, pp. 75-96. McFadden, D. (1989). "A Method of Simulated Moments for Estimation of Discrete Response Models Without Numerical Integration". In: _Econometrica_ 57.5, pp. 995-1026. DOI: [10.2307/1913621](https://doi.org/10.2307%2F1913621). URL: [http://www.jstor.org/stable/1913621](http://www.jstor.org/stable/1913621). Mellon, J. (2020). _Rain, Rain, Go Away: 137 Potential Exclusion-Restriction Violations for Studies Using Weather as an Instrumental Variable_. Working Paper. University of Manchester. URL: [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3715610](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3715610). Miller, R. A. (1984). "Job Matching and Occupational Choice". In: _Journal of Political Economy_ 92.6, pp. 1086-1120. DOI: [10.1086/261276](https://doi.org/10.1086%2F261276). Mincer, J. (1974). _Schooling, Experience and Earnings_. New York: Columbia University Press for National Bureau of Economic Research. Ost, B., W. Pan, and D. Webber (2018). "The Returns to College Persistence for Marginal Students: Regression Discontinuity Evidence from University Dismissal Policies". In: _Journal of Labor Economics_ 36.3, pp. 779-805. DOI: [10.1086/696204](https://doi.org/10.1086%2F696204). Oster, E. (2019). "Unobservable Selection and Coefficient Stability: Theory and Evidence". In: _Journal of Business & Economic Statistics_ 37.2, pp. 187-204. DOI: [10.1080/07350015.2016.1227711](https://doi.org/10.1080%2F07350015.2016.1227711). Pischke, S. (2007). _Lecture Notes on Measurement Error_. Lecture Notes. London School of Economics. URL: [http://econ.lse.ac.uk/staff/spischke/ec524/Merr_new.pdf](http://econ.lse.ac.uk/staff/spischke/ec524/Merr_new.pdf). Ransom, M. R. and T. Ransom (2018). "Do High School Sports Build or Reveal Character? Bounding Causal Estimates of Sports Participation". In: _Economics of Education Review_ 64, pp. 75-89. DOI: [10.1016/j.econedurev.2018.04.002](https://doi.org/10.1016%2Fj.econedurev.2018.04.002). Ransom, T. (2022). "Labor Market Frictions and Moving Costs of the Employed and Unemployed". In: _Journal of Human Resources_ 57.S, pp. S137-S166. DOI: [10.3368/jhr.monopsony.0219-10013R2](https://doi.org/10.3368%2Fjhr.monopsony.0219-10013R2). Rudik, I. (2020). "Optimal Climate Policy When Damages Are Unknown". In: _American Economic Journal: Economic Policy_ 12.2, pp. 340-373. DOI: [10.1257/pol.20160541](https://doi.org/10.1257%2Fpol.20160541). Rust, J. (1987). "Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher". In: _Econometrica_ 55.5, pp. 999-1033. URL: [http://www.jstor.org/stable/1911259](http://www.jstor.org/stable/1911259). Shalizi, C. R. (2019). _Advanced Data Analysis from an Elementary Point of View_. Cambridge University Press. URL: [http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ADAfaEPoV.pdf](http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ADAfaEPoV.pdf). Smith Jr., A. A. (2008). "Indirect Inference". In: _The New Palgrave Dictionary of Economics_. Ed. by S. N. Durlauf and L. E. Blume. Vol. 1-8. London: Palgrave Macmillan. DOI: [10.1007/978-1-349-58802-2](https://doi.org/10.1007%2F978-1-349-58802-2). URL: [http://www.econ.yale.edu/smith/palgrave7.pdf](http://www.econ.yale.edu/smith/palgrave7.pdf). Stinebrickner, R. and T. Stinebrickner (2014a). "Academic Performance and College Dropout: Using Longitudinal Expectations Data to Estimate a Learning Model". In: _Journal of Labor Economics_ 32.3, pp. 601-644. DOI: [10.1086/675308](https://doi.org/10.1086%2F675308). Stinebrickner, R. and T. R. Stinebrickner (2014b). "A Major in Science? Initial Beliefs and Final Outcomes for College Major and Dropout". In: _Review of Economic Studies_ 81.1, pp. 426-472. DOI: [10.1093/restud/rdt025](https://doi.org/10.1093%2Frestud%2Frdt025). Su, C. and K. L. Judd (2012). "Constrained Optimization Approaches to Estimation of Structural Models". In: _Econometrica_ 80.5, pp. 2213-2230. DOI: [10.3982/ECTA7925](https://doi.org/10.3982%2FECTA7925). Train, K. (2009). _Discrete Choice Methods with Simulation_. 2nd ed. Cambridge; New York: Cambridge University Press. ISBN: 9780521766555. Wiswall, M. and B. Zafar (2018). "Preference for the Workplace, Investment in Human Capital, and Gender". In: _Quarterly Journal of Economics_ 133.1, pp. 457-507. DOI: [10.1093/qje/qjx035](https://doi.org/10.1093%2Fqje%2Fqjx035). Young, A. (2020). _Consistency without Inference: Instrumental Variables in Practical Application_. Working Paper. London School of Economics. ]