class: center, middle, inverse, title-slide .title[ # NIMBLE models ] .subtitle[ ## NIMBLE NESS 2022 short course ] .author[ ### NIMBLE Development Team ] .date[ ### May 2022 ] --- # NIMBLE models: some concepts ### Models are graphs - Scientists in many application domains and many statisticians speak of "hierarchical models". - Computer scientists and others sometimes speak of "graphical models". - A hierarchical model is typically a directed acyclic graph (DAG). -- ### Models are objects When you create a NIMBLE model, it is an object in R. You can: - Get or set parameter or data values. - Determine graph relationships. - Calculate log probabilities. - Simulate (draw) from distributions. - More. --- ### Linear regression example Let's use a really simple model: - Linear regression with 4 data points. ```r set.seed(1) code <- nimbleCode({ intercept ~ dnorm(0, sd = 1000) slope ~ dnorm(0, sd = 1000) sigma ~ dunif(0, 100) for(i in 1:4) { predicted.y[i] <- intercept + slope * x[i] y[i] ~ dnorm(predicted.y[i], sd = sigma) } }) model <- nimbleModel(code, data = list(y = rnorm(4)), inits = list(intercept = 0.5, slope = 0.2, sigma = 1, x = c(0.1, 0.2, 0.3, 0.4))) ``` ``` ## Defining model ``` ``` ## Building model ``` ``` ## Setting data and initial values ``` ``` ## Running calculate on model ## [Note] Any error reports that follow may simply reflect missing values in model variables. ``` ``` ## Checking model sizes and dimensions ``` `x` values are neither observations ('data') nor parameters. `x` can also be provided in `constants` (this is generally best, unless you plan to change `x`) or in `data` (for consistency with WinBUGS/JAGS). --- ### Draw the graph <!-- --> - Think of each line of BUGS language code as declaring one or mode *nodes*. - The graph is stored inside the nimble model object (`model$graph`) --- ### Some definitions The parameters and data in a model are represented as nodes in a graph. Here we define some terms: - **parameter**: an unknown quantity in the model that is represented as a random variable and will generally be estimated in an algorithm, e.g. `intercept` in the linear model above - **data**: a known quantity in a model, also represented as a random variable, e.g., `y[1], ..., y[1], ...` in the linear model - **constants**: other fixed quantities involved in the model, e.g., the number of observations - **node**: an element in the model graph representing data, parameter, or a deterministic quantity that is a function of other quantities in the model, e.g., `predicted.y[1]`, `predicted.y[2]`, in the linear model - nodes can be univariate or multivariate depending on whether they are defined based on a univariate or multivariate density (or scalar or vectorized assignment for deterministic nodes) - **variable**: a collection of one or more nodes with the same name, e.g., `predicted.y` or `y`in the linear model. --- # NIMBLE models: language ### How NIMBLE is the same as BUGS and JAGS * Most distributions and functions are supported - For distributions, see [User Manual Section 5.2.4](http://r-nimble.org/manuals/NimbleUserManual.pdf#page=39) and our [cheatsheet](https://r-nimble.org/cheatsheets/NimbleCheatSheet.pdf) - For functions, see [User Manual Section 5.2.5](http://r-nimble.org/manuals/NimbleUserManual.pdf#page=44) * Most syntax is supported - Truncation syntax is different when using `nimbleCode`. (It can be the same as for JAGS if reading code from a file with `readBUGSmodel()`). See our [guide online](https://r-nimble.org/quick-guide-for-converting-from-jags-or-bugs-to-nimble). --- ### How NIMBLE extends BUGS - Alternative distribution parameterizations (like R). - Named parameters (like R). - Vectorized math (e.g., `pred[1:5] <- b0 + b1*x[1:5]`) and linear algebra. - Definition-time if-then-else (multiple model variants from the same code). - User-defined functions and distributions. - Distinction between `constants` and `data`. ### How NIMBLE is limited compared to BUGS and/or JAGS: NIMBLE is stricter about requiring square brackets and informative indices for non-scalar nodes. --- ## R-like alternative and named parameters ```r littersCode <- nimbleCode({ for (i in 1:G) { for (j in 1:N) { # likelihood (data model) r[i,j] ~ dbin(p[i,j], n[i,j]) # latent process (random effects) p[i,j] ~ dbeta(a[i], b[i]) } # prior for hyperparameters a[i] ~ dgamma(shape = 1, `rate` = .001) b[i] ~ dgamma(shape = 1, `scale` = 1/.001) } }) ``` Note that we could use the mean/sd parameterization of the beta distribution, which would then require different hyperparameter specification. --- ## R-like alternative and named parameters - BUGS/JAGS: Only `dnorm(mu, tau)` is supported, where `tau` is precision. - NIMBLE: Alternative parameterizations and named parameters are supported: - `dnorm(mean = mu, sd = sigma)` - `dnorm(mean = mu, var = sigma_squared)` - `dnorm(mean = mu, tau = phi)` - Distributions with alternative parameterizations are listed in Table 5.2 of [User Manual Section 5.2.4](https://r-nimble.org/html_manual/cha-writing-models.html#subsec:dists-and-functions) --- ## What are constants? What are data? ### Constants are values needed to define model relationships - Index ranges like *N* in the litters model - Constant vectors used for indexing: e.g., *block* in `mu[block[i]]` - Constants must be provided when creating a model with `nimbleModel`. ### Data represents a flag on the role a node plays in the model - E.g., data nodes shouldn't be sampled in MCMC. - Data values can be changed. - Data can be provided when calling `nimbleModel` or later --- Here's an example using the litters model: .scroll-box-16[ ```r littersModel$isData('r') ``` ``` ## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE ## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE ## [31] TRUE TRUE ``` ```r littersModel$isData('p') ``` ``` ## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE ## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE ## [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE ``` ```r littersModel$r ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] ## [1,] 13 12 9 9 8 8 12 11 9 9 8 11 4 5 ## [2,] 12 11 10 9 10 9 9 8 8 4 7 4 5 3 ## [,15] [,16] ## [1,] 7 7 ## [2,] 3 0 ``` ```r littersModel$p ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] ## [1,] NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## [2,] NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## [,15] [,16] ## [1,] NA NA ## [2,] NA NA ``` ] --- Data values are protected from simulation unless you are sure you want to simulate those ```r littersModel$simulate('r') ``` ``` ## NULL ``` ```r littersModel$simulate('p') littersModel$r ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] ## [1,] 13 12 9 9 8 8 12 11 9 9 8 11 4 5 ## [2,] 12 11 10 9 10 9 9 8 8 4 7 4 5 3 ## [,15] [,16] ## [1,] 7 7 ## [2,] 3 0 ``` ```r littersModel$p ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] ## [1,] 0.5923401 0.2780523 0.6355158 0.70135781 0.6591489 0.4143388 0.8678220 ## [2,] 0.7458808 0.7124073 0.5210273 0.06657146 0.6707607 0.4841694 0.4561155 ## [,8] [,9] [,10] [,11] [,12] [,13] [,14] ## [1,] 0.6089581 0.3288683 0.04563478 0.7928597 0.4873260 0.4954329 0.7517793 ## [2,] 0.1392273 0.3669903 0.61664857 0.8403967 0.4710227 0.6083624 0.4848245 ## [,15] [,16] ## [1,] 0.7221339 0.6639406 ## [2,] 0.1561506 0.3841565 ``` --- Data values are protected from simulation unless you are sure you want to simulate those ```r littersModel$simulate('r', includeData = TRUE) littersModel$r ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] ## [1,] 8 3 6 5 4 3 10 4 3 1 8 6 3 6 ## [2,] 10 9 6 0 7 7 6 2 3 3 7 4 5 2 ## [,15] [,16] ## [1,] 6 8 ## [2,] 1 6 ``` --- ### Providing data and constants together. - Data and constants can be provided together **as `constants`**. - It would be slightly easier for BUGS/JAGS users to call this "data", but that would blur the concepts. - NIMBLE will usually disambiguate data when it is provided as constants. --- ### What are covariates and other non-parameters/non-observations? - Covariates/predictors are examples of values that are not parameters nor data in the sense of the likelihood. - Covariates/predictors can be provided via `constants` if you don't need to change them (often the case). - Covariates/predictors can be provided via `data` or `inits` if you want to change them. - NIMBLE will not treat them as 'data nodes'. --- ### More explicit need to provide dimensions - Square brackets must always be provided to indicate number of dimensions - If `x` is 2-dimensional, use `x[,] %*% beta[]`, not `x %*% beta` - Sometimes NIMBLE is not as smart as BUGS/JAGS at determining dimensions. There are two solutions: - Give dimensions explicitly: `x[1:n, 1:m]`, OR - Provide a `dimensions` argument to `nimbleModel`. Example: `dimensions = list(x = c(n, m))`. --- ### Vectorized math and linear algebra Instead of writing this in your model code: ```r nimbleOptions(verbose = FALSE) m1 <- nimbleModel( nimbleCode({ for(i in 1:5) { predicted[i] <- beta0 + beta1 * x[i] } } )) ``` you can write this: ```r m2 <- nimbleModel( nimbleCode({ predicted[1:5] <- beta0 + beta1 * x[1:5] } )) ``` The two models (in terms of internal representation) are not equivalent --- The two models (in terms of internal representation) are not equivalent ```r ## m1 has 5 scalar nodes m1$getNodeNames() ``` ``` ## [1] "predicted[1]" "predicted[2]" "predicted[3]" "predicted[4]" "predicted[5]" ``` ```r ## m2 has 1 vector node m2$getNodeNames() ``` ``` ## [1] "predicted[1:5]" ``` One is not necessarily better than the other. It depends on the model and the MCMC configuration. (More on those topics later.) Vectorized declarations do not work for distributions. --- ### Be careful about scalar vs. vector vs. matrix vs. array This will not work: ```r x[1:5] <- A[1:5, 1:5] %*% b[1:5] + c[1:5] ``` The problem is that the right-hand-side returns a matrix, so we can't assign it to a vector. This will work: ```r x[1:5] <- (A[1:5, 1:5] %*% b[1:5] + c[1:5])[,1] ``` --- ## Definition-time if-then-else - If you wish to define multiple alternative models in one set of code, you can use if-then-else statements. - These will be evaluated based on variables in the R environment when the model is defined. - **You cannot use if statements as statements within model code apart from this usage of defining alternative models.** --- ## Definition-time if-then-else For example: ```r code <- nimbleCode({ sigma ~ dunif(0, 10) beta0 ~ dnorm(0, sd = 1000) beta1 ~ dnorm(0, sd = 1000) if(INCLUDE_X2) { beta2 ~ dnorm(0, sd = 1000) } else {} for(i in 1:10) { if(INCLUDE_X2) { y[i] ~ dnorm(beta0 + beta1 * x1[i] + beta2 * x2[i], sd = sigma) } else { y[i] ~ dnorm(beta0 + beta1 * x1[i], sd = sigma) } } }) INCLUDE_X2 <- FALSE m1 <- nimbleModel(code) INCLUDE_X2 <- TRUE m2 <- nimbleModel(code) ``` --- .scroll-box-20[ ```r m1$getNodeNames() ``` ``` ## [1] "sigma" ## [2] "beta0" ## [3] "beta1" ## [4] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[1]" ## [5] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[2]" ## [6] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[3]" ## [7] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[4]" ## [8] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[5]" ## [9] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[6]" ## [10] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[7]" ## [11] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[8]" ## [12] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[9]" ## [13] "lifted_beta0_plus_beta1_times_x1_oBi_cB_L6[10]" ## [14] "y[1]" ## [15] "y[2]" ## [16] "y[3]" ## [17] "y[4]" ## [18] "y[5]" ## [19] "y[6]" ## [20] "y[7]" ## [21] "y[8]" ## [22] "y[9]" ## [23] "y[10]" ``` ```r m2$getNodeNames() ``` ``` ## [1] "sigma" ## [2] "beta0" ## [3] "beta1" ## [4] "beta2" ## [5] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[1]" ## [6] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[2]" ## [7] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[3]" ## [8] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[4]" ## [9] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[5]" ## [10] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[6]" ## [11] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[7]" ## [12] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[8]" ## [13] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[9]" ## [14] "lifted_beta0_plus_beta1_times_x1_oBi_cB_plus_beta2_times_x2_oBi_cB_L8[10]" ## [15] "y[1]" ## [16] "y[2]" ## [17] "y[3]" ## [18] "y[4]" ## [19] "y[5]" ## [20] "y[6]" ## [21] "y[7]" ## [22] "y[8]" ## [23] "y[9]" ## [24] "y[10]" ``` ] m2 has `beta2` while m1 does not. The long names are "lifted nodes" -- more on those later. --- ### Models as objects: operating the model NIMBLE provides users and programmers with the ability to get information about the nodes and variables in the model and the relationships among them. We can use NIMBLE to ask questions such as: - What's the current value in a variable? .scroll-box-8[ ```r littersModel$r ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] ## [1,] 8 3 6 5 4 3 10 4 3 1 8 6 3 6 ## [2,] 10 9 6 0 7 7 6 2 3 3 7 4 5 2 ## [,15] [,16] ## [1,] 6 8 ## [2,] 1 6 ``` ```r littersModel$mu[1] ``` ``` ## [1] 0.5 ``` ```r littersModel$p ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] ## [1,] 0.5923401 0.2780523 0.6355158 0.70135781 0.6591489 0.4143388 0.8678220 ## [2,] 0.7458808 0.7124073 0.5210273 0.06657146 0.6707607 0.4841694 0.4561155 ## [,8] [,9] [,10] [,11] [,12] [,13] [,14] ## [1,] 0.6089581 0.3288683 0.04563478 0.7928597 0.4873260 0.4954329 0.7517793 ## [2,] 0.1392273 0.3669903 0.61664857 0.8403967 0.4710227 0.6083624 0.4848245 ## [,15] [,16] ## [1,] 0.7221339 0.6639406 ## [2,] 0.1561506 0.3841565 ``` ] Setting values: .scroll-box-8[ ```r littersModel$a[1] <- 1 littersModel$a ``` ``` ## [1] 1 2 ``` ] --- ### Querying nodes and variables - What are the nodes in the model? - What are the dependencies of a given node? Note that this is information used in many algorithms. ```r nodes <- littersModel$getNodeNames() nodes[1:11] ``` ``` ## [1] "a[1]" "a[2]" "b[1]" "b[2]" "p[1, 1]" "p[1, 2]" "p[1, 3]" ## [8] "p[1, 4]" "p[1, 5]" "p[1, 6]" "p[1, 7]" ``` ```r top <- littersModel$getNodeNames(topOnly = TRUE) top ``` ``` ## [1] "a[1]" "a[2]" "b[1]" "b[2]" ``` ```r muDeps <- littersModel$getDependencies('mu') muDeps[1:11] ``` ``` ## [1] "mu[1]" "mu[2]" NA NA NA NA NA NA NA ## [10] NA NA ``` --- ### Querying nodes and variables - What are data nodes? .scroll-box-20[ ```r littersModel$getNodeNames(dataOnly = TRUE) ``` ``` ## [1] "r[1, 1]" "r[1, 2]" "r[1, 3]" "r[1, 4]" "r[1, 5]" "r[1, 6]" ## [7] "r[1, 7]" "r[1, 8]" "r[1, 9]" "r[1, 10]" "r[1, 11]" "r[1, 12]" ## [13] "r[1, 13]" "r[1, 14]" "r[1, 15]" "r[1, 16]" "r[2, 1]" "r[2, 2]" ## [19] "r[2, 3]" "r[2, 4]" "r[2, 5]" "r[2, 6]" "r[2, 7]" "r[2, 8]" ## [25] "r[2, 9]" "r[2, 10]" "r[2, 11]" "r[2, 12]" "r[2, 13]" "r[2, 14]" ## [31] "r[2, 15]" "r[2, 16]" ``` ```r littersModel$isData('r') ``` ``` ## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE ## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE ## [31] TRUE TRUE ``` ```r ## parameters (including imputed data) littersModel$getNodeNames(stochOnly = TRUE, includeData = FALSE) ``` ``` ## [1] "a[1]" "a[2]" "b[1]" "b[2]" "p[1, 1]" "p[1, 2]" ## [7] "p[1, 3]" "p[1, 4]" "p[1, 5]" "p[1, 6]" "p[1, 7]" "p[1, 8]" ## [13] "p[1, 9]" "p[1, 10]" "p[1, 11]" "p[1, 12]" "p[1, 13]" "p[1, 14]" ## [19] "p[1, 15]" "p[1, 16]" "p[2, 1]" "p[2, 2]" "p[2, 3]" "p[2, 4]" ## [25] "p[2, 5]" "p[2, 6]" "p[2, 7]" "p[2, 8]" "p[2, 9]" "p[2, 10]" ## [31] "p[2, 11]" "p[2, 12]" "p[2, 13]" "p[2, 14]" "p[2, 15]" "p[2, 16]" ``` ] --- ### More details There are a variety of options to `getNodeNames()` and `getDependencies` that allow you to fine-tune the information you get. .scroll-box-8[ ```r args(littersModel$getDependencies) ``` ``` ## function (nodes, omit = character(), self = TRUE, determOnly = FALSE, ## stochOnly = FALSE, includeData = TRUE, dataOnly = FALSE, ## includeRHSonly = FALSE, downstream = FALSE, returnType = "names", ## returnScalarComponents = FALSE) ## NULL ``` ```r args(littersModel$getNodeNames) ``` ``` ## function (determOnly = FALSE, stochOnly = FALSE, includeData = TRUE, ## dataOnly = FALSE, includeRHSonly = FALSE, topOnly = FALSE, ## latentOnly = FALSE, endOnly = FALSE, returnType = "names", ## returnScalarComponents = FALSE) ## NULL ``` ] .scroll-box-8[ ```r latents <- littersModel$getNodeNames(latentOnly = TRUE, stochOnly = TRUE, includeData = FALSE) latents ``` ``` ## [1] "p[1, 1]" "p[1, 2]" "p[1, 3]" "p[1, 4]" "p[1, 5]" "p[1, 6]" ## [7] "p[1, 7]" "p[1, 8]" "p[1, 9]" "p[1, 10]" "p[1, 11]" "p[1, 12]" ## [13] "p[1, 13]" "p[1, 14]" "p[1, 15]" "p[1, 16]" "p[2, 1]" "p[2, 2]" ## [19] "p[2, 3]" "p[2, 4]" "p[2, 5]" "p[2, 6]" "p[2, 7]" "p[2, 8]" ## [25] "p[2, 9]" "p[2, 10]" "p[2, 11]" "p[2, 12]" "p[2, 13]" "p[2, 14]" ## [31] "p[2, 15]" "p[2, 16]" ``` ```r littersModel$getDependencies(latents, dataOnly = TRUE) ``` ``` ## [1] "r[1, 1]" "r[1, 2]" "r[1, 3]" "r[1, 4]" "r[1, 5]" "r[1, 6]" ## [7] "r[1, 7]" "r[1, 8]" "r[1, 9]" "r[1, 10]" "r[1, 11]" "r[1, 12]" ## [13] "r[1, 13]" "r[1, 14]" "r[1, 15]" "r[1, 16]" "r[2, 1]" "r[2, 2]" ## [19] "r[2, 3]" "r[2, 4]" "r[2, 5]" "r[2, 6]" "r[2, 7]" "r[2, 8]" ## [25] "r[2, 9]" "r[2, 10]" "r[2, 11]" "r[2, 12]" "r[2, 13]" "r[2, 14]" ## [31] "r[2, 15]" "r[2, 16]" ``` ```r ## littersModel$getNodeNames(dataOnly = TRUE) ``` ] --- ### Querying nodes and variables - What are the variables in the model? What information is available about them? .scroll-box-8[ ```r littersModel$getVarNames() ``` ``` ## [1] "r" "p" "a" "b" "mu" "theta" ``` ```r littersModel$getVarInfo('a') ``` ``` ## Reference class object of class "varInfoClass" ## Field "varName": ## [1] "a" ## Field "mins": ## [1] 1 ## Field "maxs": ## [1] 2 ## Field "nDim": ## [1] 1 ## Field "anyStoch": ## [1] TRUE ## Field "anyDynamicallyIndexed": ## [1] FALSE ``` ] As of the current version of NIMBLE, information about variables is not competely nicely arranged for a user (there aren't as many query functions), but it is available. This variable has 1 dimension (`nDim`), and its size is 2 (`maxs`). Currently `mins` is always 1. If at least one node within the variable is stochastic, then `anyStoch` will be `TRUE`. --- ### Operating the model: *simulate()* We have control over the model. In particular, for every node, NIMBLE provides **`calculate()`** and **`simulate()`** functions that calculate the current probability density value for the node and simulate a new value for the node from its (prior) distribution (i.e., given only parent nodes). These operations lie at the heart of many algorithms. **`simulate()`** puts new values into the model; if you want to see those values, you need to look into the model. .scroll-box-12[ ```r set.seed(1) # so the calculations are reproducible littersModel$simulate('p') # simulate from prior littersModel$p ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] ## [1,] 0.5803922 0.2715744 0.02844934 0.2276611 0.65840976 0.1855204 0.1300447 ## [2,] 0.6639406 0.7458808 0.71240731 0.5210273 0.06657146 0.6707607 0.4841694 ## [,8] [,9] [,10] [,11] [,12] [,13] [,14] ## [1,] 0.4492384 0.03374921 0.2108929 0.5782571 0.06969556 0.3494547 0.3390992 ## [2,] 0.4561155 0.13922729 0.3669903 0.6166486 0.84039667 0.4710227 0.6083624 ## [,15] [,16] ## [1,] 0.09446711 0.1146782 ## [2,] 0.48482454 0.1561506 ``` ```r littersModel$getLogProb('p') # log prob not yet updated! ``` ``` ## [1] NA ``` ```r littersModel$calculate('p') # update it ``` ``` ## [1] 7.521954 ``` ```r littersModel$getLogProb('p') # now we're good ``` ``` ## [1] 7.521954 ``` ] --- ### Operating the model: *calculate()* Let's change values in the model and recalculate the density values. - NIMBLE sometimes introduces *hidden nodes* (remember those lifted nodes?) not specified in the model, - When calculating model density values, it's best to ask NIMBLE to do so based on the dependencies of the altered node. ```r littersModel$getLogProb('p') ``` ``` ## [1] 7.521954 ``` ```r littersModel$a[1] <- 1 littersModel$b[1] <- 3 littersModel$getLogProb('p') # recall why this hasn't changed yet ``` ``` ## [1] 7.521954 ``` --- ### Operating the model: *calculate()* ```r ## DON'T DO THIS! (though it's ok to do here...) ## littersModel$calculate('p') ## INSTEAD DO THIS deps <- littersModel$getDependencies(c('a[1]','b[1]')) littersModel$calculate(deps) ``` ``` ## [1] -7.760781 ``` ```r ## alternatively, we could just update the entire model to be safe: ## littersModel$calculate() ## now that model state is updated, can ask for logProbs littersModel$getLogProb('p') ``` ``` ## [1] 8.249862 ``` --- ### The importance of querying a model: lifted nodes NIMBLE implements some features by inserting its own nodes. **You should never assume you know what nodes are in a model simply because you wrote the model code.** Let's look at the two main ways this happens: ```r code <- nimbleCode({ mu ~ dnorm(0, sd = 10) tau ~ dunif(0, 100) x ~ dnorm(mu, tau) #by default, tau is a precision }) m1 <- nimbleModel(code, data = list(x = 2.5), inits = list(mu = 0, tau = 1)) ``` --- <center><img src="variables.png"></center> --- ```r m1$getNodeNames() ``` ``` ## [1] "mu" "tau" ## [3] "lifted_d1_over_sqrt_oPtau_cP" "x" ``` The node `lifted_d1_over_sqrt_oPtau_cP` has been inserted between `tau` and `x`. The resulting model would equivalently have been created by this model code: ```r nimbleCode({ mu ~ dnorm(0, sd = 10) tau ~ dunif(0, 100) lifted_d1_over_sqrt_oPtau_cP <- 1/sqrt(tau) x ~ dnorm(0, sd = lifted_d1_over_sqrt_oPtau_cP) # override and make 2nd arg the SD }) ``` --- NIMBLE has *lifted* the calculation of standard deviation from precision so that it is part of the model's graph. Therefore *you will make a mistake if you assume that the dependencies of `tau` include only `x`*: ```r m1$tau ## Original value of tau ``` ``` ## [1] 1 ``` ```r m1$getParam('x', 'sd') ## Original value of sd based on tau ``` ``` ## [1] 1 ``` ```r ## Equivalent to getParam(m1, "x", "sd") m1$tau <- 16 m1$x <- 1 m1$calculate('x') ## Wrong: the lifted node is being neglected ``` ``` ## [1] -1.418939 ``` ```r m1$calculate(m1$getDependencies('tau')) ## Wrong: the lifted node is being neglected ``` ``` ## [1] -12.13781 ``` ```r m1$getParam('x', 'sd') ## Verifying the sd is not up to date. ``` ``` ## [1] 0.25 ``` --- ### The importance of querying a model: model-generic programming Instead we need to ask NIMBLE for the dependencies of any nodes that have changed and use *model-generic* programming: .scroll-box-14[ ```r deps <- m1$getDependencies('tau') m1$tau <- 16 m1$x <- 1 m1$calculate(deps) ## Correct: the lifted node is updated too ``` ``` ## [1] -12.13781 ``` ```r deps ``` ``` ## [1] "tau" "lifted_d1_over_sqrt_oPtau_cP" ## [3] "x" ``` ```r m1$getLogProb('x') ``` ``` ## [1] -7.532644 ``` ```r m1$getParam('x', 'sd') ## Verifying the sd is up to date ``` ``` ## [1] 0.25 ``` ] So the call to *calculate* causes the lifted node to get updated and then calculates the log probability density for *x*.