class: center, middle, inverse, title-slide # Miscellaneous Tips and Tricks in .mono[R] ## EC 425/525, Lab 7 ### Edward Rubin ### 17 May 2019 --- class: inverse, middle # Prologue --- name: schedule # Schedule ## Last time Simulation in .mono[R] ## Today Helpful tips and tricks in .mono[R] --- layout: true # Tips and tricks --- class: inverse, middle --- name: applys ## The apply family In general, `for` loops are not the "preferred" route in .mono[R]. -- 1. Many functions are vectorized—you can apply a function over a vector. -- <br>_E.g._, the square root of the numbers from 1 to 10: `sqrt(1:10)`. -- 1. That said, sometimes you just gotta loop. -- <br>For these situations, `base` .mono[R] offers a family of `apply` functions. --- name: lapply ## The apply family The `apply` family *applies* a function over a vector, list, data frame, *etc.* -- For example, `lapply()` takes two arguments: `X` and `FUN`. -- - .purple[`X`] A vector/list of values. -- - .purple[`FUN`] The function you want to evaluate on each value of `X`. -- `lapply()` returns a list of the results. -- .ex[Example] `toupper()` capitalizes characters -- , _e.g._, `toupper("a")` yields `"A"`. -- `lapply(X = c("a", "pig"), FUN = toupper)` -- returns `list("A", "PIG")`. -- .note[Note] This is a silly example, as you can directly use `toupper()` on vectors. --- name: apply ## Plain apply The related `apply()` function *applies* a given function (`FUN`) along the margins (`MARGIN`) of a given array/matrix (`X`). -- Your options for `MARGIN` are `1` for rows and `2` for columns. -- .ex[Example] Let's find the maximum value in each row of a matrix. ```r # Create a matrix ex_matrix <- matrix(data = 1:16, nrow = 4, byrow = T) # Find the maximum value in each row. apply(X = ex_matrix, MARGIN = 1, FUN = max) ``` ``` #> [1] 4 8 12 16 ``` --- name: mapply ## Multiple apply Like `lapply()`, `mapply()` repeatedly evaluates a function (`FUN`) for each value in a vector of inputs. -- However, `mapply()` allows you to evaluate across .b[multiple] vectors. -- In addition `mapply()` allows you to dictate whether/how the results are simplified (_e.g._, `SIMPLIFY = T` for vector or matrix) or kept as a `list`. -- .ex[Example] Random normal draws with different means and variances. ```r mapply(FUN = rnorm, n = 1, mean = c(0, 10, 20), sd = 1:3) ``` ``` #> [1] 0.8005418 8.8048199 25.8529457 ``` --- ## Custom apply All of our examples used already-defined functions for `FUN`, _e.g._, -- ```r lapply(X = c("a", "pig"), FUN = toupper) ``` -- Alternatively, you define your own function at `FUN`, _e.g._, -- ```r lapply(X = 1:2, FUN = function(i) {i > 1}) ``` ``` #> [[1]] #> [1] FALSE #> #> [[2]] #> [1] TRUE ``` --- name: apply-more ## Other packages Other packages offer similar (and parallelized) functions. .left20[ .hi-pink[`base`] <br> `lapply()` <br> `apply()` <br> `mapply()` ] -- .left25[ .hi-orange[`purrr`/`furrr`] <br> `map()` <br> ? <br> `map2()` ] -- .left30[ .hi-turquoise[`future.apply`] <br> `future_lapply()` <br> `future_apply()` <br> `future_mapply()` ] -- .left25[ .hi-purple[`parallel`] <br> `mclapply()` <br> `mcapply()` <br> `mcmapply()` ] --- name: for ## `for()` loops However, if you're really committed to running for loops, the syntax is ```r # Create an empty vector our_vector <- c() # Run the for loop for some numbers for (i in c(1, 1, 2, 3, 5, 8)) { # Print 'i' print(i) # Append 'i' to the end of our_vector our_vector <- c(our_vector, i) } ``` --- name: lists ## Lists and unlisting Lists (_e.g._, as outputted by `lapply()`) can be helpful—but they can also be fairly annoying. -- Enter `unlist()`. -- .col-left[ .b[List output] ```r lapply( X = 1:2, FUN = as.character ) ``` ``` #> [[1]] #> [1] "1" #> #> [[2]] #> [1] "2" ``` ] -- .col-right[ .b[`unlist()`-ing to vector] ```r lapply( X = 1:2, FUN = as.character ) %>% unlist() ``` ``` #> [1] "1" "2" ``` ] --- name: list-df ## From lists to data frames Sometimes you don't want to entirely `unlist()` a list. -- For example, you might have a list of data frames that you want to bind into a new data frame. -- In this case, you can use `bind_rows()` or `bind_cols()` from `dplyr`. -- Alternatively, you might be able to make use of `map_dfr()` or `map_dfc()`. --- name: list-index ## Indexing lists .note[Also] Don't forget that you can index lists using double-brackets. ```r # Capitalize the alphabet our_list <- lapply(X = letters, FUN = toupper) # The third letter our_list[[3]] ``` ``` #> [1] "C" ``` --- name: which ## Logical vectors and `which()` Finally, the simply function `which()` can be surprisingly helpful. -- `which()` tells you *which* of the entries in a logical vector are `TRUE` -- , _i.e._, *which* element—or elements—satisfies your logical condition(s). --- layout: false class: clear ```r letters ``` ``` #> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" #> [17] "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" ``` -- ```r letters > "m" ``` ``` #> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE #> [12] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE #> [23] TRUE TRUE TRUE TRUE ``` -- ```r which(letters > "m") ``` ``` #> [1] 14 15 16 17 18 19 20 21 22 23 24 25 26 ``` -- ```r letters[which(letters > "m")] ``` ``` #> [1] "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" ``` --- class: clear, middle Alternatively, we could have just used the logical vector. --- class: clear ```r letters ``` ``` #> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" #> [17] "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" ``` -- ```r letters > "m" ``` ``` #> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE #> [12] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE #> [23] TRUE TRUE TRUE TRUE ``` -- ```r letters[letters > "m"] ``` ``` #> [1] "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" ``` --- # Tips and tricks ## Logical vectors, continued This logic-based selection works on many classes of objects, but it may change the class/structure of the object. .col-left[ ```r # Create a matrix mat <- matrix(1:9, ncol = 3) # Print it out mat ``` ``` #> [,1] [,2] [,3] #> [1,] 1 4 7 #> [2,] 2 5 8 #> [3,] 3 6 9 ``` ] -- .col-right[ ```r # Is the entry even? mat %% 2 == 0 ``` ``` #> [,1] [,2] [,3] #> [1,] FALSE TRUE FALSE #> [2,] TRUE FALSE TRUE #> [3,] FALSE TRUE FALSE ``` ] -- .col-right[ ```r # Print the even entries mat[mat %% 2 == 0] ``` ``` #> [1] 2 4 6 8 ``` ] --- layout: false # Table of contents .pull-left[ ### Tips and tricks .small[ 1. [The apply family](#applys) - [`lapply()`](#lapply) - [Plain `apply()`](#apply) - [`mapply()`](#mapply) 1. [`for()` loops](#for) 1. [Lists](#lists) - [`unlist()`-ing](#lists) - [Binding to data frame](#list-df) - [Indexing](#list-index) 1. [Logical vectors and `which()`](#which) ]] --- exclude: true