In your opinion, what are the lines 3 to 24 doing? See ‘system prerequisites’ here
What do all the lines starting with RUN do?
What do all the lines starting with COPY do?
What does the very last line do?
Dockerizing our project (1/2)
The project is dockerized in scripts/Docker/dockerized_project
There’s:
A Dockerfile
A renv.lock file
A _targets.R (didn’t discuss it here)
The source to our analysis analyse_data.Rmd
Required functions in the functions/ folder
Build the image docker build -t housing_image .
Dockerizing our project (2/2)
Run a container:
First, create a shared folder on your computer
Then, use this command, but change /path/to/shared_folder to the one you made: docker run --rm --name housing_container -v /path/to/shared_folder:/home/housing/shared_folder:rw housing_image
Check the shared folder on your computer: the output is now there!
Docker: a panacea?
Docker is very useful and widely used
But the entry cost is high
Single point of failure (what happens if Docker gets bought, abandoned, etc?)
Not actually dealing with reproducibility per se, we’re “abusing” Docker in a way
The Nix package manager
Package manager: tool to install and manage packages
Package: any piece of software (not just R packages)
A popular package manager:
The Nix package manager
Google Play Store
Reproducibility in the R ecosystem
Per-project environments not often used
Popular choice: {renv}, but deals with R packages only
Still need to take care of R itself
System-level dependencies as well!
A popular approach: Docker + {renv} (see Rocker project)
Nix deals with everything, with one single text file (called a Nix expression)!
A basic Nix expression (1/6)
let
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
system_packages = builtins.attrValues {
inherit (pkgs) R ;
};
in
pkgs.mkShell {
buildInputs = [ system_packages ];
shellHook = "R --vanilla";
}
There’s a lot to discuss here!
A basic Nix expression (2/6)
Written in the Nix language (not discussed)
Defines the repository to use (with a fixed revision)
Lists packages to install
Defines the output: a development shell
A basic Nix expression (3/6)
Software for Nix is defined as a mono-repository of tens of thousands of expressions on Github
Github: we can use any commit to pin package versions for reproducibility!
For example, the following commit installs R 4.3.1 and associated packages:
rix: reproducible development environments with Nix (1/4)
{rix} (website) makes writing Nix expression easy!
Simply use the provided rix() function:
library(rix)rix(r_ver ="4.3.1",r_pkgs =c("dplyr", "ggplot2"),system_pkgs =NULL,git_pkgs =NULL,tex_pkgs =NULL,ide ="rstudio",# This shellHook is required to run Rstudio on Linux# you can ignore it on other systemsshell_hook ="export QT_XCB_GL_INTEGRATION=none",project_path =".")
rix: reproducible development environments with Nix (2/4)
List required R version and packages
Optionally: more system packages, packages hosted on Github, or LaTeX packages
Optionally: an IDE (Rstudio, Radian, VS Code or “other”)
Work interactively in an isolated environment!
rix: reproducible development environments with Nix (3/4)
rix::rix() generates a default.nix file
Build expressions using nix-build (in terminal) or rix::nix_build() from R
“Drop” into the development environment using nix-shell
Expressions can be generated even without Nix installed
rix: reproducible development environments with Nix (4/4)
Can install specific versions of packages (write "dplyr@1.0.0")