How to run rocker/rstudio images with podman on Fedora

Making us capable to have several isolated instances of RStudio with different versions of R, easily and without interference. With full access to our home folder, as if they were installed directly on our host.

containers

docker

podman

rocker

rstudio

Author

seergi

Published

March 20, 2023

Manage pods, containers, and container images.

The Rocker Project. Docker containers for the R enviroment.

Let’s get to the point

We assume that you are working with Fedora 37 and you have podman 4.3.1 or above installed in your system. Check this with podman --version:

$ podman –version
podman version 4.4.2

First, we create a directory for RStudio to store its data and configuration inside our home folder. In this case, we are going to use the rocker/verse:4.2.3 image, which means we’ll have an RStudio instance with tidyverse and LaTex included, with R version 4.2.3. I’ll name it verse423, in reference of the image used:

mkdir ~/verse423

Choosing images

The Rocker Project maintains several incremental builds of RStudio: rocker/rstudio, rocker/tidyverse, rocker/verse and so on. Check out the available images on https://rocker-project.org/images/ and choose the one that fit better your needs.

There are two container registries available: Docker Hub and GitHub, with R versions ranging from 3.3.2 up to the latest one.

Now we are are ready to spin up the RStudio container as follows:

podman run -d -p 8423:8787 --name verse423 \
  -u root --userns=keep-id:uid=1000,gid=1000 \
  --security-opt label=disable \
  -v ~/verse423:/home/rstudio \
  -v ~:/home/rstudio/$USER \
  -e DISABLE_AUTH=true \
  docker.io/rocker/verse:4.2.3

That’s it! Open your browser, point it to localhost:8423, and we have a fully functional RStudio (R 4.2.3) with quarto, rmarkdown, tidyverse, devtools, pandoc, data.table, GIT and TeX Live pre-installed with seamless full access to our host home directory, as if it were installed directly on our host, without permissions or ownership issues.

Stop and restart containers

The verse423 container will stop when:

Logging out or shutting down the computer.
Manually stopping it with podman stop verse423.

The container could be restarted anytime with podman restart verse423.

Let’s dig deeper

Motivation

The aim of this post is to show how to use the rocker/rstudio images locally in Fedora without the need of installing Docker and having its background service running, just using the default container manager already included in this distro: podman. In the past it was difficult to run these images with podman in a consistent way due to bugs, lack of certain features, difficulties with SELinux and the docker-centric design of the rocker/rstudio images, but since version 4.3.1 it is possible to run them as easy as with Docker, just with a few tweaks in the run command.

To get a better understanding of how the podman run command showed above works, it is useful to take a look at how these images are run with Docker.

How it is done with Docker under Ubuntu

Following the guidelines provided by rocker-project, this is the run command to be used with Docker to spin up an RStudio container with full access to our host home folder:

docker run -d -p 8423:8787 --name verse423 \
  -v ~/verse423:/home/rstudio \
  -v ~:/home/rstudio/$USER \
  -e USERID=$(id -u) -e GROUPID=$(id -g) \
  -e DISABLE_AUTH=true \
  docker.io/rocker/verse:4.2.3

podman with Fedora and Docker with Ubuntu

Note that the podman commands were run under Fedora 37 and Docker commands were run under Ubuntu 22.04. That’s because although it is possible to install and run Docker in Fedora, the main idea of this post is to take advantage of the container engine included in Fedora to run the rocker/rstudio images without the need to install any additional software.

Breaking down the common options

-d: Run the container in detached mode, allowing us to continue using the terminal. An alternative option is to replace it with -ti, which allows us to get the stdout of the container at the cost of locking the terminal. Or in case we want to override the /init command placing bash at the end of the run command in order to use the container directly from the terminal (before RStudio is deployed).
-p 8423:8787: The port assigned to the container. The latter is always 8787 in rocker/rstudio images, the first one can change at will. If you are going to use several instances of RStudio, make sure to assign different ports for each one. For example -p 9500:8787, -p 9501:8787 and so on. Ports below 1024 are privileged and should not be used.
--name verse423: Name for the container. I choose verse423 to make it easier to identify with the rocker/rstudio image and R version used. If you need to spin up more than one container, change the name (as for ports, cannot be two containers with the same name).
-v HOST_DIR:CONT_DIR: Volumes to be used by the container. Host folders must exist before executing the command. With this set up, you’ll have a dedicated persistent folder for RStudio to store their configuration files, workspace, history, etc. always accessible from the host, and a second volume (optional) which bind-mount the entire home folder of the host to give RStudio access to all your files. You could modify this second one to point only to your project or working folder if you don’t want to share the entire home directory.
-e DISABLE_AUTH=true: Environment variable which tells RStudio to disable authentication, appropriate since we are using the image locally. Otherwise, you should set up a password with -e PASSWORD=<mypassword> option.

The difference: Handling the user namespace

The way we handle the user namespace is critical in order to give RStudio access to read and write files on bind-mounted volumes without having permissions problems or weird changes of ownership of directories and files.

With Docker, we manage it including these options in the run command:

-e USERID=$(id -u) -e GRUOPID=$(id -g)

While with podman we include these options:

-u root --userns=keep-id:uid=1000,gid=1000 \
--security-opt label=disable

Check your UID/GID

We can check our user id and group id with the id command:

$ id
uid=1002(sergi) gid=1003(sergi) groups=1003(sergi),27(sudo),115(docker)

The $(id -u) and $(id -g) variables gives us the UID and GID of our user. In this case 1002 and 1003.

With Docker, the key point is to make the UID/GID of the user in RStudio (inside the container) match the UID/GID of the user who run the container in the host. Because the rocker/rstudio images were designed with Docker in mind, the environment variables USERID and GROUPID were provided to allow us to change the UID/GID of the RStudio user (which by default is uid=1000,gid=1000). More information about this here.

Your host user ID and RStudio user ID

If you are the only user in your computer the chances are you have an uid=1000,gid=1000. In that case the UID/GID of your host user matches automatically the UID/GID of the default RStudio user, and the following command (without setting the USERID and GROUPID options) will work fine:

docker run -d -p 8423:8787 --name verse423 \
  -v ~/verse423:/home/rstudio \
  -v ~:/home/rstudio/$USER \
  -e DISABLE_AUTH=true \
  docker.io/rocker/verse:4.2.3

But be aware that it will not work if your host user has a different UID/GID. Always include the USERID/GROUPID options with Docker to avoid potential issues while accessing the volumes. Also, you can check your id both in the host and in the terminal inside RStudio: the user name will be different, it doesn’t matter, but the id numbers must be the same.

Figure 1: Error shown by RStudio when user namespace is not handled properly and there are bind-mounted volumes. It could happen both with Docker and podman.

User namespace and volumes

Note that the handling of user namespace is only necessary when we are bind-mounting volumes. If we just want an ephemeral RStudio or we are fine storing all data inside the container and not having access to the host machine, we can ignore it and the following run command will work fine both with Docker and podman:

podman run -d -p 3333:8787 --name test \
  -e DISABLE_AUTH=true \
  docker.io/rocker/rstudio:4.2.3

With podman the USERID and GROUPID settings are useless and have no effect in mapping the user namespace, and therefore we have to take a different approach: Instead of using the options provided by rocker in their images (like with Docker), we are going to tell directly to podman how to handle the user namespace.

First, we want to map our host user id (for example uid=1002,gid=1003) to the default RStudio user id inside the container (uid=1000,gid=1000). This way, we are telling podman that whenever the RStudio user reads or writes a file in a volume, to consider it as if the host user is reading or writing that file. This option is set using --userns=keep-id:uid=1000,gid=1000 in the run command.

Set always this option if volumes are bind-mounted

With podman, we need to set the –userns=keep-id:uid=1000,gid=1000 option if we are using bind-mounted volumes, even if our host user has the same uid and gid as the default RStudio user. By default, podman maps the host user with the root user uid=0,gid=0 inside the container. Including this option overrides this behavior and maps the host user to the default RStudio user instead.

But there is a problem: If we set this option alone, we also override the entry user of the container, which is root by default. This strips the user of privileges, preventing RStudio from starting, according to the rocker documentation:

rocker/rstudio etc. requires the root user to execute the /init command to start RStudio
Server. So, do not set the –user option if you want to use RStudio Server.

Instead, the UID and GID of the default user for logging into RStudio can be changed at
container start by specifying environment variables. Please check the reference page.

Thus, we need to tell podman to give the entry user root privileges again, overriding the --userns option. This is done by using the -u root option.

You are not root inside RStudio

Note that the -u root only affects to the entry user inside the container, not RStudio. Inside RStudio you will be a non-privileged user by default. If for some reason you need to be root inside RStudio (you want to run sudo apt update for example) you have to specify it with the -e ROOT=true option in the run command. If that’s the case, remember to set also a password with the -e PASSWORD=<mypassword> option.

We are almost there, but there is one last obstacle: Volumes will not work as expected because SELnux (Security-Enhanced Linux, included by default in Fedora) is preventing the container from accessing our files. To solve this issue we have to dig deep in the podman documentation. There are several options to handle this problem, but quoting the key paragraph:

The option –security-opt label=disable disables SELinux separation for the container. For example if a user wanted to volume mount their entire home directory into a container, they need to disable SELinux separation.

So we have to include the --security-opt label=disable option to our run command to allow the container read and write files on the bind-mounted volumes, specially if we mount our entire home folder.

SELinux is not disabled

Note that SELinux will not be disabled in the system, only the container with this option in place can ignore its labels. It shouldn’t be a problem when working with these containers locally. Otherwise, check the documentation for alternative solutions. Hint: The use of the flags :U, :z or :Z in volumes could do the trick.

That’s it, we’ve solve it! Now we can have one or more RStudio instance in the flavor we choose, with the R version we want, with flawless access to our home folder, keeping our host system clean and without the need of Docker and its background service running.

Installed packages and updates

Note that the installed packages and updates are stored inside the container, not in the volumes. Therefore, they will be lost as soon as you delete the container. However, the workspace, environment, history, RStudio configuration, and any other files you stored in the /home/rstudio volume are persistent.

This is a useful feature because if you encounter issues with package installation, dependencies or updates, and you want to start over while keeping your configuration and files, you can simply delete the container using the podman rm <container> command and then re-run it with the same volume setup. Otherwise, you can keep the container and restart it using podman restart <container> whenever necessary, and your installed packages and updates will remain intact.

Podman and Docker flow execution of run commands

flowchart
  A(PODMAN\nrun command) --> B(Host user is:\nuid=1002,gid=1003)
  B --> C(Entry user of the container is forced to be:\nuid=0,gid=0 root\nby the -u root option\nOnly root can start RStudio server)
  C --> D(Root user at entry of the\ncontainer spins up RStudio)
  D --> E(RStudio user is\nuid=1000,gid=1000\nby default)
  E --> F(RStudio user is mapped\nto the host user by\nthe --userns option)
  F --> G(The --security-opt option\nallows the container to\nread/write files properly)

flowchart
  A(DOCKER<br>run command) --> B(Host user is:\nuid=1002,gid=1003)
  B --> C(Entry user of the container is:\nuid=0,gid=0 root\nby default)
  C --> D(RStudio user is changed to\nuid=1002,gid=1003\nby the -e USERID and -e GROUPID\noptions)
  D --> E(Root user at entry of the\ncontainer spins up RStudio)
  E --> F(RStudio user is mapped\nto the host user because\nthey share the same uid/gid)
  F --> G(RStudio user reads/writes\nfiles as if it were the\nhost user properly)

Check everything is in order

I must admit that I had doubts that the options -u root and --userns=keep-id would work together as I expected, but they do!

We can check everything is working as expected following these steps:

Run this ephemeral RStudio container (make sure ~/test folder exists on the host beforehand):

podman run --rm -it -p 7000:8787 --name test \
  -u root --userns=keep-id:uid=1000,gid=1000 \
  -v ~/test:/home/rstudio \
  -v ~:/home/rstudio/$USER \
  -e DISABLE_AUTH=true \
  docker.io/rocker/rstudio:4.2.3 bash

Once inside, check your id. It should be: uid=0(root) gid=0(root) groups=0(root),1000(rstudio).
Then spin up RStudio with the /init command.
Go to localhost:7000, and check your id from the RStudio terminal. It should be: uid=1000(rstudio) gid=1000(rstudio) groups=1000(rstudio),50(staff).
Write a file from R to check the first volume:
write("Hello from RStudio","/home/rstudio/hello.txt")
Write a file from R to check the second volume:
write("Hello from RStudio 2","/home/rstudio/<your_user>/hello2.txt")
From the host machine check the ownership of these two files: Should be your host user.
Press Ctrl+C from the host terminal to stop RStudio. Type exit to quit the container. Remove it with podman rm test.

Let’s see an example

Suppose we are working on three different projects:

Two old projects developed under R 3.6.3 and R 4.0.0 respectively. Let’s call them project A and project B, whose files are located at ~/projects/projectA and ~/projects/projectB respectively. We don’t need them to have access to our entire home folder, just their folder project.
One new project which is being developed under the latest version of R: 4.2.3, called project C, which files are located at ~/projects/projectC. We need access to our entire home folder from this project.

First, we need to create the home folders where rstudio will keep its configuration files, one for each project/instance of RStudio:

mkdir -p ~/projects/rstudio_homes/projectA
mkdir -p ~/projects/rstudio_homes/projectB
mkdir -p ~/projects/rstudio_homes/projectC

Then, we can create the containers with their appropriate volumes bind-mounted:

For Project A:

podman create -p 7001:8787 --name projectA \
  -u root --userns=keep-id:uid=1000,gid=1000 \
  --security-opt label=disable \
  -v ~/projects/rstudio_homes/projectA:/home/rstudio \
  -v ~/projects/projectA:/home/rstudio/projectA \
  -e DISABLE_AUTH=true \
  docker.io/rocker/rstudio:3.6.3

For Project B:

podman create -p 7002:8787 --name projectB \
  -u root --userns=keep-id:uid=1000,gid=1000 \
  --security-opt label=disable \
  -v ~/projects/rstudio_homes/projectB:/home/rstudio \
  -v ~/projects/projectB:/home/rstudio/projectB \
  -e DISABLE_AUTH=true \
  docker.io/rocker/rstudio:4.0.0

For Project C:

podman create -p 7003:8787 --name projectC \
  -u root --userns=keep-id:uid=1000,gid=1000 \
  --security-opt label=disable \
  -v ~/projects/rstudio_homes/projectC:/home/rstudio \
  -v ~/projects/projectC:/home/rstudio/projectC \
  -v ~:/home/rstudio/$USER \
  -e DISABLE_AUTH=true \
  docker.io/rocker/rstudio:4.2.3

podman create

You can create the containers without running them right away using podman create instead of podman run. Remove the -d or -ti options from the command when creating containers that way.

Start the appropriate container whenever you need to work in a project. For project A would be: podman start projectA or podman restart projectA. Then point to localhost:7001 from your web browser and you will be there. In this example port 7001 is mapped to project A, port 7002 is mapped to project B and port 7003 is mapped to project C.

Let’s thank you!

It has been a rather longthy post, but I believe it can help to better understand how podman and Docker works, specially using the rocker/rstudio images. I hope it’s useful to you. Thanks for reading!