dev-resources.site
for different kinds of informations.
Running Puppeteer in a Docker container on Raspberry Pi
Puppeteer is a Node.js module that allows interacting with a (headless) web browser programmatically. This is extremely useful for automating website testing, generating screenshots and PDFs of web pages or programmatic form submission.
Docker offers numerous benefits including a standardized environment, isolation, and rapid deployment to name a few. These benefits might be why you’d want to run Puppeteer inside a Docker container. Unfortunately, doing so on Raspberry Pi is not straightforward. There are a few issues that make it harder than usual. Luckily, they are all solvable. Let’s take a look.
Problem 1: Chromium included in Puppeteer does not work on Raspberry Pi
Puppeteer by default downloads a matching version of Chromium which is guaranteed to work out of the box on supported platforms. Unfortunately, Chromium does not currently provide an arm build that works on Raspberry Pi and running stock Puppeteer on Raspberry Pi will end up with a crash. This can be solved by installing Chromium with apt-get install chromium -y
and telling Puppeteer to use it by passing the executablePath: '/usr/bin/chromium'
to the launch()
function as follows:
const browser = await puppeteer.launch({
executablePath: '/usr/bin/chromium',
args: []
});
When doing this, Puppeteer no longer should need to download Chromium as it will be using the version installed with apt-get
, so it makes sense to skipping this step by setting the corresponding environment variable in the Dockerfile:
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
This will significantly reduce the time needed to install node modules.
Problem 2: The base image installs an old version of Chromium
Some node Docker images are based on old distributions that contain only older versions of Chromium. Most notably the node:16
image is based on buster
. If you use this base image the only version of Chromium you will be able to install with apt-get is 90.0.4430.212-1
. Unfortunately this version doesn’t work in a Docker container – it just hangs indefinitely. Moving to the node:16-bullseye
base image allows installing a much newer version of Chromium (108.0.5359.124
) where this is no longer a problem.
Problem 3: Puppeteer crashes on launch
Puppeteer will not launch in a Docker container without additional configuration. Chromium is not able to provide sandboxing when running inside a container so it needs to be launched at least with the --no-sandbox argument
. Otherwise it will crash with the following error message:
Failed to move to new namespace: PID namespaces supported, Network namespace supported, but failed: errno = Operation not permitted
Sandbox is a security feature and running without a sandbox is generally discouraged. Unfortunately, running without a sandbox appears to be currently the only way to run Puppeteer inside a Docker container. In the past the --no-sandbox
option required running Puppeteer as root, only increasing the risk. Luckily, this no longer seems to be the case – it is possible now to launch puppeteer with the --no-sandbox
option as a non-privileged user.
There are a few more options that might be worth exploring if launching Puppeteer inside a container fails:
-
--disable-gpu
– disables GPU hardware acceleration (which is usually not available when running in Docker) -
--disable-dev-shm-usage
– prevents from using shared RAM (/dev/shm/
) -
--disable-setuid-sandbox
– disabled setuid sandbox
Putting everything together
The information provided above should be all that is needed to be build a Docker image for a Node.js app that uses Puppeteer and runs on Raspberry Pi. Below is an example Dockerfile for such a Docker image. It contains comments to make it easy to notice how the solutions discussed above were applied.
# Ensure an up-to-date version of Chromium
# can be installed (solves Problem 2)
FROM node:16-bullseye
# Install a working version of Chromium (solves Problem 1)
RUN apt-get update
RUN apt-get install chromium -y
ENV HOME=/home/app-user
RUN useradd -m -d $HOME -s /bin/bash app-user
RUN mkdir -p $HOME/app
WORKDIR $HOME/app
COPY package*.json ./
COPY index.js ./
RUN chown -R app-user:app-user $HOME
# Run the container as a non-privileged user (discussed in Problem 3)
USER app-user
# Make `npm install` faster by skipping
# downloading default Chromium (discussed in Problem 1)
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
RUN npm install
CMD [ "node", "index.js" ]
Because the application also requires a couple modifications to how the headless browser is launched here is a small example application illustrating these changes with comments:
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({
// use Chromium installed with `apt` (solves Problem 1)
executablePath: '/usr/bin/chromium',
args: [
// run without sandbox (solves Problem 3)
'--no-sandbox',
// other launch flags (discussed in Problem 3)
// '--disable-gpu,
// '--disable-dev-shm-usage',
// '--disable-setuid-sandbox',
]
});
const page = await browser.newPage();
await page.goto('https://www.google.com/', {waitUntil: 'networkidle2'});
let e = await page.$('div#hplogo');
let p = await e?.getProperty('title');
if (p) {
console.log(`Today's doodle: ${await p.jsonValue()}`);
} else {
console.log('No Doodle today :(');
}
browser.close();
})();
Finally, here is the output of this application when run in a container:
Both the application and the Dockerfile are also available on Github
Conclusion
Running Puppeteer inside a Docker container is tricky – especially, when doing so on Raspberry Pi. The post discussed the key obstacles and provided solutions to overcome them. In addition, a demo containerized app was included to illustrate the main points.
💙 If you liked this article...
I publish a weekly newsletter for software engineers who want to grow their careers. I share mistakes I’ve made and lessons I’ve learned over the past 20 years as a software engineer.
Sign up here to get articles like this delivered to your inbox.
Featured ones: