🔐 Reduce Docker size by half + improve security #465

timoa · 2024-11-14T22:44:53Z

Details

This PR optimized the Dockerfiles (frontend + backend), significantly reducing the image size and improving security by running as a non-root user (node).

Image	Current size	New size	Improvement
perplexica-frontend	2.37GB	846MB	-64%
perplexica-backend	1.84GB	640MB	-65%

Here are the fundamental changes and explanations:

Multi-stage build: I used a two-stage build process. The first stage (builder) installs dependencies and builds the application. The second stage only copies the necessary files for running the application.
Removed ARG variables on the backend image: Since Docker Compose or Kubernetes will provide the environment variables, I removed them from the Dockerfile. We can set these variables in your Docker Compose file or Kubernetes deployment configuration.
Optimized copying and building: I first copy only the package.json and yarn.lock files, then install dependencies. This allows better caching of the dependency installation step.
Minimal production image: The final stage only copies the built assets, node_modules, and necessary files from the builder stage, resulting in a much smaller Docker image.
Use a more standard folder for the app: I replaced the /home/perplexica with /app and updated the docker-compose.yaml file to this new path.
Use a non-root user (node): Instead of using the root user by default, I changed the container user to node (the default user for official Node Docker images). I had to set this user's permissions on the Dockerfiles and the docker-compose.yaml to avoid permission issues on the SQLite DB file.
Use an ARG variable for the backend image to use by default the node user when running on Kubernetes and the root user if running with Docker Compose. The Docker Compose volumes are created with the root user, and the SQLite DB is accessible only as read-only if running the node user.
Update the Node version to Node v22 for the frontend and backend.
I tried to move all the ENV vars to a shared .env in the root folder by providing the right syntax to the Docker Compose file, but I haven't figured out why the Frontend app still looking for its ./ui/.envfile. If you have any insights, I will be happy to fix it. For now, it uses the same .env file for the backend and frontend (I have updated the README with additional instructions).

Important

The downside to running the backend Docker image as a non-root user is that Docker Compose will mount the volume as root, and the node user will have access to the DB only in read-only mode.
Docker Compose must run only with the --build flag to force rebuild the Docker image with the root user.
By default, the Docker images will be published to Docker Hub with the node user.

We will not have this issue with the SQLite DB permissions on Kubernetes because the volumes are managed differently.

Moving to a Postgres DB will fix this issue and help scale the project later. The Docker Compose will be able to launch a Postgres image, and it will be the same for Kubernetes with a dedicated pod or managed database like AWS RDS.

You can keep the root user for the backend image if you think that is too much for simple use with the Docker Compose file, but using the non-root user (node) will be more secure.

…o node:18-alpine

Docker Compose mount the volumes as root by default and the node user can't access the SQLite DB (read-only)

timoa · 2024-11-14T22:48:02Z

cc: @rrfaria: I closed the previous PR to provide a cleaner branch.

…nstallation/NETWORKING.md` file

ItzCrazyKns · 2024-11-16T09:23:56Z

You mentioned that you fixed issue with NEXT_PUBLIC_ENV vars. It cannot be fixed, since I now provide prebuild images, the public env vars are hardcoded in the code generated by nextjs. Its not feasible to change them and changing them via a script is not a practical approach. Your PR seems good for users who want to build images locally, but in other terms even if I link the env variable, the vars would still be hardcoded.

timoa · 2024-11-16T18:20:06Z

You mentioned that you fixed issue with NEXT_PUBLIC_ENV vars. It cannot be fixed, since I now provide prebuild images, the public env vars are hardcoded in the code generated by nextjs. Its not feasible to change them and changing them via a script is not a practical approach. Your PR seems good for users who want to build images locally, but in other terms even if I link the env variable, the vars would still be hardcoded.

There is a lot of projects that are using the NEXT_PUBLIC_API_URL on Github, but maybe you have a specific use case.
I'm not a NextJS expert, but since you're building on a Docker image, maybe you don't need to build the frontend static version? It will run with the node engine and be able to get access to the ENV vars.
In this case, it will use the .env file provided by the Docker Compose, like in my PR.
It will also work when deploying it on Kubernetes, where it is using the ENV vars provided by the K8S pod.

My last Helm chart was for the project TypeBot (PR in progress), a NextJS app with a backend and frontend.
The Dockerfile is a bit complex, but it gets the ENV vars from the .env file, and it works well on Kubernetes using public images. I will try to look at the frontend's build process and see if it can work for Perplexica.

Froggy232 · 2024-11-25T03:51:00Z

Hi there,
First, thanks for your messages, it seems there is some hope for it to work!
Do you have any news on this? I try to run perplexica in a podman pod, and I think I have a problem that this would solve.
Thanks you a lot, of course, I would totally understand if you haven't had the times, or if you haven't found a solution.
Have a nice day,
Best regards

timoa added 6 commits November 14, 2024 23:35

fix(docker): reduce Docker size + improve security

652e665

fix(docker): update the Frontend docker to with the node user perms

f9f7dc9

fix(docker): fix the Docker copy commands + change the source image t…

0c5280e

…o node:18-alpine

fix(docker): fix missing ENV variables & files

68b649c

fix(docker): fix Docker compose to use .env files

eda2c39

fix(docker): fix the permissions issue when running Docker Compose

2351c5c

Docker Compose mount the volumes as root by default and the node user can't access the SQLite DB (read-only)

timoa marked this pull request as ready for review November 14, 2024 22:48

timoa mentioned this pull request Nov 14, 2024

🔐 Reduce Docker size by half + improve security #434

Closed

rrfaria approved these changes Nov 15, 2024

View reviewed changes

timoa mentioned this pull request Nov 15, 2024

NEXT_PUBLIC_API_URL not used when using pre-built Docker images #460

Closed

fix(docker): fix the env.example files to be in sync with the `docs/i…

003fb68

…nstallation/NETWORKING.md` file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🔐 Reduce Docker size by half + improve security #465

🔐 Reduce Docker size by half + improve security #465

timoa commented Nov 14, 2024 •

edited

Loading

timoa commented Nov 14, 2024

ItzCrazyKns commented Nov 16, 2024

timoa commented Nov 16, 2024

Froggy232 commented Nov 25, 2024

🔐 Reduce Docker size by half + improve security #465

Are you sure you want to change the base?

🔐 Reduce Docker size by half + improve security #465

Conversation

timoa commented Nov 14, 2024 • edited Loading

Details

Here are the fundamental changes and explanations:

Important

timoa commented Nov 14, 2024

ItzCrazyKns commented Nov 16, 2024

timoa commented Nov 16, 2024

Froggy232 commented Nov 25, 2024

timoa commented Nov 14, 2024 •

edited

Loading