A Cloud / Local production ready Archipelago Deployment using Docker and soon Kubernetes.
Running Archipelago Commons on a live public instance using SSL with Blob/Object Storage backend
- Cloud-based deployment, e.g. AWS/Azure under Linux
- Self-managed servers running Linux
- x86/AMD86 or ARM64/v8 CPU architectures
- Running your own local/development Archipelago. For that we suggest using https://github.com/esmero/archipelago-deployment
- 4 Gbytes of RAM (e.g AWS EC2 t3.medium) 2 CPUs, Single SSD Drive of 100 Gbytes
- 8 Gbytes of RAM (AWS EC2 t3.medium) 2 CPUs, Single SSD Drive of 100 Gbytes, optional: one magnetic Drive of 500 Gbytes for Caches/Temp files/Backups.
- 16 Gbytes of RAM (AWS EC2 m6g.xlarge - Graviton) 4 CPUs, Single SSD Drive of 200 Gbytes, optional: one magnetic Drive of 1TB for Caches/Temp files/Backups.
- Ubuntu 20.04 /Amazon Linux 2/Debian 10.9 / AlmaLinux (Centos replacement) matching your CPU architecture (of course)
- Most recent
Docker
running as a service anddocker-compose
- Basic Unix/Linux terminal skills and a root/sudo account
Docker
basics knowledge and how to manage packages in your System- Knowledge of how to manage AWS/Azure or any other cloud-based provider for Computing Instances and S3 compatible Object Storage system and the associated permissions credentials to do so.
- To be patient. Please read everything. Do not skip steps. Do not only copy and paste commands. Explanations give context and might help you troubleshoot issues.
Basically this guide is meant for humans with basic to medium DevOps
background or humans with patience that are willing to troubleshoot, ask, and try again when that background is not (yet) enough. And we are here to help.
Deploy your base system
Make sure your Firewall/AWS Security group has these ports open for everyone to access
- 443 (NGINX SSL)
- 80 (NGINX HTTP) And protected/modally open for your own development/testing/administration
- 8183 (Cantaloupe)
- 8983 (Solr)
- 6400 (NLP64)
- 9001 (Minio)
- 22 (so you can ssh into your machine)
Setup your system using your favorite package manager with
- Docker
- git
- htop
- tree
- docker-compose
e.g. for Amazon Linux 2 (x86/amd64) these steps are tested:
sudo yum update -y
sudo amazon-linux-extras install -y docker
sudo service docker start
sudo usermod -a -G docker ec2-user
sudo chkconfig docker on
sudo systemctl enable docker
sudo yum install -y git htop tree
sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
sudo reboot
Reboot is needed to allow Docker to take full control over your OS resources.
In your location of choice clone this repo
git clone https://github.com/esmero/archipelago-deployment-live
cd archipelago-deployment-live
git checkout 1.0.0
Setup your deployment enviromental variables by copying the template
cp deploy/ec2-docker/.env.template deploy/ec2-docker/.env
and editing it
nano deploy/ec2-docker/.env
The content of that file would be similar to this.
ARCHIPELAGO_ROOT=/home/ec2-user/archipelago-deployment-live
ARCHIPELAGO_EMAIL=[email protected]
ARCHIPELAGO_DOMAIN=your.domain.org
MINIO_ACCESS_KEY=THE_S3_AZURE_OR_LOCAL_MINIO_KEY
MINIO_SECRET_KEY=THE_S3_AZURE_OR_LOCAL_MINIO_SECRET
MYSQL_ROOT_PASSWORD=YOUR_MYSQL_PASSWORD_FOR_ARCHIPELAGO
MINIO_BUCKET_MEDIA=THE_NAME_OF_YOUR_S3_BUCKET_FOR_PERSISTEN_STORAGE
MINIO_FOLDER_PREFIX_MEDIA=media/
MINIO_BUCKET_CACHE=THE_NAME_OF_YOUR_S3_BUCKET_FOR_IIIF_STORAGE
MINIO_FOLDER_PREFIX_CACHE=iiifcache/
REDIS_PASSWORD=YOUR_REDIS_PASSWORD
What does each key mean?
ARCHIPELAGO_ROOT
: the absolute path to yourarchipelago-deployment-live
git repo in your host machine.ARCHIPELAGO_EMAIL
: a valid email, will be used to register your SSL Certificate via Certbot.ARCHIPELAGO_DOMAIN
: a valid domain name for your repository. This domain will be also used to request your SSL Certificate via Certbot.MINIO_ACCESS_KEY
: If you are running a Cloud Service backed S3/Azure Storage this needs to be generated there. The user/IAM owner of this ACCESS KEY needs to have access to read/write the bucket you will configure in this same.env
. If running localmin.io
whatever you set will be used.MINIO_SECRET_KEY
: If you are running a Cloud Service backed S3/Azure Storage this needs to generated there. The user/IAM owner of the matching SECRET_KEY needs to have access to read/write the bucket you will configure in this same.env
file. If running localmin.io
whatever you set will be used.MYSQL_ROOT_PASSWORD
: The MYSQL 8 or Mariadb 15 password. This password will be used later also during Drupal deployment viadrush
MINIO_BUCKET_MEDIA
: The name of your Persistant Storage Bucket. If using mini.io local we recommend keeping it simple, e.g.archipelago
.MINIO_FOLDER_PREFIX_MEDIA
: Thefolder
(a prefix really) where your DO Storage and File storage will go inside theMINIO_BUCKET_MEDIA
Bucket.media/
is a fine name for this one and common in archipelago deployments. IMPORTANT: Always terminate these with a/
.MINIO_BUCKET_CACHE
: The name of your IIIF Cache storage Bucket. May be the same asMINIO_BUCKET_MEDIA
. If different make sure your yourMINIO_ACCESS_KEY
and/or IAM role ACL have permission to read write to this one too.MINIO_FOLDER_PREFIX_CACHE
: Thefolder
(a prefix really) where Cantaloupe will/can write itsiiif
caches.iiifcache/
is a lovely name we use a lot. IMPORTANT: Always terminate these with a/
.REDIS_PASSWORD
: Password for your REDIS (Drupal Cache/Queue storage) if you decide to enable the Drupal REDIS module.
IMPORTANT NOTE
: For AWS EC2. If you selected an IAM role
for your server when setting it up/deploying it, min.io
will use the AWS EC2-backed internal API to request access to your S3. This means the ROLE itself needs to have read/write access (ACL) to the given Bucket(s) and your key/secrets won't be able to override that. Please do not ignore this note. It will save you a LOT of frustration and coffee. You can also run an EC2 instace without a given IAM and in that case just the ACCESS_KEY/SECRET will matter.
Now that you know, you also know that these values should not be shared and this .env
file should not be committed/kept in version control. Please be careful.
docker-compose
will read this .env
and start all services for you based on its content.
Once you have modified this you are ready for your first big decision.
This means you will use the docker-compose-aws-s3.yml
. Do the following:
cp deploy/ec2-docker/docker-compose-aws-s3.yml deploy/ec2-docker/docker-compose.yml
Running a fully qualified domain you wish a valid/signed certificate for ARM64/Apple M1 Architecture?
This means you will use the docker-compose-aws-s3-arm64.yml
. Do the following:
cp deploy/ec2-docker/docker-compose-aws-s3-arm64.yml deploy/ec2-docker/docker-compose.yml
If you have more than a single domain you may create a text file inside
config_storage/nginxconfig/certbot_extra_domains/your.domain.org
and write for each subdomain there an entry/line.
Only if you are not running a fully qualified domain you wish a valid/signed. We really DO not recommend this route. IF you plan on using this deployment for local testing or running on non SSL please go for https://github.com/esmero/archipelago-deployment which delivers the same experience in less than 20 minutes deployment time.
Generate a self signed Cert
sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout data_storage/selfcert/private/nginx.key -out data_storage/selfcert/certs/nginx.crt
sudo openssl dhparam -out data_storage/selfcert/dhparam.pem 4096
cp deploy/ec2-docker/docker-compose-selfsigned.yml deploy/ec2-docker/docker-compose.yml
Note: Self signed docker-compose.yml file is setup to use min.io with local storage
volumes:
- ${ARCHIPELAGO_ROOT}/data_storage/minio-data:/data:cached
This folder will be created by min.io. If you are using a secondary Drive (e.g. magnetic) you can modify your deploy/ec2-docker/docker-compose.yml
to use a folder there, e.g.
volumes:
- /persistentinotherdrive/data_storage/minio-data:/data:cached
Make sure your logged in user can read/write to it.
NOTE: If you want to use AWS S3 storage for the self signed version replace the minio Service yaml block with this Service Block in your new deploy/ec2-docker/docker-compose.yml
. You can mix and match services and even remove all :cached
statements for improved R/W volumen performance.
sudo chown 8183:8183 config_storage/iiifconfig/cantaloupe.properties
sudo chown -R 8183:8183 data_storage/iiifcache
sudo chown -R 8183:8183 data_storage/iiiftmp
sudo chown -R 8983:8983 data_storage/solrcore
Time to spin our docker containers for the first time. We will start all without going into background so log/error checking is easier, especially if you have selected a Valid/Signed Cert choice and also want to be sure S3 keys/access are working.
cd deploy/ec2-docker
docker-compose up
You will see a lot of things happening. Check for errors/problems/clear alerts and give all a minute or so to start. Ok, let's assume your setup managed to request a valid signed SSL cert, you will see a nice message!
- Congratulations! Your certificate and chain have been saved at:XXXXX
Your certificate will expire on 20XX-XX-XX. To obtain a new or
tweaked version of this certificate in the future, simply run
certbot again. To non-interactively renew *all* of your
certificates, run "certbot renew"
Archipelago will do that for you whenever it's about to expire so no need to deal with this manually, even when docker-compose
restarts.
Now press CTRL+C. docker-compose
will shutdown gracefully. Good!
Copy the shipped default composer.default.json to composer.json and composer.default.lock to composer.lock (ONLY if you are installing from scratch):
cp ../../drupal/composer.default.json ../../drupal/composer.json
cp ../../drupal/composer.default.lock ../../drupal/composer.lock
Start Docker again
docker-compose up -d
Wait a few seconds and run:
docker exec -ti esmero-php bash -c "chown -R www-data:www-data private"
docker exec -ti esmero-php bash -c "chown -R www-data:www-data web/sites"
docker exec -ti esmero-php bash -c "composer install"
Composer install will take a little while and bring all your PHP libraries.
Once done, execute our setup script that will prepare your Drupal settings.php
and bring some of the .env
enviromental variables to the Drupal environment.
docker exec -ti esmero-php bash -c 'scripts/archipelago/setup.sh'
And now you can deploy Drupal!
IMPORTANT: Make sure you replace in the following command inside root:MYSQL_ROOT_PASSWORD
the MYSQL_ROOT_PASSWORD
string with the value you used/assigned in your .env
file for MYSQL_ROOT_PASSWORD
. And replace ADMIN_PASSWORD
with a password that is safe and you won't forget! That passwords is for your Drupal super user (uid:0).
docker exec -ti -u www-data esmero-php bash -c "cd web;../vendor/bin/drush -y si --verbose --existing-config --db-url=mysql://root:MYSQL_ROOT_PASSWORD@esmero-db/drupal --account-name=admin --account-pass=ADMIN_PASSWORD -r=/var/www/html/web --sites-subdir=default --notify=false;drush cr;chown -R www-data:www-data sites;"
After installation is done (may take a few) you can install initial users and assign them roles. Copy each line separately. A missing permission will not let you ingest the initial Metadata Displays and AMI set.
docker exec -ti esmero-php bash -c 'drush ucrt demo --password="demo"; drush urol metadata_pro "demo"'
docker exec -ti esmero-php bash -c 'drush ucrt jsonapi --password="jsonapi"; drush urol metadata_api "jsonapi"'
docker exec -ti esmero-php bash -c 'drush urol administrator "admin"'
Before ingesting the base content we need to make sure we can access your JSON-API
on for your new domain. That means we need to change internal urls (https://esmero-web
) to the new valid SSL driven ones. This is easy:
On your host machine (no need to docker exec
these ones), replace first in the following command your.domain.org
with the domain you setup in your .env
file. Go to (cd into) your base git clone folder (Important: YOUR BASE CLONE FOLDER) and then run
sed -i 's/http:\/\/esmero-web/https:\/\/your.domain.org/g' drupal/scripts/archipelago/deploy.sh
sed -i 's/http:\/\/esmero-web/https:\/\/your.domain.org/g' drupal/scripts/archipelago/update_deployed.sh
Now your deploy.sh
and update_deployed.sh
are update and ready. Let's ingest some Twig Templates, an AMI Set, menus and a Blocks.
docker exec -ti esmero-php bash -c 'scripts/archipelago/deploy.sh'
NOTE: update_deployed.sh
is not needed when deploying for the first time and totally discouraged on a customized Archipelago.
If you make modifications to your Twig templates
, that command will replace the ones shipped by us with fresh copies overwriting all your modifications. Only run to restore larger errors or when needing to update non-customized ones with newer versions.
By default archipelago ships with a public facing and an internal facing IIIF Server URLs configured. These urls are used by a number of IIIF enabled viewers and need to be changed to reflect your new reality (a real Domain name and a proxied path!). These settings belong to the strawberryfield/format_strawberryfield
module.
First check your current settings:
docker exec -ti esmero-php bash -c "drush config-get format_strawberryfield.iiif_settings"
You will see the following:
pub_server_url: 'http://localhost:8183/iiif/2'
int_server_url: 'http://esmero-cantaloupe:8182/iiif/2'
Let's modify pub_server_url
. Replace in the following command your.domain.org
with the domain you defined in your .env
file.
NOTE: We are passing the -y
flag to drush
avoid that way having to answer "yes".
docker exec -ti esmero-php bash -c "drush -y config-set format_strawberryfield.iiif_settings pub_server_url https://your.domain.org/cantaloupe/iiif/2"
Finally Done! Now you can log into your new Archipelago using https
and start exploring. Thank you for following this guide!
This applies to AWS m6g
and t3g
Instances and is documented inline in this guide. Please open an ISSUE in this repository if you run into any problems.
Please review https://github.com/esmero/archipelago-deployment-live/blob/1.0.0/deploy/ec2-docker/docker-compose-aws-s3-arm64.yml for more info.
Run
uname -m
- For an
x86(64 bit)
processor system output will bex86_64
- For an
ARM(64 bit)
processor system output will beaarch64
This software is a Metropolitan New York Library Council Open-Source initiative and part of the Archipelago Commons project.