Skip to content

folio-org/mod-data-export-spring

Repository files navigation

mod-data-export-spring

Copyright (C) 2021-2023 The Open Library Foundation

This software is distributed under the terms of the Apache License, Version 2.0. See the file LICENSE for more information.

Introduction

API for Data Export Spring module.

Environment variables:

Name Default value Description
DB_HOST postgres Postgres hostname
DB_PORT 5432 Postgres port
DB_USERNAME folio_admin Postgres username
DB_PASSWORD - Postgres username password
DB_DATABASE okapi_modules Postgres database name
KAFKA_HOST kafka Kafka broker hostname
KAFKA_PORT 9092 Kafka broker port
OKAPI_URL http://okapi:9130 Okapi url
SYSTEM_USER_NAME data-export-system-user Username of the system user
SYSTEM_USER_PASSWORD - Password of the system user
SYSTEM_USER_ENABLED true Defines if system user must be created at service tenant initialization or used for egress service requests
ENV folio Logical name of the deployment, must be set if Kafka/Elasticsearch are shared for environments, a-z (any case), 0-9, -, _ symbols only allowed

Additional information

Data Export Spring API provides the following URLs:

Method URL Permissions Description
GET /data-export-spring/jobs/ data-export.job.collection.get Gets jobs
GET /data-export-spring/jobs/{id} data-export.job.item.get Gets a job by the job ID
POST /data-export-spring/jobs/ data-export.job.item.post Upserts a job
GET /data-export-spring/configs/ data-export.config.collection.get Get a list of data export configurations
PUT /data-export-spring/configs/{id} data-export.config.item.put Change an export configuration
POST /data-export-spring/configs/ data-export.config.item.post Add an export configuration

More detail can be found on Data Export Spring wiki-page: WIKI Data Export Spring.

Required Permissions

Institutional users should be granted the following permissions in order to use this Data Export Spring API:

  • data-export.config.all
  • data-export.job.all

Deployment information

Before Poppy release

Only ONE instance should be running until the issues described below are fixed:

  1. If more than one instance of the module is running, then the same export tasks will be launched by all instances. As a result, information will be duplicated.

More details: Prevents execution of the same scheduled export task from another node

  1. If instance of the module will be restarted by Kubernetes or manually and no need to register modules again, because version is not changed. As a result, mandatory information for scheduling (Okapi headers, system user, tenant information) will not be stored in memory in a FolioExecutionContext.

More details: Export scheduling doesn't support work in the cluster and after restarting docker container

Short overview

Before running scheduled task(job) there is check, that module is registered for the Okapi tenant.

Tenant information need to define DB schema for storing information about Job and etc.

The data-export-system-user system user for running scheduled export tasks is created in the post tenant API controller. The password must be set using the SYSTEM_USER_PASSWORD environment variable. Permissions are defined in src/main/resources/permissions/system-user-permissions.csv.

Also Okapi headers, system user, tenant information are s-tored in memory in a FolioExecutionContext.

Since Poppy release

Scheduling was changed to quartz: Quartz Scheduling Implementation. The issues above were fixed, there's no need to reenable module after restarts and it can be scaled.

Migration to quartz scheduling
  1. Migration is done once automatically on module upgrade from version which does not support quartz to version supporting quartz (based on moduleFrom and moduleTo versions in TenantAttributes).
  2. In case reloading of existing schedules needs to be forced, it can be done with setting forceSchedulesReload=true parameter in TenantAttributes in module enable request. Example
  3. After new version supporting quartz is deployed and enabled for tenants, old module version has to be stopped, otherwise jobs will be executed by both old version with spring scheduler and new version with quartz.
Module disabling for tenant

Tenant's schedules deletion is done in scope of module disable with purge. If disabling with purge is not invoked for mod-data-export-spring, tenant's scheduled jobs will continue to run in the background even after tenant itself is deleted. Details

Issue tracker

See project MODEXPS at the FOLIO issue tracker.

Other documentation

Other modules are described, with further FOLIO Developer documentation at dev.folio.org