Releases: CAVaccineInventory/vaccine-feed-ingest-schema
v1.2.5
v1.2.4
v1.2.3
Add optional content_hash
str field to ImportSourceLocation
.
This field is used to round-trip an MD5 hash of the content of the import_json
(sorted keys without the source
field) to check if the source location has changed or not.
v1.2.2
Release more VaccineProvider
enum values:
LITTLE_CLINIC
MARIANOS
MARKET_STREET
MEDICAP
MEIJER
OSCO
PICK_N_SAVE
PRICE_CHOPPER
PUBLIX
QFC
RALEYS
SAV_ON
SHOP_RITE
SMITHS
SOUTH_EASTERN
STOP_AND_SHOP
THRIFTY
TOM_THUMB
WEIS
WINN_DIXIE
v1.2.1
v1.2.0
- Place length limits on string fields:
- Note fields are limited to 2046 characters.
- Enum fields are limited to 64 characters.
- ID fields are limited to 128 characters.
- All other fields are limited to 256 characters.
- The phone number validation has been relaxed to accept any 9-digit-style phone numbers.
v1.1.0
Loosen the Location id restrictions to allow UUID style ids with -
in them.
v1.0.0
The 1.0.0 makes big changes to the schema to make it much easier to extract clean, useful data from our fetch
/parse
/normalize
stages.
High-level goal
The downstream pipelines that consume our data have to adapt to the wide variety of scraped data formats. To help us along, we are going to impose more structure on the schema formats so that the consumers of our data have to deal with fewer edge cases. In most cases, these changes should also make it easier to write correct ingestion stages.
Specific changes
- New pydantic enums to reduce ambiguity:
State
for describing US states and territories, with USPS two-character abbreviations.ContactType
, with"general"
and"booking"
options.DayOfWeek
for the days Monday - Sunday and "public holidays".VaccineType
, with options for Pfizer/BioNTech, Moderna, Johnson & Johnson, and Oxford/AstraZeneca vaccines.VaccineSupply
, with options for vaccine stock status.WheelchairAccessLevel
, with various options for describing the wheelchair accessibility of the location.VaccineProvider
, for common parent organizations such as retail pharamcy chains.LocationAuthority
, for other authorities that identify locations, such as Google Places.
- Format validation for certain fields:
Address
:zip
must be a ZIP or ZIP+4 code, if present.state
must be aState
, if present.
LatLng
:latitude
must be between -90 and 90, inclusive, if present.longitude
must be between -180 and 180, inclusive, if present.
Contact
:contact_type
must be from theContactType
enum, if present.phone
must be in the format of a 9 or 10 digit US phone number, if present.website
must be an HTTP/HTTPS URL, if present.email
must be formatted as an email address, if present.
OpenHour
:day
must be aDayOfWeek
.open
has been renamedopens
for better parallelism withcloses
.
Vaccine
:vaccine
must be aVaccineType
.supply_level
must be from theVaccineSupply
enum, if present.
Organization
:id
should be from theVaccineProvider
enum, if possible, but may be a string or empty. Using an enum value makes it easier for consumers to interpret the value.id
must use only lowercase alphanumeric characters and underscores.
Link
:authority
should be aLocationAuthority
/VaccineProvider
, if possible, but may be a string or empty. Using an an enum value makes it easier to use these links to match locations.authority
must use only lowercase alphanumeric characters and underscores.uri
must be a URL, if present.
Source
:source
must use only lowercase alphanumeric characters and underscores.id
must not use a space or colon. These must be replaced with another character, such as a dash.fetched_from-uri
must be a URL, if present.
Location
:id
must consist of only lowercase alphanumeric characters and underscores, with precisely one colon. The colon should separate the part of the ID that reflects the data source and the part of the ID that reflects the specific location.
- Additional requirements:
- Each
Contact
should have precisely one field (phone
,website
,email
,other
). Do not coalesce several of these into a single method. - The
opens
value should be before or the same as thecloses
value on anOpenDate
. - The
opens
value should be before or the same as thecloses
value on anOpenHour
. - The
id
of aLocation
must be prefixed with the source name (specified inLocation.source.source
).
- Each
v0.2.1
-
Split
schema
intolocation
andload
. Users should now import this package using eitherfrom vaccine_feed_ingest_schema import location
or
from vaccine_feed_ingest_schema import load
The current method of importing (
schema
includes bothlocation
andload
) has been maintained for compatibility, but a deprecation warning has been added. In a future major release,schema
will be removed. -
Configure
pydantic
to:- strip whitespace from string values
- error if undefined attributes are added to a class
- error on assignment if the value does not match the desired type
- store enums as strings - this will enable easier migration to future enum-enforced values
-
Set all attributes on
Address
to be optional (previously, onlystreet2
was optional).
Note: v0.2.0
was skipped.