Releases: snowplow/dbt-snowplow-normalize
Snowplow Normalize v0.3.5
Summary
This release adds support for schema grants
Features
- Add support for schema grants
Under the hood
- Enforce full refresh flag to refresh manifest tables
Upgrading
To upgrade simply bump the snowplow-normalize version in your packages.yml
file. Note the minimum version of snowplow-utils required is now 0.16.2
Snowplow Normalize v0.3.4
Summary
This version bumps the package dependency to add support for the latest snowplow utils package. Please note that from this version onwards this package is under the SPAL license.
Under the hood
- Bump support for latest utils
- Migrate to SPAL license
Upgrading
To upgrade simply bump the snowplow-normalize version in your packages.yml
file.
Snowplow Normalize v0.3.3
Summary
- Include the new base macro functionality from utils in the package
- Allow users to specify the timestamp used to process events (from the default of
collector_tstamp
)
Under the hood
- Simplify the model architecture
Upgrading
Bump the snowplow-normalize version in your packages.yml
file.
Snowplow Normalize v0.3.2
Summary
Bumps the max supported snowplow-utils
version to allow usage with our other packages.
Upgrading
Bump the snowplow-normalize version in your packages.yml
file.
Snowplow Normalize v0.3.1
Summary
This version bumps the requirement of the jsonschema
package to validate schemas with the MultipleOf
property.
Fixes
- Bump
jsonschema
minimum version (Close #33)
Upgrading
To upgrade the package, bump the version number in the packages.yml
file in your project.
Snowplow Normalize v0.3.0
Summary
This version migrates our models away from the snowplow_incremental_materialization
and instead move to using the built-in incremental
with an optimization applied on top.
🚨 Breaking Changes 🚨
Changes to materialization
To take advantage of the optimization we apply to the incremental
materialization, users will need to add the following to their dbt_project.yml
:
# dbt_project.yml
...
dispatch:
- macro_namespace: dbt
search_order: ['snowplow_utils', 'dbt']
For custom models please refer to the snowplow utils migration guide and the latest docs on creating custom incremental models.
Features
- Migrate from
get_cluster_by
andget_partition_by
toget_value_by_target_type
- Migrate all models to use new materialization
Docs
- Update readme
Upgrading
Bump the snowplow-normalize version in your packages.yml
file, and ensuring you have followed the above steps. You can read more in our upgrade guide
Snowplow Normalize v0.2.3
Summary
This release allows users to disable the days late data filter to enable normalizing of events that don't populate the dvce_sent_tstamp
field.
Features
- Allow disabling of days late filter by setting
snowplow__days_late_allowed
to-1
(#28)
Upgrading
To upgrade the package, bump the version number in the packages.yml file in your project.
Snowplow Normalize v0.2.2
Summary
This release fixes an issue with column aliasing in BigQuery when the schema order differs from the table in the warehouse. It also adds the ability to alias your user_id
column and add flat atomic.events
columns into your users
table.
Features
- Fix column alias ordering issue in BigQuery
- Add ability to alias
user_id
column - Add ability to add flat columns to the events table
Under the hood
- Alter github pages publishing action
Upgrading
To upgrade the package, bump the version number in the packages.yml file in your project. To use the new features please see our docs for the new values to add to your config file.
Snowplow Normalize v0.2.1
Summary
This release fixes the expected path for a private registry to require it to end with /api
in line with other resolvers. It also upgrades the schema requirement for the resolver to 1.0.3 so you can use newer resolver files without issues.
Features
- Fix private registry uri requirement
- Bump iglu resolver schema version
Under the hood
- Fix github pages publishing action
Upgrading
To upgrade the package, bump the version number in the packages.yml file in your project.
Snowplow Normalize v0.2.0
Summary
This release drops support for dbt versions <1.3 to use the latest dbt-utils package, adds functionality for custom user_ids and multiple events per table, as well as ensuring all appropriate versions of BigQuery contexts are used. Due to these changes this version contains a number of breaking changes, so please read the Upgrade section carefully.
🚨 Breaking Changes 🚨
- Config file structure has changed to enable multiple event types per table
- Macro inputs have changed to enable multiple event types per table and custom user_id field
- The filtered events table has a new column to enable multiple event types per table
- Support for versions of dbt < 1.3 has been dropped
Features
- Allow for multiple event types (including self-describing) per normalized table
- Allow for custom user_id field within users table
- BigQuery optimized to use all same major version number sdes and contexts (in line with other Snowplow packages)
- Enhanced testing and warnings under the hood
- Drop support for dbt versions below 1.3 (Close #17)
Upgrading
To upgrade the package, bump the version number in the packages.yml file in your project.
Upgrading your config file
To upgrade your config file:
- Change the
event_name
field toevent_names
and make the value a list - Change the
self_describing_event_schema
field toself_describing_event_schemas
and make the value a list - If you wish to make use of the new features, see the example config or the docs for more information
Upgrading your models
Once you have upgraded your config file, the easiest way to ensure your models match the new macros is to re-run the Python script. If you would prefer not to do this, you can:
- For each normalized model:
- Convert the
event_name
andsde_cols
fields to lists, and pluralize the names in both the set and the macro call - Add a new field,
sde_aliases
which is an empty list, add this betweensde_types
andcontext_cols
in the macro call
- Convert the
- For your filtered events table:
- Change the
unique_key
in the config section tounique_id
- Add a line between the
event_table_name
andfrom
lines for each select statement;, event_id||'-'||'<that_event_table_name>' as unique_id
, with the event table name for that select block.
- Change the
- For your users table:
- Add 3 new values to the start of the macro call,
'user_id','',''
, before theuser_cols
argument.
- Add 3 new values to the start of the macro call,
Upgrade your filtered events table
If you use the master filtered events table, you will need to add a new column for the latest version to work. If you have not processed much data yet it may be easier to simply re-run the package from scratch using dbt run --full-refresh --vars 'snowplow__allow_refresh: true'
, alternatively run the following in your warehouse, replacing the schema/dataset/warehouse and table name for your table:
ALTER TABLE {schema}.{table} ADD COLUMN unique_id STRING;
UPDATE {schema}.{table} SET unique_id = event_id||'-'||event_table_name WHERE 1 = 1;