Skip to content

Releases: snowplow/dbt-snowplow-normalize

Snowplow Normalize v0.3.5

18 Mar 09:04
Compare
Choose a tag to compare

Summary

This release adds support for schema grants

Features

  • Add support for schema grants

Under the hood

  • Enforce full refresh flag to refresh manifest tables

Upgrading

To upgrade simply bump the snowplow-normalize version in your packages.yml file. Note the minimum version of snowplow-utils required is now 0.16.2

Snowplow Normalize v0.3.4

26 Jan 15:59
Compare
Choose a tag to compare

Summary

This version bumps the package dependency to add support for the latest snowplow utils package. Please note that from this version onwards this package is under the SPAL license.

Under the hood

  • Bump support for latest utils
  • Migrate to SPAL license

Upgrading

To upgrade simply bump the snowplow-normalize version in your packages.yml file.

Snowplow Normalize v0.3.3

03 Oct 08:14
7bc8a05
Compare
Choose a tag to compare

Summary

  • Include the new base macro functionality from utils in the package
  • Allow users to specify the timestamp used to process events (from the default of collector_tstamp)

Under the hood

  • Simplify the model architecture

Upgrading

Bump the snowplow-normalize version in your packages.yml file.

Snowplow Normalize v0.3.2

12 Sep 12:47
Compare
Choose a tag to compare

Summary

Bumps the max supported snowplow-utils version to allow usage with our other packages.

Upgrading

Bump the snowplow-normalize version in your packages.yml file.

Snowplow Normalize v0.3.1

14 Jun 10:01
Compare
Choose a tag to compare

Summary

This version bumps the requirement of the jsonschema package to validate schemas with the MultipleOf property.

Fixes

  • Bump jsonschema minimum version (Close #33)

Upgrading

To upgrade the package, bump the version number in the packages.yml file in your project.

Snowplow Normalize v0.3.0

28 Mar 09:27
Compare
Choose a tag to compare

Summary

This version migrates our models away from the snowplow_incremental_materialization and instead move to using the built-in incremental with an optimization applied on top.

🚨 Breaking Changes 🚨

Changes to materialization

To take advantage of the optimization we apply to the incremental materialization, users will need to add the following to their dbt_project.yml :

# dbt_project.yml
...
dispatch:
  - macro_namespace: dbt
    search_order: ['snowplow_utils', 'dbt']

For custom models please refer to the snowplow utils migration guide and the latest docs on creating custom incremental models.

Features

  • Migrate from get_cluster_by and get_partition_by to get_value_by_target_type
  • Migrate all models to use new materialization

Docs

  • Update readme

Upgrading

Bump the snowplow-normalize version in your packages.yml file, and ensuring you have followed the above steps. You can read more in our upgrade guide

Snowplow Normalize v0.2.3

16 Mar 09:53
Compare
Choose a tag to compare

Summary

This release allows users to disable the days late data filter to enable normalizing of events that don't populate the dvce_sent_tstamp field.

Features

  • Allow disabling of days late filter by setting snowplow__days_late_allowed to -1 (#28)

Upgrading

To upgrade the package, bump the version number in the packages.yml file in your project.

Snowplow Normalize v0.2.2

13 Mar 07:41
Compare
Choose a tag to compare

Summary

This release fixes an issue with column aliasing in BigQuery when the schema order differs from the table in the warehouse. It also adds the ability to alias your user_id column and add flat atomic.events columns into your users table.

Features

  • Fix column alias ordering issue in BigQuery
  • Add ability to alias user_id column
  • Add ability to add flat columns to the events table

Under the hood

  • Alter github pages publishing action

Upgrading

To upgrade the package, bump the version number in the packages.yml file in your project. To use the new features please see our docs for the new values to add to your config file.

Snowplow Normalize v0.2.1

23 Jan 13:58
Compare
Choose a tag to compare

Summary

This release fixes the expected path for a private registry to require it to end with /api in line with other resolvers. It also upgrades the schema requirement for the resolver to 1.0.3 so you can use newer resolver files without issues.

Features

  • Fix private registry uri requirement
  • Bump iglu resolver schema version

Under the hood

  • Fix github pages publishing action

Upgrading

To upgrade the package, bump the version number in the packages.yml file in your project.

Snowplow Normalize v0.2.0

21 Dec 15:03
Compare
Choose a tag to compare

Summary

This release drops support for dbt versions <1.3 to use the latest dbt-utils package, adds functionality for custom user_ids and multiple events per table, as well as ensuring all appropriate versions of BigQuery contexts are used. Due to these changes this version contains a number of breaking changes, so please read the Upgrade section carefully.

🚨 Breaking Changes 🚨

  • Config file structure has changed to enable multiple event types per table
  • Macro inputs have changed to enable multiple event types per table and custom user_id field
  • The filtered events table has a new column to enable multiple event types per table
  • Support for versions of dbt < 1.3 has been dropped

Features

  • Allow for multiple event types (including self-describing) per normalized table
  • Allow for custom user_id field within users table
  • BigQuery optimized to use all same major version number sdes and contexts (in line with other Snowplow packages)
  • Enhanced testing and warnings under the hood
  • Drop support for dbt versions below 1.3 (Close #17)

Upgrading

To upgrade the package, bump the version number in the packages.yml file in your project.

Upgrading your config file

To upgrade your config file:

  • Change the event_name field to event_names and make the value a list
  • Change the self_describing_event_schema field to self_describing_event_schemas and make the value a list
  • If you wish to make use of the new features, see the example config or the docs for more information

Upgrading your models

Once you have upgraded your config file, the easiest way to ensure your models match the new macros is to re-run the Python script. If you would prefer not to do this, you can:

  • For each normalized model:
    • Convert the event_name and sde_cols fields to lists, and pluralize the names in both the set and the macro call
    • Add a new field, sde_aliases which is an empty list, add this between sde_types and context_cols in the macro call
  • For your filtered events table:
    • Change the unique_key in the config section to unique_id
    • Add a line between the event_table_name and from lines for each select statement; , event_id||'-'||'<that_event_table_name>' as unique_id, with the event table name for that select block.
  • For your users table:
    • Add 3 new values to the start of the macro call, 'user_id','','', before the user_cols argument.

Upgrade your filtered events table

If you use the master filtered events table, you will need to add a new column for the latest version to work. If you have not processed much data yet it may be easier to simply re-run the package from scratch using dbt run --full-refresh --vars 'snowplow__allow_refresh: true', alternatively run the following in your warehouse, replacing the schema/dataset/warehouse and table name for your table:

ALTER TABLE {schema}.{table} ADD COLUMN unique_id STRING;
UPDATE {schema}.{table} SET unique_id = event_id||'-'||event_table_name WHERE 1 = 1;