You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, from a practical standpoint, this is not helpful. My example above is a real-life example from our production table. And it makes sense: If a new row is added then that's not an update, and thus the update_key is empty. Only if a row is modified then the update_key is set, which makes sense.
I therefore believe there should be another load strategy that loads "new records after max(update_key)" plus "new records with primary_key larger then max(primary_key).
Suppose I have an Oracle table with two columns, ID (the primary key) and MDATUM (the update key).
Now I upload this table to Snowflake.
Next, one row is added to the table (ID=4), and one row is updated (ID=2), which means it gets an MDATUM:
so it looks like this:
How can get those two new rows into Snowflake using an incremental load?
When I load with
then I get this table:
As you can see, ID=3 is updated as expected. The new row (ID=4), on the other hand, is not inserted into the target table.
I believe this is in line with your documentation.
However, from a practical standpoint, this is not helpful. My example above is a real-life example from our production table. And it makes sense: If a new row is added then that's not an update, and thus the update_key is empty. Only if a row is modified then the update_key is set, which makes sense.
I therefore believe there should be another load strategy that loads "new records after max(update_key)" plus "new records with primary_key larger then max(primary_key).
A workaround is I can make run Sling twice:
First, I load using these parameters:
This gets me all rows where the update_key MDATUM is newer then max(MDATUM) in my target table, i.e., all updated rows.
Then I run Sling again and this time I use:
This should get me all IDs where the ID is higher then max(ID) in my target table, i.e., all new rows.
I feel my use-case is quite plausible and should be quite widespread.
What are your thoughts on this? And do you see a problem with my workaround?
The text was updated successfully, but these errors were encountered: