Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with disabling a table in the replication yaml after upgrade #441

Open
james-n-centric opened this issue Nov 18, 2024 · 4 comments
Open

Comments

@james-n-centric
Copy link

Issue Description

There seems to be an issue using the 'disabled' flag in the stream section of the replication yaml, which seems to have started with the 1.2.22 update I did earlier today.

  • Sling version (sling --version): 1.2.22

  • Operating System (linux, mac, windows): mac

  • Replication Configuration:

This YAML works correctly (i.e. dbo.staging_node_usage2 is ignored from the replication):

source: AZURE_SQL_SERVER
target: POSTGRES_REMOTE_TGT

defaults:
  mode: full-refresh
  object: node_usage.{stream_table}

streams:
  dbo.*:
  dbo.staging_node_usage2:
    disabled: true
    object: node_usage.{stream_table}

This generates an error:

source: AZURE_SQL_SERVER
target: POSTGRES_REMOTE_TGT

defaults:
  mode: full-refresh
  object: node_usage.{stream_table}

streams:
  dbo.*:
  dbo.staging_node_usage2:
    disabled: true

Error message:

fatal:
~ Error compiling replication config
need to specify `object` for stream `dbo.staging_node_usage2`. Please see https://docs.slingdata.io/sling-cli for help.
@flarco
Copy link
Collaborator

flarco commented Nov 18, 2024

I am unable to reproduce. Below works for me. Perhaps you have some white space character somewhere?

source: MSSQL
target: POSTGRES

defaults:
  mode: full-refresh
  object: mssql.{stream_table}

streams:
  dbo.*:
  dbo.test1k_sqlserver_bcp_wide:
    disabled: true

@james-n-centric
Copy link
Author

Very strange. I've copy/pasted your yaml from above, still not working. I've tried in a brand new virtual environment, still not working.

Seem to be in the def _run(cmd: str, temp_file: str, return_output=False, env:dict=None, stdin=None): function that it's having issues:

File ~/Documents/general-code/analytics/wrangling/sling_data/sling_env/lib/python3.12/site-packages/sling/__init__.py:454, in _run(cmd, temp_file, return_output, env, stdin)
    451 for k,v in os.environ.items():
    452   env[k] = env.get(k, v)
--> 454 for line in _exec_cmd(cmd, env=env, stdin=stdin):
    455   if return_output:
    456     lines.append(line)
...
    521 if proc.returncode != 0:
--> 522   raise Exception(f'Sling command failed:\n{lines}')

Exception: Sling command failed:

Something odd in my python env I can only assume.

@flarco
Copy link
Collaborator

flarco commented Nov 19, 2024

Oh, you're not using the CLI? Can you share your python code?

@james-n-centric
Copy link
Author

Sure. This is in a notebook, using Python 3.12.3

from sling import Replication
import yaml

with open('test5.yaml') as file:
	config = yaml.load(file, Loader=yaml.FullLoader)

	replication = Replication(**config)

	replication.run() 

test5.yaml=

source: AZURE_SQL_SERVER
target: POSTGRES_REMOTE_TGT

defaults:
  mode: full-refresh
  object: node_usage.{stream_table}

streams:
  dbo.*:
  dbo.staging_node_usage:
    disabled: true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants