Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue Loading chinese characters from Snowflake to SQL Server #443

Open
kkprab opened this issue Nov 19, 2024 · 4 comments
Open

Issue Loading chinese characters from Snowflake to SQL Server #443

kkprab opened this issue Nov 19, 2024 · 4 comments

Comments

@kkprab
Copy link

kkprab commented Nov 19, 2024

I am trying to load data from Snowflake to SQL Server
In SQL Server column value with chinease characters transformed to ???? (question marks)

Sample below
image

in Snowflake we see the charaters
image

in config file i have tried with encode or decode UTF8 but no success
image

Can you plz suggest what configuration we try

@flarco
Copy link
Collaborator

flarco commented Nov 19, 2024

A few questions:

  • What version of sling are you using?
  • Are you loading into SQL Server with bcp?
    • if so, can you try without bulk (target_options: { use_bulk: false }).
  • can you try the cli to stdout to see if the characters show OK? sling run --src-conn core_dev --src-stream dev.table --limit 10 --stdout

@kkprab
Copy link
Author

kkprab commented Nov 19, 2024

Thanks Flarco

  • Using sling version 1.2.22
  • Tried with use_bulk: false still showing ??? instead of chinese characters
  • While running cli as you have suggested and i can see chinese characters in the output

image

@flarco flarco added the bug Something isn't working label Nov 19, 2024
@kkprab
Copy link
Author

kkprab commented Nov 25, 2024

Update: The initial test was conducted on Linux. I installed Sling on Windows and repeated the test. This time, Chinese characters were successfully loaded into the SQL Server table. I suspected the issue might be related to the system locale. On Linux, the locale was set to en_US.UTF-8, while on Windows, it was Western European.

I changed the Linux locale to en_US.ISO-8859-15, but encountered the following error:

~ could not bulk import ~ SQL Server BCP Import Command -> bcp 'dbo.ETQ_SUPPLIERS_STAGE_tmp' in '/tmp/sqlserver.dbo.etq_suppliers_stage_tmp.1732545272071.TVo1.csv' -S '****' -d 'RelianceGateway' -t ',' -m '1' '-w' -q -b '50000' -F '2' -e '/tmp/sqlserver.dbo.etq_suppliers_stage_tmp.1732545272082.TSi.error' -U '****' -P '****' -u SQL Server BCP Import Error -> The table name specified exceeds the maximum allowed length.
Following recommendations, I adjusted the SAMPLE_SIZE and set bulk_import to false, but the error persisted. When I reverted the locale back to en_US.UTF-8, the process worked, but Chinese characters were not loaded correctly.

Any suggestions on how to resolve this issue?

@flarco
Copy link
Collaborator

flarco commented Nov 25, 2024

Interesting that it worked for Windows.
Can you try this use_bulk: false in Linux (so it won't use bcp)?

The error you're getting is actually different: The table name specified exceeds the maximum allowed length..
Can you try a shorter table name?

@flarco flarco removed the bug Something isn't working label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants