Skip to content

Commit

Permalink
fix: backslash syntax error thanks to @rickokin
Browse files Browse the repository at this point in the history
  • Loading branch information
jaanli committed Apr 29, 2024
1 parent 4bda4d3 commit ff180ff
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 4 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,15 +88,15 @@ pip-compile

Initializing a dbt project:
```bash
dbt init healthcare_data
dbt init data_processing
```

## Building the datasets

1. Generate the synthetic healthcare data schemas using the data dictionary:

```bash
cd healthcare_data
cd data_processing
python scripts/generate_syh_dr_data_models.py ~/data/syh_dr https://www.ahrq.gov/sites/default/files/wysiwyg/data/SyH-DR-Codebook.pdf
```

Expand All @@ -109,7 +109,7 @@ dbt run --threads 8
3. Verify that you can query the data on the command line:

```bash
duckdb -c "SELECT * FROM '/Users/me/data/syh_dr/syhdr_commercial_inpatient_2016.parquet'"
duckdb -c "SELECT * FROM '~/data/syh_dr/syhdr_commercial_inpatient_2016.parquet'"
```

This should show the data:
Expand Down
4 changes: 3 additions & 1 deletion data_processing/scripts/generate_syh_dr_data_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,9 @@ def process_csv_files(pdf_url, csv_folder):
print(csv_str)
username = os.environ.get("USER")
path_without_user = "~/" + csv_path.split(username + '/')[1]
select_statement = f"SELECT\n {',\n '.join(column_list)}\nFROM read_csv('{path_without_user}', header=True, null_padding=true{csv_str if csv_types else ''})"
select_statement = f"""SELECT
{',\n '.join(column_list)}
FROM read_csv('{path_without_user}', header=True, null_padding=true{csv_str if csv_types else ''})"""
f.write(select_statement)

print(f"Generated SQL model: {sql_file}")
Expand Down

0 comments on commit ff180ff

Please sign in to comment.