DuckDB is an in-process SQL OLAP database management system that uses vectorized data engine and parallel query processing optimized for analytics. DuckDB provides extensive SQL support with direct data loading and query capabilities for local and remote data sources in Parquet
, CSV
, and JSON
data formats.
Random Fractals, Inc. created two DuckDB Visual Studio Code IDE extensions that allow you to view and query local data, DuckDB files, and remote data sources: DuckDB Pro Tools and DuckDB SQL Tools.
The free and public DuckDB SQL Tools Preview VS Code extension adds basic DuckDB SQL support and provides DuckDB connection management, SQL query API and user interfaces for the popular SQL Tools extension, SQL query editor, SQL language server, and data processing tools.
DuckDB SQL Tools Preview extension v1.5.0 adds the following basic DuckDB SQL capabilities to VS Code IDE for working with DuckDB v0.9 instances:
- Connect to a local DuckDB v0.9.2 file instance
- Create new in-memory DuckDB instance
- View DuckDB v0.9.2 tables, columns, and views
- Run SQL queries on open DuckDB v0.9.2 connections
- Attach SQLite database files to in-memory DuckDB instances
- Query remote
CSV
andParquet
data files with DuckDB HTTPFS extension - Create in-memory DuckDB tables from remote data sources and query results
- Manage DuckDB connections in SQL Tools Connection Explorer
- Autocomplete SQL keywords, table names, column names, and view names in SQL editor for an active DuckDB connection
- Save named SQL query Bookmarks
- Use SQL Query History
- Export SQL query results in
CSV
andJSON
data formats
SQL Tools documentation contains more information about the other generic SQL Tools extension features.
Install DuckDB SQL Tools Preview extension from VS Code marketplace or directly in VS Code IDE from Extensions tab (ctrl+shift+x
) by searching for DuckDB
.
DuckDB SQL Tools Installation documentation has instructions on how to install this Preview extension in VSCodium and other VS Code based IDEs like Azure Data Studio.
DuckDB SQL Tools documentation also lists all the Limitations and restrictions of this free public DuckDB VS Code extension Preview version, and recommended SQL Tools extension Configuration settings to use with our DuckDB VS Code Tools.
Pro Data Tools is our new Visual Studio Code extensions pack with premium SQL query and data view features created for the monthly Pro sponsors of our extensions on GitHub.
Our Pro Data Tools for devs and data scientists using VS Code IDE include DuckDB Pro Tools, PRQL Pro Tools and new Markdown SQL Pro Tools.
Pro Data Tools enhance SQL development experience, documentation and notebooks SQL code preview, SQL query execution and results rendering capablities for the different database management systems supported by the SQL Tools extension and plugins.
DuckDB Pro Tools extension adds advanced DuckDB read/write connection features and support to VS Code IDE, and provides full Database Schemas display, DuckDB Extensions, Views, Functions, Types, Indexes, Sequences, SQL Keywords and Configuration Settings views, information schema and PG catalog views, SQL query API and user interfaces integrated with the popular SQL Tools extension, SQL query editor, SQL language server, and VS Code data processing tools.
The latest version of DuckDB Pro Tools extension adds the following capabilities to VS Code IDE for working with DuckDB v0.10 instances:
- Connect to a local DuckDB v0.10.2 instance
- Create new in-memory DuckDB instance
- Import local and remote
CSV
,JSON
andParquet
data files into in-memory DuckDB instance for exploratory data analysis (EDA) - View DuckDB v0.10.2 databases, schemas, tables, columns, views, indexes, sequences, extensions, settings, functions, types and keywords in SQL Tools Connections Explorer
- Run SQL queries on active DuckDB connections
- Attach SQLite database files to in-memory DuckDB instances to run analytical queries
- Query remote
CSV
,Parquet
, andJSON
data files with DuckDB HTTPFS extension and new DuckDB JSON extension - Create in-memory DuckDB tables from remote data sources and query results
- Manage DuckDB 0.10.2 connections in SQL Tools Connections Explorer
- Auto-complete SQL keywords, DuckDB instance schemas, table names, column names, and view names for active DuckDB connections in VS Code SQL editor
- Save named SQL query Bookmarks
- Use SQL Query History
- Export DuckDB query results in
CSV
andJSON
data formats - Use PRQL Code Lens from our new PRQL Pro Tools collection to generate and run SQL queries on active DuckDB connection
- Explore new
employees.duckdb
demo data, PRQL and SQL sample queries - Run sample
chicago-crimes
andgbif-observations
PRQL and SQL queries on GitHub and AWS S3 hostedparquet
data files - Use over 30 built-in
DuckDB Tools
views and metadata shortcut commands from VS CodeCommand Palette...
Become a Pro Sponsor of our new Pro Data Tools work on GitHub to access DuckDB Pro Tools, PRQL Pro Tools and our new Markdown SQL Pro Tools releases and extension .vsix
installation packages.
All of our GitHub Pro sponsors get invited as collaborators to our private Pro Data Tools repositories, and will be notified about the upcoming updates, new releases, additions, and new Data Notebook Pro Tools extension we'll be releasing later this year.
To install DuckDB Pro Tools, download the latest duckdb-pro-tools-x.x.x.vsix
from the attached Assets in repository releases.
Use VS Code IDE Install from VSIX feature to install downloaded DuckDB Pro Tools extension package in VS Code or any other VSIX
-compatible IDE. DuckDB Pro Tools work in VSCodium and the latest version of Azure Data Studio IDE.
The other DuckDB Sql Tools extension you see in the Installed DuckDB extensions image above is our Free Trial public version of DuckDB Tools that only supports limited set of features, demo DuckDB files and sample queries to run.
If you already have the free public DuckDB SQL Tools installed, and configured new DuckDBPro connection to use with DuckDB Pro Tools extension, you'll be prompted to update duckdb-async
library to v0.10.2 to use the latest DuckDB version and features.
You might need to restart VS Code after duckdb library update for the new DuckDB 0.10.2 connection and queries to work. Below is a quick demo of that DuckDB library update process.
Note: DuckDB Pro Tools use DuckDB Node.js API and require a local Node.js installation to query DuckDB instances.
Download and install Node.js from the official node.js downloads page. Node.js is used as a local web server to host local data and DuckDB files. Node.js will install npm tool we use to install duckdb-async
library to establish DuckDB connections.
We use Node.js DuckDB library instead of the limited DuckDB WASM browser JS library other similar DuckDB data tools use to enable faster data imports and loading via multi-threaded node-gyp
DuckDB native API interface.
DuckDB Pro Tools v1.1.0 and higher now display the full catalog of DuckDb instance objects to view and query from SQL Tools Connections Explorer. The extended DuckDB objects tree view includes main DuckDB instance, system
, and temp
databases with the corresponding main
, information_schema
and pg_catalog
schemas.
DuckDB provides many standard PostgreSQL pg_catalog
functions and views for compatibility with PostgreSQL database and SQL syntax. You can explore those internal DuckDB views and function signatures via pg_catalag
schema in the DuckDB Objects tree view.
DuckDB Pro Tools v1.1.0 and higher DB Objects tree view also shows the full list of DuckDB indexes, sequences, information_schema
and other custom duckdb
metadata functions and internal views.
New DuckDB configuration settings display with setting type icons and current values in the DuckDB Objects tree view item description.
DuckDB configuration settings, database schema types, table and view columns now show basic data type icons for strings, numeric, date and timestamp columns and setting types.
Additionally, DuckDB Pro Tools v1.1.0 and higher display primary columns in tables, and not null
column value constraints.
DuckDB Pro Tools v1.2.0 added over 30 built-in DuckDB views and metadata shortcut commands to the Command Palette you can access via ctrl/cmd+shift+p
shortcut in VS Code or from View -> Commnad Palette...
menu by typing DuckDB Tools
in the >
command prompt:
DuckDB egnine itself provides a number of Extensions you can install and load to work with local and remote CSV
and Parquet
data files over HTTPFS, enable Full Text Search, attach SQLite database with SQLite Scanner, or attach PostgreSQL database instance with DuckDB Postgres Scanner.
You can check the list of core and installed DuckDB extensions by running the following SQL query on an open/active DuckDB connection:
select * from duckdb_extensions();
DuckDB v0.7 and higher DuckDB v0.8.1
and v0.9.2
include new and improved JSON
extension capabilities to work with JSON documents. DuckDB Pro Tools install and load DuckDB JSON extension for all active DuckDB conenctions in order to enable quick JSON data imports and other JSON related features without extra SQL setup.
Quick DuckDB JSON extension query example and steps to parse and load remote JSON
data document from a public Github repository with DuckDB Pro Tools in VS Code:
β₯± Create new in-memory DuckDBPro config.
β₯± Use JSON data http://raw.githubusercontent.com URL in your SELECT statement.
β₯± Optionally use CREATE TABLE ... AS
SQL statement to load retrieved JSON
data into new DuckDB table and run multiple queries on that data loaded into memory.
Note: the US airports demo data in example above is from our VS Code Data Table Renderers extension we will use later in the upcoming Data Notebook Pro extension to query and display tabular data results from many different data sources, including DuckDB. New Data Notebook Pro extension will be released in Q4 of 2024.
Quick Data Notebook extension info preview with a simple Data Connections side panel and a few commands already implemented for that custom VS Code Notebook extension we plan to release with SQL and PRQL queries support.
DuckDB Pro Tools extension also installs and loads HTTPFS DuckDB extension by default for all the open database instances. You can add other DuckDB Extensions to an open database instance by running INSTALL
and LOAD
extension SQL statements.
Example of loading trimmed down Chicago crimes data reported in 2022 into an in-memory DuckDB instance from a .parquet
data file hosted in our Chicago Crimes data and analytical tools demo Github repository:
This example uses implicitly loaded DuckDB HTTPFS extension to query reported Chicago crimes parquet data file with over 210K recorded crime reports, creates a CrimeReports
table in a new DuckDB :memory:
instance from remote parquet data file, and queries imported data.
SQLite database users can use DuckDB Pro Tools VS Code extension and DuckDB SQLite Scanner extension to add data from SQLite database to in-memory DuckDB instance.
Run the following SQL statements to add SQLite Scanner DuckDB extension to an open database connection:
INSTALL sqlite;
LOAD sqlite;
With the loaded SQLite Scanner DuckDB extension you can attach SQLite database file to a DuckDB database instance. Attached SQLite database tables will show up as views in DuckDB instance.
Run the following SQL CALL
function to attach SQLite database instance:
CALL sqlite_attach('E:\\projects\\data\\tools\\duckdb-tools\\data\\chinook\\sqlite\\chinook.sqlite');
Note: DuckDB Pro Tools extension uses DuckDB NodeJS Client API. In order to work with local data files, you need to specify full path to your local database or data files in SQL statements that reference local file paths.
Future versions of this extension will simplify local file path references by deducing absolute file path from the local DuckDB file connection string or open VS Code project workspace folder path, and replace relative database or data file references with the corresponding absolute path in an open VS Code project workspace.
Also, note in the sqlite_attach()
SQL function call statement above we are escaping \
file path delimiters on Windows OS by using \\
characters sequence.
DuckDB Pro Tools extension documentation repository contains sample /data folder with the popular chinook.duckdb
and chinook.sqlite
database files you can experiment with.
This private beta repository also contains new SQL scripts, create demo DuckDB files Jupyter notebooks, schema.sql
and load.sql
DuckDB scripts with demo CSV
and Parquet
data files, and new PRQL and SQL query examples for Chicago Crimes, Employees, GBIF Observations, TPC-DS and TPC-H benchmarks, and some public Transporation datasets query examples our Pro sponsors on github can use to get started with DuckDB Pro Tools in VS Code.
The demo /data
folder contains sample .sql
files with SQL and PRQL query examples you can try. You can also easily generate new SQL queries using recently released Hugging Face Code Autocomplete VS Code extension.
For example, SQL queries in employees/prompt.sql were generated with HF Code Autocomplete.
We'll add more demo data, sample DuckDB files, PRQL and SQL query examples in future DuckDB Pro Tools releases.
While DuckDB Pro Tools offer more features and examples, and work with the latest version of DuckDB, there are still some limitations in the initial version of this extension which we plan to address in the upcoming monthly updates and releases.
The latest version of DuckDB Pro Tools extension supports local DuckDB instances created with DuckDB v0.10.2 engine. Database instances and files created with prior versions of DuckDB are not supported as they use different compression and storage formats and the structure of .duckdb
file has been changing as DuckDB engine is evolving.
You can still use our free limited capabilities public DuckDB SQL Tools extension to load and query older DuckDB v0.6.1 data files. Switching between older DuckDB Sql Tools and this DuckDB Pro Tools version and DuckDB file configs will prompt you to upgrade or downgrade the corresponding DuckDB Node.js library accordingly, but you can keep both extensions and switch the extension you want to use depending on the DuckDB files version you are working with.
Otherwise, Use DuckDB CLI to export data from the older database file versions and create new .duckdb
file using the latest DuckDB storage implemenation.
Read Announcing DuckDB 0.6.0, Announcing DuckDB 0.7.0, and Announcing DuckDB 0.8.0 blog posts for more information about DuckDB storage improvements, new features, and other changes.
DuckDB Pro Tools extension v1.1.0 and above now open .duckdb
database files in read-write
file access mode.
NOTE: current DuckDB write access mode and connection handling in DuckDB Pro Tools v1.1.0 and latest is very alpha and some restrictions still apply:
-
DuckDB v0.10.2 still supports only one active
write
connection, or multipplereaders
and nowriters
in its latest implementation. -
DuckDB Pro Tools will create a
read-only
connection if opening.duckdb
file inread-write
access mode fails. -
Additionally, we now add
PRAGMA enable_checkpoint_on_shutdown
to all open DuckDB connections to runcheckpoint
when all DuckDB connecitons are closed and the parent NodeJS process releases a lock on the.wal
write ahead log file in order to synchronize all writes and database updates to the main.duckdb
file.
Important: we recommend you don't close open DuckDB connections via SQL Tools Connections close context menu and let VS Code, NodeJS, and DuckDB processes handle data sync from .wal
files on shutdown.
DuckDB Pro Tools let you open multiple DuckDB connections while keeping only one DuckDB connection as current or active when you select that connection in SQL Tools Connections Explorer view. If you close an active DuckDB connection and get an error while trying to reconnect, we suggest you restart VS Code for DuckDB to release .wal
file lock and grant new write or read-only database file access.
We are working on a new custom DuckDB Connection Pooling interface that will handle DuckDB connections better, and will prompt you to close another write connection open by another process or a program, when the DuckDB .wal
file is detected or locked by another process for the local DuckDB instance you attempt to update.
DuckDB Pro Tools extension lets you create in-memory database instances by specifying :memory:
keyword in the Database File field of the new DuckDB connection in SQL Tools Connection Assistant as demonstrated in DuckDB HTTPFS and SQLite Scanner extension usage examples above.
You can leverage writable :memory:
DuckDB instances in DuckDB Pro Tools to a great extent considering they allow you to load and query remote data files and export data as needed, as demonstrated in JSON
, HTTPFS
and other examples above.
In-Memory DuckDB instances function similar to In-Memory SQLite Databases. You can use DuckDB Import Data and Atach features available via DuckDB SQL statements, and use Export Database SQL statements to export created in-memory DuckDB instances.
SQL Tools extension provides many configuration Settings users can toggle to change database connection and tree view display options, sql formatting, and results display.
The following SQL Tools Settings were used while creating and testing DuckDB Pro Tools extension and are recommended for working with DuckDB instances efficiently. We suggest you set these preferences in User Settings in VS Code by navigating to File -> Preferences -> Settings -> User -> Extensions -> SQLTools
, or adding them to your global VS Code settings.json
config file using the JSON code snippet below:
{
...
"sqltools.useNodeRuntime": true,
"sqltools.disableNodeDetectNotifications": true,
"sqltools.autoOpenSessionFiles": false,
"sqltools.results.limit": 10000,
"sqltools.results.location": "current",
}
Setting | Description |
---|---|
"sqltools.useNodeRuntime": true |
Enable Node runtime in order to use DuckDB NodeJS API this database SQL tools extension depends on. |
"sqltools.disableNodeDetectNotifications": true |
Disable Node runtime detection notifications after initial SQLTools extension installation to prevent Node runtime information message display on every new VS Code session start. |
"sqltools.autoOpenSessionFiles": false |
Prevent auto open of new session SQL editor instance after connecting to the databaase instance. |
"sqltools.results.limit": 10000 |
Maximum number of records to return in results. SQL Tools defaults to displaying only 50 records in query results view. Changing this limit setting to 1000 or 10000 will show more data rows to inspect in result views. |
"sqltools.results.location": "current" |
Defines the editor group to use for result table views. SQL Tools displays all results in the next editor group to show results on the side next to the active SQL query editor. Changing this setting to current will display results in the same editor group and display more result columns. |
Please use our DuckDB SQL Tools GitHub Discussions portal to submit your feedback, share examples of how you are using our DuckDB VS Code extensions, or request new trivial and premium features. Our goal with these DuckDB SQL and Pro Data Tools extensions is to make DuckDB more accessible and easier to use in VS Code IDE.
Become a Fan or a Pro Sponsor on GitHub to support our dev work on these DuckDB Tools and other Random Fractals, Inc. code and data viz extensions if you find them useful, educational, or enhancing your daily dataViz dev code workflows and exploratory data analysis experience.