Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support for DuckDB’s RANGE function with two timestamps for BigQuery Compatibility #10487

Open
1 task done
BugaM opened this issue Nov 13, 2024 · 1 comment
Open
1 task done
Labels
feature Features or general enhancements

Comments

@BugaM
Copy link

BugaM commented Nov 13, 2024

Is your feature request related to a problem?

In DuckDB, the RANGE function requires a third argument (an interval) when working with timestamps. This differs from BigQuery’s implementation, where the RANGE function can take two timestamp arguments without an interval. This incompatibility prevents the following query, which works in BigQuery, from running in DuckDB.

import ibis
import pandas as pd
from datetime import datetime

conn = ibis.duckdb.connect()

data = pd.DataFrame({
    'id': [1, 2, 1, 2],
    'value': [10, 20, 30, 40],
    'timestamp': [
        datetime(2023, 1, 1, 12, 0),
        datetime(2023, 1, 2, 12, 0),
        datetime(2023, 1, 3, 12, 0),
        datetime(2023, 1, 4, 12, 0)
    ]
})

conn.create_table('test_table', data)

query = conn.sql("""
    SELECT id, value, RANGE(timestamp, LEAD(timestamp) OVER (PARTITION BY id ORDER BY timestamp ASC)) as timestamp_range
    FROM test_table 
""", dialect="bigquery")

Error:

duckdb.duckdb.BinderException: Binder Error: No function matches the given name and argument types 'range(TIMESTAMP, TIMESTAMP)'. You might need to add explicit type casts.
        Candidate functions:
        range(BIGINT) -> BIGINT[]
        range(BIGINT, BIGINT) -> BIGINT[]
        range(BIGINT, BIGINT, BIGINT) -> BIGINT[]
        range(TIMESTAMP, TIMESTAMP, INTERVAL) -> TIMESTAMP[]
        range(TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH TIME ZONE, INTERVAL) -> TIMESTAMP WITH TIME ZONE[]

LINE 1: DESCRIBE SELECT id, value, RANGE(timestamp, LEAD(timestamp) OVER (...

The same query works in BigQuery:

WITH test_table AS (
    SELECT 1 AS id, 10 AS value, TIMESTAMP("2023-01-01 12:00:00") AS timestamp UNION ALL
    SELECT 2 AS id, 20 AS value, TIMESTAMP("2023-01-02 12:00:00") AS timestamp UNION ALL
    SELECT 1 AS id, 30 AS value, TIMESTAMP("2023-01-03 12:00:00") AS timestamp UNION ALL
    SELECT 2 AS id, 40 AS value, TIMESTAMP("2023-01-04 12:00:00") AS timestamp
)

 
SELECT id, value, RANGE(timestamp, LEAD(timestamp) OVER (PARTITION BY id ORDER BY timestamp ASC)) as timestamp_range
    FROM test_table 

Resulting:

image

What is the motivation behind your request?

I’m using Ibis to test BigQuery queries locally. This feature is needed to make sure BigQuery operators work the same way in Ibis. Specifically, it would support doing as of joins as suggested by the documentation. Enabling it also envolves adding the RANGE_CONTAINS function.

Describe the solution you'd like

Add support to BQ's RANGE and RANGE_CONTAINS functions when BigQuery dialect is chosen.

What version of ibis are you running?

9.5.0

What backend(s) are you using, if any?

DuckDB, BigQuery

Code of Conduct

  • I agree to follow this project's Code of Conduct
@BugaM BugaM added the feature Features or general enhancements label Nov 13, 2024
@BugaM BugaM changed the title feat: Support for DuckDB’s RANGE function with timestamps for BigQuery Compatibility feat: Support for DuckDB’s RANGE function with two timestamps for BigQuery Compatibility Nov 13, 2024
@gforsyth
Copy link
Member

Hi @BugaM -- if you are executing SQL queries via .sql then the translation issue here is with SQLGlot, not with Ibis.
sqlglot is what we use for handling dialect translation, and while we subclass some dialects to make tweaks, DuckDB has pretty robust support, so we currently don't do that.

This seems like an issue to open upstream at https://github.com/tobymao/sqlglot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements
Projects
Status: backlog
Development

No branches or pull requests

2 participants