Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: window functions not working as expected in BigQuery backend #10390

Open
1 task done
efcaguab opened this issue Oct 29, 2024 · 2 comments
Open
1 task done

bug: window functions not working as expected in BigQuery backend #10390

efcaguab opened this issue Oct 29, 2024 · 2 comments
Labels
bug Incorrect behavior inside of ibis

Comments

@efcaguab
Copy link

efcaguab commented Oct 29, 2024

What happened?

When I try to use nunique with a group_by, or window I'm getting an error suggesting that the SQL is not allowed by BigQuery.

The same ibis expressions seem to work well using duckdb. Perhaps something to do with the row-wise arrangement of the window function?

import ibis
from ibis import _
import pandas as pd

bq_dataset = "scratch_fer"
data = pd.DataFrame({
    "foo": [1, 1, 2, 2, 3, 3],
    "bar": ["a", "b", "a", "a", "b", "b"]
})
bq = ibis.bigquery.connect()
bq.create_table("test", data, database=bq_dataset, overwrite=True)
test_table = bq.table(f"{bq_dataset}.test")

# All these commands below fail: (400 Window framing clause is not allowed if
# DISTINCT is specified at [3:60]; reason: invalidQuery, location: query,
# message: Window framing clause is not allowed if DISTINCT is specified
test_table.group_by("foo").mutate(bar=_.bar.nunique()).to_pandas()
test_table.filter(_.bar.nunique().over(ibis.window(group_by="foo")) > 1).to_pandas()

# However these (which I think will be the equivalent of above) work
test_table.sql(f"SELECT foo, COUNT(DISTINCT bar) OVER (PARTITION BY foo) AS bar FROM {bq_dataset}.test").to_pandas()
test_table.sql(f"SELECT * FROM {bq_dataset}.test QUALIFY COUNT(DISTINCT bar) OVER (PARTITION BY foo) > 1").to_pandas()

What version of ibis are you using?

9.5.0

What backend(s) are you using, if any?

bigquery

Relevant log output

(400 Window framing clause is not allowed if
# DISTINCT is specified at [3:60]; reason: invalidQuery, location: query,
# message: Window framing clause is not allowed if DISTINCT is specified

Code of Conduct

  • I agree to follow this project's Code of Conduct
@efcaguab efcaguab added the bug Incorrect behavior inside of ibis label Oct 29, 2024
@cpcloud
Copy link
Member

cpcloud commented Nov 1, 2024

This looks like an issue with what we're generating as the default window frame.

Thanks for the report!

@cpcloud
Copy link
Member

cpcloud commented Nov 11, 2024

I've started on this, but it's hairy and probably requires changing the behavior of window functions with Ibis default values.

I should be able to get to this for 10.0 which would be the ideal release to ship it in since it would be a big change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Status: backlog
Development

No branches or pull requests

2 participants