-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Casting existing timestamp to timestamp again strips timezone information #12218
Comments
I'd be happy to take this if it's not already being worked on. |
take |
i'd expect |
@findepi It appears so |
The SQL spec compliant behavior when casting
In a sense PostgreSQL's
Now, DF behavior when casting
IMO we should change DF cast from cc @alamb |
apache/arrow-rs#5827 seems related. |
So would the expected outcome be that |
I wouldn't say that timestamp has UTC time. |
Makes sense, so the expectation is for the date-time calculation to be correct for that TZ but no offset? So for the example above:
Should actually be:
|
@devanbenz Agreed |
Okay sounds good @findepi -- I do see: https://github.com/apache/arrow-rs/pull/5831/files which appears to be relevant. I think this change would be better suited for arrow vs. datafusion I suppose? |
@devanbenz i think so, yes |
@findepi @alamb It actually appears that datafusion (and in regards apache arrow) seems to be casting as expected afaik. It looks like the
Should built in functions be following PG as well? Postgres
Clickhouse
Datafusion
Note: All queries were ran at around 11:30 AM CST. The PG and CH queries are using local TZ where as DF does not and uses Zulu time instead. Clickhouse does not support timestamptz hence why it was not in the query outputs given in the examples. Taking a look at a similar query with
But instead of stripping it and making it Zulu time it just sets it back to whatever the @alamb I think that we shouldn't modify the current casting since it appears to be functioning as expected per the arrow spec apache/arrow-rs#5827 (comment) (at least for this bug specifically) but we should change the way that |
Just for the sake of clarity, my original description of the behaviour holds with times other than
That being said, if this is the expected behaviour of other SQL dialects then it seems reasonable to me that it would behave the same way in DataFusion as well. |
Ah yeah just testing something similar out in
But taking a look at postgres that is not the case as you had mentioned above:
Taking a look at MariaDB I'm seeing a similar output:
I think the |
Continuation from this thread
This investigation did lead me to two other findings in which I think should be action'ed on. They are not
I believe the functionality is there in the code since I see that the |
I think you can do it like select now() AT TIME ZONE 'America/Denver'; Though it would be awesome if that was documented (it doesn't appear to be in the SQL reference): https://datafusion.apache.org/search.html?q=AT+TIMEZONE |
Taking a look at this it appears that casting to a timezone does work in datafusion but it doesn't function to how I would expect (or maybe it needs to be a different type of cast similar to clickhouse syntax?): Clickhouse:
Datafusion:
Thats more so what I mean -- that when casting to a timestamp with a given timezone the output should abide by the timezone where it appears that |
Unfortunately as it stands the |
@Omega359 good point |
I think you could potentially get the server time from the https://docs.rs/datafusion/latest/datafusion/logical_expr/struct.ScalarUDF.html#method.simplify (aka you could rewrite |
I looked into that @alamb but I think what is required is access to ConfigOptions (so we can get tz from config, etc). I don't think
Augmenting |
Yeah I agree. There are other usecases for storing state / customized in UDFs (like pre-compiling regexps once per query rather than once per batch, for example). I keep trying to find some common API / pattern we can put in and follow Related discussions |
Describe the bug
Using the
::timestamp
(or CAST()) function strips timezone information.To Reproduce
Or as seen with actual values:
Expected behavior
I would expect that using the
::timestamp
cast on a timestamp would not impact its timezone. This is how Postgres behaves.Additional context
No response
The text was updated successfully, but these errors were encountered: