Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL Date Nanos Addition and Subtraction #116839

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

not-napoleon
Copy link
Member

Resolves #109995

This adds support and tests for addition and subtraction of date nanos with periods and durations. It does not include support for date_diff, which is a separate ticket (#109999). The bulk of the PR is testing, the actual date math is all handled by library functions.

@not-napoleon not-napoleon added >non-issue auto-backport Automatically create backport pull requests when merged test-release Trigger CI checks against release build :Analytics/ES|QL AKA ESQL v9.0.0 v8.17.0 labels Nov 14, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 14, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@@ -413,6 +413,14 @@ public static boolean isDateTimeOrTemporal(DataType t) {
return isDateTime(t) || isTemporalAmount(t);
}

public static boolean isDateTimeOrNanosOrTemporal(DataType t) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered adding DATE_NANOS to isDateTime, but that creates a few problems. First off, we use isDateTime in a lot of places, and touching all of them would have made this already quite large PR even larger. Second, there are clearly some places, such as EsqlDataTypeConverter which clearly need to check for exactly DATETIME type, and would have incorrect behavior if we expanded isDateTime to include DATE_NANOS. Given that, it seemed like adding a new function was the safest way to proceed.

.orElse(UNSUPPORTED);
try {
return types.stream()
.min((dt1, dt2) -> dataTypeCastingPriority.get(dt1).compareTo(dataTypeCastingPriority.get(dt2)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a lot of time digging into why tests were failing here. Ultimately, it's unclear to me why the other functions supporting date nanos were not throwing an NPE here. The NPE happens because dataTypeCastingPriority.get(DATE_NANOS) returns null, since date nanos are not in that map, which then causes compareTo to fail with the observed NPE.

My understanding is that dataTypeCastingPriority is used for autocasting literals, which is not currently planned for date nanos(note that autocasting literals is different from the planned auto-union-type work in #110009). Given that, it seemed like adding DATE_NANOS to the map would result in incorrect behavior. Having said that, it is deeply unclar to me if catching the exception and returning UNSUPPORTED is correct here. Advice would be welcome.

@@ -117,13 +125,44 @@ public static Iterable<Object[]> parameters() {
return new TestCaseSupplier.TestCase(
List.of(
new TestCaseSupplier.TypedData(lhs, DataType.DATETIME, "lhs"),
new TestCaseSupplier.TypedData(rhs, DataType.DATE_PERIOD, "rhs")
new TestCaseSupplier.TypedData(rhs, DataType.DATE_PERIOD, "rhs").forceLiteral()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the course of working on this, I noticed that SubTests was only running the folding tests for date math, because we hadn't been using forceLiteral here. That also meant we weren't checking the evaluator to string, and the old value was not correct; the new evaluator string pattern shown here has been the actual value since long before this PR. I fixed that here and in a test below.

@@ -210,7 +250,7 @@ public static Iterable<Object[]> parameters() {
return original.getData().get(nullPosition == 0 ? 1 : 0).type();
}
return original.expectedType();
}, (nullPosition, nullData, original) -> original);
}, (nullPosition, nullData, original) -> nullData.isForceLiteral() ? equalTo("LiteralsEvaluator[lit=null]") : original);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also follows from using forceLiteral on the periods and durations.

private static long addNanos(Long date, TemporalAmount period) {
return DateUtils.toLong(
Instant.from(
ZonedDateTime.ofInstant(DateUtils.toInstant(date), org.elasticsearch.xpack.esql.core.util.DateUtils.UTC).plus(period)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not thrilled that this is essentially copied from the production code. It means these tests are not doing a whole lot to ensure we are getting correct values out of the computation. Unfortunately, I don't know of a more obviously correct way to check these results. I think it's okay for a couple of reasons:

  • A lot of these tests are meant to show that the function fits into the ecosystem correctly (e.g. has a good toString value, folds correctly, handles exactly the types it claims to handle, etc).
  • We have a lot of CSV tests validating the actual date math
  • The date math itself is being done by the java time library, which is quite well tested.

That said, I'm open to suggestions if anyone has a better idea. Although clearly we made the same choice for millisecond date math, so I'm assuming we didn't have a better idea then. (comment applies to the verification function for the subtraction tests as well)

 Conflicts:
	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) test-release Trigger CI checks against release build v8.17.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ESQL] Support addition and subtraction of date nanos
2 participants