Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: interval analysis unit tests #14189

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hiltontj
Copy link
Contributor

Added unit tests to interval analysis method which converts Expr tree to a set of Intervals for columns in a given schema.

Which issue does this PR close?

I did not have an issue for this, but I was experimenting with the analyze method for converting Expr trees into sets of Intervals and pushed the resulting tests up, since I did not see any that directly tested the analyze method.

Rationale for this change

I did not see any unit tests for the analyze method. In addition to this example, these provide a bit more of a sense of how the method works for different boundary expressions in queries.

What changes are included in this PR?

Two new unit tests in the datafusion/physical-expr/src/analysis.rs module.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the physical-expr Physical Expressions label Jan 18, 2025
@@ -57,6 +57,7 @@ petgraph = "0.7.1"
[dev-dependencies]
arrow = { workspace = true, features = ["test_utils"] }
criterion = "0.5"
datafusion = { workspace = true }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this is the line that caused the circular dependencies check to fail. I added it so I could get the SessionContext in my test for parsing the SQL expressions - perhaps there is a better way to do that...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way to do so would be to put the test into one of the core integration suites: https://github.com/apache/datafusion/blob/main/datafusion/core/tests/core_integration.rs

Then you run it like

cargo test --test core_integration

Basically SessionContext is in a different crate that depends on this crate (but not the other way around)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hiltontj -- this looks quite cool

Maybe @berkaysynnada knows if there are existing tests and/or where the tests could go

@@ -57,6 +57,7 @@ petgraph = "0.7.1"
[dev-dependencies]
arrow = { workspace = true, features = ["test_utils"] }
criterion = "0.5"
datafusion = { workspace = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way to do so would be to put the test into one of the core integration suites: https://github.com/apache/datafusion/blob/main/datafusion/core/tests/core_integration.rs

Then you run it like

cargo test --test core_integration

Basically SessionContext is in a different crate that depends on this crate (but not the other way around)

let schema = Arc::new(Schema::new(vec![make_field("a", DataType::Int64)]));
type TestCase = (&'static str, Option<i64>, Option<i64>);
let test_cases: Vec<TestCase> = vec![
("a > 10", Some(11), None),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another approach is to avoid parsing SQL and insteadl build these expressions programatically

Like

Suggested change
("a > 10", Some(11), None),
(col(a).gt(lit(10), Some(11), None),

There are some other examples here: https://docs.rs/datafusion/latest/datafusion/logical_expr/enum.Expr.html#column-references-and-literals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think it makes more sense to use the expression helpers for two reasons:

  • It brings these tests closer to unit tests and further from integration tests, i.e., no need for SessionContext and SQL parsing
  • It removes the need for the circular dependency

I don't mind changing them over to use the helpers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did just that in a43bf00

Added unit tests to interval analysis method which converts Expr tree
to a set of Intervals for columns in a given schema.
@hiltontj hiltontj force-pushed the hiltontj/analyze-interval-tests branch from 656f4fd to a43bf00 Compare January 19, 2025 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants