Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

September 2024 ASF Board Report #10156

Closed
alamb opened this issue Apr 20, 2024 · 6 comments
Closed

September 2024 ASF Board Report #10156

alamb opened this issue Apr 20, 2024 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Apr 20, 2024

Is your feature request related to a problem or challenge?

Per https://whimsy.apache.org/roster/committee/datafusion the DataFusion ASF board report schedule is

March, June, September, December

Describe the solution you'd like

I would like to draft a board report for the ASF board meeting, ideally with community help.

The meetings are typically in the second or third week of the month

Describe alternatives you've considered

I plan to do this in the same style that worked well in Arrow (see an example from @andygrove
here https://lists.apache.org/thread/7w4mgy98qomc6drvj2fo81gvhq6p0boc) -- make a google doc (or issue) that people can add relevant content to and then the chair (me for the time being) submits it to the board

Additional context

No response

@alamb
Copy link
Contributor Author

alamb commented Aug 20, 2024

Due Wed Sep 11 2024 (in 22 days)

@alamb
Copy link
Contributor Author

alamb commented Aug 20, 2024

Draft document is https://docs.google.com/document/d/10MrrMHyfALB9Vpjtl-6abGWdNz6VLiWR9yI1a6Yl1aw/edit

I also sent a note to the mailing list as well, but it doesn't see it posted yet

@alamb
Copy link
Contributor Author

alamb commented Sep 9, 2024

This is due Wed Sep 11

According to our records, you are listed as the chair of DataFusion,
a committee that is due to submit a report by Wed Sep 11th
for the next ASF board meeting. This is an initial reminder to prepare a
report for DataFusion and submit it as described below.

@alamb
Copy link
Contributor Author

alamb commented Sep 9, 2024

I updated the doc https://docs.google.com/document/d/10MrrMHyfALB9Vpjtl-6abGWdNz6VLiWR9yI1a6Yl1aw/edit and plan to submit in the 11th.

Current content for the curious:

Description:

The mission of Apache DataFusion is the creation and maintenance of software
related to an extensible query engine

Project Status:

Current project status: New + Ongoing (high activity)
Issues for the board: None

Membership Data:

Apache DataFusion was founded 2024-04-16 (5 months ago)
There are currently 37 committers and 14 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:2.

Community changes, past quarter:

  • Jay Zhan was added to the PMC on 2024-08-11
  • Mehmet Ozan Kabak was added to the PMC on 2024-06-12
  • Ruihang Xia was added to the PMC on 2024-06-12
  • Berkay Şahin was added as committer on 2024-08-28
  • Eduard Karacharov was added as committer on 2024-08-14
  • Lewis Zhang was added as committer on 2024-06-14
  • Tim Saucer was added as committer on 2024-09-07
  • Weijun Huang was added as committer on 2024-08-27

Project Activity:

The project continues to be active with many PRs and issues opened and
closed per day.

We wrote two public blogs about our work: 1, 2 and DataFusion and systems built on it are being featured in high profile (for the Database world) venues such as the CMU Database Systems Seminar

We are working to adopt the sqlparser crate into the project as well

DataFusion core

https://github.com/apache/datafusion

We continue the monthly release cadence versions 40.0.0, 41.0.0 and are on track for version 42.0.0. The 41.0.0 release had almost 70 unique contributors.

We are currently focused on performance including for high cardinality aggregates and adding support for StringViewArrays. We completed a long running project to ensure all aggregate functions use the same API and are beginning the same project for window functions.

We have been discussing what features to include, and working to add LogicalTypes, as well as to create a more differentiated CLI experience. See the roadmap ticket for more details.

Sub project: DataFusion Python

https://github.com/apache/datafusion-python

The DataFusion Python project has received significant contributions recently to make the project more “Pythonic” and now has regular activity from maintainers. Tim Saucer has been added as a committer who focuses more heavily on datafusion-python.

Sub project: DataFusion Comet

https://github.com/apache/datafusion-comet

The Comet project is very active and recently released its initial 0.1.0
source release. Blog post: https://datafusion.apache.org/blog/2024/07/20/datafusion-comet-0.1.0/

Sub project: DataFusion Ballista

https://github.com/apache/datafusion-ballista
https://github.com/apache/datafusion-ballista-python

The Ballista subproject is not very actively maintained, but there have been
some contributions recently to upgrade to more recent versions of the core
DataFusion project.

Recent Releases

  • COMET-0.2.0 was released on 2024-08-28.
  • 41.0.0 was released on 2024-08-11.
  • PYTHON-39.0.0 was released on 2024-07-02.
  • 39.0.0 was released on 2024-06-10.
  • PYTHON-38.0.1 was released on 2024-05-30.
  • PYTHON-37.1.0 was released on 2024-05-13.
  • 38.0.0 was released on 2024-05-10.

Community Health:

It is still hard to keep track of everything going on these, which is a good thing. While it is always a struggle to get enough code review capacity, the committers keep things going and the community helps each other out with reviews. We continue to actively grow our committer and PMC ranks.

There are currently four meetups planned: New York City, San Francisco (for the second time!) Belgrade, and Seattle.

@alamb
Copy link
Contributor Author

alamb commented Sep 11, 2024

Final submitted report:

Description:

The mission of Apache DataFusion is the creation and maintenance of software
related to an extensible query engine

Project Status:

Current project status: New + Ongoing (high activity)
Issues for the board: None

Membership Data:

Apache DataFusion was founded 2024-04-16 (5 months ago)
There are currently 37 committers and 14 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:2.

Community changes, past quarter:

  • Jay Zhan was added to the PMC on 2024-08-11
  • Mehmet Ozan Kabak was added to the PMC on 2024-06-12
  • Ruihang Xia was added to the PMC on 2024-06-12
  • Berkay Şahin was added as committer on 2024-08-28
  • Eduard Karacharov was added as committer on 2024-08-14
  • Lewis Zhang was added as committer on 2024-06-14
  • Tim Saucer was added as committer on 2024-09-07
  • Weijun Huang was added as committer on 2024-08-27

Project Activity:

The project continues to be active with many PRs and issues opened and closed
per day.

We wrote two public blogs about our work: 1, 2 and DataFusion and systems
built on it are being featured in high profile (for the Database world) venues
such as the CMU Database Systems Seminar

We are working to adopt the sqlparser crate into the project as well

DataFusion core

https://github.com/apache/datafusion

We continue the monthly release cadence versions 40.0.0, 41.0.0 and are on
track for version 42.0.0. The [41.0.0 release] had almost 70 unique
contributors.

We are currently focused on performance including for high cardinality
aggregates and adding support for StringViewArrays. We completed a long
running project to ensure all aggregate functions use the same API and are
beginning the same project for window functions.

We have been [discussing what features to include, and working to add
LogicalTypes, as well as to create a more differentiated CLI experience. See
the roadmap ticket for more details.

Sub project: DataFusion Python

https://github.com/apache/datafusion-python

The DataFusion Python project has received significant contributions recently
to make the project more “Pythonic” and now has regular activity from
maintainers. Tim Saucer has been added as a committer who focuses more heavily
on datafusion-python.

Sub project: DataFusion Comet

https://github.com/apache/datafusion-comet

The Comet project is very active and recently released its initial 0.1.0
source release.

Blog post:
https://datafusion.apache.org/blog/2024/07/20/datafusion-comet-0.1.0/

Sub project: DataFusion Ballista

https://github.com/apache/datafusion-ballista
https://github.com/apache/datafusion-ballista-python

The Ballista subproject is not very actively maintained, but there have been
some contributions recently to upgrade to more recent versions of the core
DataFusion project.

Recent Releases

  • COMET-0.2.0 was released on 2024-08-28.
  • 41.0.0 was released on 2024-08-11.
  • PYTHON-39.0.0 was released on 2024-07-02.
  • 39.0.0 was released on 2024-06-10.
  • PYTHON-38.0.1 was released on 2024-05-30.
  • PYTHON-37.1.0 was released on 2024-05-13.
  • 38.0.0 was released on 2024-05-10.

Community Health:

It is still hard to keep track of everything going on these, which is a good
thing. While it is always a struggle to get enough code review capacity, the
committers keep things going and the community helps each other out with
reviews. We continue to actively grow our committer and PMC ranks.

There are currently four meetups planned: New York City, San Francisco (for
the second time!) Belgrade, and Seattle.

@alamb alamb closed this as completed Sep 11, 2024
@alamb
Copy link
Contributor Author

alamb commented Sep 11, 2024

Next report is in december: #10157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant