Make Fedora and CentOS a pleasant platform for Data Engineering.
Data Engineering ecosystem heavily consist of software that would not go well with Fedora Packaging Guideline. The software tend to be Java/Python/etc software that rely on bundled JARs/Eggs/dependencies which in no way matches with what provided in Fedora.
This SIG would not attempt to create packages to be included into the core Fedora repositories, but instead we would be creating "vendor" packages that install these software in
/opt/, while still providing integration to the rest of Fedora/CentOS ecosystem (
systemd service files, configs in
/etc with sane default,
.desktop file for GUI apps, data directory in
/var/lib, logs in
systemd-journald, howtos, etc). For cluster type softwares, we would also create packages that contains ansible scripts to help deploy the cluster.
How we would do this is, we'll create our packages in COPR, and upload them into the SIG COPR group.
We plan to package:
How you can help
If you have no idea how to help us with making Data Engineering toolings into Fedora here are some proposals where we need help:
- Packagers: There are so many interesting packages that are not yet packaged for Fedora or need to be updated.
- Testers: If you want to try the rpms that we prepared and report bugs, We need your feedback to improve it.
- Documentation: If you would like to make howtos and documents to make the process easier for others to adopt the tools.
As of now, we are in Fedora Malaysia Discord server. We probably will request an official mailing list if the team get bigger.