From Fedora Project Wiki

< SIGs‎ | bigdata‎ | packaging

m (Besser82 moved page Oozie packaging to SIGs/bigdata/packaging/Oozie: moved to common namespace)
 
(26 intermediate revisions by 3 users not shown)
Line 1: Line 1:
This is a preliminary evaluation by the [[Big_data_SIG|Big Data SIG]] of the work required to get [http://oozie.apache.org/ Apache Oozie] into Fedora.  This work has been done based upon the 3.3 release building against the hadoop 2.0.5-alpha package in Fedora 20.  There are likely more issues not discussed here that won't be discovered until more of the missing dependencies are packaged for Fedora.
This is a preliminary evaluation by the [[Big_data_SIG|Big Data SIG]] of the work required to get [http://oozie.apache.org/ Apache Oozie] into Fedora.  This work has been done based upon the 4.0 release building against the hadoop 2.2.0 package in Fedora 20+.  There are likely more issues not discussed here that won't be discovered until more of the missing dependencies are packaged for Fedora.


= Issues To Be Resolved =
= Issues To Be Resolved =
== Missing Java Dependencies ==
== Missing Java Dependencies ==
Oozie depends on a number of other parts of the Hadoop ecosystem which are not in Fedora:
{| class="wikitable"
# HBase
|+ <div id="deps">Missing/Questionable Dependencies</div>
# Hive
! Project !! State !! Review BZ !! Packager !! Notes
# Pig
|-
# Sqoop
| activemq
 
| '''<span style="color:blue">Active</span>'''
It additionally depends upon a number of Apache Hadoop jars which are not packaged in the hadoop rpm in F20 as of hadoop-2.0.5-8:
|
# hadoop-client
| [[User:samkottler|samkottler]]
# hadoop-minicluster (Needed for tests)
| 5.8.0 needed, 5.6.0 currently packaged.  Used for tests, which are disabled.  Substituted with jeronimo-jms where needed
|-
| apache-log4j-extras
| '''<span style="color:green">Complete</span>'''
| <strike>{{bz|1059384}}</strike>
| [[User:rrati|rrati]]
|
|-
| greenmail
| '''<span style="color:green">Complete</span>'''
| <strike>{{bz|1059805}}</strike>
| [[rrati|rrati]]
|
|-
| hbase
| '''<span style="color:green">Complete</span>'''
| <strike>{{bz|1045556}}</strike>
| [[User:rrati|rrati]]
|
|-
| hive/hcatalog
| '''<span style="color:green">Complete</span>'''
| <strike>{{bz|1065446}}</strike>
| [[User:pmackinn|pmackinn]]
|
|-
| jung2, jung-algorithms, jung-api, jung-graph-impl, jung-visualization
| '''<span style="color:green">Complete</span>'''
| <strike>{{bz|1069366}}</strike>
| [[User:rrati|rrati]]
|
|-
| pig
| '''<span style="color:green">Complete</span>'''
| <strike>{{bz|1060277}}</strike>
| [[User:pmackinn|pmackinn]]
|
|-
| sqoop
| '''<span style="color:blue">Active</span>'''
|
| [[User:pmackinn|pmackinn]]
| Sqoop modules are disabled
|}


== Oozie Build System Issues ==
== Oozie Build System Issues ==
Line 17: Line 60:


The hadoop versions are also hard coded in the pom files, but this shouldn't be an issue since Fedora's java build tools ignore version requests except in the case of compatibility packages.
The hadoop versions are also hard coded in the pom files, but this shouldn't be an issue since Fedora's java build tools ignore version requests except in the case of compatibility packages.
By default oozie will build for/against java 1.6.  You can tell the build to build with 1.7 by setting javaVersion=1.7, and to build for 1.7 with targetJavaVersion=1.7.


== Dependency Version Mismatches ==
== Dependency Version Mismatches ==

Latest revision as of 13:34, 10 April 2014

This is a preliminary evaluation by the Big Data SIG of the work required to get Apache Oozie into Fedora. This work has been done based upon the 4.0 release building against the hadoop 2.2.0 package in Fedora 20+. There are likely more issues not discussed here that won't be discovered until more of the missing dependencies are packaged for Fedora.

Issues To Be Resolved

Missing Java Dependencies

Missing/Questionable Dependencies
Project State Review BZ Packager Notes
activemq Active samkottler 5.8.0 needed, 5.6.0 currently packaged. Used for tests, which are disabled. Substituted with jeronimo-jms where needed
apache-log4j-extras Complete RHBZ #1059384 rrati
greenmail Complete RHBZ #1059805 rrati
hbase Complete RHBZ #1045556 rrati
hive/hcatalog Complete RHBZ #1065446 pmackinn
jung2, jung-algorithms, jung-api, jung-graph-impl, jung-visualization Complete RHBZ #1069366 rrati
pig Complete RHBZ #1060277 pmackinn
sqoop Active pmackinn Sqoop modules are disabled

Oozie Build System Issues

The Ooize build system is pretty rigid and does not seem configurable. For example the build seems to always want to build bits for hadoop 1.1.1 and hadoop 2.x even if a specific hadoop version is provided at build time. Similarly it does not seem possible to disable test compilation from command line options. It appears editing of the poms will be required to disable specific modules.

The hadoop versions are also hard coded in the pom files, but this shouldn't be an issue since Fedora's java build tools ignore version requests except in the case of compatibility packages.

By default oozie will build for/against java 1.6. You can tell the build to build with 1.7 by setting javaVersion=1.7, and to build for 1.7 with targetJavaVersion=1.7.

Dependency Version Mismatches

Jetty

Oozie depends on jetty 6.1.14. Oozie will need to be updated to support the current version of jetty in Fedora.

Tomcat

Oozie depends on tomcat 6. Oozie will need to be updated to support the current version of tomcat in Fedora.

Webapp Configuration

Similar to Hadoop's HTTPFS, oozie downloads a copy of tomcat and uses that to create a specially configured tomcat configuration for oozie. This is will likely require the tomcat shell scripts. Is so then the webapp will not be able to be packaged in Fedora until the tomcat shell scripts are packaged.