(Change Proposal ready for 2013-07-17 FESCo meeting (#1137)) |
(Change accepted en block on Jul 17 FESCo meeting (#1137)) |
||
Line 1,038: | Line 1,038: | ||
[[Category: | [[Category:ChangeAcceptedF20]] | ||
<!-- When your change proposal page is completed and ready for review and announcement --> | <!-- When your change proposal page is completed and ready for review and announcement --> | ||
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler --> | <!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler --> |
Revision as of 13:13, 18 July 2013
Apache Hadoop 2.x
Summary
Provide native Apache Hadoop packages.
Owner
- Name: Matthew Farrellee
- Email: matt@fedoraproject.org
- Release notes owner:
Current status
- Targeted release: Fedora 20
- Last updated: 8 July 2013
- Tracker bug: <will be assigned by the Wrangler>
Detailed Description
Apache Hadoop is a widely used, increasingly complete big data platform, with a strong open source community and growing ecosystem. The goal is to package and integrate the core of the Hadoop ecosystem for Fedora, allowing for immediate use and creating a base for the rest of the ecosystem.
Benefit to Fedora
The Apache Hadoop software will be packaged and integrated with Fedora. The core of the Hadoop ecosystem will be available with Fedora and provide a base for additional packages.
Scope
- Proposal owners:
- Note: target is Apache Hadoop 2.0.5-alpha
- Package all dependencies needed for Apache Hadoop 2.x
- Package the Apache Hadoop 2.x software
- Other developers: N/A (not a System Wide Change)
- Release engineering: N/A (not a System Wide Change)
- Policies and guidelines: N/A (not a System Wide Change)
Upgrade/compatibility impact
N/A (not a System Wide Change)
How To Test
- Install the hadoop rpms with:
- yum install hadoop-common hadoop-hdfs hadoop-libhdfs hadoop-mapreduce hadoop-mapreduce-examples hadoop-yarn
- Start the cluster by issuing:
- systemctl start hadoop-namenode hadoop-datanode hadoop-nodemanager hadoop-resourcemanager
- Initialize the HDFS directories:
- hdfs-create-dirs
- Create a directory for the user running the tests:
- runuser hdfs -s /bin/bash /bin/bash -c "hadoop fs -mkdir /user/<name>"
- runuser hdfs -s /bin/bash /bin/bash -c "hadoop fs -chown <name> /user/<name>"
- The user from the previous step can run jobs like the following mapreduce examples:
- hadoop jar /usr/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar pi 10 1000000
- hadoop jar /usr/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar randomwriter out
- These three have an order they need to be run in:
- hadoop jar /usr/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar teragen 100 gendata
- hadoop jar /usr/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar terasort gendata 100
- hadoop jar /usr/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar teravalidate gendata reportdata
User Experience
N/A (not a System Wide Change)
Dependencies
N/A (not a System Wide Change)
Contingency Plan
- Contingency mechanism: N/A (not a System Wide Change)
- Contingency deadline: N/A (not a System Wide Change)
- Blocks release? N/A (not a System Wide Change), Yes/No
Documentation
Release Notes
- TODO
Effort details
People involved
Name | IRC | Focus | Additional |
---|---|---|---|
Matthew Farrellee | mattf | keeping track, integration testing | UTC-5 |
Peter MacKinnon | pmackinn | packaging, testing | UTC-5 |
Rob Rati | rsquared | packaging | UTC-5 |
Timothy St. Clair | tstclair | config, upstream tracking | UTC-6 |
Sam Kottler | skottler | packaging | UTC-5 |
Gil Cattaneo | gil | packaging | UTC+1 |
Christopher Meng | cicku | packaging, testing | UTC+8 |
Detailed status
- Last updated: 11 July 2013
- Percentage of completion
- Dependencies available in Fedora (missing since project initiation): 100%
- Adaptation of Hadoop 2.0.5a source via patches: 100%
- Hadoop spec completion: 95%
- Test suite (Updated to 2.0.5a; all skips are from upstream)
Module | Tests | Failures | Errors | Skipped |
---|---|---|---|---|
hadoop-auth | 48 | 0 | 0 | 0 |
hadoop-common | 1828 | 1 | 0 | 6 |
hadoop-hdfs | 1644 | 2 | 5 | 6 |
hadoop-hdfs-httpfs | 283 | 0 | 0 | 0 |
hadoop-hdfs-bkjournal | 31 | 0 | 1 | 0 |
hadoop-yarn-common | 109 | 0 | 0 | 0 |
hadoop-yarn-client | 17 | 0 | 0 | 0 |
hadoop-yarn-server-common | 3 | 0 | 0 | 0 |
hadoop-yarn-server-nodemanager | 153 | 0 | 1 | 0 |
hadoop-yarn-server-web-proxy | 9 | 0 | 0 | 0 |
hadoop-yarn-server-resourcemanager | 277 | 0 | 0 | 0 |
hadoop-yarn-server-tests | 7 | 0 | 0 | 0 |
hadoop-yarn-applications-distributedshell | 2 | 0 | 0 | 0 |
hadoop-yarn-applications-unmanaged-am-launcher | 1 | 0 | 0 | 0 |
hadoop-mapreduce-examples | 11 | 0 | 0 | 1 |
hadoop-mapreduce-client-core | 56 | 0 | 0 | 0 |
hadoop-mapreduce-client-common | 43 | 0 | 0 | 0 |
hadoop-mapreduce-client-shuffle | 4 | 0 | 0 | 0 |
hadoop-mapreduce-client-app | 210 | 2 | 0 | 0 |
hadoop-mapreduce-client-jobclient | 443 | 0 | 3 | 12 |
hadoop-mapreduce-client-hs | 128 | 2 | 0 | 0 |
hadoop-mapreduce-client-hs-plugins | 1 | 0 | 0 | 0 |
hadoop-streaming | 55 | 0 | 6 | 0 |
hadoop-distcp | 112 | 0 | 0 | 0 |
hadoop-archives | 2 | 0 | 0 | 0 |
hadoop-rumen | 3 | 0 | 0 | 1 |
hadoop-gridmix | 44 | 0 | 0 | 0 |
hadoop-datajoin | 1 | 0 | 0 | 0 |
hadoop-extras | 20 | 0 | 0 | 1 |
Approach
We are taking an iterative, depth-first approach to packaging. We do not have all the dependencies mapped out ahead of time. Dependencies are being tabulated into two groups:
- missing - the dependency being requested from a hadoop-common pom has not yet been packaged, reviewed or generated into fedora repos
- broken - the dependency requested is out of date with current fedora versions, and patches must be developed for inclusion in a hadoop rpm build that address any build, API or source code deltas
Note that a dependency may show up in both of these tables.
Anyone who wants to help should find an available dependency below, edit the table changing the state to Active and packager to yourself.
If you are lucky enough to pick a dependency that itself has unpackaged dependencies, identify the sub-dependencies and add them to the bottom of the Dependencies table below, change your current dependency to Blocked and repeat.
If your dependency is already packaged but the version is incompatible, contact the package owner and resolve the incompatibility in a mutually satisfactory way. For instance:
- If the version available in Fedora is older, explore updating the package. If that is not possible, explore creating a package that includes a version in its name, e.g. pkgnameXY. Ultimately, the most recent version in Fedora should have the name pkgname while older versions have pkgnameXY. It may take a full Fedora release to rationalize package names. Make a note in the Dependencies table.
- If the version you need is older than the packaged version, consider creating a patch to use the newer version. If a patch is not viable, proceed by packaging the dependency with a version in its name, e.g. pkgnameXY. Make a note in the Dependencies table.
There is tattletale dependency graph data for both the baseline branch and the fedora development branch.
Running and debugging the unit test suite is discussed in the test suite section below and results are maintained in the test suite results table.
You will run into situations where the Apache Hadoop source needs to be patched to handle the Fedora version of a dependency. Those patches are candidates to propose upstream, are tracked in the upstream patch tracking table, and maintained in the source repositories below. Any changes that are required to conform to Fedora's packaging guidelines or deal with a package naming issue should be contained to the hadoop spec file.
In handling patches, the intention of this process is to isolate changes to a single dependency so patches can be created that can be consumed upstream. It is important that changes to the source be isolated to 1 dependency and the changes must be self-contained. A dependency is not necessarily a single jar file. Changes to a dependency should entail everything needed to use the jar files from a later release of the dependency.
- https://github.com/fedora-bigdata/hadoop-common Fork of Apache Hadoop for changes required to support compilation on Fedora
- https://github.com/fedora-bigdata/hadoop-rpm Spec and supporting files for generating an RPM for Fedora
Dependency Branches
All code/build changes to Apache Hadoop should be performed on a branch in the hadoop-common repo that should be based off the
- branch-2.0.5-alpha
branch and should following this naming convention:
- fedora-patch-<dependency>
Where <dependency> is the name of the dependency being worked on. Changes to this branch should ONLY relate to the dependency being worked on. Do not include the dependency version in the branch name. These branches will be updated as needed because of Fedora or Hadoop updates until they are accepted upstream by Apache Hadoop. Not having the dependency version allows the branch to move from version 1->2->3 without confusion if it is required before accepted upstream.
Integration Branch
An integration branch should be created in the hadoop-common repository that corresponds with the release version being packaged using the following naming convention:
- fedora-<version>-integration
where <ver> is the hadoop version being packaged. All branches containing changes that have not yet been accepted upstream should be merged to the integration branch and the result should pass the build and all tests. Once this is complete a patch should be generated and pushed to the hadoop-rpm repository.
Test suite
In order to attempt to run any part of the test suite, you must first build the components (F18,F19):
- git clone git://github.com/fedora-bigdata/hadoop-common.git
- cd hadoop-common
- git checkout -b fedora-2.0.5-alpha-test origin/fedora-2.0.5-alpha-test
- mvn-rpmbuild -Pdist,native -DskipTest -DskipTests -DskipIT install
If you are interested in the whole ball of wax then
- mvn-rpmbuild -X -Dorg.apache.jasper.compiler.disablejsr199=true
test
and go mow a football field or knit a sweater. Note that this could still result in spurious failures. Add
-Dmaven.test.failure.ignore=true
to the above line if you're seeking just test errors.
The fedora-2.0.5-alpha-test branch excludes identified consistently failing tests. You can edit your copy of hadoop-project/pom.xml to bring any of them back into play.
If you are interested in investigating specific failures such as active ones from the table above then target the module, test class, and even method as you see fit:
- mvn-rpmbuild -X -pl :hadoop-common test -Dtest=TestSSLHttpServer#testEcho
All your hard work results in a patch? Great! Hit a contributor up with it and we'll review and apply if everything looks cool.
This option is required to ensure the test of TestHttpServer#testContentTypes passes due to the use of glassfish JSP support.
Dependencies
State | Notes |
---|---|
Available | free for someone to take |
Active | dependency is actively being packaged if missing, or patch is being developed or tested for inclusion in hadoop-common build |
Blocked | pending packages for dependencies |
Review | under review, include link to review BZ |
Complete | woohoo! |
Project | State | Review BZ | Packager | Notes |
---|---|---|---|---|
hadoop | Active | rrati,pmackinn | ||
bookkeeper | Complete | RHBZ #948589 | gil | Version 4.0 requested. packaged 4.2.1. Patch: BOOKKEEPER-598 |
glassfish-gmbal | Complete | RHBZ #859112 | gil | F18 build |
glassfish-management-api | Complete | RHBZ #859110 | gil | F18 build |
grizzly | Complete | RHBZ #859114 | gil | Only for F20 for now. Cause: missing glassfish-servlet-api on F18 and F19. |
groovy | Complete | RHBZ #858127 | gil | 1.5 requested but 1.8 packaged in fedora. Possible moving forward 1.8 series will be known as groovy18 and groovy will be 2.x. |
jersey | Complete | RHBZ #825347 | gil | F18 build Should be rebuilt with grizzly2 support enabled. |
jets3t | Complete | RHBZ #847109 | gil | |
jspc-compiler | Complete | RHBZ #960720 | pmackinn | Passes preliminary overall hadoop-common compilation/testing. |
maven-native | Complete | RHBZ #864084 | gil | Needs patch to build with java7. NOTE: javac target/source is already set by mojo.java.target option |
zookeeper | Complete | RHBZ #823122 | gil | requires jtoaster |
Project | Packager | Notes |
---|---|---|
ant | Version 1.6 requested, 1.8 currently packaged in Fedora. Needs to be inspected for API/functional incompatibilities(?) | |
apache-commons-collections | pmackinn | Java import compilation error with existing package. Patches for hadoop-common being tracked at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-collections |
apache-commons-math | pmackinn | Current apache-commons-math uses math3 in pom instead of math, and API changes in code. Patches for hadoop-common being tracked at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-math |
cglib | pmackinn | Missing an explicit dep which old dep chain didn't need.. Patches for hadoop-common being tracked at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-cglib |
ecj | rrati | Need ecj version ecj-4.2.1-6 or later to resolve a dependency lookup issue |
gmaven | gil | Version 1.0 requested, available 1.4 (but has broken deps) RHBZ #914056 |
hadoop-hdfs | pmackinn | glibc link error in hdfs native build. Patch for hadoop-common being tracked at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-cmake-hdfs |
hsqldb | tradej | 1.8 in fedora, update to 2.2.9 in the process. API compatibility to be checked. |
jersey | pmackinn | Needs jersey-servlet and version. Tracked at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-jersey |
jets3t | pmackinn | Requires 0.6.1. With 0.9.x: hadoop-common Jets3tNativeFileSystemStore.java error: incompatible types S3ObjectsChunk chunk = s3Service.listObjectsChunked(bucket.getName(). Patches for hadoop-common being tracked at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-jets3t |
jetty | rrati | jetty8 packaged in Fedora, but 6.x requested. 6 and 8 are incompatible. Patches tracked at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-jetty |
slf4j | pmackinn | Package in fedora fails to match in dependency resolution. jcl104-over-slf4j dep in hadoop-common moved to jcl-over-slf4j as part of jspc/tomcat dep. Patch being tracked at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-jasper |
tomcat-jasper | pmackinn | Version 5.5.x requested. Adaptations made for incumbent Tomcat 7 via patches at https://github.com/fedora-bigdata/hadoop-common/tree/fedora-patch-jasper. Reviewing fit as part of overall hadoop-common compilation/testing. |
Test suite results
Module | Name | Baseline 2.0.2a/2.0.5a | Fedora 2.0.2a | Fedora 2.0.5a | Tester | Notes |
---|---|---|---|---|---|---|
hadoop-common | TestDoAsEffectiveUser | Pass | Fail | Pass | pmackinn | Failed in hadoop-common test suite but frequently succeeded as standalone. testRealUserSetup |
hadoop-common | TestRPCCompatibility | Pass | Fail | Pass | pmackinn | Failed in hadoop-common test suite but frequently succeeded as standalone. testVersion2ClientVersion1Server: "expected:<3> but was:<-3>" |
hadoop-common | TestSSLHttpServer | Pass | Fixed | Fixed | pmackinn | Required addition of SslContextFactory (new to Jetty 8+) setup in advance of SslConnector activation. Tracked here. |
hadoop-hdfs | TestWebHdfsFileSystemContract | Pass | Fail | Pass | pmackinn | Spurious failures, infrequently reproducible. |
hadoop-hdfs | TestDelegationTokenForProxyUser | Pass | Fail | Pass | pmackinn | testWebHdfsDoAs: "expected:<200> but was:<401>". Can no longer reproduce. |
hadoop-hdfs | TestEditLogRace | Pass | Fail | Fail | pmackinn | testSaveRightBeforeSync |
hadoop-hdfs | TestHftpURLTimeouts | Pass | Fail | Pass | pmackinn | testHsftpSocketTimeout -> intermittent test failure. Infrequently reproducible. |
hadoop-hdfs | TestNameNodeMetrics | Fail | Fail | Pass | pmackinn | testCorruptBlock: Bad value for metric PendingReplicationBlocks expected:<0> but was:<1> in test suite. Was sporadic. Can no longer reproduce. |
hadoop-hdfs | TestCheckpoint | Fail | Fail | Pass | pmackinn | testSecondaryHasVeryOutOfDateImage: Test resulted in an unexpected exit in test suite. Standalone OK. Can not reproduce. |
hadoop-hdfs-bkjournal | TestBookKeeperJournalManager | Pass | Fixed | Fixed | pmackinn | testOneBookieFailure, testAllBookieFailure: BK 4.2 introduced "readonly" bookie which are enabled by default. This throws off test counts. |
hadoop-hdfs-bkjournal | TestBookKeeperAsHASharedDir | Pass | Fixed | Fixed | pmackinn | See above re: readonly bookies. |
hadoop-hdfs-bkjournal | TestBookKeeperHACheckpoints | Pass | Fail | Pass | pmackinn | testCheckpointWhenNoNewTransactionsHappened: Port in use: localhost:10001 (but not really). No longer reproducible. |
hadoop-yarn-server-nodemanager | TestNMWebServices* | Pass | Fail | Review | pmackinn | All tests in error with java.lang.InstantiationException: something with the adaptation of GuiceServletConfig. Can be removed in test code and made to pass but would like deeper understanding of failure. Fedora test-only fix tracked here. |
hadoop-yarn-server-nodemanager | TestNMWebServicesApps,Containers | Pass | Fail | Fixed | pmackinn | JSON assert logic needed fixing for new version of jettison. Tracked here. |
hadoop-yarn-server-resourcemanager | TestRMWebServices* | Pass | Fail | Review | pmackinn | All tests in error with java.lang.InstantiationException on GuiceServletConfig: same guice-servlet problem as noted previously. |
hadoop-yarn-server-resourcemanager | TestRMWebServicesApps,Nodes | Pass | Fail | Fixed | pmackinn | JSON assert logic needed fixing for new version of jettison. Tracked here. |
hadoop-yarn-server-resourcemanager | TestDelegationTokenRenewer | Pass | Fail | Pass | pmackinn | Was testDTRenewalWithNoCancel: renew wasn't called as many times as expected expected:<1> but was:<2>. No longer reproducible. |
hadoop-yarn-server-resourcemanager | TestApplicationTokens | Pass | Fail | Pass | pmackinn | Was testTokenExpiry: sporadic NPE. No longer reproducible. |
hadoop-yarn-server-resourcemanager | TestAppManager | Pass | Fail | Pass | pmackinn | Was testRMAppSubmit,testRMAppSubmitWithQueueAndName: app event type is wrong before expected:<KILL> but was:<APP_REJECTED>; sporadic. No longer reproducible. |
hadoop-yarn-client | TestYarnClient | Fail | Fail | Pass | pmackinn | Was testClientStop: Can only configure with YarnConfiguration. No longer reproducible. |
hadoop-yarn-applications-unmanaged-am-launcher | TestUnmanagedAMLauncher | Fail | Fail | Pass | pmackinn | Seemed designed to execute once by successfully contacting an RM but repeatedly retries with: yarnAppState=FAILED, distributedFinalState=FAILED. No longer reproducible. |
hadoop-mapreduce-client-app | TestAMWebServices* | Pass | Fail | Review | pmackinn | All tests in error with java.lang.InstantiationException on GuiceServletConfig: same guice-servlet problem as noted previously. |
hadoop-mapreduce-client-hs | TestHsWebServices* | Pass | Fail | Review | pmackinn | All tests in error with java.lang.InstantiationException on GuiceServletConfig: same guice-servlet problem as noted previously. |
hadoop-mapreduce-client-jobclient | TestMiniMRProxyUser | Pass | Fail | Pass | pmackinn | Was testValidProxyUser: assert fail. No longer reproducible. |
hadoop-mapreduce-client-jobclient | TestMRJobs | Pass | Fail | Pass | pmackinn | Was testDistributedCache: assert fail. No longer reproducible. |
hadoop-mapreduce-client-jobclient | TestMapReduceLazyOutput | Pass | Fail | Fail | pmackinn | testLazyOutput: assert fail |
hadoop-mapreduce-client-jobclient | TestEncryptedShuffle | Pass | Fail | Pass | pmackinn | Was encryptedShuffleWithClientCerts,encryptedShuffleWithoutClientCerts: assert fail. No longer reproducible. |
hadoop-mapreduce-client-jobclient | TestJobName | Pass | Fail | Pass | pmackinn | Was testComplexName,testComplexNameWithRegex: Job failed! No longer reproducible. |
hadoop-mapreduce-client-jobclient | TestJobSysDirWithDFS | Pass | Fail | Fail | pmackinn | testWithDFS: Job failed! |
hadoop-mapreduce-client-jobclient | TestClusterMapReduceTestCase | Pass | Fail | Pass | pmackinn | Was testMapReduce,testMapReduceRestarting: Job failed! No longer reproducible. |
hadoop-mapreduce-client-jobclient | TestLazyOutput | Pass | Fail | Fail | pmackinn | testLazyOutput: Job failed! |
hadoop-mapreduce-client-jobclient | TestMiniMRWithDFSWithDistinctUsers | Pass | Fail | Fail | pmackinn | testDistinctUsers: Job failed!. Fails more often than passes. |
hadoop-mapreduce-client-jobclient | TestRMNMInfo | Pass | Fail | Pass | pmackinn | testRMNMInfoMissmatch: |
hadoop-mapreduce-client-hs | TestJobHistoryParsing | Pass | Fail | Fail | pmackinn | testCountersForFailedTask: sporadic failures using OJ7, none with Oracle 1.6. Appears to be thread related, a task is assumed as killed with no counters based on having a null status. |
hadoop-distcp | TestCopyCommitter | Pass | Fail | Pass | pmackinn | Was testNoCommitAction: Commit failed. No longer reproducible. |
Tests are listed in the order of execution
Baseline: F18, maven 3.0.5, Oracle JDK 1.6u45
Upstream patch tracking
Currenly tracking against branch-2 @ https://github.com/timothysc/hadoop-common
Branch | Commiter | JIRA | Target | Status |
---|---|---|---|---|
fedora-patch-math | pmackinn | https://issues.apache.org/jira/browse/HADOOP-9594 | 2.1.0-beta | PENDING REVIEW |
pmackinn | https://issues.apache.org/jira/browse/HADOOP-9605 | 2.0.5-alpha | COMMITTED | |
rsquared | https://issues.apache.org/jira/browse/HADOOP-9607 | 2.0.5-alpha | COMMITTED | |
fedora-patch-collections | pmackinn | https://issues.apache.org/jira/browse/HADOOP-9610 | 2.1.0-beta | PENDING REVIEW |
fedora-patch-cglib | pmackinn | https://issues.apache.org/jira/browse/HADOOP-9611 | 2.1.0-beta | PENDING REVIEW |
fedora-patch-jersey | pmackinn | https://issues.apache.org/jira/browse/HADOOP-9613 | 2.1.0-beta | PENDING REVIEW |
fedora-patch-jets3t | pmackinn | 2.1.0-beta | PENDING REVIEW | |
fedora-patch-jetty | pmackinn | https://issues.apache.org/jira/browse/HADOOP-9650 | 2.1.0-beta | RE-EVAL DUE TO F19 jetty-9 |
fedora-patch-jasper | pmackinn | https://lists.fedoraproject.org/pipermail/bigdata/2013-June/000026.html | N/A | Carrying Patch Until Further Notice |
2.0.5-alpha | Already Modified Upstream |
Packager Resources
Packager tips
- mvn-rpmbuild utility will ONLY resolve from system repo
- mvn-local will resolve from system repo first then fallback to maven if unresolved
- can be used to find the delta between system repo packages available and missing dependencies that can be viewed in the .m2 local maven repo (find ~/.m2/repository -name '*.jar')
- -Dmaven.local.debug=true
- reveals how JPP lookups are executed per dependency: useful for finding groupId,artifactId mismatches
- -Dmaven.test.skip=true
- tells maven to skip test runs AND compilation
- useful for unblocking end-to-end build
An alternative to gmaven:
- apply a patch with the following content where required
- test support is not guaranteed, should not work.
<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-antrun-plugin</artifactId> <version>1.7</version> <dependencies> <dependency> <groupId>org.codehaus.groovy</groupId> <artifactId>groovy</artifactId> <version>any</version> </dependency> <dependency> <groupId>antlr</groupId> <artifactId>antlr</artifactId> <version>any</version> </dependency> <dependency> <groupId>commons-cli</groupId> <artifactId>commons-cli</artifactId> <version>any</version> </dependency> <dependency> <groupId>asm</groupId> <artifactId>asm-all</artifactId> <version>any</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-nop</artifactId> <version>any</version> </dependency> </dependencies> <executions> <execution> <id>compile</id> <phase>process-sources</phase> <configuration> <target> <mkdir dir="${basedir}/target/classes"/> <taskdef name="groovyc" classname="org.codehaus.groovy.ant.Groovyc"> <classpath refid="maven.plugin.classpath"/> </taskdef> <groovyc destdir="${project.build.outputDirectory}" srcdir="${basedir}/src/main" classpathref="maven.compile.classpath"> <javac source="1.5" target="1.5" debug="on"/> </groovyc> </target> </configuration> <goals> <goal>run</goal> </goals> </execution> </executions> </plugin>
YUM repositories
An RPM repository of dependencies already packaged and in, or heading towards, review state can be found here:
http://repos.fedorapeople.org/repos/rrati/hadoop/
Currently, only Fedora 18 x86_64 packages are available