Spring for Apache Hadoop Annotation Configuration

License	License The Apache Software License, Version 2.0
Categories	Categories Data config Application Layer Libs Configuration
GroupId	GroupId org.springframework.data
ArtifactId	ArtifactId spring-data-hadoop-config
Last Version	Last Version 2.5.0.RELEASE
Release Date	Release Date 06-Jul-2017
Type	Type jar
Description	Description Spring for Apache Hadoop Annotation Configuration Spring for Apache Hadoop Annotation Configuration
Project URL	Project URL http://github.com/spring-projects/spring-hadoop
Project Organization	Project Organization Spring by Pivotal
Source Code Management	Source Code Management http://github.com/spring-projects/spring-hadoop

Download spring-data-hadoop-config

Filename	Size
spring-data-hadoop-config-2.5.0.RELEASE.pom
spring-data-hadoop-config-2.5.0.RELEASE.jar	43 KB
spring-data-hadoop-config-2.5.0.RELEASE-sources.jar	38 KB
spring-data-hadoop-config-2.5.0.RELEASE-javadoc.jar	261 bytes
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/org.springframework.data/spring-data-hadoop-config/ -->
<dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-hadoop-config</artifactId>
    <version>2.5.0.RELEASE</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/org.springframework.data/spring-data-hadoop-config/
implementation 'org.springframework.data:spring-data-hadoop-config:2.5.0.RELEASE'

Gradle Kotlin

// https://jarcasting.com/artifacts/org.springframework.data/spring-data-hadoop-config/
implementation ("org.springframework.data:spring-data-hadoop-config:2.5.0.RELEASE")

Apache Buildr

'org.springframework.data:spring-data-hadoop-config:jar:2.5.0.RELEASE'

Apache Ivy

<dependency org="org.springframework.data" name="spring-data-hadoop-config" rev="2.5.0.RELEASE">
  <artifact name="spring-data-hadoop-config" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='org.springframework.data', module='spring-data-hadoop-config', version='2.5.0.RELEASE')
)

Scala SBT

libraryDependencies += "org.springframework.data" % "spring-data-hadoop-config" % "2.5.0.RELEASE"

Leiningen

[org.springframework.data/spring-data-hadoop-config "2.5.0.RELEASE"]

Dependencies

compile (22)

Group / Artifact	Type	Version
org.apache.pig : pig Optional	jar	0.14.0
org.springframework : spring-beans	jar	4.3.9.RELEASE
org.apache.hadoop : hadoop-streaming	jar	2.7.3
org.springframework : spring-messaging	jar	4.3.9.RELEASE
org.apache.hadoop : hadoop-common	jar	2.7.3
org.apache.hadoop : hadoop-yarn-common	jar	2.7.3
org.apache.hadoop : hadoop-distcp	jar	2.7.3
org.apache.hadoop : hadoop-mapreduce-client-core	jar	2.7.3
org.apache.hadoop : hadoop-hdfs	jar	2.7.3
org.apache.hbase : hbase Optional	jar	0.98.5-hadoop2
org.springframework : spring-tx Optional	jar	4.3.9.RELEASE
org.springframework : spring-expression	jar	4.3.9.RELEASE
org.springframework : spring-context	jar	4.3.9.RELEASE
org.springframework : spring-core	jar	4.3.9.RELEASE
org.springframework : spring-aop	jar	4.3.9.RELEASE
org.apache.hbase : hbase-client Optional	jar	0.98.5-hadoop2
org.springframework.data : spring-data-hadoop-core	jar	2.5.0.RELEASE
org.springframework : spring-context-support	jar	4.3.9.RELEASE
org.springframework : spring-jdbc Optional	jar	4.3.9.RELEASE
org.apache.hive : hive-service Optional	jar	1.1.1
org.apache.hbase : hbase-common Optional	jar	0.98.5-hadoop2
org.apache.hadoop : hadoop-mapreduce-client-jobclient	jar	2.7.3

Project Modules

There are no modules declared in this project.

NOTICE: The Spring for Apache Hadoop project has reached End-Of-Life status on April 5th, 2019. The final Spring for Apache Hadoop 2.5.0 release was built using Apache Hadoop version 2.7.3 and no new releases are planned.

The Spring for Apache Hadoop project provides extensions to Spring, Spring Batch, and Spring Integration to build manageable and robust pipeline solutions around Hadoop.

Spring for Apache Hadoop extends Spring Batch by providing support for reading from and writing to HDFS, running various types of Hadoop jobs (Java MapReduce, Streaming, Hive, Spark, Pig) and using HBase. An important goal is to provide excellent support for non-Java based developers to be productive using Spring Hadoop and not have to write any Java code to use the core feature set.

Spring for Apache Hadoop also applies the familiar Spring programming model to Java MapReduce jobs by providing support for dependency injection of simple jobs as well as a POJO based MapReduce programming model that decouples your MapReduce classes from Hadoop specific details such as base classes and data types.

Docs

You can find out more details from the user documentation or by browsing the javadocs. If you have ideas about how to improve or extend the scope, please feel free to contribute.

Artifacts

For build dependencies to use in your own projects see our Quick Start page.

Building

Spring for Apache Hadoop uses Gradle as its build system. To build the system simply run:

gradlew

from the project root folder. This will compile the sources, run the tests and create the artifacts. Note that the tests by default tries to access a localhost single-node Hadoop cluster.

Supported distros

By default Spring for Apache Hadoop compiles against the Apache Hadoop 2.7.x stable relase (hadoop27).

The following distros and versions are currently supported in this projects master branch:

Apache Hadoop 2.7.x (hadoop27) default
Apache Hadoop 2.6.x (hadoop26)
Pivotal HD 3.0 (phd30)
Cloudera CDH5 (cdh5)
Hortonworks HDP 2.5 (hdp25)
Hortonworks HDP 2.4 (hdp24)

(For older distro versions, look for older releases)

To compile against a specific distro version pass the -Pdistro=<label> project property, like so:

gradlew -Pdistro=hadoop26 build

Note that the chosen distro is displayed on the screen:

Using Apache Hadoop 2.6.x [2.6.0]

In this case, the specified Hadoop distribution (above Apache Hadoop 2.6.x) is used to create the project binaries.

CI Builds

The results for CI builds are available at Spring Data Hadoop: Project Summary - Spring CI

Testing

For its testing, Spring for Apache Hadoop expects a pseudo-distributed/local Hadoop instalation available on localhost configured with a port of 8020 for HDFS. The local Hadoop setup allows the project classpath to be automatically used by the Hadoop job tracker. These settings can be customized in two ways:

Build properties

From the command-line, use hd.fs for the file-system (to avoid confusion, specify the protocol such as 'hdfs://', 's3://', etc - if none is specified, hdfs:// will be used), hd.rm for the YARN resourcemanager, hd.jh for the jobhistory and hd.hive for the Hive host/port information, to override the defaults. For example to run against HDFS at dumbo:8020 one would use:

gradlew -Phd.fs=hdfs://dumbo:8020 build

Properties file

Through the test.properties file under src/test/resources folder (further tweaks can be applied through hadoop-ctx.xml file under src/test/resources/org/springframework/data/hadoop).

Enabling Hbase/Hive/Pig/WebHdfs Tests

Note that by default, only the vanilla Hadoop tests are running - you can enable additional tests (such as Hive or Pig) by adding the tasks enableHBaseTests, enableHiveTests, enablePigTests or enableWebHdfsTests (or enableAllTests in short). Use test.properties file for customizing the default location for these services as well.

Disabling test execution

You can disable all tests by skipping the test task:

gradlew -x test

Contributing

Here are some ways for you to get involved in the community:

Get involved with the Spring community on StackOverflow using the spring-data-hadoop tag to post and answer questions.
Create JIRA tickets for bugs and new features and comment and vote on the ones that you are interested in.
Watch for upcoming articles on Spring by subscribing to the Spring Blog.

Github is for social coding: if you want to write code, we encourage contributions through pull requests from forks of this repository. If you want to contribute code this way, read the Spring Framework contributor guidelines.

Code of Conduct

This project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to spring-code-of-conduct@pivotal.io.

Staying in touch

Follow the project team (Mark, Thomas or Janne) on Twitter.

In-depth articles can be found at the Spring blog, and releases are announced via our news feed.

Spring

Versions

Version
2.5.0.RELEASE 06-Jul-2017
2.4.0.RELEASE 29-Jun-2016
2.3.0.RELEASE-hadoop26 22-Dec-2015
2.3.0.RELEASE 22-Dec-2015

Spring for Apache Hadoop Annotation Configuration

License

Categories

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project URL

Project Organization

Source Code Management