Skip to content

aaron-siegel/scalding-commons

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scalding-commons Build Status

Common extensions to the Scalding MapReduce DSL.

Dfs-Datastores Integration

Scalding-Commons includes Scalding Sources for use with the dfs-datastores project.

This library provides a VersionedKeyValSource that allows Scalding to write out key-value pairs of any type into a binary sequencefile. Serialization is handled with the bijection-core library's Bijection trait.

VersionedKeyValSource allows multiple writes to the same path,as write creates a new version. Optionally, given a Monoid on the value type, VersionedKeyValSource allows for versioned incremental updates of a key-value database.

import com.twitter.scalding.source.VersionedKeyValSource
import VersionedKeyValSource._

// ## Sink Example

// The bijection library provides implicit Bijections
// from String -> Array[Byte] and Int -> Array[Byte].
val versionedSource = VersionedKeyValSource[String,Int]("path")

// creates a new version on each write
someScaldingFlow.write(versionedSource)

// because Scalding provides an implicit Monoid[Int],
// the writeIncremental method will add new integers into
// each value on every write:
someScaldingFlow.writeIncremental(versionedSource)

// ## Source Examples
//
// This Source produces the most recent set of kv pairs from the VersionedStore
// located at "path":
VersionedKeyValSource[String,Int]("path")

// This source produces version 12345:
VersionedKeyValSource[String,Int]("path", Some(12345))

Maven

Current version is 0.1.0. groupid="com.twitter", artifact=scalding-commons_2.9.2".

Authors

License

Copyright 2012 Twitter, Inc.

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

About

Common extensions to the Scalding MapReduce DSL.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Scala 81.9%
  • Perl 18.1%