Friday, 13 June 2014

Read an HDFS file functional way in scala

This example reads an HDFS file in scala in a functional manner. We use Stream class to read data lazily when required.

val path = new Path("/data/abc.csv")
val conf = new Configuration()
val fileSystem = FileSystem.get(conf)
val stream = fileSystem.open(path)

// Important to make this def, bcoz if we make it val the memory might bloat up as it keeps the old
// values in the stream as well
def readLines = Stream.cons(stream.readLine, Stream.continually( stream.readLine))

readLines.takeWhile(_ != null).foreach(line => println(line))