JuliaCon India 2015

The first Indian conference on the Julia programming language

Crunching Big Data with Julia

Submitted by Tanmay K. Mohapatra (@tanmaykm) on Friday, 18 September 2015

videocam_off

Technical level

Beginner

Section

Crisp talk

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +3

Objective

Introduce the big data infrastructure in Julia. It can be used to read/write HDFS files and run parallel Julia programs on a Yarn cluster.

Description

This talk will use Elly.jl to demonstrate a big data workflow in Julia. Elly is a Hadoop HDFS and Yarn client. It is a pure Julia implementation with no dependencies on libhdfs. It provides:

  • A familiar Julia ClusterManager interface, making it possible to use the familiar Julia parallel constructs on a Yarn cluster: addprocs, @parallel, spawn, pmap, etc.
  • Lower level APIs to write native Yarn applications.
  • A familiar Julia IO API for accessing HDFS files.

We shall use Elly and a few associated Julia packages to process a few example datasets.

Speaker bio

Tanmay K.M., Julia contributor. https://github.com/tanmaykm

Links

Comments

Login with Twitter or Google to leave a comment