Hue now offers a Notebook application for interactively processing, visualizing and sharing data. Through a new Spark REST Job Server, Spark Python and Scala shells are available as well as Streaming. Those are ideal for doing some quick big data crunching from anywhere or a Web browser!
This talk details the architecture of the REST API and Notebook UI as well as their integration with the Hadoop ecosystem. It also describes the alternatives we tried and the challenges that were faced. The capabilities will then be lived demo in Hue’s Notebook Application through a real life scenario combining Spark and Hadoop.
Attendees will learn how to ramp-up on Spark and see how they could open it up to all their users and analyse even more data.
8. • Married
with
full
ecosystem
• File,
Job
browsers
• Create
table
wizards
• Any
language
(Hive,
Spark...)
• Graphing
• Export/Import/Sharing
• MulT
users
• ImpersonaTon
HADOOP WITH SPARK
NOTEBOOK
10. • REST
Web
server
in
Scala
• InteracTve
Spark
Sessions
and
Batch
Jobs
• Type
IntrospecTon
for
VisualizaTon
• Running
sessions
in
YARN
local
• Backends:
Scala,
Python,
R
• Open
Source:
h[ps://github.com/cloudera/hue/
tree/master/apps/spark/java
• Play
with
Curl
h[p://gethue.com/how-‐to-‐use-‐the-‐
LIVY
SPARK SERVER
11. LIVY WEB SERVER
ARCHITECTURE YARN
Master
Spark
Client
YARN
Node
Spark
Interpreter
Spark
Context
YARN
Node
Spark
Worker
YARN
Node
Spark
Worker
Livy
Server
Scalatra
Session
Manager
Session
12. LIVY WEB SERVER
ARCHITECTURE
Livy
Server
YARN
Master
Scalatra
Spark
Client
Session
Manager
Session
YARN
Node
Spark
Interpreter
Spark
Context
YARN
Node
Spark
Worker
YARN
Node
Spark
Worker
1