SlideShare a Scribd company logo
1 of 51
Algebird 
Abstract Algebra 
for 
Analytics 
Sam BESSALAH 
@samklr 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop @samklr
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
Abstract Algebra 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
From WikiPedia 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
Algebraic Structure 
โ€œ Set of values, coupled with one or 
more finite operations,and a set of 
laws those operations must obey. โ€œ 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Algebraic Structure 
โ€œ Set of values, coupled with one or more 
finite operations, and a set of laws those 
operations must obey. โ€œ 
e.g Sum, Magma, Semigroup, Groups, Monoid, 
Abelian Group, Semi Lattices, Rings, Monads, 
etc. 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Semigroup 
Semigroup Law : 
(x <> y) <> z = x <> (y <> z) 
(associativity) 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Semigroup 
Semigroup Law : 
(x <> y) <> z = x <> (y <> z) 
(associativity) 
trait Semigroup[T] { 
def aggregate(x : T, y : T) : T 
} 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Monoids 
Monoid Laws : 
(x <> y) <> z = x <> (y <> z) 
(associativity) 
identity <> x = x 
x <> identity = x 
(identity) 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Monoids 
Monoid Laws : 
(x <> y) <> z = x <> (y <> z) 
(associativity) 
identity <> x = x 
x <> identity = x 
(identiy / zero) 
trait Monoid[T] { 
def identity : T 
def aggregate (x, y) : T 
} 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Monoids 
Monoid Laws : 
(x <> y) <> z = x <> (y <> z) 
(associativity) 
identity <> x = x 
x <> identity = x 
trait Monoid[T] extends Semigroup[T]{ 
def identity : T 
} 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Groups 
Group Laws: 
(x <> y) <> z = x <> (y <> z) 
(associativity) 
identity <> x = x 
x <> identity = x 
(identity) 
x <> inverse x = identity 
inverse x <> x = identity 
(invertibility) 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Groups 
Group Laws 
(x <> y) <> z = x <> (y <> z) 
identity <> x = x 
x <> identity = x 
x <> inverse x = identity 
inverse x <> x = identity 
trait Group[T] extends Monoid[T]{ 
def inverse (v : T) :T 
} 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Many More 
- Abelian groups (Commutative Sets) 
- Rings 
- Semi Lattices 
- Ordered Semigroups 
- Fields .. 
Many of those are in Algebird โ€ฆ. 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Examples 
- (a min b) min c = a (b min c) with Int. 
- a max ( b max c) = (a max b) max c ** 
- a or (b or c) = (a or b) or c 
- a and (b and c) = (a and b) and c 
- int addition 
- set union 
- harmonic sum 
- Integer mean 
- Priority queue 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Why do we need those algebraic 
structures ? 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
We want to : 
- Build scalable analytics systems 
- Leverage distributed computing to perform aggregation 
on really large data sets. 
- A lot of operations in analytics are just sorting and 
counting at the end of the day 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Distributed Computing โ†’ Parallellism 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Distributed Computing โ†’ Parallellism 
Associativity โ†’ enables parallelism 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Distributed Computing โ†’ Parallellism 
Associativity enables parallelism 
Identity means we can ignore some data 
Commutativity helps us ignore order 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Typical Map Reduce ... 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Finding Top-K Elements in Scalding ... 
class TopKJob(args : Args) extends Job (args) { 
Tsv ( args(โ€˜inputโ€™), visitScheme) 
.filter (. ..) 
.leftJoinWithTiny ( โ€ฆ ) 
.filter ( โ€ฆ ) 
.groupBy( โ€˜fieldOne) 
{ _.sortWithTake (visitScheme -> top } 
(biggerSale) 
.write(Tsv(...) ) 
} 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
.sortWithTake( โ€ฆ ) 
Looking into .sortWithTake in Scalding, thereโ€™s one 
nice thing : 
class PiorityQueueMonoid[T] (max : Int) 
(implicit order : Ordering[T] ) 
extends Monoid[Priorityqueue[T] ] 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
class PiorityQueueMonoid[T] (max : Int) 
(implicit order : Ordering[T] ) 
extends Monoid[Priorityqueue[T] ] 
Letโ€™s take a look : 
PQ1 : 55, 45, 21, 3 
PQ2: 100, 80, 40, 3 
top-4 (PQ1 U PQ2 ): 100, 80, 55, 45 
Priority Queue : 
Can be empty 
Two Priority Queues can be โ€œaddedโ€ in any order 
Associative + Commutative 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
class PiorityQueueMonoid[T] (max : Int) 
(implicit order : Ordering[T] ) 
extends Monoid[Priorityqueue[T] ] 
Letโ€™s take a look : 
PQ1 : 55, 45, 21, 3 
PQ2: 100, 80, 40, 3 
top-4 (PQ1 U PQ2 ): 100, 80, 55, 45 
Priority Queue : 
Makes Scalding go fast, 
by doing sorting, 
filtering and extracting 
in one single โ€œmapโ€ 
step. 
Can be empty 
Two Priority Queues can be โ€œaddedโ€ in any order 
Associative + Commutative 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Stream Mining Challenges 
- Update predictions after each observation 
- Single pass : canโ€™t read old data or replay 
the stream 
- Full size of the stream often unknown 
- Limited time for computation per 
observation 
- O(1) memory size 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Stream Mining Challenges 
http://radar.oreilly.com/2013/10/stream-mining-essentials.html 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Tradeoff : Space and speed over 
accuracy. 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Tradeoff : Space and speed over 
accuracy. 
use sketches. 
Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Sketches 
Probabilistic data structures that store a summary 
(hashed mostly)of a data set that would be costly to 
store in its entirety, thus providing most of the 
time, sublinear algorithmic properties. 
E.g Bloom Filters, Counter Sketch, KMV counters, 
Count Min Sketch, HyperLogLog, Min Hashes 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Bloom filters 
Approximate data structure for set membership 
Behaves like an approximate set 
BloomFilter.contains(x) => NO | Maybe 
P(False Positive) > 0 
P(False Negative) = 0 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Internally : 
Bit Array of fixed size 
add(x) : for all element i, b[h(x,i)]=1 
contains(x) : TRUE if b[h(x,i)] = = 1 for all i. 
(Boolean AND => associative) 
Both are associative => BF can be designed as a Monoid 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Bloom filters 
import com.twitter.algebird._ 
import com.twitter.algebird.Operators._ 
// generate 2 lists 
val A = (1 to 300).toList 
// Generate a Bloomfilter 
val NUM_HASHES = 6 
val WIDTH = 6000 // bits 
val SEED = 1 
implicit val bfm = new BloomFilterMonoid(NUM_HASHES, WIDTH, SEED) 
// approximate set with bloomfilter 
val A_bf = A.map{i => bfm.create(i.toString)}.reduce(_ + _) 
val approxBool = A_bf.contains(โ€œ150โ€) ---> ApproximateBoolean(true, 0.9995โ€ฆ) 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Count Min Sketch 
Gives an approximation of the number of occurrences of an 
element in a set. 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Count Min Sketch 
Count min sketch 
Adding an element is a numerical addition 
Querying uses a MIN function. 
Both are associative. 
useful for detecting heavy hitters, topK, LSH 
We have in Algebird : 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
HyperLogLog 
Popular sketch for cardinality estimtion. 
Gives within a probilistic distribution of an error 
the number of distinct values in a data set. 
HLL.size = Approx[Number] 
Intuition 
Long runs of trailings 0 in a random bits 
chain are rare 
But the more bit chains you look at, the more 
likely you are to find a long one 
The longest run of trailing 0-bits seen can be 
an estimator of the number of unique bit chains 
observed. 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Adding an element uses a Max and Sum function. 
Both are associative and Monoids. (Max is an 
ordered 
semigroup in Algebird really) 
Querying for an element uses an harmonic mean 
which is a Monoid. 
In Algebird : 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Many More juicy sketches ... 
- MinHashes to compute Jaccard similarity 
- QTree for quantiles estimation. Neat for anomaly 
detection. 
- SpaceSaverMonoid, Awesome to find the approximate 
most frequent and top K elements. 
- TopKMonoid 
- SGD, PriorityQueues, Histograms, etc. 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
SummingBird : Lamba in a box 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Heard of Lambda Architecture ? 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
SummingBird 
Same code for both batch and real time processing. 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
SummingBird 
Same code, for both batch and real time processing. 
But works only on Monoids. 
Uses Storehaus, as a mergeable store layer. 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
http://github.com/twitter/algebird 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
http://github.com/twitter/algebird 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr 
These slides : 
http://bit.ly/1szncAZ 
http://slidesha.re/1zhhXKU
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr
Links 
-Algebra for analytics by Oscar Boykin (Creator of Algebird) 
http://speakerdeck.com/johnynek/algebra-for-analytics 
- Take a look into HLearn https://github.com/mikeizbicki/HLearn 
- Great intro into Algebird by Michael Noll 
http://www.michael-noll.com/blog/2013/12/02/twitter-algebird-monoid-monad- 
for-large-scala-data-analytics/ 
-Aggregate Knowledge http://research.neustar.biz/2012/10/25/sketch-of- 
the-day-hyperloglog-cornerstone-of-a-big-data-infrastructure 
- Probabilistic data structures for web analytics. 
http://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures- 
web-analytics-data-mining/ 
- http://debasishg.blogspot.fr/2014/01/count-min-sketch-data-structure- 
for.html 
- http://infolab.stanford.edu/~ullman/mmds/ch3.pdf 
#Devoxx #algebird #scalding #monoid #hadoop #spark 
@samklr

More Related Content

What's hot

Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Spark Summit
ย 
Developing distributed applications with Akka and Akka Cluster
Developing distributed applications with Akka and Akka ClusterDeveloping distributed applications with Akka and Akka Cluster
Developing distributed applications with Akka and Akka ClusterKonstantin Tsykulenko
ย 
Asynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka StreamsAsynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka StreamsJohan Andrรฉn
ย 
mesos-devoxx14
mesos-devoxx14mesos-devoxx14
mesos-devoxx14Samir Bessalah
ย 
Reactive Streams / Akka Streams - GeeCON Prague 2014
Reactive Streams / Akka Streams - GeeCON Prague 2014Reactive Streams / Akka Streams - GeeCON Prague 2014
Reactive Streams / Akka Streams - GeeCON Prague 2014Konrad Malawski
ย 
Reactive programming on Android
Reactive programming on AndroidReactive programming on Android
Reactive programming on AndroidTomรกลก Kypta
ย 
Writing Hadoop Jobs in Scala using Scalding
Writing Hadoop Jobs in Scala using ScaldingWriting Hadoop Jobs in Scala using Scalding
Writing Hadoop Jobs in Scala using ScaldingToni Cebriรกn
ย 
Scalding: Twitter's Scala DSL for Hadoop/Cascading
Scalding: Twitter's Scala DSL for Hadoop/CascadingScalding: Twitter's Scala DSL for Hadoop/Cascading
Scalding: Twitter's Scala DSL for Hadoop/Cascadingjohnynek
ย 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and MonoidsHugo Gรคvert
ย 
Akka Actor presentation
Akka Actor presentationAkka Actor presentation
Akka Actor presentationGene Chang
ย 
whats new in java 8
whats new in java 8 whats new in java 8
whats new in java 8 Dori Waldman
ย 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015Holden Karau
ย 
Introduction to Structured Streaming | Big Data Hadoop Spark Tutorial | Cloud...
Introduction to Structured Streaming | Big Data Hadoop Spark Tutorial | Cloud...Introduction to Structured Streaming | Big Data Hadoop Spark Tutorial | Cloud...
Introduction to Structured Streaming | Big Data Hadoop Spark Tutorial | Cloud...CloudxLab
ย 
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka StreamsFresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka StreamsKonrad Malawski
ย 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
ย 
Async - react, don't wait - PingConf
Async - react, don't wait - PingConfAsync - react, don't wait - PingConf
Async - react, don't wait - PingConfJohan Andrรฉn
ย 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraEscape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraPiotr Kolaczkowski
ย 
Norikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In RubyNorikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In RubySATOSHI TAGOMORI
ย 
HadoopCon 2016 - ็”จ Jupyter Notebook Hold ไฝไธ€ๅ€‹ไธŠ็ทš Spark Machine Learning ๅฐˆๆกˆๅฏฆๆˆฐ
HadoopCon 2016  - ็”จ Jupyter Notebook Hold ไฝไธ€ๅ€‹ไธŠ็ทš Spark  Machine Learning ๅฐˆๆกˆๅฏฆๆˆฐHadoopCon 2016  - ็”จ Jupyter Notebook Hold ไฝไธ€ๅ€‹ไธŠ็ทš Spark  Machine Learning ๅฐˆๆกˆๅฏฆๆˆฐ
HadoopCon 2016 - ็”จ Jupyter Notebook Hold ไฝไธ€ๅ€‹ไธŠ็ทš Spark Machine Learning ๅฐˆๆกˆๅฏฆๆˆฐWayne Chen
ย 

What's hot (20)

Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
ย 
Developing distributed applications with Akka and Akka Cluster
Developing distributed applications with Akka and Akka ClusterDeveloping distributed applications with Akka and Akka Cluster
Developing distributed applications with Akka and Akka Cluster
ย 
Asynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka StreamsAsynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka Streams
ย 
mesos-devoxx14
mesos-devoxx14mesos-devoxx14
mesos-devoxx14
ย 
Reactive Streams / Akka Streams - GeeCON Prague 2014
Reactive Streams / Akka Streams - GeeCON Prague 2014Reactive Streams / Akka Streams - GeeCON Prague 2014
Reactive Streams / Akka Streams - GeeCON Prague 2014
ย 
Reactive programming on Android
Reactive programming on AndroidReactive programming on Android
Reactive programming on Android
ย 
XML-Motor
XML-MotorXML-Motor
XML-Motor
ย 
Writing Hadoop Jobs in Scala using Scalding
Writing Hadoop Jobs in Scala using ScaldingWriting Hadoop Jobs in Scala using Scalding
Writing Hadoop Jobs in Scala using Scalding
ย 
Scalding: Twitter's Scala DSL for Hadoop/Cascading
Scalding: Twitter's Scala DSL for Hadoop/CascadingScalding: Twitter's Scala DSL for Hadoop/Cascading
Scalding: Twitter's Scala DSL for Hadoop/Cascading
ย 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and Monoids
ย 
Akka Actor presentation
Akka Actor presentationAkka Actor presentation
Akka Actor presentation
ย 
whats new in java 8
whats new in java 8 whats new in java 8
whats new in java 8
ย 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
ย 
Introduction to Structured Streaming | Big Data Hadoop Spark Tutorial | Cloud...
Introduction to Structured Streaming | Big Data Hadoop Spark Tutorial | Cloud...Introduction to Structured Streaming | Big Data Hadoop Spark Tutorial | Cloud...
Introduction to Structured Streaming | Big Data Hadoop Spark Tutorial | Cloud...
ย 
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka StreamsFresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
ย 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
ย 
Async - react, don't wait - PingConf
Async - react, don't wait - PingConfAsync - react, don't wait - PingConf
Async - react, don't wait - PingConf
ย 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraEscape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
ย 
Norikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In RubyNorikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In Ruby
ย 
HadoopCon 2016 - ็”จ Jupyter Notebook Hold ไฝไธ€ๅ€‹ไธŠ็ทš Spark Machine Learning ๅฐˆๆกˆๅฏฆๆˆฐ
HadoopCon 2016  - ็”จ Jupyter Notebook Hold ไฝไธ€ๅ€‹ไธŠ็ทš Spark  Machine Learning ๅฐˆๆกˆๅฏฆๆˆฐHadoopCon 2016  - ็”จ Jupyter Notebook Hold ไฝไธ€ๅ€‹ไธŠ็ทš Spark  Machine Learning ๅฐˆๆกˆๅฏฆๆˆฐ
HadoopCon 2016 - ็”จ Jupyter Notebook Hold ไฝไธ€ๅ€‹ไธŠ็ทš Spark Machine Learning ๅฐˆๆกˆๅฏฆๆˆฐ
ย 

Viewers also liked

Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In ProductionSamir Bessalah
ย 
A Study to Design and Implement a Manual for the Learning Process of Technica...
A Study to Design and Implement a Manual for the Learning Process of Technica...A Study to Design and Implement a Manual for the Learning Process of Technica...
A Study to Design and Implement a Manual for the Learning Process of Technica...UNIVERSIDAD MAGISTER (Sitio Oficial)
ย 
An application of abstract algebra to music theory
An application of abstract algebra to music theoryAn application of abstract algebra to music theory
An application of abstract algebra to music theorymorkir
ย 
Deep learning for mere mortals - Devoxx Belgium 2015
Deep learning for mere mortals - Devoxx Belgium 2015Deep learning for mere mortals - Devoxx Belgium 2015
Deep learning for mere mortals - Devoxx Belgium 2015Samir Bessalah
ย 
Information Security Seminar #2
Information Security Seminar #2Information Security Seminar #2
Information Security Seminar #2Alexander Kolybelnikov
ย 
Definition ofvectorspace
Definition ofvectorspaceDefinition ofvectorspace
Definition ofvectorspaceTanuj Parikh
ย 
Snapdragon Processor
Snapdragon ProcessorSnapdragon Processor
Snapdragon ProcessorKrishna Gehlot
ย 
Production and Beyond: Deploying and Managing Machine Learning Models
Production and Beyond: Deploying and Managing Machine Learning ModelsProduction and Beyond: Deploying and Managing Machine Learning Models
Production and Beyond: Deploying and Managing Machine Learning ModelsTuri, Inc.
ย 
Snapdragon processors
Snapdragon processorsSnapdragon processors
Snapdragon processorsDeepak Mathew
ย 
Kill the mutants - A better way to test your tests
Kill the mutants - A better way to test your testsKill the mutants - A better way to test your tests
Kill the mutants - A better way to test your testsRoy van Rijn
ย 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Marina Santini
ย 
Machine Learning on Big Data
Machine Learning on Big DataMachine Learning on Big Data
Machine Learning on Big DataMax Lin
ย 
Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsDavid Pittman
ย 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
ย 
Tutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsTutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsNhatHai Phan
ย 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The PeopleDaniel Tunkelang
ย 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
ย 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not AppsNatasha Murashev
ย 

Viewers also liked (19)

Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In Production
ย 
algebraic-geometry
algebraic-geometryalgebraic-geometry
algebraic-geometry
ย 
A Study to Design and Implement a Manual for the Learning Process of Technica...
A Study to Design and Implement a Manual for the Learning Process of Technica...A Study to Design and Implement a Manual for the Learning Process of Technica...
A Study to Design and Implement a Manual for the Learning Process of Technica...
ย 
An application of abstract algebra to music theory
An application of abstract algebra to music theoryAn application of abstract algebra to music theory
An application of abstract algebra to music theory
ย 
Deep learning for mere mortals - Devoxx Belgium 2015
Deep learning for mere mortals - Devoxx Belgium 2015Deep learning for mere mortals - Devoxx Belgium 2015
Deep learning for mere mortals - Devoxx Belgium 2015
ย 
Information Security Seminar #2
Information Security Seminar #2Information Security Seminar #2
Information Security Seminar #2
ย 
Definition ofvectorspace
Definition ofvectorspaceDefinition ofvectorspace
Definition ofvectorspace
ย 
Snapdragon Processor
Snapdragon ProcessorSnapdragon Processor
Snapdragon Processor
ย 
Production and Beyond: Deploying and Managing Machine Learning Models
Production and Beyond: Deploying and Managing Machine Learning ModelsProduction and Beyond: Deploying and Managing Machine Learning Models
Production and Beyond: Deploying and Managing Machine Learning Models
ย 
Snapdragon processors
Snapdragon processorsSnapdragon processors
Snapdragon processors
ย 
Kill the mutants - A better way to test your tests
Kill the mutants - A better way to test your testsKill the mutants - A better way to test your tests
Kill the mutants - A better way to test your tests
ย 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
ย 
Machine Learning on Big Data
Machine Learning on Big DataMachine Learning on Big Data
Machine Learning on Big Data
ย 
Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data Scientists
ย 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
ย 
Tutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsTutorial on Deep learning and Applications
Tutorial on Deep learning and Applications
ย 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
ย 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
ย 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not Apps
ย 

Similar to Algebird : Abstract Algebra for big data analytics. Devoxx 2014

Everything is Permitted: Extending Built-ins
Everything is Permitted: Extending Built-insEverything is Permitted: Extending Built-ins
Everything is Permitted: Extending Built-insAndrew Dupont
ย 
Concurrent programming with Celluloid (MWRC 2012)
Concurrent programming with Celluloid (MWRC 2012)Concurrent programming with Celluloid (MWRC 2012)
Concurrent programming with Celluloid (MWRC 2012)tarcieri
ย 
The things we don't see โ€“ stories of Software, Scala and Akka
The things we don't see โ€“ stories of Software, Scala and AkkaThe things we don't see โ€“ stories of Software, Scala and Akka
The things we don't see โ€“ stories of Software, Scala and AkkaKonrad Malawski
ย 
Ruby โ€” An introduction
Ruby โ€” An introductionRuby โ€” An introduction
Ruby โ€” An introductionGonรงalo Silva
ย 
Taxonomy of Scala
Taxonomy of ScalaTaxonomy of Scala
Taxonomy of Scalashinolajla
ย 
Blocks by Lachs Cox
Blocks by Lachs CoxBlocks by Lachs Cox
Blocks by Lachs Coxlachie
ย 
Ruby 2: some new things
Ruby 2: some new thingsRuby 2: some new things
Ruby 2: some new thingsDavid Black
ย 
Pharo, an innovative and open-source Smalltalk
Pharo, an innovative and open-source SmalltalkPharo, an innovative and open-source Smalltalk
Pharo, an innovative and open-source SmalltalkSerge Stinckwich
ย 
Ruby ๅ…ฅ้–€ ็ฌฌไธ€ๆฌกๅฐฑไธŠๆ‰‹
Ruby ๅ…ฅ้–€ ็ฌฌไธ€ๆฌกๅฐฑไธŠๆ‰‹Ruby ๅ…ฅ้–€ ็ฌฌไธ€ๆฌกๅฐฑไธŠๆ‰‹
Ruby ๅ…ฅ้–€ ็ฌฌไธ€ๆฌกๅฐฑไธŠๆ‰‹Wen-Tien Chang
ย 
Spock: Test Well and Prosper
Spock: Test Well and ProsperSpock: Test Well and Prosper
Spock: Test Well and ProsperKen Kousen
ย 
Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Alexandre Rafalovitch
ย 
Ruby is an Acceptable Lisp
Ruby is an Acceptable LispRuby is an Acceptable Lisp
Ruby is an Acceptable LispAstrails
ย 
Ruby Topic Maps Tutorial (2007-10-10)
Ruby Topic Maps Tutorial (2007-10-10)Ruby Topic Maps Tutorial (2007-10-10)
Ruby Topic Maps Tutorial (2007-10-10)Benjamin Bock
ย 
Introductionto fp with groovy
Introductionto fp with groovyIntroductionto fp with groovy
Introductionto fp with groovyIsuru Samaraweera
ย 
Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...Andreas Dewes
ย 
BASE Meetup: "Analysing Scala Puzzlers: Essential and Accidental Complexity i...
BASE Meetup: "Analysing Scala Puzzlers: Essential and Accidental Complexity i...BASE Meetup: "Analysing Scala Puzzlers: Essential and Accidental Complexity i...
BASE Meetup: "Analysing Scala Puzzlers: Essential and Accidental Complexity i...Andrew Phillips
ย 
Scala Up North: "Analysing Scala Puzzlers: Essential and Accidental Complexit...
Scala Up North: "Analysing Scala Puzzlers: Essential and Accidental Complexit...Scala Up North: "Analysing Scala Puzzlers: Essential and Accidental Complexit...
Scala Up North: "Analysing Scala Puzzlers: Essential and Accidental Complexit...Andrew Phillips
ย 
Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2Henry S
ย 

Similar to Algebird : Abstract Algebra for big data analytics. Devoxx 2014 (20)

Everything is Permitted: Extending Built-ins
Everything is Permitted: Extending Built-insEverything is Permitted: Extending Built-ins
Everything is Permitted: Extending Built-ins
ย 
Concurrent programming with Celluloid (MWRC 2012)
Concurrent programming with Celluloid (MWRC 2012)Concurrent programming with Celluloid (MWRC 2012)
Concurrent programming with Celluloid (MWRC 2012)
ย 
The things we don't see โ€“ stories of Software, Scala and Akka
The things we don't see โ€“ stories of Software, Scala and AkkaThe things we don't see โ€“ stories of Software, Scala and Akka
The things we don't see โ€“ stories of Software, Scala and Akka
ย 
Ruby โ€” An introduction
Ruby โ€” An introductionRuby โ€” An introduction
Ruby โ€” An introduction
ย 
Taxonomy of Scala
Taxonomy of ScalaTaxonomy of Scala
Taxonomy of Scala
ย 
Blocks by Lachs Cox
Blocks by Lachs CoxBlocks by Lachs Cox
Blocks by Lachs Cox
ย 
Ruby 2: some new things
Ruby 2: some new thingsRuby 2: some new things
Ruby 2: some new things
ย 
A tour on ruby and friends
A tour on ruby and friendsA tour on ruby and friends
A tour on ruby and friends
ย 
Pharo, an innovative and open-source Smalltalk
Pharo, an innovative and open-source SmalltalkPharo, an innovative and open-source Smalltalk
Pharo, an innovative and open-source Smalltalk
ย 
Ruby ๅ…ฅ้–€ ็ฌฌไธ€ๆฌกๅฐฑไธŠๆ‰‹
Ruby ๅ…ฅ้–€ ็ฌฌไธ€ๆฌกๅฐฑไธŠๆ‰‹Ruby ๅ…ฅ้–€ ็ฌฌไธ€ๆฌกๅฐฑไธŠๆ‰‹
Ruby ๅ…ฅ้–€ ็ฌฌไธ€ๆฌกๅฐฑไธŠๆ‰‹
ย 
Spock: Test Well and Prosper
Spock: Test Well and ProsperSpock: Test Well and Prosper
Spock: Test Well and Prosper
ย 
Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014
ย 
Ruby is an Acceptable Lisp
Ruby is an Acceptable LispRuby is an Acceptable Lisp
Ruby is an Acceptable Lisp
ย 
Ruby Topic Maps Tutorial (2007-10-10)
Ruby Topic Maps Tutorial (2007-10-10)Ruby Topic Maps Tutorial (2007-10-10)
Ruby Topic Maps Tutorial (2007-10-10)
ย 
Rails by example
Rails by exampleRails by example
Rails by example
ย 
Introductionto fp with groovy
Introductionto fp with groovyIntroductionto fp with groovy
Introductionto fp with groovy
ย 
Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...
ย 
BASE Meetup: "Analysing Scala Puzzlers: Essential and Accidental Complexity i...
BASE Meetup: "Analysing Scala Puzzlers: Essential and Accidental Complexity i...BASE Meetup: "Analysing Scala Puzzlers: Essential and Accidental Complexity i...
BASE Meetup: "Analysing Scala Puzzlers: Essential and Accidental Complexity i...
ย 
Scala Up North: "Analysing Scala Puzzlers: Essential and Accidental Complexit...
Scala Up North: "Analysing Scala Puzzlers: Essential and Accidental Complexit...Scala Up North: "Analysing Scala Puzzlers: Essential and Accidental Complexit...
Scala Up North: "Analysing Scala Puzzlers: Essential and Accidental Complexit...
ย 
Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2
ย 

Recently uploaded

Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)
ย 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
ย 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
ย 
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธcall girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธDelhi Call girls
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธanilsa9823
ย 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
ย 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
ย 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
ย 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female serviceCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
ย 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
ย 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlanโ€™s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlanโ€™s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlanโ€™s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlanโ€™s ...OnePlan Solutions
ย 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
ย 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
ย 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
ย 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
ย 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
ย 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
ย 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto Gonzรกlez Trastoy
ย 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
ย 

Recently uploaded (20)

Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
ย 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
ย 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
ย 
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธcall girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
ย 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
ย 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
ย 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
ย 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female serviceCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
ย 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
ย 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlanโ€™s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlanโ€™s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlanโ€™s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlanโ€™s ...
ย 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
ย 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
ย 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
ย 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
ย 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
ย 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
ย 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
ย 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
ย 

Algebird : Abstract Algebra for big data analytics. Devoxx 2014

  • 1. Algebird Abstract Algebra for Analytics Sam BESSALAH @samklr Room 4 #Devoxx #algebird #scalding #monoid #hadoop @samklr
  • 2. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 3. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 4. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 5. Abstract Algebra Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 6. From WikiPedia Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 7. Algebraic Structure โ€œ Set of values, coupled with one or more finite operations,and a set of laws those operations must obey. โ€œ Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 8. Algebraic Structure โ€œ Set of values, coupled with one or more finite operations, and a set of laws those operations must obey. โ€œ e.g Sum, Magma, Semigroup, Groups, Monoid, Abelian Group, Semi Lattices, Rings, Monads, etc. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 9. Semigroup Semigroup Law : (x <> y) <> z = x <> (y <> z) (associativity) Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 10. Semigroup Semigroup Law : (x <> y) <> z = x <> (y <> z) (associativity) trait Semigroup[T] { def aggregate(x : T, y : T) : T } Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 11. Monoids Monoid Laws : (x <> y) <> z = x <> (y <> z) (associativity) identity <> x = x x <> identity = x (identity) Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 12. Monoids Monoid Laws : (x <> y) <> z = x <> (y <> z) (associativity) identity <> x = x x <> identity = x (identiy / zero) trait Monoid[T] { def identity : T def aggregate (x, y) : T } Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 13. Monoids Monoid Laws : (x <> y) <> z = x <> (y <> z) (associativity) identity <> x = x x <> identity = x trait Monoid[T] extends Semigroup[T]{ def identity : T } Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 14. Groups Group Laws: (x <> y) <> z = x <> (y <> z) (associativity) identity <> x = x x <> identity = x (identity) x <> inverse x = identity inverse x <> x = identity (invertibility) Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 15. Groups Group Laws (x <> y) <> z = x <> (y <> z) identity <> x = x x <> identity = x x <> inverse x = identity inverse x <> x = identity trait Group[T] extends Monoid[T]{ def inverse (v : T) :T } Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 16. Many More - Abelian groups (Commutative Sets) - Rings - Semi Lattices - Ordered Semigroups - Fields .. Many of those are in Algebird โ€ฆ. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 17. Examples - (a min b) min c = a (b min c) with Int. - a max ( b max c) = (a max b) max c ** - a or (b or c) = (a or b) or c - a and (b and c) = (a and b) and c - int addition - set union - harmonic sum - Integer mean - Priority queue Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 18. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 19. Why do we need those algebraic structures ? Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 20. We want to : - Build scalable analytics systems - Leverage distributed computing to perform aggregation on really large data sets. - A lot of operations in analytics are just sorting and counting at the end of the day Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 21. Distributed Computing โ†’ Parallellism Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 22. Distributed Computing โ†’ Parallellism Associativity โ†’ enables parallelism Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 23. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 24. Distributed Computing โ†’ Parallellism Associativity enables parallelism Identity means we can ignore some data Commutativity helps us ignore order Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 25. Typical Map Reduce ... Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 26. Finding Top-K Elements in Scalding ... class TopKJob(args : Args) extends Job (args) { Tsv ( args(โ€˜inputโ€™), visitScheme) .filter (. ..) .leftJoinWithTiny ( โ€ฆ ) .filter ( โ€ฆ ) .groupBy( โ€˜fieldOne) { _.sortWithTake (visitScheme -> top } (biggerSale) .write(Tsv(...) ) } Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 27. .sortWithTake( โ€ฆ ) Looking into .sortWithTake in Scalding, thereโ€™s one nice thing : class PiorityQueueMonoid[T] (max : Int) (implicit order : Ordering[T] ) extends Monoid[Priorityqueue[T] ] Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 28. class PiorityQueueMonoid[T] (max : Int) (implicit order : Ordering[T] ) extends Monoid[Priorityqueue[T] ] Letโ€™s take a look : PQ1 : 55, 45, 21, 3 PQ2: 100, 80, 40, 3 top-4 (PQ1 U PQ2 ): 100, 80, 55, 45 Priority Queue : Can be empty Two Priority Queues can be โ€œaddedโ€ in any order Associative + Commutative Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 29. class PiorityQueueMonoid[T] (max : Int) (implicit order : Ordering[T] ) extends Monoid[Priorityqueue[T] ] Letโ€™s take a look : PQ1 : 55, 45, 21, 3 PQ2: 100, 80, 40, 3 top-4 (PQ1 U PQ2 ): 100, 80, 55, 45 Priority Queue : Makes Scalding go fast, by doing sorting, filtering and extracting in one single โ€œmapโ€ step. Can be empty Two Priority Queues can be โ€œaddedโ€ in any order Associative + Commutative Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 30. Stream Mining Challenges - Update predictions after each observation - Single pass : canโ€™t read old data or replay the stream - Full size of the stream often unknown - Limited time for computation per observation - O(1) memory size Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 31. Stream Mining Challenges http://radar.oreilly.com/2013/10/stream-mining-essentials.html Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 32. Tradeoff : Space and speed over accuracy. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 33. Tradeoff : Space and speed over accuracy. use sketches. Room 4 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 34. Sketches Probabilistic data structures that store a summary (hashed mostly)of a data set that would be costly to store in its entirety, thus providing most of the time, sublinear algorithmic properties. E.g Bloom Filters, Counter Sketch, KMV counters, Count Min Sketch, HyperLogLog, Min Hashes #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 35. Bloom filters Approximate data structure for set membership Behaves like an approximate set BloomFilter.contains(x) => NO | Maybe P(False Positive) > 0 P(False Negative) = 0 #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 36. Internally : Bit Array of fixed size add(x) : for all element i, b[h(x,i)]=1 contains(x) : TRUE if b[h(x,i)] = = 1 for all i. (Boolean AND => associative) Both are associative => BF can be designed as a Monoid #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 37. Bloom filters import com.twitter.algebird._ import com.twitter.algebird.Operators._ // generate 2 lists val A = (1 to 300).toList // Generate a Bloomfilter val NUM_HASHES = 6 val WIDTH = 6000 // bits val SEED = 1 implicit val bfm = new BloomFilterMonoid(NUM_HASHES, WIDTH, SEED) // approximate set with bloomfilter val A_bf = A.map{i => bfm.create(i.toString)}.reduce(_ + _) val approxBool = A_bf.contains(โ€œ150โ€) ---> ApproximateBoolean(true, 0.9995โ€ฆ) #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 38. Count Min Sketch Gives an approximation of the number of occurrences of an element in a set. #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 39. Count Min Sketch Count min sketch Adding an element is a numerical addition Querying uses a MIN function. Both are associative. useful for detecting heavy hitters, topK, LSH We have in Algebird : #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 40. HyperLogLog Popular sketch for cardinality estimtion. Gives within a probilistic distribution of an error the number of distinct values in a data set. HLL.size = Approx[Number] Intuition Long runs of trailings 0 in a random bits chain are rare But the more bit chains you look at, the more likely you are to find a long one The longest run of trailing 0-bits seen can be an estimator of the number of unique bit chains observed. #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 41. Adding an element uses a Max and Sum function. Both are associative and Monoids. (Max is an ordered semigroup in Algebird really) Querying for an element uses an harmonic mean which is a Monoid. In Algebird : #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 42. Many More juicy sketches ... - MinHashes to compute Jaccard similarity - QTree for quantiles estimation. Neat for anomaly detection. - SpaceSaverMonoid, Awesome to find the approximate most frequent and top K elements. - TopKMonoid - SGD, PriorityQueues, Histograms, etc. #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 43. SummingBird : Lamba in a box #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 44. Heard of Lambda Architecture ? #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 45. SummingBird Same code for both batch and real time processing. #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 46. SummingBird Same code, for both batch and real time processing. But works only on Monoids. Uses Storehaus, as a mergeable store layer. #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 47. http://github.com/twitter/algebird #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 48. http://github.com/twitter/algebird #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 49. #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr These slides : http://bit.ly/1szncAZ http://slidesha.re/1zhhXKU
  • 50. #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr
  • 51. Links -Algebra for analytics by Oscar Boykin (Creator of Algebird) http://speakerdeck.com/johnynek/algebra-for-analytics - Take a look into HLearn https://github.com/mikeizbicki/HLearn - Great intro into Algebird by Michael Noll http://www.michael-noll.com/blog/2013/12/02/twitter-algebird-monoid-monad- for-large-scala-data-analytics/ -Aggregate Knowledge http://research.neustar.biz/2012/10/25/sketch-of- the-day-hyperloglog-cornerstone-of-a-big-data-infrastructure - Probabilistic data structures for web analytics. http://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures- web-analytics-data-mining/ - http://debasishg.blogspot.fr/2014/01/count-min-sketch-data-structure- for.html - http://infolab.stanford.edu/~ullman/mmds/ch3.pdf #Devoxx #algebird #scalding #monoid #hadoop #spark @samklr