SlideShare a Scribd company logo
1 of 140
Download to read offline
Scala Parallel Collections 
Aleksandar Prokopec 
EPFL
Scala collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s) 
McDonald
Scala collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s) 
1040 ms
Scala parallel collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s)
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s)
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s) 
2 cores 
575 ms
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s) 
4 cores 
305 ms
for comprehensions 
surnames.par.flatMap { s => 
names.par 
.filter(n => s endsWith n) 
.map(n => (n, s)) 
}
for comprehensions nested parallelized bulk operations 
surnames.par.flatMap { s => 
names.par 
.filter(n => s endsWith n) 
.map(n => (n, s)) 
}
Nested parallelism
Nested parallelism parallel within parallel 
composition 
surnames.par.flatMap { s => 
surnameToCollection(s) 
// may invoke parallel ops 
}
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ...
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
recursive algorithms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, Array(""))
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, Array("")) 
1545 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray(""))
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
1 core 
1575 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
2 cores 
809 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
4 cores 
530 ms
So, I just use par and I’m home free?
How to think parallel
Character count use case for foldLeft 
val txt: String = ... 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}
6 
5 
4 
3 
2 
1 
0 
Character count use case for foldLeft 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
going left to right - not parallelizable! 
A 
B 
C 
D 
E 
F 
_ + 1
Character count use case for foldLeft 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
going left to right – not really necessary 
3 
2 
1 
0 
A 
B 
C 
_ + 1 
3 
2 
1 
0 
D 
E 
F 
_ + 1 
_ + _ 
6
Character count in parallel 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}
Character count in parallel 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
3 
2 
1 
A 
B 
C 
_ + 1 
3 
2 
1 
A 
B 
C 
: (Int, Char) => Int
Character count fold not applicable 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
! (Int, Int) => Int
Character count use case for aggregate 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _)
3 
2 
1 
A 
B 
C 
Character count use case for aggregate 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
_ + 1
Character count use case for aggregate 
aggregation  element 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + 1
Character count use case for aggregate 
aggregation  aggregation 
aggregation  element 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + 1
Word count another use case for foldLeft 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
}
Word count initial accumulation 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
0 words so far 
last character was a space 
“Folding me softly.”
Word count a space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character is a space
Word count a non space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character was a space – a new word
Word count a non space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character wasn’t a space – no new word
Word count in parallel 
“softly.“ 
“Folding me “ 
P1 
P2
Word count in parallel 
“softly.“ 
“Folding me “ 
wc = 2; rs = 1 
wc = 1; ls = 0 
 
P1 
P2
Word count in parallel 
“softly.“ 
“Folding me “ 
wc = 2; rs = 1 
wc = 1; ls = 0 
 
wc = 3 
P1 
P2
Word count must assume arbitrary partitions 
“g me softly.“ 
“Foldin“ 
wc = 1; rs = 0 
wc = 3; ls = 0 
 
P1 
P2
Word count must assume arbitrary partitions 
“g me softly.“ 
“Foldin“ 
wc = 1; rs = 0 
wc = 3; ls = 0 
 
P1 
P2 
wc = 3
Word count initial aggregation 
txt.par.aggregate((0, 0, 0))
Word count initial aggregation 
txt.par.aggregate((0, 0, 0)) 
# spaces on the left 
# spaces on the right 
#words
Word count initial aggregation 
txt.par.aggregate((0, 0, 0)) 
# spaces on the left 
# spaces on the right 
#words 
””
Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
““ 
“Folding me“ 
 
“softly.“ 
““ 

Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
“e softly.“ 
“Folding m“ 

Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
“ softly.“ 
“Folding me” 

Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
”_” 
0 words and a space – add one more space each side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
” m” 
0 words and a non-space – one word, no spaces on the right side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
” me_” 
nonzero words and a space – one more space on the right side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
” me sof” 
nonzero words, last non-space and current non-space – no change
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
” me s” 
nonzero words, last space and current non-space – one more word
Word count in parallel 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count using parallel strings? 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count string not really parallelizable 
scala> (txt: String).par
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…)
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation!
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation! 
ParArray
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation! 
ParArray 
 copy string contents into an array
Conversions going parallel 
// `par` is efficient for... 
mutable.{Array, ArrayBuffer, ArraySeq} 
mutable.{HashMap, HashSet} 
immutable.{Vector, Range} 
immutable.{HashMap, HashSet}
Conversions going parallel 
// `par` is efficient for... 
mutable.{Array, ArrayBuffer, ArraySeq} 
mutable.{HashMap, HashSet} 
immutable.{Vector, Range} 
immutable.{HashMap, HashSet} 
most other collections construct a new parallel collection!
Conversions going parallel 
sequential 
parallel 
Array, ArrayBuffer, ArraySeq 
mutable.ParArray 
mutable.HashMap 
mutable.ParHashMap 
mutable.HashSet 
mutable.ParHashSet 
immutable.Vector 
immutable.ParVector 
immutable.Range 
immutable.ParRange 
immutable.HashMap 
immutable.ParHashMap 
immutable.HashSet 
immutable.ParHashSet
Conversions going parallel 
// `seq` is always efficient 
ParArray(1, 2, 3).seq 
List(1, 2, 3, 4).seq 
ParHashMap(1 -> 2, 3 -> 4).seq 
”abcd”.seq 
// `par` may not be... 
”abcd”.par
Custom collections
Custom collection 
class ParString(val str: String)
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] {
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str)
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str) 
def splitter: Splitter[Char]
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str) 
def splitter = 
new ParStringSplitter(0, str.length)
Custom collection splitter definition 
class ParStringSplitter(var i: Int, len: Int) 
extends Splitter[Char] {
Custom collection splitters are iterators 
class ParStringSplitter(i: Int, len: Int) 
extends Splitter[Char] { 
def hasNext = i < len 
def next = { 
val r = str.charAt(i) 
i += 1 
r 
}
Custom collection splitters must be duplicated 
... 
def dup = new ParStringSplitter(i, len)
Custom collection splitters know how many elements remain 
... 
def dup = new ParStringSplitter(i, len) 
def remaining = len - i
Custom collection splitters can be split 
... 
def psplit(sizes: Int*): Seq[ParStringSplitter] = { 
val splitted = new ArrayBuffer[ParStringSplitter] 
for (sz <- sizes) { 
val next = (i + sz) min ntl 
splitted += new ParStringSplitter(i, next) 
i = next 
} 
splitted 
}
Word count now with parallel strings 
new ParString(txt).aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count performance 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
new ParString(txt).aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
}) 
100 ms 
cores: 1 2 4 
time: 137 ms 70 ms 35 ms
Hierarchy 
GenTraversable 
GenIterable 
GenSeq 
Traversable 
Iterable 
Seq 
ParIterable 
ParSeq
Hierarchy 
def nonEmpty(sq: Seq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
}
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
}
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
} 
side-effects! 
ArrayBuffer is not synchronized!
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
} 
side-effects! 
ArrayBuffer is not synchronized! 
ParSeq 
Seq
Hierarchy 
def nonEmpty(sq: GenSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res.synchronized { 
res += s 
} 
} 
res 
}
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
These return collections!
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
Sequential collections – builders
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
Sequential collections – builders 
Parallel collections – combiners
Builders building a sequential collection 
1 
2 
3 
4 
5 
6 
7 
Nil 
Nil 
ListBuilder 
+= 
+= 
+= 
result
How to build parallel?
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
}
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
Combiner 
Combiner 
Combiner
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
Should be efficient – O(log n) worst case
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
How to implement this combine?
Parallel arrays 
1, 2, 3, 4 
5, 6, 7, 8 
4 
6, 8 
3, 1, 8, 0 
2, 2, 1, 9 
8, 0 
2, 2 
merge 
merge 
merge 
copy 
allocate 
2 
4 
6 
8 
8 
0 
2 
2
Parallel hash tables 
ParHashMap
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
e.g. calling filter
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
ParHashCombiner 
e.g. calling filter
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
0 
1 
4 
ParHashCombiner 
5 
7 
9
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
0 
1 
4 
ParHashCombiner 
5 
9 
5 
7 
0 
1 
4 
7 
9
Parallel hash tables 
ParHashMap 
ParHashCombiner 
ParHashCombiner 
How to merge? 
5 
7 
0 
1 
4 
9
5 
7 
8 
9 
1 
4 
0 
Parallel hash tables 
buckets! 
ParHashCombiner 
ParHashCombiner 
ParHashMap 
2 
0 = 00002 
1 = 00012 
4 = 01002
Parallel hash tables 
ParHashCombiner 
ParHashCombiner 
0 
1 
4 
9 
7 
5 
combine
Parallel hash tables 
ParHashCombiner 
ParHashCombiner 
9 
7 
5 
0 
1 
4 
ParHashCombiner 
no copying!
Parallel hash tables 
9 
7 
5 
0 
1 
4 
ParHashCombiner
Parallel hash tables 
9 
7 
5 
0 
1 
4 
ParHashMap
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
What is the return type here?
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
creates a ParVector!
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
creates a ParVector! 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
...
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
...
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
... 
protected[this] override def newCombiner 
: Combiner[Char, ParString]
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
... 
protected[this] override def newCombiner = 
new ParStringCombiner
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] {
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
size
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
size
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
size 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
size 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
size 
lastc 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
def +=(elem: Char) = { 
lastc += elem 
size += 1 
this 
}
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
def +=(elem: Char) = { 
lastc += elem 
size += 1 
this 
} 
size 
lastc 
chunks 
+1
Custom combiners for methods returning custom collections 
... 
def combine[U <: Char, NewTo >: ParString] 
(other: Combiner[U, NewTo]) = other match { 
case psc: ParStringCombiner => 
sz += that.sz 
chunks ++= that.chunks 
lastc = chunks.last 
this 
}
Custom combiners for methods returning custom collections 
... 
def combine[U <: Char, NewTo >: ParString] 
(other: Combiner[U, NewTo]) 
lastc 
chunks 
lastc 
chunks
Custom combiners for methods returning custom collections 
... 
def result = { 
val rsb = new StringBuilder 
for (sb <- chunks) rsb.append(sb) 
new ParString(rsb.toString) 
} 
...
Custom combiners for methods returning custom collections 
... 
def result = ... 
lastc 
chunks 
StringBuilder
Custom combiners for methods expecting implicit builder factories 
// only for big boys 
... 
with GenericParTemplate[T, ParColl] 
... 
object ParColl extends ParFactory[ParColl] { 
implicit def canCombineFrom[T] = 
new GenericCanCombineFrom[T] 
...
Custom combiners performance measurement 
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘)
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
2 cores 
81 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
Custom combiners performance measurement
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
t/ms 
proc 
125 ms 
1 
2 
4 
81 ms 
56 ms 
Custom combiners performance measurement
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
t/ms 
proc 
125 ms 
1 
2 
4 
81 ms 
56 ms 
def result 
(not parallelized) 
Custom combiners performance measurement
Custom combiners tricky! 
•two-step evaluation 
–parallelize the result method in combiners 
•efficient merge operation 
–binomial heaps, ropes, etc. 
•concurrent data structures 
–non-blocking scalable insertion operation 
–we’re working on this
Future work coming up 
•concurrent data structures 
•more efficient vectors 
•custom task pools 
•user defined scheduling 
•parallel bulk in-place modifications
Thank you! 
Examples at: 
git://github.com/axel22/sd.git

More Related Content

What's hot

Functional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianFunctional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianBrian Lonsdorf
 
Purely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaPurely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaVladimir Kostyukov
 
Kotlin collections
Kotlin collectionsKotlin collections
Kotlin collectionsMyeongin Woo
 
Kotlin Advanced - Apalon Kotlin Sprint Part 3
Kotlin Advanced - Apalon Kotlin Sprint Part 3Kotlin Advanced - Apalon Kotlin Sprint Part 3
Kotlin Advanced - Apalon Kotlin Sprint Part 3Kirill Rozov
 
Switching from java to groovy
Switching from java to groovySwitching from java to groovy
Switching from java to groovyPaul Woods
 
7 Habits For a More Functional Swift
7 Habits For a More Functional Swift7 Habits For a More Functional Swift
7 Habits For a More Functional SwiftJason Larsen
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYvikram mahendra
 
Constraint Programming in Haskell
Constraint Programming in HaskellConstraint Programming in Haskell
Constraint Programming in HaskellDavid Overton
 
Lambda Expressions in Java 8
Lambda Expressions in Java 8Lambda Expressions in Java 8
Lambda Expressions in Java 8bryanbibat
 
学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2Opt Technologies
 

What's hot (15)

Functional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianFunctional Patterns for the non-mathematician
Functional Patterns for the non-mathematician
 
Purely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaPurely Functional Data Structures in Scala
Purely Functional Data Structures in Scala
 
Data import-cheatsheet
Data import-cheatsheetData import-cheatsheet
Data import-cheatsheet
 
Data transformation-cheatsheet
Data transformation-cheatsheetData transformation-cheatsheet
Data transformation-cheatsheet
 
Kotlin collections
Kotlin collectionsKotlin collections
Kotlin collections
 
Scala by Luc Duponcheel
Scala by Luc DuponcheelScala by Luc Duponcheel
Scala by Luc Duponcheel
 
Kotlin Advanced - Apalon Kotlin Sprint Part 3
Kotlin Advanced - Apalon Kotlin Sprint Part 3Kotlin Advanced - Apalon Kotlin Sprint Part 3
Kotlin Advanced - Apalon Kotlin Sprint Part 3
 
Tuples All the Way Down
Tuples All the Way DownTuples All the Way Down
Tuples All the Way Down
 
Python programming : List and tuples
Python programming : List and tuplesPython programming : List and tuples
Python programming : List and tuples
 
Switching from java to groovy
Switching from java to groovySwitching from java to groovy
Switching from java to groovy
 
7 Habits For a More Functional Swift
7 Habits For a More Functional Swift7 Habits For a More Functional Swift
7 Habits For a More Functional Swift
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAY
 
Constraint Programming in Haskell
Constraint Programming in HaskellConstraint Programming in Haskell
Constraint Programming in Haskell
 
Lambda Expressions in Java 8
Lambda Expressions in Java 8Lambda Expressions in Java 8
Lambda Expressions in Java 8
 
学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2
 

Similar to Scala Parallel Collections

Truth, deduction, computation lecture g
Truth, deduction, computation   lecture gTruth, deduction, computation   lecture g
Truth, deduction, computation lecture gVlad Patryshev
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...Philip Schwarz
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and ScalaFolding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and ScalaPhilip Schwarz
 
The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212Mahmoud Samir Fayed
 
Perl 6 in Context
Perl 6 in ContextPerl 6 in Context
Perl 6 in Contextlichtkind
 
Definition ofvectorspace
Definition ofvectorspaceDefinition ofvectorspace
Definition ofvectorspaceTanuj Parikh
 
(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?Tomasz Wrobel
 
Monadologie
MonadologieMonadologie
Monadologieleague
 
The Magnificent Seven
The Magnificent SevenThe Magnificent Seven
The Magnificent SevenMike Fogus
 
Rewriting Java In Scala
Rewriting Java In ScalaRewriting Java In Scala
Rewriting Java In ScalaSkills Matter
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to PerlSway Wang
 
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVSCBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVSGautham Rajesh
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingAlberto Labarga
 
Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Bryan O'Sullivan
 
Laziness in Swift
Laziness in Swift Laziness in Swift
Laziness in Swift SwiftWro
 
Scala - where objects and functions meet
Scala - where objects and functions meetScala - where objects and functions meet
Scala - where objects and functions meetMario Fusco
 
iRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetSamuel Lampa
 

Similar to Scala Parallel Collections (20)

Truth, deduction, computation lecture g
Truth, deduction, computation   lecture gTruth, deduction, computation   lecture g
Truth, deduction, computation lecture g
 
Python Lecture 11
Python Lecture 11Python Lecture 11
Python Lecture 11
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and ScalaFolding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
 
The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212
 
Perl 6 in Context
Perl 6 in ContextPerl 6 in Context
Perl 6 in Context
 
Definition ofvectorspace
Definition ofvectorspaceDefinition ofvectorspace
Definition ofvectorspace
 
SDC - Einführung in Scala
SDC - Einführung in ScalaSDC - Einführung in Scala
SDC - Einführung in Scala
 
(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?
 
Monadologie
MonadologieMonadologie
Monadologie
 
The Magnificent Seven
The Magnificent SevenThe Magnificent Seven
The Magnificent Seven
 
Rewriting Java In Scala
Rewriting Java In ScalaRewriting Java In Scala
Rewriting Java In Scala
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to Perl
 
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVSCBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Real World Haskell: Lecture 2
Real World Haskell: Lecture 2
 
Laziness in Swift
Laziness in Swift Laziness in Swift
Laziness in Swift
 
Scala - where objects and functions meet
Scala - where objects and functions meetScala - where objects and functions meet
Scala - where objects and functions meet
 
The hitchhicker’s guide to unit testing
The hitchhicker’s guide to unit testingThe hitchhicker’s guide to unit testing
The hitchhicker’s guide to unit testing
 
iRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat Sheet
 

Recently uploaded

Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profileakrivarotava
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 

Recently uploaded (20)

Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profile
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 

Scala Parallel Collections

  • 1. Scala Parallel Collections Aleksandar Prokopec EPFL
  • 2.
  • 3. Scala collections for { s <- surnames n <- names if s endsWith n } yield (n, s) McDonald
  • 4. Scala collections for { s <- surnames n <- names if s endsWith n } yield (n, s) 1040 ms
  • 5.
  • 6. Scala parallel collections for { s <- surnames n <- names if s endsWith n } yield (n, s)
  • 7. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s)
  • 8. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s) 2 cores 575 ms
  • 9. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s) 4 cores 305 ms
  • 10. for comprehensions surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s)) }
  • 11. for comprehensions nested parallelized bulk operations surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s)) }
  • 13. Nested parallelism parallel within parallel composition surnames.par.flatMap { s => surnameToCollection(s) // may invoke parallel ops }
  • 14. Nested parallelism going recursive def vowel(c: Char): Boolean = ...
  • 15. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc
  • 16. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield recursive algorithms
  • 17. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c
  • 18. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c
  • 19. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, Array(""))
  • 20. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, Array("")) 1545 ms
  • 21. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray(""))
  • 22. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 1 core 1575 ms
  • 23. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 2 cores 809 ms
  • 24. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 4 cores 530 ms
  • 25. So, I just use par and I’m home free?
  • 26. How to think parallel
  • 27. Character count use case for foldLeft val txt: String = ... txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 }
  • 28. 6 5 4 3 2 1 0 Character count use case for foldLeft txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } going left to right - not parallelizable! A B C D E F _ + 1
  • 29. Character count use case for foldLeft txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } going left to right – not really necessary 3 2 1 0 A B C _ + 1 3 2 1 0 D E F _ + 1 _ + _ 6
  • 30. Character count in parallel txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 }
  • 31. Character count in parallel txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } 3 2 1 A B C _ + 1 3 2 1 A B C : (Int, Char) => Int
  • 32. Character count fold not applicable txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } 3 2 1 A B C _ + _ 3 3 3 2 1 A B C ! (Int, Int) => Int
  • 33. Character count use case for aggregate txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _)
  • 34. 3 2 1 A B C Character count use case for aggregate txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + _ 3 3 3 2 1 A B C _ + 1
  • 35. Character count use case for aggregate aggregation  element 3 2 1 A B C _ + _ 3 3 3 2 1 A B C txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + 1
  • 36. Character count use case for aggregate aggregation  aggregation aggregation  element 3 2 1 A B C _ + _ 3 3 3 2 1 A B C txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + 1
  • 37. Word count another use case for foldLeft txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) }
  • 38. Word count initial accumulation txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } 0 words so far last character was a space “Folding me softly.”
  • 39. Word count a space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character is a space
  • 40. Word count a non space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character was a space – a new word
  • 41. Word count a non space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character wasn’t a space – no new word
  • 42. Word count in parallel “softly.“ “Folding me “ P1 P2
  • 43. Word count in parallel “softly.“ “Folding me “ wc = 2; rs = 1 wc = 1; ls = 0  P1 P2
  • 44. Word count in parallel “softly.“ “Folding me “ wc = 2; rs = 1 wc = 1; ls = 0  wc = 3 P1 P2
  • 45. Word count must assume arbitrary partitions “g me softly.“ “Foldin“ wc = 1; rs = 0 wc = 3; ls = 0  P1 P2
  • 46. Word count must assume arbitrary partitions “g me softly.“ “Foldin“ wc = 1; rs = 0 wc = 3; ls = 0  P1 P2 wc = 3
  • 47. Word count initial aggregation txt.par.aggregate((0, 0, 0))
  • 48. Word count initial aggregation txt.par.aggregate((0, 0, 0)) # spaces on the left # spaces on the right #words
  • 49. Word count initial aggregation txt.par.aggregate((0, 0, 0)) # spaces on the left # spaces on the right #words ””
  • 50. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res ““ “Folding me“  “softly.“ ““ 
  • 51. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) “e softly.“ “Folding m“ 
  • 52. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) “ softly.“ “Folding me” 
  • 53. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) ”_” 0 words and a space – add one more space each side
  • 54. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) ” m” 0 words and a non-space – one word, no spaces on the right side
  • 55. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) ” me_” nonzero words and a space – one more space on the right side
  • 56. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) ” me sof” nonzero words, last non-space and current non-space – no change
  • 57. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) ” me s” nonzero words, last space and current non-space – one more word
  • 58. Word count in parallel txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 59. Word count using parallel strings? txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 60. Word count string not really parallelizable scala> (txt: String).par
  • 61. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…)
  • 62. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation!
  • 63. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation! ParArray
  • 64. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation! ParArray  copy string contents into an array
  • 65. Conversions going parallel // `par` is efficient for... mutable.{Array, ArrayBuffer, ArraySeq} mutable.{HashMap, HashSet} immutable.{Vector, Range} immutable.{HashMap, HashSet}
  • 66. Conversions going parallel // `par` is efficient for... mutable.{Array, ArrayBuffer, ArraySeq} mutable.{HashMap, HashSet} immutable.{Vector, Range} immutable.{HashMap, HashSet} most other collections construct a new parallel collection!
  • 67. Conversions going parallel sequential parallel Array, ArrayBuffer, ArraySeq mutable.ParArray mutable.HashMap mutable.ParHashMap mutable.HashSet mutable.ParHashSet immutable.Vector immutable.ParVector immutable.Range immutable.ParRange immutable.HashMap immutable.ParHashMap immutable.HashSet immutable.ParHashSet
  • 68. Conversions going parallel // `seq` is always efficient ParArray(1, 2, 3).seq List(1, 2, 3, 4).seq ParHashMap(1 -> 2, 3 -> 4).seq ”abcd”.seq // `par` may not be... ”abcd”.par
  • 70. Custom collection class ParString(val str: String)
  • 71. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] {
  • 72. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length
  • 73. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str)
  • 74. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter: Splitter[Char]
  • 75. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter = new ParStringSplitter(0, str.length)
  • 76. Custom collection splitter definition class ParStringSplitter(var i: Int, len: Int) extends Splitter[Char] {
  • 77. Custom collection splitters are iterators class ParStringSplitter(i: Int, len: Int) extends Splitter[Char] { def hasNext = i < len def next = { val r = str.charAt(i) i += 1 r }
  • 78. Custom collection splitters must be duplicated ... def dup = new ParStringSplitter(i, len)
  • 79. Custom collection splitters know how many elements remain ... def dup = new ParStringSplitter(i, len) def remaining = len - i
  • 80. Custom collection splitters can be split ... def psplit(sizes: Int*): Seq[ParStringSplitter] = { val splitted = new ArrayBuffer[ParStringSplitter] for (sz <- sizes) { val next = (i + sz) min ntl splitted += new ParStringSplitter(i, next) i = next } splitted }
  • 81. Word count now with parallel strings new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 82. Word count performance txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) }) 100 ms cores: 1 2 4 time: 137 ms 70 ms 35 ms
  • 83. Hierarchy GenTraversable GenIterable GenSeq Traversable Iterable Seq ParIterable ParSeq
  • 84. Hierarchy def nonEmpty(sq: Seq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res }
  • 85. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res }
  • 86. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res } side-effects! ArrayBuffer is not synchronized!
  • 87. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res } side-effects! ArrayBuffer is not synchronized! ParSeq Seq
  • 88. Hierarchy def nonEmpty(sq: GenSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res.synchronized { res += s } } res }
  • 89. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …
  • 90. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … These return collections!
  • 91. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … Sequential collections – builders
  • 92. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … Sequential collections – builders Parallel collections – combiners
  • 93. Builders building a sequential collection 1 2 3 4 5 6 7 Nil Nil ListBuilder += += += result
  • 94. How to build parallel?
  • 95. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] }
  • 96. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } Combiner Combiner Combiner
  • 97. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } Should be efficient – O(log n) worst case
  • 98. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } How to implement this combine?
  • 99. Parallel arrays 1, 2, 3, 4 5, 6, 7, 8 4 6, 8 3, 1, 8, 0 2, 2, 1, 9 8, 0 2, 2 merge merge merge copy allocate 2 4 6 8 8 0 2 2
  • 100. Parallel hash tables ParHashMap
  • 101. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 e.g. calling filter
  • 102. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner ParHashCombiner e.g. calling filter
  • 103. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner 0 1 4 ParHashCombiner 5 7 9
  • 104. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner 0 1 4 ParHashCombiner 5 9 5 7 0 1 4 7 9
  • 105. Parallel hash tables ParHashMap ParHashCombiner ParHashCombiner How to merge? 5 7 0 1 4 9
  • 106. 5 7 8 9 1 4 0 Parallel hash tables buckets! ParHashCombiner ParHashCombiner ParHashMap 2 0 = 00002 1 = 00012 4 = 01002
  • 107. Parallel hash tables ParHashCombiner ParHashCombiner 0 1 4 9 7 5 combine
  • 108. Parallel hash tables ParHashCombiner ParHashCombiner 9 7 5 0 1 4 ParHashCombiner no copying!
  • 109. Parallel hash tables 9 7 5 0 1 4 ParHashCombiner
  • 110. Parallel hash tables 9 7 5 0 1 4 ParHashMap
  • 111. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) What is the return type here?
  • 112. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) creates a ParVector!
  • 113. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) creates a ParVector! class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) ...
  • 114. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ...
  • 115. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ... protected[this] override def newCombiner : Combiner[Char, ParString]
  • 116. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ... protected[this] override def newCombiner = new ParStringCombiner
  • 117. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] {
  • 118. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0
  • 119. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 size
  • 120. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) size
  • 121. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) size chunks
  • 122. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last size chunks
  • 123. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last size lastc chunks
  • 124. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }
  • 125. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this } size lastc chunks +1
  • 126. Custom combiners for methods returning custom collections ... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo]) = other match { case psc: ParStringCombiner => sz += that.sz chunks ++= that.chunks lastc = chunks.last this }
  • 127. Custom combiners for methods returning custom collections ... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo]) lastc chunks lastc chunks
  • 128. Custom combiners for methods returning custom collections ... def result = { val rsb = new StringBuilder for (sb <- chunks) rsb.append(sb) new ParString(rsb.toString) } ...
  • 129. Custom combiners for methods returning custom collections ... def result = ... lastc chunks StringBuilder
  • 130. Custom combiners for methods expecting implicit builder factories // only for big boys ... with GenericParTemplate[T, ParColl] ... object ParColl extends ParFactory[ParColl] { implicit def canCombineFrom[T] = new GenericCanCombineFrom[T] ...
  • 131. Custom combiners performance measurement txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘)
  • 132. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms Custom combiners performance measurement
  • 133. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms Custom combiners performance measurement
  • 134. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms 2 cores 81 ms Custom combiners performance measurement
  • 135. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms 2 cores 81 ms 4 cores 56 ms Custom combiners performance measurement
  • 136. 1 core 125 ms 2 cores 81 ms 4 cores 56 ms t/ms proc 125 ms 1 2 4 81 ms 56 ms Custom combiners performance measurement
  • 137. 1 core 125 ms 2 cores 81 ms 4 cores 56 ms t/ms proc 125 ms 1 2 4 81 ms 56 ms def result (not parallelized) Custom combiners performance measurement
  • 138. Custom combiners tricky! •two-step evaluation –parallelize the result method in combiners •efficient merge operation –binomial heaps, ropes, etc. •concurrent data structures –non-blocking scalable insertion operation –we’re working on this
  • 139. Future work coming up •concurrent data structures •more efficient vectors •custom task pools •user defined scheduling •parallel bulk in-place modifications
  • 140. Thank you! Examples at: git://github.com/axel22/sd.git