Grid computing is a form of distributed computing that is increasing in popularity in fields that have high computation and/or data storage requirements. In the presentation we give an overview of grid computing, describe our experiences using grid tools on a real project and develop a working grid across a cluster of two nodes using GridGain, an open source grid toolkit.
6. Grid?
• Multiple independent computing clusters which act like a
quot;gridquot; (Wikipedia)
• Many nodes, each node is indistinguishable from other nodes
•Complete machines over co-located CPUs?
•Multiple processes?
•Commodity hardware?
•Homogenous machines?
12. Requirements
• Callable from a Rails webapp
•Real-time - synchronous responses less than 30 seconds
•Large dataset - 100 GB (computation runs across all data)
13. Rails webapp
• Simple document-literal web service
• Ruby - soap4r
• Java - GlassFish, Spring-WS
•Not really interesting for this talk... see Brisbane.rb
14. Data
• Read-only
•Full control
•45 TB (became 100 GB with pre-processing)
•SQL? 3 tables, one query w/ 2 joins
25. Progress
• Don’t need to distribute data no data grid
•No off the shelf solutions that scale/go fast
•Understand data better happy to roll our own as fallback
36. GridGain
• “fully open source full-stack grid computing platform for Java”
•Map/reduce-based computation
•Easy to setup and use
•Can be extended via SPI implementations
•Just works
•“Scalable” (we’ve had it up to 32 nodes)
38. When does it work
• When data is independent (pure/referentially transparent)
•When data can be combined (reduce) based solely on input
39. foo foo:1
bar bar:1
foo bar bar bar:1 foo: 1
split
bar baz baz map
baz:1 reduce
bar: 4
quux bar quux quux:1 baz: 2
baz bar bar bar:1 quux: 1
baz baz:1
bar bar:1
41. foo bar foo: 1
bar baz bar: 4
quux bar baz: 2
baz bar quux: 1
Grid
42. foo bar foo: 1
bar baz bar: 4
?
quux bar baz: 2
baz bar quux: 1
bar: 2
foo bar
baz: 1
bar baz
quux: 1
foo: 1
quux bar
bar: 2
baz bar
baz: 1
Node Node
43. foo bar foo: 1
bar baz Master bar: 4
quux bar Node baz: 2
baz bar quux: 1
bar: 2
foo bar
baz: 1
bar baz
quux: 1
foo: 1
quux bar
bar: 2
baz bar
baz: 1
Node Node
44. foo bar quux bar
bar baz baz bar
foo: 1 Master Master bar: 2
bar: 2 Node Node baz: 1
baz: 1 quux: 1
foo bar baz bar
quux bar bar baz
Node Node
46. foo bar foo: 1
bar baz Master bar: 4
quux bar reduce
Node baz: 2
baz bar quux: 1
bar: 2
foo bar
baz: 1
bar baz
quux: 1
foo: 1
quux bar
bar: 2
baz bar
baz: 1
Node map map Node
48. foo bar foo: 1
bar baz Master bar: 4
reduce[B, C](List[B], C, (C, B)
quux bar Node
→ C) → List[C] 2
baz:
baz bar quux: 1
bar: 2
foo bar
baz: 1
bar baz
quux: 1
foo: 1
quux bar
bar: 2
baz bar
baz: 1
map[A, B](List[A],
Node A → B) → List[B] Node
50. foo bar foo: 1
bar baz Master bar: 4
quux bar Node baz: 2
baz bar quux: 1
Task Result
bar: 2
foo bar quux bar baz: 1
bar baz baz bar quux: 1
foo: 1
Job bar: 2 Job
baz: 1
Node Node
51. foo bar foo: 1
bar baz Master bar: 4
quux bar Node baz: 2
baz bar quux: 1
Task Result
foo bar baz bar
Job Job
bar baz quux bar
Job Job
Node Node
52. foo bar foo: 1
bar baz Master bar: 4
quux bar Node baz: 2
baz bar quux: 1
Task Result
bar baz foo bar quux bar baz bar
Job Job Job Job
Node Node Node Node
56. foo bar foo: 1
bar baz Master bar: 4
quux bar Node baz: 2
baz bar quux: 1
Task
bar baz foo bar quux bar baz bar
Job Job Job Job
Node Node Node Node
57. foo bar foo: 1
bar baz Master bar: 4
quux bar Node baz: 2
baz bar quux: 1
Task
bar baz foo bar quux bar baz bar
Job Job Job Job
Node
X
Node Node Node
58. foo bar foo: 1
bar baz Master bar: 4
quux bar Node baz: 2
baz bar quux: 1
Task
bar baz quux bar baz bar
Job bar
foo Job Job
Job
Node
XNode Node Node
59. foo bar foo: 1
bar baz Master bar: 4
quux bar Node baz: 2
baz bar quux: 1
Task
foo bar bar baz
Job quux bar baz bar Job
Job Job
X X
Node Node Node Node