1. Gary Berger
Technical Leader, Engineering
Office of the CTO, DSSG
Biggest Problems in Cloud Design Today
Source: http://visibleearth.nasa.gov
2. Internet being dominated by
real-time entertainment
Source: Sandia, 2010 Global Internet Phenomena Report
3. What is an Architect?
IMHOTEP
DOCTOR, ARCHITECT, HIGH PRIEST, SCRIBE
AND VIZIER TO KING DJOSER
“An architect does not arrive at his finished
product solely by a sequence of
rationalizations, like a scientist, or through the
workings of the Zeitgeist. Nor does he reach
them by uninhibited intuition, like a musician
or painter. He thinks of forms intuitively, and
then tries to justify them rationally. Peter
Collins 1966
“Good architecture has been seen largely as
either working within a context or
circumventing it, depending on which
principles are adopted and where the cutting
edge is perceived.” Theory of Architecture,
Paul-Alan Johnson, 1994
6. Why is Architecture hard
to understand?
“Whereof one cannot speak, one must pass over in silence.”
Wittgenstein
7. Tacit Knowledge
(Informal Knowledge)
• Knowledge that is difficult to
transfer to another person by
means of writing it down or
verbalizing it.
• Knowledge which cannot be
codified, but can only be
transmitted via training or
gained through personal
experience.
• Inherent “know-how” -- as
opposed to “know-what”
(facts), “know-why”
(science), or “know-who”
(networking). It involves
learning and skill but not in a
way that can be written
down.
Source http://en.wikipedia.org/wiki/Tacit_knowledge adapted from 'The Tacit Dimension, philosopher-chemist Michael
Polanyi
W.T. Wallington walks a 21,600lb
stone
9. "Knowledge as the
Competitive Resource”
• "Knowledge is not just another resource alongside the
traditional factors of production --labor, capital and
land- but the only meaningful resource today” -
[Drucker, 1993]
• “Knowledge is the source of the highest quality power
and is the key to the power-shift that lies ahead.
knowledge is not merely an adjunct of money power
and muscle power but eventually will be the ultimate
replacement of other resource” -[Toffler, 1990]
• “The economic and producing power of a modern
corporation lies more in its intellectual and service
capabilities than in its hard assets such as land, plant
and equipment - [Quinn, 1992]
12. Awesome Ladder!
Von Neuman
Architecture John Von Neumann
INPUT
/
OUTPUT
INSTRUCTION
&
DATA
MEMORY
ALU
REGISTERS
CONTROL
CPU
CONTROL & ADDRESS
DATA & INSTRUCTION BUS
1903-1957
14. Independent
Compute POD
Data Network
Unified I/O 10GE
Data Snooping/Migration
Capacity Scaling
Block Store
Data Center Blueprint
I/O Scaling
POD Services Tier
Client Access Tier
HTTP
Compute/Data Grid
15. Things we are going to
talk about
• Dealing with Scalability
• Dealing with Data
• Dealing with Security
18. What is Scalability?
Mechanical and Biological systems all have limits
Scaling Factors
• All systems reach a limit
relative to their size.
• Understanding where
these limitations arise
gives us a clue where
to look for
performance
bottlenecks
• Architects typically find
limitations through trial
and error.
• Concurrency = The
interaction between
processors
• Contention = The degree
of serialization on shared
writeable data
• Coherency = Penalty
incurred for maintaining
consistency of shared
writable data
21. Scalability Can Be
Measured
Guerrilla Capacity Planning, Gunther, 2007
Universal Scalability Law
• C(p) = scaleup|scaleout
• p = number of processors
• a = serialized
fraction(contention)
• k = coherency k>=0
• Scalability is not infinite but a
concave function
We are making an
assumption here that we
have an exponentially
distributed load and service
rate (i.e. a Poisson
Distribution)
22. Why Scale-Up is Important
Beyond Wimpy Cores
Max Capacity p*
Asymptotic Maximum
ceiling
Coherency starts to dominate
k
Amdahl k=0
23. Conclusion
We Need Models Moore’s Impact[1]
• Effectively modeling some of
these characteristics are top of
mind problems for current
application architects
• Eric Brewers CAP Theorem
challenges architects to deal
with latency as a proxy for strong
consistency..
• Much work going on in
understanding these problems
and building a balance between
availability and consistency (i.e.
adaptive consistency)
• Some patterns make it difficult to
model mathematically
• Technologist’s Moore’s Law
o Double Transistors per Chip every 2
years
o Slows or stops: TBD
• Microarchitect’s Moore’s Law
o Double Performance per Core
every 2 years
o Slowed or stopped: Early 2000s
Multicore’s Moore’s
Law
o Double cores per chip every 2 years
• Double Parallelism per
Workload every 2 years
o Aided by Architectural
Support for Parallelism
o Double Performance per Chip
every 2 years
Or GAME OVER?1. Amdahl’s Law in the Multicore Era, Hill, Marty, Wisconsin Multifacet Project
26. Data Management
Data management is the development, execution
and supervision of plans, policies, programs and
practices that control, protect, deliver and
enhance the value of data and information assets
What are the two most important commands in the
data center today?
(NFS Read/Write)
Source: Data Management International, dama.org
27. Data Management
Models Practices
• Request level parallelism
• Data level parallelism
• Persistence model
• Durable, Volatile,
Transient
• Caching Eviction Policies
• Synchronous/Asynchrono
us Updates
• Denormalization of data
• Caching Trees
o Anti-cache spoilers
• Distributed Hash Tables
(NOSQL)
o Key/value
o Column
o Document
o Graph
• Messaging and
Serialization(IPC)
o Lightweight interfaces (PB, Thrift,
HC)
• Distributed transactions
o Opportunistic locking
o Vector Clocks
o Paxos protocols
28. Jason McHugh, Principal Engineer, Amazon
Flash Crowds
Demand spike on singular resource
• 69.6 seconds receive
31K requests for a single
object
• Cache spoilers
• Cache trees and
coherency protocol
built into relax
consistency to protect
availability
32. The “Illusion” of Security
• Perimeter defense seals
off data center so
attack surface moves
to the client
• Attackers find path of
least resistance
o Email Addresses
o Social Websites
o Standard naming practices )i.e.
firtname.lastname@company.c
om
The Apple I,
Recently sold for $210,000
“Simply keeping out bad code is not sufficient to keep out bad
computation” Stefan Savage, UC San Diego
33. Modern Attacks
Easy to 0wn, Normal processing leads to code execution
Mitigation Strategies
• Memory Trespass
• Rogue AV through mass mailings
• Injection Flaws (SQL, OS, LDAP)
• Cross Site Scripting
• Broken Authentication and
Session Management
• Insecure Direct Object
References
• Cross-site Request Forgery
Summary
• Normal processing leads to code
execution
o Receive packet/request
o Parse display/data
• ASLR (Address Space
Layout Randomization)
• DEP (Data Execution
Prevention)
• Stack Cookies
• Sandboxing
• Need to understand
strategy more than
tactics
Examples
34. Source: Dino A. Dai Zovi, Memory Corruption, Exploitation and You
Workstation Attack
Surface
35. Zero Day Attacks
• The price of disclosure?
• There are 1419 Researchers working at ZDI?
• ZDI can be used to launch a new Aurora attack
37. Architectural Ladders
3000 BCE 300 CE
Neolithic Architecture
Sumerian Architecture
Ancient Egyptian
Architecture
Classical Architecture
38. Architecture
• Architecture is created to express
some intent but is not the purpose
itself, therefore architecture must
serve a purpose
• Architectures must evolve or die,
sometimes at the expense of the
intent and function
• Architectures can be rediscovered,
refactored and reused for a new
purpose or function
• Architectures may not realize their
full potential
• Architectures do not replace
fundamentals in engineering and
science but establish a pattern
from which to describe its
effectiveness
Foote, Yoder, 1999, The Big Ball of Mud
ZIGGURAT: Dubai’s Carbon Neutral Pyramid
Will House 1 Million
39. Conclusion
• Some of the problems today have been recognized over a
decade ago but lacked the economic justifications for
change
• History repeating as we move to refactoring architectures of
the past “Engineered Solutions” just at different scales
• New architectures being proposed based on empirical
evidence, prototyping and experimentation, others just a
horrible guess
• Architects need to quickly establish new patterns with the goal
of pushing the bottlenecks to the least cost contributor (i.e.
Energy Proportional Computing).
• Architecture should help us to describe intent of the product
or function not merely as a generalization
• Architectures today are agile
• Architecture for efficient computing which maximizes
processing power per joule of energy.
40. Uggh.. Predictions?
• By 2012, 20 percent of businesses will own no IT assets
• By 2012, India-centric IT services companies will represent 20
percent of the leading cloud aggregators in the market (through
cloud service offerings)
• By 2012, Facebook will become the hub for social network
integration and Web socialization
• By 2013, mobile phones will overtake PCs as the most common
Web access device worldwide
• By 2014, most IT business cases will include carbon remediation
cost
• By 2014, over 3 billion of the world's adult population will be able
to transact electronically via mobile or Internet technology
• By 2015, context will be as influential to mobile consumer services
and relationships as search engines are to the Web
• By 2016, all Global 2000 companies will use public cloud services.
44. Meta Structures to scale
Service Directory MetaDataMetaData
MetaData MetaData
MetaData MetaData
Content Content
Content
ContentContent
Content
45. Persistency
pNFS RFC5661 HoneyComb 2
• Parallel Opens by file
handle
• Asynchronous
notification on lock
availability
• Commands linearized
in slot table
• Support for File, Object
and Block targets
• Automated data management
• Extreme data mobility
• Ability to run 3rd party storage apps
• Highly Reliable with self healing
• Flat name space
• Single management entity
• Multi‐cell architecture
• Programmatic APIs
• Immutable
• Automatic load balancing
• Transparent node upgrades
• Meta‐data support
• Storage apps support
• Deferred maintenance model
• Open‐Source Software only
46. Clustered Scalability
Guerrilla Capacity Planning, Gunther, 2007
Universal Scalability Law
• C(p) = intranode scalability
• n = nodes
• p,n = processors/node
• az = global internode contention
• kz = global internode coherency
QuestionsHow many people have been in a Data Center at any point in their career?How many people have been in a data center in the last year?How many people have been part of the construction, staging and turnup of a data center in their career?How many people have in the last year?
Streamed or buffered audio and video (RTSP, RTP, RTMP, Flash), peercasting (PPStream, Octoshape), placeshifting (Slingbox, home media servers)
Architect must distill patterns to find a common way of testing for rational justification
Architected over 100s of years..Scale evolved over several generationsPurpose and intent left to interpretation but believe this was a place to bury highly important people in the culture. May have been the architects themselves3000BC Dug a ditch a bank and a ring of 56 pits Aubrey Holes under the chalk to possible hold bluestones from wales500 years later sarsen stones were but up and bluestones were movedAvenue to River.Many generations, abandoning one form and moving to another. http://www.independent.co.uk/life-style/history/syrias-stonehenge-neolithic-stone-circles-alignments-and-possible-tombs-discovered-1914047.They didn’t have much scaling problems here, lots of mathematics and astrological knowledge (moon and sun trajectorySome, the "bluestones", weighed four tons each and were brought a distance of 150 miles from Pembrokeshire, Wales.http://video.pbs.org/video/1636852466/Ended with the introduction of copper and gold, Personal wealth lead to individual burials
http://www.independent.co.uk/life-style/history/syrias-stonehenge-neolithic-stone-circles-alignments-and-possible-tombs-discovered-1914047.htmlBCE Before Common EraThey were excellent at dealing with wood and stone
http://www.theforgottentechnology.com/newpage1
Rosetta Stone amongs other things is a public notice part of which says ”with regard to the priests, that they should pay no more as the tax for admission to the priesthood than what was appointed them throughout his father's reign and until the first year of his own reign; and has relieved the members of the priestly orders from the yearly journey to Alexandria;” basically relinquishing the priests from paying taxes.
Next Slide into Neolithic ArchitectureWe are going to dabble in some ancient architectures and see how they can be related..
This is the low level
MESIF (Modify, Exclusive, Shared, Invalid, Forward)CAP (Consistency, Availability, Data Partionining)REST (Representational State Transfer Service)pNFS (NFSV4.1DHT(Distributed Hash Table)NOSQL (Not Only SQL)DSL(Domain Specific Language)ORM(Object Relational Mapper)PCM(Phase Change Memory)TSV(Through Silicon Via)
This is the high level
Scale goes from simple structures to whole cities..http://en.wikipedia.org/wiki/Pyramid_of_DjoserArchitect IMHOTEPInvention of writing at 3100BCEThe Sumerians were the first society to create the city itself as a built form
http://en.wikipedia.org/wiki/Pyramid_of_DjoserArchitect IMHOTEPAppears in late Neolithic
Also Gunther..Our focus is to model Poisson arrival rates and service times even though Ethernet exhibits some self-similar behavior (i.e. LRD)Contention (i.e. Spinlock, row lock, etc..)Coherency=Consistency“The problem of characterizing Internet traffic is not one that can be solved easily, once and for all. As the Internet increases in size and the technologies connected to it change, we must constantly monitor and reevaluate our assumptions to ensure that our conceptual models correctly represent reality.”[1]
Serialized ContentionHyperthreading (SMT) SpinlocksMutex Field of study around lockless algosAs parallel process increase the serialized contention becomes the prominent dependencyWhile there are other ways of modeling data what is important to recognize is the fact that a completely Poisson model is what allows us to balance out the loadThe more self-similar or LRD the more problematic it becomes to model behavior. Ethernet actually exhibits LRD behavior on the output, how much of this will cause bad architectural strategies.Like the Conservation of Mass you have the Conservation of Bottlenecks. Bottlenecks are created nor destroyed they simply move from one point to anotherWhy should we pay attention to these models? Any architecture which is not based on these simple mathematics will have a difficult time being modeled correctly and thus capacity planning will be completely ineffective.People always place the burden on the application to deal with bottlenecks but there are only so many implementations which allow for a significant change. For instance the use of GPU for Victimization. This is the classical “Speed=up” model which we can reduce execution time by adding more SIMD capable computation engines. As opposed to scale-up which allows for application demand to grow while keeping the serialized overhead the same (I,,e same service rate) in order to protect customer expectations of serfvice level.Other ModelsGeometric ModelQuadratic ModelExponential ModelThink if a and k as state and federal taxes
GuntherCoherency overheadTwo variables are sigma (serlized contention) and kappa which is the coherency (consistency) overhead,Brawny cores still beat wimpy cores, most of the time, UrsHölzle GoogleSoftware development costs often dominate a company’s overall technical expenses, so forcing programmers to parallelize more code can cost more than we’d save on the hardware side
http://www.reshafim.org.il/ad/egypt/building/The drawings on the left were found by the French at the quarries of Gebel Abu Feida in 1789. These pillar capitals, destined for a temple at Denderah being built by Cleopatra, were sketched with red ochre on the rock face in half the natural size.http://www.reshafim.org.il/ad/egypt/building/The drawings on the left were found by the French at the quarries of Gebel Abu Feida in 1789. These pillar capitals, destined for a temple at Denderah being built by Cleopatra, were sketched with red ochre on the rock face in half the natural size.http://en.wikipedia.org/wiki/File:Ancient_Egypt_rope_manufacture.jpgList of Inventions in Ancient EgyptBlack InkFirst Ox-Drawn Plows365 Day Calendar and Leap YearPaperFirst Triangular Shaped PyramidsOrganized laborHieroglyphics as an early system of writingSails
NFSV1 file striping
Number of elements to a set (find largest match)
http://en.wikipedia.org/wiki/Pyramid_of_Djoser
ancient mechanical computer[1][2] designed to calculate astronomical positions.The device, they say, is technically more complex than any known device for at least a millennium afterwards.The text is astronomical with many numbers that could be related to planetary motions, and the gears are a mechanical representation of a second century theory that explained the irregularities of the Moon's motion across the sky caused by its elliptical orbit.
“Memory trespass vulnerabilities are software weaknesses that allow memory accesses outside of the semantics of the programming language in which the software was written.”Fuzzing attacks are used to exploit unknown application behaviors which can be used to create an exploit.
We can see what they were able to accomplish but don’t know how or why. The architecture remains and can be studied even though it has no use today.Different Scales, moving towards defined purposes, burial ground but for individuals with great wealthMonuments for the group to monuments for the most wealthy and powerfulEach architecture develops to solve a purpose and than maybe discarded or refactored for other purposes.
It wasn’t the attempt of the ancient architects to define their architectural period, it is for us to analyze the history of design and how its patterns change.The pharaohs wanted to do something “godlike” like live forever…It was the architect who had to figure out a way of explaining it even though it required massive engineering skill.Maybe IMTOs intent was to build the pyramid and got a buyer for it..
Architected over 100s of years..Scale evolved over thousands of years smaller stones to bigger stones.Many iterations, many stages. 500 years after bluestones the Sarsen stones appeared.3000BC Dug a ditch a bank and a ring of 56 pits Aubrey Holes under the chalk to hold bluestones, the "bluestones", weighed four tons each and were brought a distance of 150 miles from Pembrokeshire, Wales.http://video.pbs.org/video/1636852466/Many generations, abandoning one form and moving to another. http://www.independent.co.uk/life-style/history/syrias-stonehenge-neolithic-stone-circles-alignments-and-possible-tombs-discovered-1914047.They didn’t have much scaling problems here, lots of mathematics and astrological knowledge (moon and sun trajectory)Ended with the introduction of copper and gold, Personal wealth lead to individual burials
Otber ModelsGeometric ModelQuadratic ModelExponential ModelThink if a and k as state and federal taxes
Cardinality: Measure of the number of elements to a set (find largest match)