2. +
Outline
īŽ Whatâs ORAM ?
īŽ Motivation
īŽ History
īŽ More on ORAM
īŽ Goldreichâs ORAM
īŽ Ostrovskyâs ORAM
īŽ Recent Developments
īŽ Conclusion
3. +
What is ORAM ?
īŽ Oblivious Random Access Machine [ORAM]
īŽ A machine is oblivious if the sequence in which it accesses memory locations is
equivalent for any two inputs with the same running time.
īŽ E.g. For a client-server model with outsourced data storage in server, the server
cannot gain info about actual memory access pattern from client requests
īŽ Main Idea:
Memory Access Pattern is hidden from adversary.
Caveat: Assume data content is not leaked as it is protected by traditional
cryptographic methods (encryption)
ServerClient
read(i)
write(i, d)
.
.
.
4. +
Outline
īŽ Whatâs ORAM ?
īŽ Motivation
īŽ History
īŽ More on ORAM
īŽ Goldreichâs ORAM
īŽ Ostrovskyâs ORAM
īŽ Recent Developments
īŽ Conclusion
5. + Motivation
īŽ Rise of Cloud Computing
īŽ Success of Pay-as-you-Go model
for Public Clouds
īŽ Affordability, Elasticity,
SLA Guarantees
ī§ Lots of âBigâ Data
ī§ Solution: Outsourced Data Storage
ī§ Accessible from many devices (desktop,
laptop, phone, car, ...).
ī§ Greater reliability.
ī§ May be cheaper (economy of scale).
īŽ Security Concerns behind
outsourcing data to Public Clouds
6. +
Motivation
īŽ Possible Solutions
īŽ Duh! Use encryption
īŽ Use âtrustedâ public cloud services
īŽ Comâon, Google is ânot evilâ
īŽ Use private cloud infrastructure
īŽ Eg. Walmart uses OpenStack
īŽ Is there a way to hide how we access our data but still use
public cloud infrastructure ?
īŽ Answer: ORAM
8. +
Motivation: Goal
īŽ âUntrustedâ means:
īŽ It may not implement Write/Read properly
īŽ It will try to learn about data
Goal: Securely Outsourcing Data - Store,
access, and update data on an untrusted
server.
M[0]=d1
M[1]=d2
M[2]=d3
âĻ
âĻ
M[n]=dn
read(i)
write(i, d)
.
.
.
Client
Remote Server
[Islam et al, â12]
9. +
Motivation: Solution
īŽ An ORAM emulator is an intermediate layer
that protects any client (i.e. program).
īŽ ORAM will issue operations that deviate from
actual client requests.
īŽ Correctness:
īŽ If server is honest then input/ output behavior is same for client.
īŽ Security:
īŽ Server cannot distinguish between two clients with same running time.
10. +
Outline
īŽ Whatâs ORAM ?
īŽ Motivation
īŽ History
īŽ More on ORAM
īŽ Goldreichâs ORAM
īŽ Ostrovskyâs ORAM
īŽ Recent Developments
īŽ Conclusion
11. +
History of ORAM
īŽ Pippenger and Fischer showed âoblivious Turing
machinesâ could simulate general Turing machines
īŽ Goldreich introduced analogous notion of ORAMs in â87
and gave first interesting construction
īŽ Ostrovsky gave a more efficient construction in â90
īŽ ... 20 years pass, time sharing systems become âcloudsâ
...
īŽ Then a flurry of papers improving efficiency: ~10 since
2010
12. +
Outline
īŽ Whatâs ORAM ?
īŽ Motivation
īŽ History
īŽ More on ORAM
īŽ Goldreichâs ORAM
īŽ Ostrovskyâs ORAM
īŽ Recent Developments
īŽ Conclusion
13. +
ORAM Assumptions
īŽ Assumption #1: Server does not see data.
īŽ Store an encryption key on emulator and re-encrypt on every
read/write.
īŽ Assumption #2: Server does not see op (read vs write).
īŽ Every op is replaced with both a read and a write.
īŽ Assumption #3: Server is honest-but-curious.
īŽ Store a MAC key on emulator and sign (address, time, data) on
each operation.
īŽ Not malicious
14. +
ORAM Security after Simplification
īŽ Whatâs left to protect is the âaccess patternâ of the program.
īŽ Definition: The access pattern generated by a sequence (i1,
..., in) with the ORAM emulator is the random variable (j1, ... ,
jT) sampled while running with an honest server.
īŽ Definition: An ORAM emulator is secure if for every pair of
sequences of the same length, their access patterns are
indistinguishable.
15. +
Trivial ORAMs
īŽ Example #1: Store everything in ORAM simulator cache
and simulate with no calls to server.
īŽ Client storage = N.
īŽ Example #2: Store memory on server, but scan entire
memory on every operation.
īŽ Amortized and worst-case communication overhead = N.
īŽ Example #3: Assume client accesses each memory slot
at most once, and then permute addresses using a PRP.
īŽ Essentially optimal, but assumption does not hold in practice.
16. +
ORAM Lower Bounds
īŽ Theorem (GOâ90):
Any ORAM emulator must perform Ί(t log t) operations to
simulate t operations.
īŽ Proved via a combinatorial argument.
īŽ Theorem (BMâ10):
Any ORAM emulator must either perform Ί(t log t log log
t) operations to simulate t operations or use storage Ί(N2-
o(1)) (on the server).
17. +
ORAM Efficiency Goals
īŽ In order to be interesting, an ORAM must
simultaneously provide
īŽ o(N) client storage
īŽ o(N) amortized overhead
īŽ Handling of repeated access to addresses.
īŽ Desirable features for an âoptimal ORAMâ:
īŽ O(log N) worst-case overhead
īŽ O(1) client storage between operations
īŽ O(1) client memory usage during operations
īŽ âStateless clientâ: Allows several clients who share a short key
to obliviously access data w/o communicating amongst
themselves between queries. Requires op counters.
18. +
Basic Tools: Shuffling
īŽ This means we move data at address i to address Ī(i).
īŽ Proof idea:
Use an oblivious sorting algorithm.
For each comparison in the sort, read both positions and rewrite
them, either swapping the data or not (depending on if Ī(i) >
Ī(j)).
Key Idea: Use sorting networks like Batcher or AKS, where
memory accesses are independent of input
Claim: Given any permutation Ī on {1 , ... , N}, we can permute the
data according to Ī with a sequence of ops that does not depend on
the data or Ī.
19. +
Oblivious Sort: Sorting Network
īŽ Sorts Fixed number of values using a Fixed Sequence of
Comparisons.
īŽ Perfect for Oblivious Shuffling
īŽ Can be represented as networks of wires and comparator
modules
īŽ Values (of any ordered type) flow across the wires. E
īŽ Each comparator connect two wires
Batcher Sorting Network for n = 4
Time Complexity = O(N log2 N)
21. +
Outline
īŽ Whatâs ORAM ?
īŽ Motivation
īŽ History
īŽ More on ORAM
īŽ Goldreichâs ORAM
īŽ Ostrovskyâs ORAM
īŽ Recent Developments
īŽ Conclusion
22. +
Goldreichâs âSquare Rootâ ORAM
īŽ Scan Cache from the Server
īŽ If data is not in server cache, read it from main
memory
īŽ If data is in server cache, read next dummy slot
īŽ Write data into server cache
īŽ Reshuffle with new K and flush cache after
every C reads.
Initialization: Pick PRP key K. Use it to obliviously shuffle
N data slots together with C âdummyâ slots. Empty cache.
N data
slots
C dummy
slots
C cache
slots
Server Storage: N + 2C slots [General Case]
Client Storage: O(1)
23. +
Goldreichâs ORAM : Efficiency
Taking C=N1/2
Batcher sort â extra log N factor in costs
Security: Relatively easy to prove.
Server sees an oblivious sort and then C unique, random looking
read/writes before reinitializing.
25. +
Outline
īŽ Whatâs ORAM ?
īŽ Motivation
īŽ History
īŽ More on ORAM
īŽ Goldreichâs ORAM
īŽ Ostrovskyâs ORAM
īŽ Recent Developments
īŽ Conclusion
26. +
Ostrovskyâs âHierarchicalâ ORAM
[Ostrovsky â96]
īŽ Aim: Minimize Amortized cost of Random Shuffling in
âSquare Rootâ Approach
īŽ Store data (including dummies) in Random-Hash Table,
rather than Randomly Sorted Array and recurse carefully
Main Idea:
īŽ Use Hierarchy of Buffers (hash tables) of different sizes
īŽ Shuffle buffers with frequency inversely proportional to their
sizes
Much more complicated technique for hiding repeated access to
same slots.
27. + Ostrovskyâs âHierarchicalâ ORAM
Design
Server Storage
âĸ log N âlevelsâ for N items
âĸ Level i contains 2i buckets
âĸ Each buckets contains log N slots
âĸ Each slot contains a ciphertext
encrypting data or dummy.
level
2
3
4
1
K
2
K3
K4
K
1
âĸ Data starts on lowest level into buckets,
overflow happens
Client Storage
PRP key Ki for each level
âĸ When accessed, data gets moved to level 1 with
negligible prob.
âĸ Eventually, data gets shuffled to lower levels
âĸ Invariant: ⤠2i data slots used in level i
âĸ (i.e. ⤠1 per bucket on average)
= data
PRP
Keys
28. +
Ostrovskyâs ORAM: Read/Write
Read/Write(addr) *
īŽ Scan both top buckets for data
īŽ At each lower level, scan exactly one bucket
īŽ Until found, scan bucket at F(Ki, addr) on that level
īŽ After found, scan a random bucket on that level
īŽ Write data into bucket F(K1, addr) on level 1
īŽ Perform a âshuffling procedureâ to maintain invariant
* Server is blind to ops i.e. every op is replaced with both a read
and a write (which might or might not modify the data).
29. +
Ostrovskyâs ORAM in Action
level
2
3
4
1
K
2
K3
K4
K
1
= data
PRF
Keys
Read/Write(blue address):
1. Scan both buckets at level 1
2. Scan bucket F(K2, addr) = 4 in level 2
3. Scan F(K3, addr) = 3 in level 3 (finding data)
4. Scan a random bucket in level
5. Move found data to level 1
30. +
Shuffling Procedure
īŽ We âmerge levelsâ so that each level has ⤠1 slot per
bucket on average
īŽ After T operations:
īŽ Let D={ max x : 2x divides T }
īŽ For i = 1 to D
īŽ Pick new PRP key for level i+1
īŽ Shuffle data in levels i and i+1 together into level i+1 using new
key
īŽ Level i is shuffled after every 2i ops.
31. +
Shuffling: Oblivious Hashing
īŽ 12-step Shuffling process with multiple calls to 2 primitives for
merging level i and i+1 into i
īŽ Primitives Used:
īŽ Scanning: Reading all words in memory array and possibly
modifying them
īŽ Oblivious in the sense, order of access is predetermined and
same content may be written back
īŽ 7 Calls
īŽ Oblivious Sorting:
Sorting memory array by sorting keys using Sorting Network such that
sequence of memory accesses are fixed and independent of input.
īŽ 4 Calls
Example of Sorting Networks: Batchers, AKS
33. +
Ostrovskyâs ORAM in Action
level
2
3
4
1
K
2
K3
K4
K
1
= data
PRF
Keys
1. Read a Slot
2. Read another Slot
īŽ Level 1 shuffled after 2^1 = 2 operations [stops here]
34. +
Ostrovskyâs ORAM in Action
level
2
3
4
1
K
2
K3
K4
K
1
= data
PRF
Keys
1. Read a Slot
2. Read another Slot
īŽ Level 1 shuffled after 2^2 = 2 operations
3. 2 more Reads
īŽ Level 1 shuffled after 2^2 = 4 operations
īŽ Level 2 shuffled after 2^2 = 4 operations [stops here]
35. +
Ostrovskyâs ORAM: Security
īŽ Security proof is more delicate than the first one.
īŽ Key observation: This scheme never uses the value F(Ki,
addr) on the same (key, address) twice.
īŽ Why? Suppose client touches for the same address twice.
īŽ After the first read, data is promoted to level 1.
īŽ During the next read:
īŽ If it is still on level 1, then we donât evaluate F at all.
īŽ If it is has been moved, a new key must have been chosen for
that level since last read due to shuffling.
īŽ Using key observation, all reads look like random bucket scans.
37. +
Outline
īŽ Whatâs ORAM ?
īŽ Motivation
īŽ History
īŽ More on ORAM
īŽ Goldreichâs ORAM
īŽ Ostrovskyâs ORAM
īŽ Recent Developments
īŽ Conclusion
38. +
Recent Developments
īŽ Cuckoo Hashing ORAMs
īŽ Replace bucket-lists with more efficient hash table
īŽ Path ORAM Protocol
īŽ Binary Tree based ORAM Framework with client stash
īŽ Each block is mapped to a uniformly random leaf
bucket in the tree,
īŽ Unstashed blocks are always placed in some bucket
along the path to the mapped leaf.
39. +
Outline
īŽ Whatâs ORAM ?
īŽ Motivation
īŽ History
īŽ More on ORAM
īŽ Goldreichâs ORAM
īŽ Ostrovskyâs ORAM
īŽ Recent Developments
īŽ Conclusion
40. +
Conclusion
Main Takeaway
īŽ Use sorting networks with PRP keys to simulate oblivious
shuffling after a certain number of memory accesses
īŽ Rinse and repeat
Improvement
īŽ Use Hierarchy of Hash table Buffers of different sizes
īŽ Shuffle buffers with frequency inversely proportional to their sizes
41. +
References
īŽ Goldreich, Oded, and Rafail Ostrovsky. "Software protection
and simulation on oblivious RAMs." Journal of the ACM (JACM)
43.3 (1996): 431-473.
īŽ Islam, Mohammad Saiful, Mehmet Kuzu, and Murat
Kantarcioglu. "Access Pattern disclosure on Searchable
Encryption: Ramification, Attack and Mitigation." NDSS. Vol.
20. 2012.
īŽ http://cseweb.ucsd.edu/~cdcash/oram-slides.pdf