8. What is Bitcoin Script?
• Forth-like, stack-based VM, RPN
• 1 byte opcodes
• All values are variable length byte
arrays
• Type interpreted by operations
• Only stack & alt-stack
• No return stack (no calls)
• No heap
• Deterministic - No side effects or I/O
13. Why Stack-based VM?
• memory efficient
• easy to implement VM
– no need for a lexer, parser or AST
• Portable
– run on devices: phones or calculators
without consuming too much bandwidth
• compact code
– storage on the Bitcoin Blockchain is
very expensive ($600K/GB @ $220/BTC)
16. Bitcoin Script Limitations
• deterministic, but not Turing complete
- intentionally
• no loops - disallow infinite loops
• no recursive functions
– no functions at all
• no jumps/goto
– but has (OP_IF,OP_ELSE,OP_ENDIF)
• many opcodes disabled (string ops)
• sigop counts – limit # of hashing ops
• scripts are limited in size - max 500B
23. Tx Fees
• sum(TxInputs) ≥ sum(TxOutputs)
• TxFee = sum(TxInputs) – sum(TxOutputs)
• Min: 0.0001 BTC / 1000 bytes of tx (~ $0.022)
• fee goes to the miner who found a block,
which includes this tx
24. Blockchain – tree of Blocks
• Blocks (BlockHash 32B)
–Txs (TxId - reverse TxHash 32B)
• Inputs (#)
–reference to UTXO (TxId, output#)
–lock script (scriptPubKey)
• Outputs (#)
–value (in satoshis = BTC*10−8)
–unlock script (scriptSig)
28. • output(input) pair as an invocation
• output script - like a function (a hardcoded
function - tx type)
• input script - like a parameters to a function
31. Evaluation Logic
1. Start with an empty stack
2. Evaluate the “unlock script”
(scriptSig) from UTXO
3. Evaluate the “lock script”
(scriptPubKey) from current
tx input
4. If result is true (1), tx is
valid, otherwise invalid
32. “Always Pay Anyone” ;)
Stack scriptSig scriptPubKey
… OP_TRUE
…
Concatenate both scripts & start with empty stack
33. “Always Pay Anyone” ;)
Stack scriptSig scriptPubKey
… OP_TRUE
…
…
Don’t care what’s in scriptSig – unless it invalidate the
Tx. May even leave stuff on the stack.
34. “Always Pay Anyone” ;)
Stack scriptSig scriptPubKey
… OP_TRUE
…
…
…
Don’t care what’s in scriptSig – unless it invalidate the
Tx. May even leave stuff on the stack.
35. “Always Pay Anyone” ;)
Stack scriptSig scriptPubKey
… OP_TRUE
…
1
…
…
The top of the stack is 1 (i.e. true) – the Tx is valid!
36. “Don’t Pay” – “Burn bitcoins”
Stack scriptSig scriptPubKey
… OP_FALSE
…
Concatenate both scripts & start with empty stack
37. “Don’t Pay” – “Burn bitcoins”
Stack scriptSig scriptPubKey
… OP_FALSE
…
…
Don’t care what’s in scriptSig – unless it invalidate the
Tx. May even leave stuff on the stack.
38. “Don’t Pay” – “Burn bitcoins”
Stack scriptSig scriptPubKey
… OP_FALSE
…
…
…
Don’t care what’s in scriptSig – unless it invalidate the
Tx. May even leave stuff on the stack.
39. “Don’t Pay” – “Burn bitcoins”
Stack scriptSig scriptPubKey
… OP_FALSE
…
0
…
…
The top of the stack is 0 (i.e. false) – the Tx is invalid!
These bitcoins are burned forever - unspendable!
40. Pay to math genius who knows
how much is 2 * 2 = ?
Stack scriptSig scriptPubKey
4 2
2
OP_MUL
OP_EQUALVERIFY
Concatenate both scripts & start with empty stack
41. Pay to math genius who knows
how much is 2 * 2 = ?
Stack scriptSig scriptPubKey
_4 2
2
OP_MUL
OP_EQUALVERIFY
4
Push constant to the stack
42. Pay to math genius who knows
how much is 2 * 2 = ?
Stack scriptSig scriptPubKey
_4 2
2
OP_MUL
2 OP_EQUALVERIFY
4
Push constant to the stack
43. Pay to math genius who knows
how much is 2 * 2 = ?
Stack scriptSig scriptPubKey
_4 2
2
2 OP_MUL
2 OP_EQUALVERIFY
4
Push constant to the stack
44. Pay to math genius who knows
how much is 2 * 2 = ?
Stack scriptSig scriptPubKey
_4 2
2
OP_MUL
4 OP_EQUALVERIFY
4
Multiply 2 values on top of the stack
45. Pay to math genius who knows
how much is 2 * 2 = ?
Stack scriptSig scriptPubKey
_4 2
2
OP_MUL
OP_EQUALVERIFY
4 == 4 – the Tx is valid!
Too easy – need a real cryptographic solution!
46. Pay to PubKeyHash - P2PKH
Stack scriptSig scriptPubKey
<sig> OP_DUP
<pubKey> OP_HASH160
<pubKeyHash>
OP_EQUALVERIFY
OP_CHECKSIG
Concatenate both scripts & start with empty stack.
99% of all Bitcoin Txs use this script.
49. Pay to PubKeyHash - P2PKH
Stack scriptSig scriptPubKey
<sig> OP_DUP
<pubKey> OP_HASH160
<pubKey> <pubKeyHash>
<pubKey> OP_EQUALVERIFY
<sig> OP_CHECKSIG
Duplicate value on the top of the stack
52. Pay to PubKeyHash - P2PKH
Stack scriptSig scriptPubKey
<sig> OP_DUP
<pubKey> OP_HASH160
<pubKeyHash>
<pubKey> OP_EQUALVERIFY
<sig> OP_CHECKSIG
verify that 2 values are equal: if equal, continue;
else invalidate tx & stop execution
53. Pay to PubKeyHash - P2PKH
Stack scriptSig scriptPubKey
<sig> OP_DUP
<pubKey> OP_HASH160
<pubKeyHash>
OP_EQUALVERIFY
1 OP_CHECKSIG
Signature is checked for top two stack items.
1 on top of the stack – Tx is valid!
54. MultiSig – M out of N Tx
Stack scriptSig scriptPubKey
OP_0 2
<SigBuyer> <PubKeyBuyer>
<SigSeller> <PubKeySeller>
<PubKeyMediator>
3
OP_CHECKMULTISIG
Any 2 out of 3 can sign this Tx:
Buyer & Seller, Mediator & Buyer or Mediator & Seller
55. MultiSig – M out of N Tx
Stack scriptSig scriptPubKey
OP_0 2
<SigBuyer> <PubKeyBuyer>
<SigSeller> <PubKeySeller>
<PubKeyMediator>
3
<SigSeller> OP_CHECKMULTISIG
<SigBuyer>
0
Any 2 out of 3 can sign this Tx:
Buyer & Seller, Mediator & Buyer or Mediator & Seller
56. MultiSig – M out of N Tx
Stack scriptSig scriptPubKey
3 OP_0 2
<PubKeyMediator> <SigBuyer> <PubKeyBuyer>
<PubKeySeller> <SigSeller> <PubKeySeller>
<PubKeyBuyer> <PubKeyMediator>
2 3
<SigSeller> OP_CHECKMULTISIG
<SigBuyer>
0
Any 2 out of 3 can sign this Tx:
Buyer & Seller, Mediator & Buyer or Mediator & Seller
57. MultiSig – M out of N Tx
Stack scriptSig scriptPubKey
OP_0 2
<SigBuyer> <PubKeyBuyer>
<SigSeller> <PubKeySeller>
<PubKeyMediator>
3
OP_CHECKMULTISIG
1
If 2 signatures matching any 2 out of 3 public keys
– Tx is Valid!
58. Standard Tx Script Types
• Pay-to-PubKey (P2PK) – obsolete
• Pay-to-PubKeyHash (P2PKH) – 99% of all Tx
• Pay-to-ScriptHash (P2SH)
• Multisig – obsolete
• Nulldata - OP_RETURN
59. Non-standard Txs
• DDoS attacks against bitcoin nodes, which
send non-standard tx
• an invalid script (and tx) will not be accepted
• a non-standard script (and tx) will not be
relayed to the network
• but some miner pools will accept them
(Eligius) – need to send directly to them
60. NullData Script
• OP_RETURN [up to 40 bytes metadata]
- immediately invalidates the tx
- allows embedding metadata into blockchain
- unspendable / non-redeemable (burned)
- Before OP_RETURN was whitelisted metadata
was encoded as fake addresses
- provably prunable
63. Disadvatages
• Bitcoin Script can be used to implement a
weak version of Smart Contracts, but:
– Not Turing-complete
– Designed for Tx Validation – not general purpose
– Lack of state (either valid or invalid Tx, no storage)
– Value-blindness (i.e. just use UTXO value – can’t
pay arbitrary amount of BTC)
– Blockchain-blindness (can’t use blockchain data –
source of randomness, needed for gambling)
64. Smart Contracts on Bitcoin
• Smart Contracts on Bitcoin require multiple
technologies:
– Pay to Script Hash (P2SH) Multisig
– OP_RETURN to encode Metadata on the
Blockchain
– Oracles - network of external servers running
Smart Contracts’ deterministic Turing-complete
code (decisions by strict majority like Jury)
• Too Hacky, Complex & Error-prone!
91. Decentralization Continuum
Source: The “Unbundling of Trust”: how to identify good cryptocurrency opportunities? by Richard Brown
http://www.gendal.me/2014/11/14/the-unbundling-of-trust-how-to-identify-good-cryptocurrency-opportunities/
92. Decentralized
Centralized Decentralized
Apple iTunes, Netflix Bittorrent
Facebook Diaspora*
WhatsApp Jabber/XMPP
Cellular operators Firechat – Mesh Networks
AOL Internet
Post Office email
Domain Registrars Namecoin
PayPal Bitcoin
104. Ethereum Blockchain
• Same concept like in Bitcoin
• Bitcoin block time ~ 10 min
• Ethereum block time 5 block candidates per 1
min ~ 1 block per 12-15 sec
105. Ethereum Account Types
• EOA (Externally-owned Account)
– controlled by Human or application (DApp)
– only EOA can initiate transactions
• Contract Account
– can receive transactions
– can send messages to itself or other contracts
106. Ethereum Account
• Address – 160 bit excerpt from public key
• Balance (in ether ~ $0.70/ETH now)
• Nonce
Contract Accounts in addition have:
• Code
• Storage
107. Contract
• “Contract” is not a good name, better names:
– Autonomous Agent
– Actor
– Object (like in OOP)
108. Contracts
• Contract are like people:
– can call / send messages to other contracts
– … and return values
– can create new contracts
– can replicate itself
– can “suicide”
– can pay (send ether) other contracts or people
– … can buy things
109. Create Contract
• Create Contract:
– Endowment (ETH)
– Init code (whatever returned from init code)
– Gas
– Signature
• On creation:
– Places a new account in the system with code
(code in account is whatever returned from init)
111. Send a Message Call to Contract
• Send a message call to a contract:
– Recipient account address (160 bit)
– Value (ETH)
– Data (byte array)
– Gas limit
– Gas price (multiplier per ETH, used for tx priority)
– Signature
• On message receipt:
– Value is transferred to recipient’s balance
– Recipients code (if any) runs
– Return result to the caller
112. Transactions & Messages
• Transaction originates always from EOA
• And can result in multiple message calls to
contract accounts
• Transactions are recorded in the blockchain
• Message calls are transient (only exist while
transaction executing)
114. Ethereum VM - EVM
• Stack of 32B (256bit) words
• Byte-addressable Memory (2256
bytes addressable)
• Key/Value Storage (2256 words addressable)
115. EVM - Storage
• Isolated from other accounts
• Storage address space modeled as Associative Array, not
a Linear Memory – Key/Value Store
• the only VM which uses Associative Array for Address
Space
• Every new (unused) word in memory/storage has 0 value
• Writing 0 to storage word - equivalent to deleting it
(freeing it)
116. EVM State is 8-tuple:
{
block_state, // also references storage
transaction, // current transaction
message, // current message
code, // current contract’s code
memory, // memory byte array
stack, // words on the stack
pc, // program counter → code[pc]
gas // gas left to run tx
}
117. Why 256 bit?
• Crypto primitives:
– SHA256 (SHA3)
– public key is 256-bit uint (odd/even,x)
– Private key uses sepc256k1/EDCSA is 2 256-bit
uints (r,s)
• 160-bit account addresses fit into 256-bit
• 256-bit SIMD ISAs (SSE,AVX) on modern CPUs
118. WORD – Data Types
• 256 bit big endian unsigned integers - uint256
• 256 bit 2-s complement signed integers - int256
• 256 bit hash (as big endian)
• 160 bit Account Address
– big endian, least significant 20 bytes only
– 12 most significant bytes discarded
• 32 bytes/characters
• 0 – False, 1 - True
119. Ethereum VM (EVM) ISA
From To Opcode groups
00 0F Stop and Arithmetic Operations
10 1F Comparison & Bitwise Logic Operations
20 2F SHA3 hashing
30 3F Environmental Information
40 4F Block Information
50 5F Stack, Memory, Storage and Flow Operations
60 7F Push Operations
80 8F Duplication Operations
90 9F Exchange Operations
A0 AF Logging Operations
F0 FF Contract Operations
120. Arithmetic Ops
Hex Mnemonic δ α Description
01 ADD 2 1 Addition
02 MUL 2 1 Multiplication
03 SUB 2 1 Subtraction
04 DIV 2 1 Integer division
05 SDIV 2 1 Signed integer division. Where all values are treated as
two’s complement signed 256-bit integers
06 MOD 2 1 Modulo remainder
07 SMOD 2 1 Signed modulo remainder. Where all values are treated as
two’s complement signed 256-bit integers
08 ADDMOD 3 1 Modulo addition
09 MULMOD 3 1 Modulo multiplication
0A EXP 2 1 Exponential operation
0B SIGNEXTEND 2 1 Extend length of two’s complement signed integer
121. 10s: Comparison & Bitwise Logic
Hex Mnemonic δ α Description
10 LT 2 1 Less-than comparison
11 GT 2 1 Greater-than comparison
12 SLT 2 1 Signed less-than comparison
13 SGT 2 1 Signed greater-than comparison
14 EQ 2 1 Equality comparison
15 ISZERO 1 1 Simple not operator
16 AND 2 1 Bitwise AND
17 OR 2 1 Bitwise OR
18 XOR 2 1 Bitwise XOR
19 NOT 1 1 Bitwise NOT
1A BYTE 2 1 Retrieve single byte from word. For Nth byte, we count
from the left (i.e. N=0 would be the most significant in big
endian)
122. 20s: SHA3 hashing
Hex Mnemonic δ α Description
20 SHA3 2 1 Compute Keccak-256 hash for the range in memory [start,
start+len-1]
μs [0] ≡ Keccak(μm [μs [0] . . . (μs [0] + μs [1] − 1)])
μi ≡ M (μi , μs [0], μs [1])
123. Message Call Data Ops
Hex Mnemonic δ α Description
35 CALLDATALOAD 1 1 Get input data of current environment. This pertains to
the input data passed with the message call instruction
or transaction
36 CALLDATASIZE 0 1 Get size of input data in current environment. This
pertains to the input data passed with the message call
instruction or transaction
37 CALLDATACOPY 3 0 Copy input data in current environment to memory.
This pertains to the input data passed with the message
call instruction or transaction
124. Contract Code Ops
Hex Mnemonic δ α Description
38 CODESIZE 0 1 Get size of code running in current environment
39 CODECOPY 3 0 Copy code running in current environment to memory
3B EXTCODESIZE 1 1 Get size of an account’s code
3C EXTCODECOPY 4 0 Copy an account’s code to memory
125. 30s: Environmental Information
Hex Mnemonic δ α Description
30 ADDRESS 0 1 Get address of currently executing account (its like this /
self in OOP, self() in Erlang)
31 BALANCE 1 1 Get balance of the given account
32 ORIGIN 0 1 Get execution origination address. This is the sender of
original transaction; it is never an account with non-
empty associated code
33 CALLER 0 1 Get caller address. This is the address of the account
that is directly responsible for this execution
34 CALLVALUE 0 1 Get deposited value by the instruction/transaction
responsible for this execution
3A GASPRICE 0 1 Get price of gas in current environment. This is gas price
specified by the originating transaction
5A GAS 0 1 Get the amount of available gas
126. 40s: Block Information
Hex Mnemonic δ α Description
40 BLOCKHASH 1 1 Get the hash of one of the 256 most recent complete
blocks
41 COINBASE 0 1 Get the block’s coinbase address
42 TIMESTAMP 0 1 Get the block’s timestamp
43 NUMBER 0 1 Get the block’s number
44 DIFFICULTY 0 1 Get the block’s difficulty
45 GASLIMIT 0 1 Get the block’s gas limit
127. Memory
Hex Mnemonic δ α Description
51 MLOAD 1 1 Load word from memory
52 MSTORE 2 0 Save word to memory
53 MSTORE8 2 0 Save byte to memory
59 MSIZE 0 1 Get the size of active memory in bytes
128. Storage
Hex Mnemonic δ α Description
54 SLOAD 1 1 Load word from storage
55 SSTORE 2 0 Save word to storage
129. Control Flow
Hex Mnemonic δ α Description
00 STOP 0 0 Halts execution
56 JUMP 1 0 Alter the program counter
57 JUMPI 2 0 Conditionally alter the program counter
58 PC 0 1 Get the program counter
5B JUMPDEST 0 0 Mark a valid destination for jumps. This operation has no
effect on machine state during execution
130. Contract ops
Hex Mnemonic δ α Description
F0 CREATE 3 1 Pops a,b,c.
Creates a new contract with code from memory[b : b+c]
and endowment (initial ether sent) a,
and pushes the value of the contract
F1 CALL 7 1 Send message call to contract
F2 RETURN 2 1 Pops a,b.
Stops execution, returning memory[a : a+b]
FF SUICIDE 1 0 Sends all remaining ether to specified address,
Returns and flags contract for deletion as soon as tx ends
Like C++:
delete this;
131. Stack ops
Hex Mnemonic δ α Description
50 POP 1 0 Remove item from stack
60
61
…
7F
PUSH1
PUSH2
…
PUSH32
0 1 Place 1,2…32 bytes item on stack.
The bytes are read in line from the program code’s bytes
array. The function c ensures the bytes default to zero if
they extend past the limits. The byte is right-aligned (takes
the lowest significant place in big endian).
DUP … Operations to duplicate values on the stack
SWAP … Operations to swap values on the stack
139. Name Registry contract
Compiled to EVM assembly:
PUSH1 0 CALLDATALOAD SLOAD NOT PUSH1 9 JUMPI
STOP JUMPDEST PUSH1 32 CALLDATALOAD PUSH1 0
CALLDATALOAD SSTORE
140. EVM State is 8-tuple:
{
block_state, // also references storage
transaction, // current transaction
message, // current message
code, // current contract’s code
memory, // memory byte array
stack, // words on the stack
pc, // program counter → code[pc]
gas // gas left to run tx
}
141. EVM State inside Contract:
Invariant per Contract:
block_state, // also references storage
transaction, // current transaction
message, // current message
code // current contract’s code
Contract State:
{
pc, // program counter → code[pc]
gas, // gas left to run tx
stack, // words on the stack
memory, // memory byte array
storage // K/V store of words
}
142. Example of Tx
Zvi registers a domain “54” with IP “20202020”:
- Send Tx:
- From: “Zvi 160-bit address”
- To: “NameRegistry” contract’s address
- Value: 0 ether
- Data: [54, 20202020]
- GasLimit: 2000 gas
- GasPrice: 1.0 (1 gas == 1 wei)
143. Example of Tx - Gas
Calldata [54, 20202020] is 2 words of 32 bytes = 64
bytes.
StartGas * GasPrice = 2000 * 1 = 2000 wei
Tx costs:
• 500 + 5*TXDATALEN = 500 – 5*64 bytes = 820 gas
157. Gas Usage
• 1150 gas consumed by Tx execution
• 2000 gas – 1150 gas = 850 gas refund
• If we were setting GasLimit to less than 1150,
the Tx would be failing in the middle and all
gas would be consumed (no refund)
158. Send the same Tx 2nd time
Zvi registers a domain “54” with IP “20202020”:
- Send Tx:
- From: “Zvi 160-bit address”
- To: “NameRegistry” contract’s address
- Value: 0 ether
- Data: [54, 20202020]
- GasLimit: 2000 gas
- GasPrice: 1.0 (1 gas == 1 wei)
167. Gas Usage (2nd Tx)
• 845 gas consumed by 2nd Tx execution
• 2000 gas – 845 gas = 1155 gas refund
• If we were setting GasLimit to less than 845,
the Tx would be failing in the middle and all
gas would be consumed (no refund)
168. Acceptable uses of the EVM
• Acceptable uses:
– running business logic (“IFTTT - If This Then That")
– verifying signatures & other cryptographic objects
– applications that verify parts of other blockchains (eg. a
decentralized ether-to-bitcoin exchange)
• Unacceptable uses:
– using the EVM as a file storage, email or text messaging
– anything to do with GUI, web apps, etc.
– cloud computing, HPC, number crunching, ML, etc.
170. DSLs for Ethereum Smart Contracts
• Low-level
– EVM Assembly
– LLL (Triple-L) - Lisp-like Low-level Language
• High-level
– Serpent (Python-like) – going to be obsolete?
– EtherScript – Visual DSL
– Mutan (Go-like) – obsolete
– CLL (C-like language) – obsolete
– Solidity - (C/Javascript like with static types)
171. LLL (triple L)
• Lisp-like Low-level Language
• (*.lll)
• Used mostly for compilers & tools
• LISP-flavored EVM “MacroAssembly”
• S-expressions of opcodes
• Unlike EVM Assembly
– no need to manage stack
– no need to manage jumps & jump dest labels
• Can test & generate LLL from Clojure
– https://github.com/drcode/clll
178. Serpent
• Python-like syntax
• Python control flow (if, while, etc.)
• Infix operators
• EVM semantics
• Special variables to refer to EVM properties
• A little bit higher level than LLL
• Can write unit tests in Python
• (*.se)
183. Solidity (*.sol)
• DSL designed specifically for Ethereum contracts
• Syntax similar to C/C++
• Statically typed
• ABI – Application Binary Interface
– i.e. function from one contract knows how to call and marshal
arguments to function from another contracts
– i.e. common contract code libraries
• Mix IDE for Solidity:
– https://github.com/ethereum/wiki/wiki/Mix:-The-DApp-IDE
• Solidity Online Compiler:
– http://chriseth.github.io/cpp-ethereum
188. Functions can access
private members
contract Foo {
function Foo {
x = 69;
}
function getx() returns (uint) {
return x;
}
private uint x;
}
189. Types
• bool
• intN - N in [8:8:256] bit, int is int256
• uintN - N in [8:8:256] bit, uint is uint256
• hashN - N in [8:8:256] bit, hash is hash256
• address - 160 bit
• stringN - N in [0:32] bytes
– string0 - empty string
– string1 – character
– string32 – 32 char fixed-length string
192. Mappings (assoc arrays)
mapping (KEYTYPE => VALUETYPE) M;
• Regular finite-size member variables take continuous
storage slots starting from position 0
• The mapping variable (M) itself takes unfilled slot in
some position p (i.e. p = addr(M) )
• Mappings layout in storage:
addr(M[k]) = sha3(k . p)
194. Data Structure Nesting
• No Arrays (yet)
• Structs can be nested
• Mappings can be nested
• Structs can include Mappings
• Mappings can include Structs
195. “Paid” Function Calls
contract InfoFeed {
function info() returns (uint ret) {
return 42;
}
}
contract Consumer {
InfoFeed feed;
function setFeed(address addr) {
feed = InfoFeed(addr);
}
function callFeed() {
feed.info.value(10).gas(800)();
}
}
196. Subcurrency example
contract Coin {
function Coin {
balances[msg.sender] = 1000000000;
}
function send(address to, uint value) {
if(balances[msg.sender] >= value) {
balances[msg.sender] -= value;
balances[to] += value;
}
}
private mapping(address => uint) balances;
}