3. “The network is the unsolved problem of cloud, and open source. We need
the network to be a first-class citizen of a cloud system.”
-Daniel Berg, distinguished engineer at IBM Cloud
Communication in the Cloud is a Challenge
3
4. “A standard approach to connecting and communicating with microservices is
needed. Standardization and uniformity simplifies management.”
-Tony Lock, distinguished analyst at Freeform Dynamics
Communication in the Cloud is a Challenge
4
22. Can We Do Something About This Mess?
22
Request-reply Streaming RPC
Peer-to-peerReal-time
23. Things We Can’t Do Anything About
23
It’s tempting to address complexity by imposing uniformity
of infrastructure. This is costly and constraining, and
doesn’t address any of the difficulties in building reliable
distributed applications.
Deployment Topology
24. Things We Can’t Do Anything About
24
Frameworks and languages have their place, we want to
be open to new technologies and allow developers the
freedom to choose the right tools for the job.
Proliferation of development frameworks
25. Things We Can Do Something About
25
Define a standard protocol that enables us to use this
small finite set of interactions across any combination of
infrastructure, language, framework, so they don’t have to
be implemented, reimplemented and implemented again!
27. Java API
public interface RSocket {
Mono<Payload> requestResponse(Payload payload);
Mono<Void> fireAndForget(Payload payload);
Flux<Payload> requestStream(Payload payload);
Flux<Payload> requestChannel(Flux<Payload> payloads);
}
28. Java API
public interface RSocket {
Mono<Payload> requestResponse(Payload payload);
Mono<Void> fireAndForget(Payload payload);
Flux<Payload> requestStream(Payload payload);
Flux<Payload> requestChannel(Flux<Payload> payloads);
}
29. Interaction Models – Request-Response
Mono<Payload> resp = client.requestResponse(requestPayload)
• Standard Request-Response semantics
• Likely to represent the majority of requests for the foreseeable future
• Even this obvious interaction model surpasses HTTP because it is
asynchronous and multiplexed
• Request with account number, respond with account balance
30. Interaction Models – Fire-and-Forget
Mono<Void> resp = client.fireAndForget(requestPayload)
• An optimization of Request-Response when a response isn't necessary
• Significant efficiencies
• Networking (no ack)
• Client/Server processing (immediate release of resources)
• Non-critical event logging
31. Interaction Models – Request-Stream
Flux<Payload> resp = client.requestStream(requestPayload)
• Analogous to Request-Response returning a collection
• The collection is streamed back instead of queuing until complete
• RequestN semantics mean data is not materialized until ready to send
• Request with account number, respond with real-time stream of account
transactions
32. Interaction Models – Channel
Flux<Payload> out = client.requestChannel(Flux<Payload> in)
• A bi-directional stream of messages in both directions
• An unstructured channel allows arbitrary interaction models
• Request burst of initial state, listen for subsequent updates, client
updates subscription without starting new connection
• Request with account number, respond with real-time stream of account
transactions, update subscription to filter certain transaction types,
respond with filtered real-time stream of account transactions
33. Message Driven Binary Protocol
• Requester-Responder interaction is broken down into frames that
encapsulate messages
• The framing is binary (not human readable like JSON or XML)
• Massive efficiencies for machine-to-machine communication
• Downsides only manifest rarely and can be mitigated with tooling
• Payloads are bags of bytes
• Can be JSON or XML (it's all just 1's and 0’s)
41. Multiplexed
• Connections that are only used for a single request are massively
inefficient (HTTP 1.0)
• Pipelining (ordering requests and responses sequentially) is a naive
attempt solving the issue, but results in head-of-line blocking (HTTP 1.1)
• Multiplexing solves the issue by annotating each message on the
connection with a stream id that partitions the connection into multiple
"logical streams"
46. Bi-Directional
• Many protocols (notably not TCP) have a distinction between the client
and server for the lifetime of a connection
• This division means that one side of the connection must initiate all
requests, and the other side must initiate all responses
• Even more flexible protocols like HTTP/2 do not fully drop the distinction
• Servers cannot start an unrequested stream of data to the client
• Once a client initiates a connection to a server, both parties can be
requestors or responders to a logical stream
52. Reactive Streams Back Pressure
Network protocols generally send a single request, and receive an
arbitrarily large response in return
There is nothing to stop the responder (or even the requestor) from sending an
arbitrarily large amount of data and overwhelming the receiver
In cases where TCP back pressure throttles the responder, queues fill with large
amounts of un-transferred data
53. Reactive Streams (pull-push) back pressure ensures that data is only
materialized and transferred when receiver is ready to process it
65. Resumption/Resumability
• Starting as a client-to-edge-server protocol highlighted a common failing of
existing options
• Clients on unstable connections would often drop and need to re-establish
current state
• Led to inefficiencies in both network traffic and data-center compute
• Resumability allows both parties in a "logical connection" to identify
themselves on reconnection
• On Resumption both parties handshake about the last frame received and all
missed frames are re-transmitted
• Frame caching to support is not defined by spec so it can be very flexible
X
76. Building applications with RSocket API
RSocket is Programming Model Agnostic
The RSocket interface is a serviceable contract
77. Building applications with RSocket API
A building block that other programming models could build upon
RSocket is Programming Model Agnostic
The RSocket interface is a serviceable contract
78. RSocket Protocol
TCP WebSocket Aeron/UDPHTTP/2
Protobuf JSON Custom Binary
RPC-style Messaging
Java JavaScript C++ Go Flow
80. RPC-style (Contract Driven)
service RecordsService {
rpc records (RecordsRequest) returns (stream Record) {}
}
RecordsServiceClient rankingService =
new RecordsServiceClient(rsocket);
recordsService.records(RecordsRequest.newBuilder()
.setMaxResults(16)
.build())
.subscribe(record -> System.out.println(record));
81. RPC-style (Contract Driven)
service RecordsService {
rpc records (RecordsRequest) returns (stream Record) {}
}
RecordsServiceClient rankingService =
new RecordsServiceClient(rsocket);
recordsService.records(RecordsRequest.newBuilder()
.setMaxResults(16)
.build())
.subscribe(record -> System.out.println(record));
You still need to manage this part
83. RPC-style (Contract Driven)
service RecordsService {
rpc records (RecordsRequest) returns (stream Record) {}
}
let recordServiceClient = new RecordsServiceClient(rsocket);
let req = new RecordRequest();
req.setMaxResults(16);
recordServiceClient.records(req)
.subscribe();
94. Metadata and Data in Frames
• Each Frame has an optional metadata payload
• The metadata payload has a MIME-Type but is otherwise unstructured
• Very flexible
• Can be used to carry metadata about the data payload
• Can be used to carry metadata in order to decode the payload
• Generally means that payloads can be heterogenous and each message
decoded uniquely
95. Fragmentation
• Payload frames have no maximum size
• The protocol is well suited to serving large payloads
• Still Images (Facebook), Video (Netflix)
• Both TCP MTUs and reliability on slow connections lead towards smaller
payloads
• Fragmentation provides a way to continue to reason about "logical
frames" while ensuring that individual payloads are smaller
• Applied transparently, after enforcement of RequestN semantics
96. Cancellation
• All interaction types support cancellation
• Cancellation is a signal by the requestor that any inflight processing
should be terminated eagerly and aggressively
• An obvious requirement for Request-Stream and Channel
• But useful even in Request-Response where the response can be
expensive to generate
• Early termination can lead to significant improvement in efficiency
97. Leasing
• Reactive back pressure ensures that a responder (or either party in a
Channel) cannot overwhelm the receiver
• This does not prevent a requestor from overwhelming a responder
• This commonly happens in server-to-server environments where
throughput is high
• Leasing enables responders to signal capacity to requestors
• This signal is useful for client-side load-balancing
• Without preventing server-side load-balancing