AWS Startup Day - Boston 2018 - The Best Practices and Hard Lessons Learned of Serverless Applications

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Chris Munns – Senior Developer Advocate – AWS
Serverless
The Best Practices and
Hard Lessons Learned of
Serverless Applications

About me:
Chris Munns - munns@amazon.com, @chrismunns
• Senior Developer Advocate - Serverless
• New Yorker
• Previously:
• AWS Business Development Manager – DevOps, July ’15 - Feb ‘17
• AWS Solutions Architect Nov, 2011- Dec 2014
• Formerly on operations teams @Etsy and @Meetup
• Little time at a hedge fund, Xerox and a few other startups
• Rochester Institute of Technology: Applied Networking and Systems
Administration ’05
• Internet infrastructure geek

https://secure.flickr.com/photos/mgifford/4525333972
Why are we
here today?

No servers to provision
or manage
Scales with usage
Never pay for idle Availability and fault
tolerance built in
Serverless means…

SERVICES (ANYTHING)
Changes in
data state
Requests to
endpoints
Changes in
resource state
EVENT SOURCE FUNCTION
Node.js
Python
Java
C#
Go
Serverless applications

Anatomy of a Lambda function
Your
function
Language
runtime
Execution
Environment
Compute
substrate

Your
function
Language
runtime
Execution
Environment
Compute
substrate
Places where
you can
impact
performance

The function lifecycle
Bootstrap
the runtime
Start your
code
Full
cold start
Partial
cold start
Warm
start
Download
your code
Start new
container
AWS optimization Your optimization

AWS X-Ray Integration with Serverless
• Lambda instruments incoming
requests for all supported
languages
• Lambda runs the X-Ray daemon on
all languages with an SDK
var AWSXRay = require(‘aws-xray-sdk-core‘);
AWSXRay.middleware.setSamplingRules(‘sampling-rules.json’);
var AWS = AWSXRay.captureAWS(require(‘aws-sdk’));
S3Client = AWS.S3();

Seeing a cold start in AWS X-Ray

Tweak your function’s computer power
Lambda exposes only a memory control, with the % of CPU
core and network capacity allocated to a function
proportionally
Is your code CPU, Network or memory-bound? If so, it could be cheaper
to choose more memory.

Smart resource allocation
Match resource allocation (up to 3 GB!) to logic
Stats for Lambda function that calculates 1000 times all prime numbers
<= 1000000
128 MB 11.722965sec $0.024628
256 MB 6.678945sec $0.028035
512 MB 3.194954sec $0.026830
1024 MB 1.465984sec $0.024638
Green==Best Red==Worst

Smart resource allocation
Match resource allocation (up to 3 GB!) to logic
Stats for Lambda function that calculates 1000 times all prime numbers
<= 1000000
128 MB 11.722965sec $0.024628
256 MB 6.678945sec $0.028035
512 MB 3.194954sec $0.026830
1024 MB 1.465984sec $0.024638
Green==Best Red==Worst
+$0.00001-10.256981sec

Impact of a memory change
50% increase
in memory
95th percentile
changes from
3s to 2.1s
https://blog.newrelic.com/2017/06/20/lambda-functions-xray-traces-custom-serverless-metrics/

Multithreading? Maybe!
• <1.8GB is still single core
• CPU bound workloads won’t see gains – processes share same
resources
• >1.8GB is multi core
• CPU bound workloads will gains, but need to multi thread
• I/O bound workloads WILL likely see gains
• e.g. parallel calculations to return

The function lifestyle with VPC
Download
your code
Start new
container
Start your
code
Create
VPC ENI
Attach
VPC ENI
Full
cold start
Warm
start
Bootstrap
runtime
Partial
cold start
AWS optimization Your optimization

Do I need a VPC?
Should my
Lambda
function be
in a VPC?
Does my function
need to access
any specific
resources in a
VPC?
Does it also need to
access resources or
services in the
public internet?
Don’t put the
function in a
VPC
Put the
function in a
private subnet
Put the
function in a
subnet with a
NAT’d route to
the internet
Yes Yes
No No

VPC vs. Resilience
• ALWAYS configure a minimum of 2 Availability
Zones
• Give your Lambda functions their own subnets
• Give your Lambda subnets a large IP range to
handle potential scale
• If your functions need to talk to a resource on
the internet, you need a NAT!
• ENIs are a pain, we know, we’re working on it 🤓

CloudWatch Events ”ping” hack
Largely unnecessary, but if cold starts have a
visible impact on your overall performance:
• Use CloudWatch Events’ scheduled events to
invoke (“ping”) a Lambda function via API call
to the Lambda service
• DO NOT add an unnecessary API Gateway,
for example
• Pass in a payload that you can test for as not
a real payload
• Have a function in your code that handles and
replies accordingly
Lambda
function
event
(time-based)
Amazon API
Gateway
Amazon
Kinesis
Normal
application
logic
“ping”
logic

Handler() function
Function to be executed upon
invocation
Event object
Data sent during Lambda
Function Invocation
Context object
Methods available to interact
with runtime information
(request ID, log group, etc.)
public String handleRequest(Book book, Context context) {
saveBook(book);
return book.getName() + " saved!";
}

Ephemeral function environment
• Lambda processes a single event
per-container
• No need for non-blocking execution
on the frontend
• REMEMBER – containers are reused
• Lazily load variables in the global
scope
• Don’t load it if you don’t need it – cold
starts are affected
import boto3
client = None
def my_handler(event, context):
global client
if not client:
client =
boto3.client("s3")
# process

Concise function logic
• Separate Lambda handler (entry point) from core logic
• Use functions to TRANSFORM, not TRANSPORT
• Dynamic logic via configuration
• Per function – Environment variables
• Cross function – Amazon Parameter Store/Secrets Manager
• Read only what you need. For example:
• Properly indexed databases
• Query filters in Aurora
• Use S3 select

No orchestration in codeSTARTJOB
JOB#XSTARTED
HTTPPOST
HTTPPOST
AREWETHEREYET?
NOPE!
WE’REDONE!
ZzZz
OR
time.sleep(10)

No orchestration in code – use AWS Step Functions!

Efficient function code
• Avoid “fat”/monolithic functions
• Control the dependencies in your
function's deployment package
• Optimize for your language
• Node – Browserfy, Minify

Lambda execution models
Asynchronous (event)
Amazon
SNS
AWS Lambda
function
Amazon
S3
reqs
Poll-based
Amazon
DynamoDB
Amazon
Kinesis
changes
AWS Lambda
service
function
Synchronous (push)
Amazon
API Gateway
AWS Lambda
function
/order

Gateways and routers
• Choose suitable entry point for client
applications
• Single, custom client? Use the
AWS SDK
• Not end user facing? use
regional endpoints on API
Gateway
• Discard uninteresting events ASAP
• S3 – Event prefix
• SNS – Message filtering

Concurrency controls
• Concurrency is a shared pool by default
• Separate using per function concurrency settings
• Acts as reservation
• Also acts as max concurrency per function
• Especially critical for data sources like RDS
• “Kill switch” – set per function concurrency to zero

How do I figure out what’s wrong?
These tools are here, so use them!
1. Turn on X-Ray now
1. look at wrapping your own calls with it via the X-Ray SDKs
2. Don’t underestimate the power of logging in Lambda
1. Simple “debug: in functionX” statements work great and are easy to
find in CloudWatch Logs
3. The most valuable metrics are the ones closest to your
customer/use-case
1. How many gizmos did this function call/create/process/etc

Lambda Dead Letter Queues
“By default, a failed Lambda function invoked asynchronously
is retried twice, and then the event is discarded. Using Dead
Letter Queues (DLQ), you can indicate to Lambda that
unprocessed events should be sent to an Amazon SQS queue
or Amazon SNS topic instead, where you can take further
action.” –
https://docs.aws.amazon.com/lambda/latest/dg/dlq.html
• Turn this on! (for async use-cases)
• Monitor it via an SQS Queue length metric/alarm
• If you use SNS, send the messages to something durable
and/or a trusted endpoint for processing
• Can send to Lambda functions in other regions
• If and when things go “boom” DLQ can save your
invocation event information
☠️
✉️
Q

Lambda Dead Letter Queues
“By default, a failed Lambda function invoked asynchronously
is retried twice, and then the event is discarded. Using Dead
Letter Queues (DLQ), you can indicate to Lambda that
unprocessed events should be sent to an Amazon SQS queue
or Amazon SNS topic instead, where you can take further
action.” –
https://docs.aws.amazon.com/lambda/latest/dg/dlq.html
• Turn this on! (for async use-cases)
• Monitor it via an SQS Queue length metric/alarm
• If you use SNS, send the messages to something durable
and/or a trusted endpoint for processing
• Can send to Lambda functions in other regions
• If and when things go “boom” DLQ can save your
invocation event information
☠️
✉️
Q
As of June 28 2018 you can
now directly subscribe a
Lambda function to an SQS
Queue to automatically react
DLQ’d messages!

Lambda execution models
Poll-Based
Amazon
SQS
messages
AWS Lambda
service
function
NEW! NEW!

aws.amazon.com/serverless

Chris Munns
munns@amazon.com
@chrismunnshttps://www.flickr.com/photos/theredproject/3302110152/

?
https://secure.flickr.com/photos/dullhunk/202872717/

AWS Startup Day - Boston 2018 - The Best Practices and Hard Lessons Learned of Serverless Applications

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AWS Startup Day - Boston 2018 - The Best Practices and Hard Lessons Learned of Serverless Applications

Similar to AWS Startup Day - Boston 2018 - The Best Practices and Hard Lessons Learned of Serverless Applications (20)

Recently uploaded

Recently uploaded (20)

AWS Startup Day - Boston 2018 - The Best Practices and Hard Lessons Learned of Serverless Applications

Editor's Notes