End AWS Lambda Cold Starts with Provisioned Concurrency

© 2020, Amazon Web Services, Inc. or its Affiliates. End
AWS Lambda Cold Starts with Provisioned Concurrency Julian Wood Senior Developer Advocate, AWS Serverless @julian_wood

© 2020, Amazon Web Services, Inc. or its Affiliates. All
rights reserved. https://secure.flickr.com/photos/mgifford/4525333972 Why are we here today?

rights reserved. Performant interactive applications are easier than ever with AWS Lambda Photo by Marc-Olivier Jodoin on Unsplash

rights reserved. Today Introducing AWS Lambda Provisioned Concurrency! But first: A primer on cold-starts and concurrency Then later: General best practices for all Lambda based apps

rights reserved. Cold starts and you Photo by Chris Munns

rights reserved. Cold Starts and you A cold start is what occurs when a system needs to create a new resource in response to an event/request. For Lambda: • Happens when new execution environments are needed to handle requests • Typically <1% of all invokes for “production workloads” • As measured by functions with consistent invokes over a period of time, aka not dev, excludes rarely invoked • Can vary from <100ms to a > 1 second

rights reserved. Cold Starts and you A cold start is what occurs when a system needs to create a new resource in response to an event/request. For Lambda: • Happens when new execution environments are needed to handle requests • Typically <1% of all invokes for “production workloads” • As measured by functions with consistent invokes over a period of time, aka not dev, excludes rarely invoked • Can vary from <100ms to a > 1 second Why this variance?

rights reserved. Function lifecycle Request made to Lambda’s API

rights reserved. Lambda API 1. Lambda directly invoked via invoke API SDK clients API provided by the Lambda service Used by all other services that invoke Lambda across all models Supports sync and async Can pass any event payload structure you want Client included in every SDK Lambda function

rights reserved. Lambda execution model Synchronous (push) Asynchronous (event) Stream (Poll-based) /order Amazon API Gateway Lambda function Amazon DynamoDB Amazon Kinesis changes AWS Lambda service function Amazon SNS Amazon S3 reqs Lambda function

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Complete invocation Yes warm start

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Complete invocation Yes This is the path that most invocations end up taking in applications with somewhat consistent traffic patterns. A small <1% of requests would cause a cold-start and then ~99% would end on warmed environments warm start

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource No “full” cold start

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource Download customer code No “full” cold start

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource Download customer code Start execution environment No “full” cold start

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource Download customer code Start execution environment Execute INIT No “full” cold start

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Find available compute resource Download customer code Start execution environment Execute INIT Complete invocation No “full” cold start

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Complete invocation Yes warm start Subsequent invokes could then follow the warm path

rights reserved. Other warm/cold factors Once warm, environments do not stick around forever • Regularly reaped every few hours to keep execution environments fresh • Run on resources that could encounter failures • On occasion warm functions are ignored in order to re-balance load across Availability Zones No affinity for repeat requests • No concept of stick sessions or ability to target a warmed environment Updating your application code or changing your function configuration will cause the existing environments to be flushed and cause cold starts

rights reserved. Function lifecycle – worker host Execute INIT code Execute handler code Full cold start Partial cold start Warm start Download your code Start new Execution environment AWS optimization Your optimization

rights reserved. Anatomy of a Lambda function Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; }

rights reserved. Anatomy of a Lambda function Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Your handler

rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Your handler

rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Your handler Dependencies, configuration information, common helper functions

rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Function Pre-handler-secret-getter() { } Function Pre-handler-db-connect(){ } Your handler Dependencies, configuration information, common helper functions

rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Function Pre-handler-secret-getter() { } Function Pre-handler-db-connect(){ } Your handler Dependencies, configuration information, common helper functions Common helper functions

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource Download customer code Start execution environment No “full” cold start Execute INIT

rights reserved. INIT - Pre-handler code, dependencies, variables Executes at the initial start of an execution environment. • Import only what you need • Where possible trim down SDKs and other libraries to the specific bits required • Pre-handler code is great for establishing connections, but be prepared to then handle reconnections in further executions • REMEMBER – execution environments are reused • Lazily load variables in the global scope • Don’t load it if you don’t need it – cold starts are affected • Clear out used variables so you don’t run into left- over state Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { ....

rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Function Pre-handler-secret-getter() { } Function Pre-handler-db-connect(){ } Your handler Dependencies, configuration information, common helper functions Common helper functions Data shows that the longest cause of delay before handler execution comes from INIT/pre-handler code

rights reserved. How can we measure this? Photo by Chris Munns

rights reserved. AWS X-Ray Profile and troubleshoot serverless applications: • Lambda instruments incoming requests for all supported languages and can capture calls made in code • API Gateway inserts a tracing header into HTTP calls as well as reports data back to X-Ray itself var AWSXRay = require(‘aws-xray-sdk-core‘); var AWS = AWSXRay.captureAWS(require(‘aws-sdk’)); S3Client = AWS.S3();

rights reserved. X-Ray Trace Example

rights reserved. Concurrency and you Photo by Kolleen Gladden on Unsplash

rights reserved. Concurrency and you A single Lambda execution environment can only process a single event at a time. • Regardless of event source or invoke type • Batches pulled from Amazon Kinesis Data Streams, Amazon SQS, or Amazon DynamoDB Streams count as a single event Concurrent requests will require new execution environments to be created • Limited in concurrency by burst rate per account per region

rights reserved. Concurrency and you Time Request 1 Cold start Execution

rights reserved. Concurrency and you Time Request 1 Cold start Execution This execution environment is blocked for this entire time

rights reserved. Concurrency and you Time Request 1 Cold start Execution Request 2 Cold start Execution Request 3 Cold start Execution

rights reserved. Concurrency and you Time Request 1 Cold start Execution Request 2 Cold start Execution Request 3 Cold start Execution Request 4 Cold start Execution Request 5 Cold start Execution The first environment is now free and warm

rights reserved. Concurrency and you Time Request 1 Cold start Execution Request 2 Cold start Execution Request 3 Cold start Execution Request 6 Execution Request 4 Cold start Execution Request 5 Cold start Execution

rights reserved. Concurrency and you Time Request 1 Cold start Execution Request 2 Cold start Execution Request 3 Cold start Execution Request 6 Execution Request 4 Cold start Execution Request 5 Cold start Execution Request 7 Execution Request 8 Execution Request 10 Ex Request 9 Cold start

rights reserved. Concurrency and you Time Request 1 Cold start Execution Request 2 Cold start Execution Request 3 Cold start Execution Request 6 Execution Request 4 Cold start Execution Request 5 Cold start Execution Request 7 Execution Request 8 Execution Request 10 Ex Request 9 Cold start Concurrency 3 5 4 4 6 5

rights reserved. Lambda per function concurrency controls Concurrency a shared pool by default = 1000 per account per region Separate using per function concurrency settings • Acts as reservation Also acts as max concurrency per function • Especially critical for downstream resources like databases “Kill switch” – set per function concurrency to zero

rights reserved. Introducing AWS Lambda Provisioned Concurrency Photo by Erwan Hesry on Unsplash

rights reserved. Introducing AWS Lambda Provisioned Concurrency Pre-creates execution environments all the way up through the INIT phase. • Mostly for interactive workloads that are heavily latency sensitive • Greatly improved consistency across the full long tail of performance • Little to no changes to your code or way you use Lambda • Integrated with AWS Auto Scaling • Adds a cost factor for per concurrency provisioned but a lower duration cost for execution • This could end up saving you money when heavily utilized NEW!!!

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Find available compute resource Download customer code Start execution environment Execute INIT Complete invocation No “full” cold start

rights reserved. Function lifecycle Function configured with Provisioned Concurrency Find available compute resource Download customer code Start execution environment Execute INIT Provisioned concurrency start

rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Complete invocation Yes warm start This becomes the default for all provisioned concurrency execution environments

rights reserved. Provisioned Concurrency – things to know • Reduces the start time to your function handler to <100ms • Can’t configure for $LATEST • Use versions/aliases • Soft limit of 500 provisioned execution environment creation per minute • No changes to function handler code performance • Requests above provisioned concurrency follow on-demand Lambda limits and behaviors for cold- starts, bursting, pricing • Still limited by overall account concurrency per limit region

rights reserved. Provisioned Concurrency – things to know In order to maintain availability and help with scaling we provision more resources than you request • You don’t pay for these extra execution environments • But because they execute INIT, they could consume other resources from your account (DB connections, calls to other services) In order to keep execution environments fresh, we still reap them regularly but will pre-create execution environments behind the scenes • You’ll see INITs in your logs every few hours without configuration changes or other events • This won’t impact performance We give less CPU burst to Provisioned Concurrency than On-Demand during INIT and so code could take longer to execute

rights reserved. Provisioned Concurrency – pricing https://aws.amazon.com/lambda/pricing/ : • Provisioned Concurrency is calculated from the time you enable it on your function until it is disabled, rounded up to the nearest 5 minutes. The price depends on the amount of memory you allocate to your function and the amount of concurrency that you configure on it. No free tier. • Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 100ms. The price depends on the amount of memory you allocate to your function.

rights reserved. Provisioned Concurrency - configuration

rights reserved. AWS Serverless Application Model (AWS SAM) AWS CloudFormation extension optimized for serverless Special serverless resource types: functions, APIs, tables, layers, and applications Supports anything AWS CloudFormation supports Open specification (Apache 2.0) https://aws.amazon.com/serverless/sam

rights reserved. AWS SAM template AWSTemplateFormatVersion: '2010-09-09’ Transform: AWS::Serverless-2016-10-31 Resources: GetProductsFunction: Type: AWS::Serverless::Function Properties: Handler: index.getProducts Runtime: nodejs8.10 CodeUri: src/ Policies: - DynamoDBReadPolicy: TableName: !Ref ProductTable Events: GetResource: Type: Api Properties: Path: /products/{productId} Method: get ProductTable: Type: AWS::Serverless::SimpleTable Just 20 lines to create: • Lambda function • IAM role • API Gateway • DynamoDB table

rights reserved. Provisioned Concurrency - AutoScaling

rights reserved. Provisioned Concurrency - configuration

rights reserved. Performance best practices & other good news

rights reserved. Tweak your function’s computer power Lambda exposes only a memory control, with the % of CPU core and network capacity allocated to a function proportionally Is your code CPU, Network or memory-bound? If so, it could be cheaper to choose more memory.

rights reserved. Smart resource allocation Match resource allocation (up to 3 GB!) to logic Stats for Lambda function that calculates 1000 times all prime numbers <= 1000000 128 MB 11.722965sec $0.024628 256 MB 6.678945sec $0.028035 512 MB 3.194954sec $0.026830 1024 MB 1.465984sec $0.024638 Green==Best Red==Worst

rights reserved. Smart resource allocation Match resource allocation (up to 3 GB!) to logic Stats for Lambda function that calculates 1000 times all prime numbers <= 1000000 128 MB 11.722965sec $0.024628 256 MB 6.678945sec $0.028035 512 MB 3.194954sec $0.026830 1024 MB 1.465984sec $0.024638 Green==Best Red==Worst +$0.00001 -10.256981sec

rights reserved. Multithreading? Maybe! • <1.8GB is still single core • CPU bound workloads won’t see gains – processes share same resources • >1.8GB is multi core • CPU bound workloads will gains, but need to multi thread • I/O bound workloads WILL likely see gains • e.g. parallel calculations to return

rights reserved. Lambda + VPC, no longer a cold-start pain point!  Before: 14.8 sec duration After: 933ms duration  NEW!!!

rights reserved. FIN/ACK With AWS Lambda Provisioned Concurrency developers can build applications with consistent low latency • Reduces cold start caused latency to sum 100ms • Can save money, but depends on overall utilization • Watch for INIT performance! • Still matters and you pay for it • Built primarily for interactive workloads • May not make sense for async or polling.

rights reserved. aws.amazon.com/serverless

© 2020, Amazon Web Services, Inc. or its Affiliates. Thank
you  Julian Wood @julian_wood Senior Developer Advocate, AWS Serverless

End AWS Lambda Cold Starts with Provisioned Con...

End AWS Lambda Cold Starts with Provisioned Concurrency

More Decks by Julian Wood

Other Decks in Technology

Featured

Transcript