rights reserved. Today Introducing AWS Lambda Provisioned Concurrency! But first: A primer on cold-starts and concurrency Then later: General best practices for all Lambda based apps
rights reserved. Cold Starts and you A cold start is what occurs when a system needs to create a new resource in response to an event/request. For Lambda: • Happens when new execution environments are needed to handle requests • Typically <1% of all invokes for “production workloads” • As measured by functions with consistent invokes over a period of time, aka not dev, excludes rarely invoked • Can vary from <100ms to a > 1 second
rights reserved. Cold Starts and you A cold start is what occurs when a system needs to create a new resource in response to an event/request. For Lambda: • Happens when new execution environments are needed to handle requests • Typically <1% of all invokes for “production workloads” • As measured by functions with consistent invokes over a period of time, aka not dev, excludes rarely invoked • Can vary from <100ms to a > 1 second Why this variance?
rights reserved. Lambda API 1. Lambda directly invoked via invoke API SDK clients API provided by the Lambda service Used by all other services that invoke Lambda across all models Supports sync and async Can pass any event payload structure you want Client included in every SDK Lambda function
rights reserved. Lambda execution model Synchronous (push) Asynchronous (event) Stream (Poll-based) /order Amazon API Gateway Lambda function Amazon DynamoDB Amazon Kinesis changes AWS Lambda service function Amazon SNS Amazon S3 reqs Lambda function
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Complete invocation Yes warm start
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Complete invocation Yes This is the path that most invocations end up taking in applications with somewhat consistent traffic patterns. A small <1% of requests would cause a cold-start and then ~99% would end on warmed environments warm start
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource No “full” cold start
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource Download customer code No “full” cold start
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource Download customer code Start execution environment No “full” cold start
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource Download customer code Start execution environment Execute INIT No “full” cold start
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Find available compute resource Download customer code Start execution environment Execute INIT Complete invocation No “full” cold start
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Complete invocation Yes warm start Subsequent invokes could then follow the warm path
rights reserved. Other warm/cold factors Once warm, environments do not stick around forever • Regularly reaped every few hours to keep execution environments fresh • Run on resources that could encounter failures • On occasion warm functions are ignored in order to re-balance load across Availability Zones No affinity for repeat requests • No concept of stick sessions or ability to target a warmed environment Updating your application code or changing your function configuration will cause the existing environments to be flushed and cause cold starts
rights reserved. Anatomy of a Lambda function Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; }
rights reserved. Anatomy of a Lambda function Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Your handler
rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Your handler
rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Your handler Dependencies, configuration information, common helper functions
rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Function Pre-handler-secret-getter() { } Function Pre-handler-db-connect(){ } Your handler Dependencies, configuration information, common helper functions
rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Function Pre-handler-secret-getter() { } Function Pre-handler-db-connect(){ } Your handler Dependencies, configuration information, common helper functions Common helper functions
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Find available compute resource Download customer code Start execution environment No “full” cold start Execute INIT
rights reserved. INIT - Pre-handler code, dependencies, variables Executes at the initial start of an execution environment. • Import only what you need • Where possible trim down SDKs and other libraries to the specific bits required • Pre-handler code is great for establishing connections, but be prepared to then handle reconnections in further executions • REMEMBER – execution environments are reused • Lazily load variables in the global scope • Don’t load it if you don’t need it – cold starts are affected • Clear out used variables so you don’t run into left- over state Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { ....
rights reserved. Anatomy of a Lambda function Import sdk Import http-lib Import ham-sandwich Pre-handler-secret-getter() Pre-handler-db-connect() Function myhandler(event, context) { <Event handling logic> { result = SubfunctionA() }else { result = SubfunctionB() return result; } Function Pre-handler-secret-getter() { } Function Pre-handler-db-connect(){ } Your handler Dependencies, configuration information, common helper functions Common helper functions Data shows that the longest cause of delay before handler execution comes from INIT/pre-handler code
rights reserved. AWS X-Ray Profile and troubleshoot serverless applications: • Lambda instruments incoming requests for all supported languages and can capture calls made in code • API Gateway inserts a tracing header into HTTP calls as well as reports data back to X-Ray itself var AWSXRay = require(‘aws-xray-sdk-core‘); var AWS = AWSXRay.captureAWS(require(‘aws-sdk’)); S3Client = AWS.S3();
rights reserved. Concurrency and you A single Lambda execution environment can only process a single event at a time. • Regardless of event source or invoke type • Batches pulled from Amazon Kinesis Data Streams, Amazon SQS, or Amazon DynamoDB Streams count as a single event Concurrent requests will require new execution environments to be created • Limited in concurrency by burst rate per account per region
rights reserved. Concurrency and you Time Request 1 Cold start Execution Request 2 Cold start Execution Request 3 Cold start Execution Request 4 Cold start Execution Request 5 Cold start Execution The first environment is now free and warm
rights reserved. Lambda per function concurrency controls Concurrency a shared pool by default = 1000 per account per region Separate using per function concurrency settings • Acts as reservation Also acts as max concurrency per function • Especially critical for downstream resources like databases “Kill switch” – set per function concurrency to zero
rights reserved. Introducing AWS Lambda Provisioned Concurrency Pre-creates execution environments all the way up through the INIT phase. • Mostly for interactive workloads that are heavily latency sensitive • Greatly improved consistency across the full long tail of performance • Little to no changes to your code or way you use Lambda • Integrated with AWS Auto Scaling • Adds a cost factor for per concurrency provisioned but a lower duration cost for execution • This could end up saving you money when heavily utilized NEW!!!
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Find available compute resource Download customer code Start execution environment Execute INIT Complete invocation No “full” cold start
rights reserved. Function lifecycle Function configured with Provisioned Concurrency Find available compute resource Download customer code Start execution environment Execute INIT Provisioned concurrency start
rights reserved. Function lifecycle Request made to Lambda’s API Service identifies if warm execution environments is available Invoke handler Complete invocation Yes warm start This becomes the default for all provisioned concurrency execution environments
rights reserved. Provisioned Concurrency – things to know • Reduces the start time to your function handler to <100ms • Can’t configure for $LATEST • Use versions/aliases • Soft limit of 500 provisioned execution environment creation per minute • No changes to function handler code performance • Requests above provisioned concurrency follow on-demand Lambda limits and behaviors for cold- starts, bursting, pricing • Still limited by overall account concurrency per limit region
rights reserved. Provisioned Concurrency – things to know In order to maintain availability and help with scaling we provision more resources than you request • You don’t pay for these extra execution environments • But because they execute INIT, they could consume other resources from your account (DB connections, calls to other services) In order to keep execution environments fresh, we still reap them regularly but will pre-create execution environments behind the scenes • You’ll see INITs in your logs every few hours without configuration changes or other events • This won’t impact performance We give less CPU burst to Provisioned Concurrency than On-Demand during INIT and so code could take longer to execute
rights reserved. Provisioned Concurrency – pricing https://aws.amazon.com/lambda/pricing/ : • Provisioned Concurrency is calculated from the time you enable it on your function until it is disabled, rounded up to the nearest 5 minutes. The price depends on the amount of memory you allocate to your function and the amount of concurrency that you configure on it. No free tier. • Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 100ms. The price depends on the amount of memory you allocate to your function.
rights reserved. Tweak your function’s computer power Lambda exposes only a memory control, with the % of CPU core and network capacity allocated to a function proportionally Is your code CPU, Network or memory-bound? If so, it could be cheaper to choose more memory.
rights reserved. Smart resource allocation Match resource allocation (up to 3 GB!) to logic Stats for Lambda function that calculates 1000 times all prime numbers <= 1000000 128 MB 11.722965sec $0.024628 256 MB 6.678945sec $0.028035 512 MB 3.194954sec $0.026830 1024 MB 1.465984sec $0.024638 Green==Best Red==Worst
rights reserved. Smart resource allocation Match resource allocation (up to 3 GB!) to logic Stats for Lambda function that calculates 1000 times all prime numbers <= 1000000 128 MB 11.722965sec $0.024628 256 MB 6.678945sec $0.028035 512 MB 3.194954sec $0.026830 1024 MB 1.465984sec $0.024638 Green==Best Red==Worst +$0.00001 -10.256981sec
rights reserved. Multithreading? Maybe! • <1.8GB is still single core • CPU bound workloads won’t see gains – processes share same resources • >1.8GB is multi core • CPU bound workloads will gains, but need to multi thread • I/O bound workloads WILL likely see gains • e.g. parallel calculations to return
rights reserved. FIN/ACK With AWS Lambda Provisioned Concurrency developers can build applications with consistent low latency • Reduces cold start caused latency to sum 100ms • Can save money, but depends on overall utilization • Watch for INIT performance! • Still matters and you pay for it • Built primarily for interactive workloads • May not make sense for async or polling.