Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Amazon S3 Boston 2025-05-07

Amazon S3 Boston 2025-05-07

Boston Lakehouse Meetup
2025-05-07
Cambridge Massachusetts

Avatar for sullis

sullis

May 07, 2025
Tweet

More Decks by sullis

Other Decks in Programming

Transcript

  1. AWS re:Invent 2024 automatic generation of metadata that is captured

    when S3 objects are added or modi fi ed stored in fully managed Apache Iceberg tables
  2. Amazon S3 S3 is an object storage service with an

    HTTP REST API https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html
  3. Amazon S3 “There is a frontend fl eet with a

    REST API, a namespace service, a storage fl eet that’s full of hard disks, and a fl eet that does background operations.” https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html
  4. S3 core concepts An Amazon S3 object represents a fi

    le or collection of data Every object must reside within a bucket
  5. S3 bucket names an Amazon S3 bucket name is globally

    unique the namespace is shared by all AWS accounts
  6. S3 pricing https://aws.amazon.com/s3/pricing/ “You pay for storing objects in your

    S3 buckets. The rate you’re charged depends on your objects' size, how long you stored the objects during the month, and the storage class”
  7. S3 storage classes https://aws.amazon.com/s3/storage-classes/ “Amazon S3 o ff ers a

    range of storage classes that you can choose from based on the performance, data access, resiliency, and cost requirements of your workloads.”
  8. Creating an S3 bucket AWS Console UI AWS CLI AWS

    SDK CloudFormation AWS CDK Terraform Pulumi Infrastructure as Code Other
  9. S3 performance https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html "your application can achieve at least 3,500

    PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per partitioned Amazon S3 pre fi x”
  10. S3 performance https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html "There are no limits to the number

    of pre fi xes in a bucket. You can increase your read or write performance by using parallelization”
  11. S3 performance https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html “While Amazon S3 is scaling to your

    new higher request rate, you may see some 503 (Slow Down) errors. These errors will dissipate when the scaling is complete.”
  12. S3

  13. I was just hoping you might give me some insight

    into the evolution of the Apache Iceberg table specification