AWS Simple Storage Service – S3 Overview
- High availability and scalability.
- Amazon S3 is a simple key, value object store.
- S3 provides unlimited storage space and file size should be 0 bye to 5TB
- S3 is an Object storage (not a Block level storage) and cannot be used to setup OS or dynamic websites.
- S3 resources are private by default
- 100 service limit for an each account.
- each object has user-defined metadata and system metadata. and some system metadata are controlled by user. e.g. storage class, server side encryption.
S3 Bucket and Objects
- S3 buckets are container for storing objects.
- S3 bucket is owned by the AWS account that creates and Bucket ownership is not transferable.
- S3 bucket names are globally unique, regardless of the AWS region in which you create the bucket
- Even though S3 is a global service, buckets are created within a region specified during the creation of the bucket.
- There is no limit to the number of objects that can be stored in a bucket.
- 100 buckets of soft limit in each of AWS account, and need to make request to increase the size.
- Bucket names should be globally unique and DNS compliant
- Bucket ownership is not transferable
- Buckets cannot be nested and cannot have bucket within another bucket
- Object is uniquely identified within a bucket by a keyname and version ID
- Objects consist of object data, metadata and others
- Key is object name
- Value is data
- Metadata is the data about the data and is a set of name-value pairs that describe the object
- Version ID is the version id for the object and in combination with the key helps to unique identify an object within a bucket
- Subresources helps provide additional information for an object
- Access Control Information helps control access to the objects
- Object metadata cannot be modified after the object is uploaded and it can be only modified by performing copy operation and setting the metadata
- Objects belonging to a bucket reside in a specific AWS region never leave that region, unless explicitly copied using Cross Region replication
- Object can be retrieved as a whole or a partially
Operations on Object
- Objects of size 5GB can be uploaded in a single PUT operation
- Multipart upload can be used for file size more than 5GB. But max limit must be 5TB. recommend reach part will be 100MB.
- After Successfully upload a file, S3 will give 200 status code.
- S3 allows listing of all the keys within a bucket
- Single listing request would return a max of 1000 object keys
- Keys within a bucket can be listed using Prefix and Delimiter.
- Object can be retrieved as a whole
- Object can be retrieved in parts or partially by using the Range HTTP header.
- Objects can also be downloaded by sharing Pre-Signed urls
- Metadata of the object is returned in the response headers
- Copy object can be done up to 5GB in single operation and otherwise, multipart upload api has to be used for copying up to 5TB with copy command.
- User-Defined metadata and user-controlled system metadata( storage class and server side encryption) are also copied too
- create additional copies
- rename object by copying and delete original object
- moving one object from region to another region
- To change object metadata.
- S3 allows to delete a single object or multiple object Max 1000 in a single HTTP request.
- Delete None Versioned object – need to provide object key and it will be deleted permanently.
- Delete Versioned object -If key is provided, object will be marked as deleted and return version Id in respond.
If key and version Id are provided, that specific version of object will be deleted permanently.
If the version ID maps to the delete marker of that object, S3 deletes the delete marker of that object and reappear in your bucket.
- Objects must be restored before you can access an archived object
- During this period, the storage cost for both the archive and the copy is charged
- Multipart upload allows the user to upload a single object as a set of parts.
- Multipart uploads supports 1 to 10000 parts and each part can be from 5MB to 5GB with last part size allowed to be less than 5MB
- Multipart uploads allows max upload size of 5TB
- Object parts can be uploaded independently and in any order. If part uploading fails, it can be re-uploaded without affecting other parts.
- After all parts of the object are uploaded and complete initiated, S3 assembles these parts and creates the object.
Advantages of multipart upload
- Improved throughput
- Quick recovery from any network issues
- Pause and resume object uploads
- Begin an upload before the final object size is known
Steps of multipart upload
- Initiation of a Multipart upload request to S3 returns a unique ID for each multipart upload.
- This ID needs to be provided for each part uploads
- Parts Upload
- A part number needs to be specified with each request with unique ID
- Upload Completion or Abort
- S3 creates an object by concatenating the parts in ascending order based on the part number and associates the metadata with the object
- Multipart Upload Completion request should include the unique upload ID
- S3 should receive a multipart upload completion or abort request else it will not delete the parts and storage would be charged
- All buckets and objects are by default private
- Pre-signed URLs allows user to be able to download or upload a specific object without requiring AWS security credentials or permissions
- Pre-signed urls are valid only till the expiration date & time
Virtual Hosted Style vs Path-Style Request
- Bucket name is not part of the domain
- URL will be http://s3-eu-west-1.amazonaws.com/mybucket/abc.mov
- Bucket name is part of the domain name in the URL
- Amazon S3 costs vary by Region
- Charges in S3 are as follow
- Storage – cost is per GB/month
- Requests – per request cost varies depending on the request type GET, PUT
- Data Transfer – data transfer in is free and data transfer out is charged per GB/month
S3 Restriction and Limitation
- Max 100 buckets can be in an account, if you want more, you need to contact AWS.
- Bucket name must be unique across all region, all AWS accounts.
- Name shout be min 3 characters and no more than 63 characters.
- Name should start with number or letter, not allow any other characters. rest will be lowercase letter, number, periods and hyphens.
- Name can’t be IP address, period and hyphens can’t follow each other.