S3 stands for Simple Storage Service.
It is Amazon’s object-based storage service for uploading data files. It is not ideal for installing an operating system, for instance. It’s utilised in apps like Dropbox, which allow users to store all types of files in a Cloud-based filesystem.
The storage capacity is effectively unlimited, with files ranging in size from 0 bytes to 5 terabytes. On the back end, Amazon will continue to expand its storage capacity.
When you upload a file to S3, it will be saved in a Bucket. A bucket is similar to a folder or repository in that it will house all of your subfolders and files. Because S3 is a worldwide namespace, each S3 bucket you create must have a unique name that no one else in the world is using. Because your bucket name is part of your S3 URL, which must be unique, this is required.
Url Format: https://s3-<region>.amazonaws.com/<bucketname>
e.g.: https://s3-eu-west-2.amazonaws.com/mybucketname
Reading and Writing Data to and from S3
When you upload a new file to S3, it has Read after Write consistency.
This means that once your file has been successfully written to S3, it is instantaneously accessible from anywhere on the world.
However, if you are updating or deleting a file, S3 has Eventual consistency.
This means that the update or deletion may take some time to occur, and depending on your location, you may or may not notice the changes right away. This is because, for example, if your S3 file is cached on an edge site, it may take some time for the update or deletion to propagate through to all of these places.
Data stored in S3 can be held in one of 4 Storage Classes.
- Standard S3 — Standard S3 – The most durable storage class, this should be used for regularly visited data that need to be accessed quickly.
- S3 Infrequently Accessed — As the name implies, this is for files that are accessed infrequently but that you still need.
- S3 Redundancy Storage — for data that can be replicated (and isn’t a big deal if it’s lost), such as thumbnails or auto-generated documents.
- Glacier — This is the capability of archiving data. You can store your data here for a very low price, however data access can take up to 3–5 hours.
Files are stored with the following attributes:
- Key — the filename
- Value — the file data
- Version ID — if versioning is turned on, you can keep track of the current version. Versioning will store all versions of an object (even if you delete) so can be useful for backups and can be used in conjunction with MFA.
- Metadata — data about upload times, last accessed etc.
Data Lifecycle Management
As your data grows older in S3, you may configure it to move through each of the Storage Classes to save money. So, for example, old data that has never been read can be stored in Glacier.
Current and past versions of files can be managed via lifecycle management. After at least 30 days, you can migrate files into S3 IA, and then into Glacier after another 30 days in IA. These figures are all user-configurable, however they must be at least 30 days long. That is, your file must spend at least 60 days in S3 + IA before being moved to Glacier.