What is BLOB Storage?
BLOB storage stands for Binary Large OBject storage and is very diverse in its use cases. A binary object can refer to images, audio and multimedia files literally anything similar to how files are stored on your PC/Laptop file system. Imagine if your file system now had a HTTP API in front that lets you upload and delete files that is essentially a BLOB store. Popular BLOB stores are AWS S3, Azure BLOB, MinIO, Ceph. Why use a BLOB store?
Content Delivery Network
A CDN is used to serve content closer to clients. The backend for the CDN is usually a BLOB storage that contains the assets that we want to be served. CDNs “warm-up” and cache the assets. Say we have images in our BLOB store and we point our CDN to it. The image will now be distributed across all of the CDNs POPs (Point of Presence). This way when a client hits from anywhere in the world they will get image served to them with low latency. Popular CDNs are Fastly, Amazon CloudFront, Azure CDN, Akamai, CloudFlare.
We are in the age of Big Data where we gather large amounts data on application usage, metrics and logs. The data is usually large, like multiple gigabytes in size or more in a 24 hour period. Normally an ingestion service will process the data and store it in a format (Parquet, Fastq) that can be later analysed for Analytical or Data Science purposes. These are usually referred to as Data Lakes due to their intended usage and support Hadoop based workloads.
BLOB stores can be used to share and receive data with third parties. Say you have a business partner and they want to share files. Restricted credentials can shared to allow the partner to upload data securely.
Application Storage Backends
Due to BLOB storage being essentially like a file system, many applications can use it as their storage backend. For example a Docker registry can store Docker images in S3 or Azure BLOB. Artefact stores such as Nexus and Artifactory can use BLOB storage to store binaries for deployments. BLOB stores that support versioning can even act as storage backends for version control systems. Another popular use case is Terraform state files.
For a more DevOps scenario BLOB stores can be used to store centralised configuration. Say we want to use a pull mechanism for large scale deployments. Configuration can be uploaded to a BLOB store and clients can pull on a periodic basis to update their local configurations or run local Ansible Playbooks.
Backups commonly come in the form of large (or small) binaries. These can be very easily be stored in a BLOB store and retrieved at a later date to restore during an outage. In the Kubernetes landscape Velero is a popular library used to backup and restore Kubernetes clusters in the event of disaster recovery.
Do you have a use case for a BLOB storage that is not covered above? Please share in the comments!