Praise for the blobstore 📯

The phrase “blobstore” refers to a simple key-value store, optimized for large values. Amazon’s S3Google Cloud Storage, App Engine’s Blobstore and Twitter’s Blobstore are examples.

I’m writing to highlight what appears to be a fundamental role played by the blobstore pattern. This may be aspirational, but I’m curious to explore.

Caveat: I only have experience using blobstores and the related infra I mention below, eg CDNs, source control, document stores, CI, etc, not maintaining this infra.

I’ve bolded references to other key pieces of infrastructure.


This is one of the more primitive, intuitive applications. Push a local file to the store and then retrieve it:

$ curl -T my_file
$ curl

Note how amenable the store is to a RESTful API. Related, many of the connections made in this post remind me of CouchDB.

Custom domains

Enable custom domains and we have general hosting:

$ curl -H "x-custom-domain:"
$ curl

Artifact store

Push executable blobs into the store and we have an artifact store, like Twitter’s Packer and Google’s MPM.


Put a cache in front of the store and we have a CDN. Use SSDs for hosting and we may not need a cache. Or use an external CDN and the blobstore as an “origin server”.


Key-value stores scale well because they require relatively little coordination: a key maps simply to a host and doesn’t need to be read before being written.

We improve hosting scalability by building on a key-value pattern, for example.

Document store

For better or worse, a blobstore is only aware of simple keys, so we can’t read into the values.

However, index the value and we have a document store (eg AWS DocumentDB). We can keep the blobstore pure by defining the document store as standalone infrastructure and depending on an event bus (eg Google Cloud Pub/Sub) and execution service (eg Google Cloud Functions):

  1. Document store subscribes to blobstore push events via event bus associating an executable
  2. Client pushes value to blobstore
  3. Blobstore writes value and fires event, eg “push ”
  4. Document store processes the path as described above

Now we can query values:

$ curl -d '{"k":"v"}'
$ curl"k>t"
> v


Using an approach similar to building a document store, we can process the values stored in blobstore to populate a search index.

Source control

Put a review process and commit log in front of writes to the blobstore and you have scalable source controlUse a blobstore for Git’s object store, and you get scalable replication. Use a document-aware source control mechanism, like Google, and you get a scalable mono-repo.

Blobstore data is often immutable and versioned, so commits could simply be references to file versions.


Subscribe to push events from source control, using an event bus and execution service and we have CI. We can listen for success events from our subscriber to get pre-submit quality control. We can write artifacts back to the blobstore as part of a continuous deploy pipeline.


We’ve already described a few cases of stream processing via an event bus, but we can also periodically iterate over values and stream them through an execution service to rebuild indices, produce artifacts, normalize data, etc.

The weird world of observable keys 🔑👀

The get that keeps on getting.


Given a bus:

import Foundation
struct Value : Event {
  let key: String
  let val: Any?
protocol Store {
  func get(_ key: String)
  func set(_ key: String, _ val: Any?)
class LocalStore : Store {
  let db: UserDefaults
  let bus: Bus
  init(db: UserDefaults, bus: Bus){
    self.db = db
    self.bus = bus
  func get(_ key: String) {
    let val = db.object(forKey: key) key, val: val))
  func set(_ key: String, _ val: Any?){
    db.set(val, forKey: key) key, val: val))
let bus = Bus()
let local = LocalStore(db: UserDefaults.standard, bus: bus)
local.set("foo", "bar")

A couple features I like:

  • The bus provides a consistent interface for adding and removing subscriptions
  • There’s a straightforward get method to flush a value into the bus


If rationalizing events bubbling up from UI and disparate data sources is challenging, normalizing to a single bus interface may be helpful.