What are the greatest perils of digital preservation?

  • the total collapse of our modern digital infrastructure, vanishing our digital artifacts and memories in a single, fell swoop?
  • small-scale hard drive failures and format obsolescence, surgically and quietly rendering our files inaccessible?
  • forgetting a single digital object during a software / hardware migration?

I had a near miss with the last one recently. It was a snowflake of an object. Without naming titles or identifiers, it was a book scanned as a one-off digital object. An important, interesting, and culturally valuable book. And this is precisely why it got lost in the shuffle.

During migration, or even general record-keeping, auditing, and intellectual control, we focus on the big collections. Or, we cut our teeth on the small ones, working up to the big push (makes sense when, perhaps, one collection is literally 1,000x larger than small ones). We measure our success in achieiving 100% migration rate – both in quantity and fidelity – in groups, “got 2293/2293 for that collection, 422/422 for the other, and 16/16 for that little tyke over there,” and so on, and so forth.

But what about those other objects that have made it into our purview and custody? The objects that have no collection, that have no measurement of quantity outside of their self-reflexive parity? Those are the ones at risk.

I have likened it to “hitching your wagon” to a known entity. Or “safety in numbers.” The list goes on. The moment we tether an object to another, preferably a bunch, they benefit from the visibility of the herd.

The original files for the object were always safe, but all the work that went into ingest, creating derivatives, modeling for shifting platforms, would have all been lost. Not to mention any additional content, metadata, or insight that might have accompanied the object as it vinted as a digital object.

I never did write-up our conversion from single-object ebooks to ebooks that are modeled as multiple-objects, but it was quite an undertaking. Not only did this object in question not belong to a collection, but once it had missed the ebook migration, it had two strikes against it. It not longer registered in QA and auditing as an “ebook”; instead, drifting into the tepid abyss of non-intellectually controlled items.

Do you “have” an object if not controlled?

Is every connection to a collection or another object, distinction in an otherwise entropic stew of files on a server?

There are all kinds of safeguards and practices against “misplacing” a digital object like this, but in some way, don’t they all involved tethering? Even if but a sliver of metadata that reads, “I am object, hear me roar”?