Digilib And Image Processing

I am following a thread on the IIIF Google group forum, ruminating on how IIIF and the Image API might support more advanced image processing. I am probably mis-characterizing, or reading too much into the conversation a bit, but something interesting to me emerged from some of the early comments.

There was the acknowledgement that as stewards of digital images, looking to the future, it’s likely that we will start undertaking image processing - OCR, classification, etc. – on the images we have at our disposal. Perhaps for metadata enrichment, digital humanities work, the possibilities are extensive. IIIF, and the Image API, provide an excellent and standardized way to access images. Image processing is helped by preparing images in particular ways, such as converting to grayscale to help detect nodes and edges, that IIIF might be able to help with. What if the API, in addition to rotating, scaling, selecting, and some limited color options, could help facilitate image processing of our visual resources?

This conversation has been fascinating on many levels.

Robert Casties responded to the thread, pointing out that the project digilib has some methods and functionality that would do just such things. I wasn’t familiar with digilib, but what a neat project! Appears to be out of Germany, dating back to the early to mid 2000’s. In many ways, it mirrors the IIIF ecosystem of image servers, and standardized APIs for requesting these images. Details drift and overlap here and there, but it’s devilishly similar to image servers such as Loris (which we use here at Wayne) or Canteloupe.

IIIF has what it calls the “Image API”, the particular GET parameters used to request images. Digilib appears to have something called the “Scaler API” that does the same. Digilib appears to also support IIIF, perhaps an update to a project that seems to pre-date the IIIF movement, that acknowledges the increasing prevlance of IIIF in the digital repository spheres.

Though I’ve yet to install or interact with digilib, something deep in the fingers and toes tells me I like it. It has a page called, “Ancient History”, in German, which makes sense given where digilib was engendered. In principle and architecture, it very much mirrors what I have found so appealing about IIIF when I first stumbled on it in 2011 or 2012. This “Ancient History” page dates this project back the late 1990’s, where this kind of thinking for serving digital images online was pretty revolutionary.

I’ve strayed a bit from the original impetus for penning this post, that being ruminating on how standards like IIIF can support downstream image processing, but as I like to say, that’s okay! It’s been a fascinating thread to follow, and I’m hoping more will weigh in on how they envision emerging image delivery standards can help get these images into machine learning environements.



At least 10-12 inches of snow are falling outside the window, and I spent the evening moving between various states of slipping in snow, to sliding in cars, to overheating indoors. And throughout these transitions of state, the term “rewiring” is flickering through my thoughts.

The venerable online etymological dictionary has an entry for “wire” that states,

Old English wir “metal drawn out into a fine thread,” from Proto-Germanic *wira- (source also of Old Norse viravirka “filigree work,” Swedish vira “to twist,” Old High German wiara “fine gold work”), from PIE *wei- (1) “to turn, twist, plait” (source also of Old Irish fiar, Welsh gwyr “bent, crooked;” Latin viere “to bend, twist,” viriæ “bracelets,” of Celtic origin).

Like working glass from a rod to an ornate and beautiful capture of heat and time, these definitions suggest that modest wire is a tangible expression of effort, or, a platted braid of circumstance (think of hanging cords, twisted around from radiating vents or curious cats, turned into something resembling the industrious form of wire).

Wire is dangerous. Wire is necessary. And wire is precise. This is no hip-shooting, boat wrangling rope rodeo. No, wire is the conduit of the 21st century’s most precision modes of communication, that by which the finest of movements are illicited from airplane rudders, and much more. Wire is a forgotten pillar of our modern infrastructure, with roots back to aesthetically pleasing arrangements of precious metals.

Rewiring, then, must be a reversal of some precise or circumstantial expression. At least temporarily to make way for a new pattern. Rewiring is slow. Rewiring is understanding potentially complex networks. And rewiring is permanent. It’s rare that we rewire a house, pull out the old, install the new, and ever rerewire the house with the old. Sure, importantly, that network of old wire can be reused, repurposed, or revered, but it’s unlikely it will ever power that same rotary dialer again.

And so, “rewiring” represents to me evolution. Irreversible change. Which isn’t always a bad thing.

False Conditions

Was reminded this morning of a lesson that drifts in and out of working on systems with lots of moving parts: all improvements are inextricably based to the current condition of supporting infrastructure.

Said another way: anything you do, anything you change, is probably based on information available to you at the time.

But this isn’t the lesson. The lesson is that that kind of decision making is often flawed. I’m sure this is of no surprise to many, but I’m uncomfortable how each iteration of an improvement to a particular part of the system brings this same lesson home. Fool me once, shame on you, fool me twice, yada yada.

A concrete example might help.

We have series of pipes and routes in our server-side API that abstracts routes for images from our IIIF-based Loris image server. So, we can ask for http://foo.bar/item/goober:tronic/thumbnail and get back a thumbnail at the more complicated URL path, http://foo.bar/loris/fedora|goober:tronic/full/full/0/default.png. The latter is not semantically meaningful to many, and contains hardcoded infrastructure such as loris in the URL. Our image server may change, and our goal is to have Cool URIs for things like thumbnails, metadata, etc. As always on this blog, over-simplification for the sake of idea exploring.

Recently, we had the rare and supremely delicious surprise of server kernel patching improving server-side rendering of images in our python-based image server, Loris. Dramatically. We are still exploring precisely what explains the speed increase (perhaps fodder for another post), but suffice it to say, it’s great. However, thumbnails started to break. The reason, one of our image proxies was streaming the results with the requests python library. When speeds / rendering / IO was slower on the server, this sped up the load time for thumbnails. But when the server speed increased, it revealed what I’m assuming was some kind of race-condition as the bits jumped through these proxied hoopes. Again, this is all speculation at this point, but the fact remains that removing the streaming flag from a particular request has fixed the problem, and, the thumbnails load even faster.

Our original design to stream a response sped up the load-time with a particular set of server conditions. Now that the conditions have changed, that decision is no longer correct. How interesting, that a decision once correct, becomes flawed over the passage of time. Such is a day in the life of managing a system with lots of moving parts.