Strange Attractors Part 1

A post, in two parts.

Part 1: Strange Attractors, and their strangely attractive backstory

I recently had the pleasure of driving to Texas and back, and while on the road, partook in some podcasts. One of them was called, “Stuff You Should Know”, and dedicated an episode to Chaos Theory. They did an admirable job with a slippery idea, and while I took away some new insights, this topic was not new to me.

No, I’ve been a fan / advocate / evagnelist / lunatic / devotee of Chaos Theory since the heady days of 2001. Early in my undergraduate forays into math and physics, I convinced a professor to let me explore the interesting ramifications of nonlinear and fractal geometry. Before going any further, if you’ve searched for “fractal geometry” or even just “fractals” on the internet, you’ve gotten some websites that looked like they emerged from the 1990’s. You see, the people that love fractals have a need to share their insights. I was firmly in this camp.

I graphed henons in QBASIC, porting code from the appendix of – gasp – print books. I drew sierpinski triangles everywhere. I used “bifurcate” every chance I could. It was a wild high, and I chased that tempestuous beast into the darkest of nights.

I became fascinated by coastlines: did you know they are infinitely long? On a map, the circuitous circumference around an island nation like Iceland may look to be… ~1,000 miles? But in reality, we can dive into any fjord or inlet, crouch low and begin poking around in the rocks that roll gently in the lapping tides. Where is the coastline? Is it before or after that pebble? If that pebble is included, don’t we have to include it in the total distance of the island perimeter? Yes, you most certianly do, as a responsible member of society, a reasonable steward of information.

This leaves us with an uncomfortable truth that “edges” and “boundaries” are often much more complex than we anticipated. Also revealed from fractal geometery, the same complex gravitational forces that create a valley of flour in your bowl when an egg is cracked into it, are the same that shape the timeless Grand Canyon (at least, many are shared - in all likelihood, your flour valleys have not been whipped by wind and rain for millenia). But the point remains: brocoli looks like trees, and a bunch of geese flying look suspiciously like reflections of light on lazily undulating waves.

And yet there is so much more to this story, which again, I must commend the hosts of that podcate for nobly wading through, with patience and a singular sense of where they are in the discussion. Restating known things can be useful for honing intuition, and setting the stage. So to explore the strange attraction of Strange Attractors, we must go back to Sir Isaac Newton, King Oscar II of Sweden, and the “n-body-problem”.

Newton could predict apples falling, and was having good success predicting comets and cannonballs. So what about predicting the location of a handful of planets 10, 100, 1000 years out? Not so much:

“Knowing three orbital positions of a planet’s orbit – positions obtained by Sir Isaac Newton (1643-1727) from astronomer John Flamsteed[6] – Newton was able to produce an equation by straightforward analytical geometry, to predict a planet’s motion; i.e., to give its orbital properties: position, orbital diameter, period and orbital velocity.[7] Having done so, he and others soon discovered over the course of a few years, those equations of motion did not predict some orbits very well or even correctly.[8] Newton realized it was because gravitational interactive forces amongst all the planets was affecting all their orbits.”

And here, the badlands of Chaos Theory and nonlinear dynamics peeked forth. Scientists and mathematicians started to understand that the gravitational pull from each planet was simultanesouly pulling on one another. Each moment, or “iteration”, magnified any variances or inaccuracies in the initial measurements that were plugged into the equations. If a cue ball hitting three pool balls with speed x is turned into equations, we can predict where they will be after 1000 “moments”. However, if speed x is ever so slightly different, those pool balls will be in wildly different places just a few “moments” later. This is a gross over-simplification of the n-body-problem, but it gets at the heart of it.

King Oscar II held a contest for anyone who could solve this problem. As I didn’t intend to delve too deeply into the history, but instead muse on what it might mean in the present, I continue to muddy and gloss over the finer points of this. This article explores the contest and solution in much better detail. But for our purposes here, Poincaré pointed out that it was unsolvable, and won the prize. He proved that a perfect predition relied on infinitely accurate measurements, and we know that’s not possible. And Chaos Theory was born.

Fast forward lots of years, and we’re finally getting back to Strange Attractors, and the meteorologist Edward Lorenz stumbld on this very problem while trying to distill complex equations around thermal dynamics to a simple form. This passage from Wikipedia sums it up nicely,

“Minute variations in the initial values of variables in his twelve-variable computer weather model (c. 1960, running on an LGP-30 desk computer) would result in grossly divergent weather patterns.[2] This sensitive dependence on initial conditions came to be known as the butterfly effect (it also meant that weather predictions from more than about a week out are generally fairly inaccurate).[13]”

And here is what he graphed:

The hands grow cold, the coffee cup has tipped past halfway. Where are we in this discussion? Those dark spots in the graph above, those are the strange attractors we’ve danced around. Why does this matter? Why is interesting? How is this related to coastlines, pebbles, brocoli, and eggs? That, is fodder for another post…

Bag Validation

bag_validation.png

Something didn’t happen today, and it was incredibly validating. No validation pun intended here. Honestly.

We’ve been ingesting ebooks into our digital collections platform for quite some time now, and one perennial problem we face is that of malformed books. This can come in many forms, and it hinges on how we structure our digital ebooks. At a very high level, each physical page gets about five digital representations:

  • the page image, as a TIFF file
  • raw text from page, as a text file
  • coordinates of words on the page, as ALTOXML file
  • PDF of the page, with words overlaying the page image
  • HTML of the page, including some limited layout (this one is not great, may be deprecated soon)

So, for a 100 page book, you might – should – end up with 500 distinct files: 001.tif, 001.pdf, 001.xml, 001.txt, 001.html, 002.tif, you get the point.

Before we had any ebook specific checks in place, it was not uncommon for a book to enter the ingest workflow missing a singular PDF, XML, or other file relating to a page. Or even, an entire page (001,002,004, no 003). This would result in ebooks ingested, but missing key components that would rile things down the pipeline in highly annoying and unforeseen ways. From a preservation standpoint, it was also not ideal to allow missing derivatives to slip through.

But those days are mostly over. We have included some bag validation for each different kind of content type in our digital collections, that look for specific properties. For example, an Image object should roll through with an original image, a JP2 derivative, a thumbnail, etc. If one of those is missing, it fails the validation on that datastream. For ebooks, we’re looking for parity of derivative file counts for each page. If a page comes through missing something, we get notified in our Ingest Workspace (fodder for another post), and that bag (object) is prevented from getting ingested.

In this way, we can let 44/45 perfectly good books get ingested, then diagnose the rogue baddie. The wheels of digitization and access roll on, and we identify things that need fixing. The screenshot above shows this check firing on a batch today, long since forgetting about putting it in place. Great huzzahs!

Ribald

Long ago, and far away, I had a thousand other blogs, all lost to the sands of time on the internet. Well, not lost per say, more abandoned as I realized I would not be able to faithfully shepard them along the winding roads of internet time. My goal was to consolidate here, hunkering down in the welcoming leaves of markdown and GitHub.

In the shuffle, however, I lost a blog I was most fond of, a simple “Word of the Day” or “For the Word”, or something along those lines. It was posts dedicated to a single word, a meditation on the wonderful acorns of knowledge and enshrined history that exist within a single word. And so, upon reading a word last night that fit the bill, I’d like to bring it back here. And so, without further ado…

Ribald

The impetus for blogging about this word can be traced back to a recent trip to Denton, TX to visit an old friend. While there, we saw the legendary performer Paul Slavens perform. He takes money from the audience and makes up songs on the spot based on song titles or themes they scribble on bar napkins.

Some estimates have him at more than 2,000 improvised songs in the last 20 years. So, needless to say, there are plenty of examples on YouTube. But, despite what you might see in the following video, it’s hard to capture his charm and wit that exists between songs; the real appeal and virtuoso.


Wikipedia says this about him,

“In the mid 90s Slavens began creating improvisational songs based on audience suggestions, and has created an estimated 2000 songs over the last 2 decades, many recordings exist, although few have been released. Often these songs are humorous in nature and can be quite ribald.”

And such was my (re)acquaintance with ribald. Without any objective definition, I knew precisely what this word meant. How is that possible? Moreover, can a definition ever replace this initial correlation of ribald that I now hold?

In Philosophical Investigations, Wittgenstein opens up with a quote from St. Augustine,

“The individual words in language name objects—sentences are combinations of such names. In this picture of language we find the roots of the following idea: Every word has a meaning. This meaning is correlated with the word. It is the object for which the word stands.”

But, interestingly, immediately begins to push against this idea, suggesting it’s a far too simplistic understanding of language, and by proxy, words. I think it’s safe to assume that Wittgenstein would support the idea that a word does not have a single meaning, but is actually given meaning from context, learning, and much more.

And so, returning to ribald, this word was perfectly defined for me through a personal experience and the persona of Paul Slavens. Now, sure, of course, I realize there is an agreed upon definition of ribald. From the venerable OED, for ribald as noun:

“1. a. In the medieval period: a person of low social status, esp. regarded as worthless or good-for-nothing; a rascal, vagabond. Also as a form of address. Now arch. or hist.

“2. A foul-mouthed or blasphemous person; one who uses offensive, irreverent, or scurrilous language; one who jeers or jokes in a rude or lewd way. Now rare.”

“3. A promiscuous or loose woman; a wanton, a harlot. Obs.”

“4. A wicked, dissolute, or licentious person; a villain. Now arch. and regional (Sc.).”

and so on, and so forth. Also from the OED, for ribald as adjective:

“1. Of a person or persons: (in early use) lewd, coarse, or licentious in language or behaviour; deliberately and offensively abusive or impious; (now usually in weakened sense) given to bawdy, vulgar, or irreverent talk or behaviour; amusingly rude.”

“2. Of language, humour, etc.: coarse, vulgar, scurrilous, irreverent; (subsequently esp.) referring to sexual matters in an amusingly rude or irreverent way. Now the most common sense.”

of which appears to be much more common.

So we have these definitions, and they are, unsurprinsgly, expansive to say the least. We love words like this; forged in the streets of pre-industrial London, tumbled around during the bawdy – notice the similar ‘ald’, ‘awd’ sounds, coincidence? – early 1900’s. And yet, with all that history and lyrical verse dedicated to this word, I meet it halfway with my intuited, pop-culture, internal definition.

It reminds me of work we’re doing with objects in digital repositories. There is a tension between front-ends that extract disconnected information from disparate sources, reconstituting client-side for a conceptual whole, vs. opinionated server-side models that pull some suggestions for stylings from here and there, but for the most part “push” or impell themselves through a series of pasta makers (I’m consciously choosing to move away from meat-based metaphors, you know, for the planet). The net effect is often the same to the unaware user, but the mechinations that move the system are fundamentally different in their approach. Both have pros and cons. And relevant to this discussion, there is very rarely an ideal state that adheres entirely to one philosophy or another. These systems we build and work with are muddy, confused over time, and contorted to work in the real word.

Much like words; those great and wonderful puzzles.

ps. all typos and mis-spellings are my own, no editing has been performed.