Let’s kick off this blog with a technical post.
On my FreeBSD/amd64 desktop system, I often swap out a lot of files to external hard drives, after archiving them to Verbatim DVD-Rs (made in Taiwan, I avoid those made in India like the plague, as they failed me in the past). Just to be on the safe(r) side, I also append additional error-correcting Reed-Solomon codes to the ISO images with dvdisaster, which you should really evaluate, as even good media is bound to degrade sooner than later. Anyway, all this is boring sysadmin backup and restore routine. But hang on, now it gets interesting:
Some files I deem so important that I actually save them to two external HDDs, just in case disaster strikes. And because I’m the impatient kind of guy, I usually copy them simultaneously instead of sequentially. Now, that may not be the most prudent thing to do (say e.g. lightning strikes and takes both external drives with it), but as I already have copies on DVD-Rs, I could still recover from that, albeit with a lot of effort. Yet, had I been more conservative and used the sequential approach (plugging one drive in, copying the files, plugging it out and moving it out of harm’s way, plugging the other drive in, copying the files again, plugging that other drive out and so on), I’d never have noticed the following strange phenomenon:
Computer systems are full of unintended self-synchronizations.
Want to see that in action? Alright! Try this:
- Put two HDDs into USB enclosures. Ideally, both HDDs would have the same technical specs (rpm, bus speed etc.), but that’s not really necessary, since the USB link is the bottleneck. However, both enclosures should use the same USB speed (i.e. not one USB 1.0 and the other USB 2.0) talking with the host.
- Insert both enclosures into two USB ports.
- If not already done, put a file system on both drives and mount both drives (say, to mount points /mnt1 and /mnt2).
- Open two xterm windows and place them side by side.
- Use rsync -av to copy a directory structure from /path/to/source to /mnt1 and /mnt2. So type at the prompt in both xterms the following, but don’t hit Enter yet:
term1% rsync -av /path/to/source /mnt1
term2% rsync -av /path/to/source /mnt2
- When both commands are ready, hit Enter in the first xterm, and wait a couple of seconds.
- After the first 10 or so copied files, hit Enter in the second xterm.
- Now observe both xterms, and enjoy the paths of files copied scrolling by.
If you try this, you’ll notice that both rsync processes are first out of sync. No wonder here: one started many seconds after the other. But a couple of seconds later, both processes magically become synchronized, and stay synchronized until all files have been copied!
There are a lot of reasons for this unintended synchronism, and you may discuss them in the comments section.
But no matter what the real cause in this special case is, there’s an important lesson to be learned here: in relatively tightly coupled systems like inside an operating system, or even in loosely coupled systems like across a whole network of computers, it is always very important to intentionally (try to) break those unintended synchronisms, lest they become a liability.
Even if the synchronism shown in this post is actually useful (it efficiently reuses the buffer cache, i.e. data is not read twice from the source disk), synchronized loads are usually a Bad Thing(tm). In computer networks, one synchronism could e.g. be cron(1)-induced. Imagine 100,000+ little home routers trying to access an NTP daemon or a relatively small pool of NTP deamons at (nearly) the same time. Can you hear the thundering herd hammering on the poor ntpd machines? Right! Or, less dramatic but no less effective: mirrors trying to update themselves from a master server, but all at 3:00am UTC?
Intentionally introducing random delays to break up synchronisms is not difficult, but is all too often neglected. Perhaps cron(1) could grow an extra option to randomize periodic tasks up to a configurable time interval, like the -j and -J jitter options in FreeBSD’s cron?