Solving Sleipnir's Problem

Vinnie leans back in his chair, hands behind his head. “Lessee if I got this straight. The computer’s muscles are its processors. It can have a bunch of them, different kinds for different jobs like a horse has different muscles for different moves. Computers got internal networks to connect the processors like a horse has tendons and ligaments. Me and Sy got a beef going about the bones, whether it’s data or memory ’cause nothing happens without both of ’em. That a good summary?”

“That’s about the size of it.”

“So what was that crack about some eight-legged horse being the most interesting case?”

Sleipnir image adapted from the Tjängvide runestone
from Wikimedia Commons under CC 4.0 license

Robert grabs a paper napkin. Coffee shop proprietor Al winces. “Consider the kangaroo. It has two legs and it uses both at the same time when it hops around. I’ll diagram its feet with 1 and 2 and color them both red, OK?”

“Kangaroo hopped through some red paint, gotcha.”

“A human has two feet and we alternate between them when we walk. Like this second pattern — red foot, blue foot, over and over. Then there’s your standard horse with four legs — many more possibilities, right? For one, the front pair and the back pair each can act like a simple walk but independently, like the third row here.”

Meanwhile, I’m fiddling with Old Reliable and find this video. “That’s a good description of the basic gait that the horsemen call the walk, no surprise.”

Vinnie’s looking at the video over my shoulder. “Huh! Look here at the trot. The front and rear legs on opposite sides work together but in-between the beat of the other pair. I suppose you’d draw it like this fourth sketch, right?”

“That’s the idea. I’m only keeping track of which feet get used at the same time or opposite times. I’m sure there are other combinations that don’t fit the two-color model.”

Vinnie’s still watching the video. “Say this one. The gallop is like it’s walking with its front feet and kangarooing off that beat with its back ones.”

“Well, there you go. On to my point. Sy, what’s a horse’s most important decision if it’s not going to trip up?”

“Which foot it’s going to move next, I suppose. Oh, I see where you’re going. Odin’s eight-legged horse would have a serious coordination problem — which legs to pair together and what order they’d work in.”

“Exactly. No surprise, a computer has the same coordination problem unless it’s extremely specialized. As soon as you have multiple tasks demanding service, yet another task has to direct traffic. That’s basically where operating systems come into play. An OS has low-level code that stands between the application programs and the hardware resources.”

“What’s it doing there besides getting in the way?”

“Simplifying things, Vinnie. You don’t want to recode your program or buy a new version of your spreadsheet software when you plug in a new hard drive. When your application issues a call to transfer some data to or from your hard drive, the OS translates that into bit-level instructions the hard drive understands. A different device from a different manufacturer probably uses different command bits. No problem, your OS satisfies your next I/O call with whatever instructions that device understands. But an OS does more than that.”

“Like what else?”

“Lots of things. Security, for one — it makes sure you’re authorized to logon and touch certain data. Network interfacing for another. But for system performance the critical OS functions involve choosing who gets how much resource to work with.”

“Like disk space? I keep hitting my limit in the Cloud.”

“The Cloud’s a whole ‘nother level of complicated, but yeah, like that. The OS addresses performance by managing CPU time, throttling back low-priority tasks to give more time to high-priority work.”

“How’s it know the difference?”

“Depends on the OS. Generally it boils down to a list of privileged program names and user-ids versus everyone else.”

“How’s it do the throttling?”

“That also depends on the OS. Some of them meter out time slices, others fiddle with dispatch priority. Tricky business.”

“Tricky as running an eight-legged horse.”

~~ Rich Olcott

Memories: The Corners of Your Mind

Vinnie doesn’t let go of a question. “OK, Robert, I got that a computer’s internal network is sorta like a horse’s sinews, tying muscle and bone together. An’ I got that a computer’s processors of whichever kind are like a horse’s muscles. But what does for a computer what bones do for a horse?”

“The ‘bones’ are a bit of a stretch, Vinnie. Data’s one possibility, memory or storage is the other one.”

Vinnie takes the bait. “Horse muscles move horse bones. The processors move data, so data’s got to be the bones.”

For the sake of argument, I come back. “But when the electricity turns off, the data goes away, right? Memory’s still there, so memory must be the bones. Or is it storage? What’s the difference between memory and storage?”

“You’ve put your finger on it, Sy — persistence. If the data’s retained when the power’s off, like on a hard drive, it’s in storage. Otherwise it’s in memory. Setting aside power glitches, of course — a bad glitch can even kill some kinds of storage and the data it’s holding, which is one reason for doing backups. As a general rule, memory is smaller, more expensive and much faster than storage so there’s a trade-off. If you want a lot of speed, load up on fast memory but it’ll cost you cash and resilience.”

“I’ll bet that’s where your special skills come in handy, right, Robert?”

“Pretty much, Vinnie. The trick is to get the right data into the right kind of memory at the right time.”

“The right kind…?”

“Ohhhyeah, there’s a whole hierarchy out there — on-chip memory essentially inside the processor, on-board memory on separate chips, off-board memory and storage…. It goes on all the way out to The Cloud if you’re set up that way. There’s even special memory for keeping track of which data is where in the other memories. The internal network plays into it, too — the data bus to a given memory could be just a byte wide or many times fatter, which makes a big difference in access speed. The hardware takes care of some data placement automatically, but a lot of it we can affect with the software. That’s mostly where I come in.”

Horse skeleton from Wikimedia Commons by CC license

“Doin’ what? The hardware’s pretty much what your boss already bought, not much you can tinker with there. The bits are zoomin’ around inside at electronic speeds, you can’t pick and choose where to put ’em.”

“Yes, we can, if we’re smart and careful. You know Michael Corleone’s line, ‘Keep your friends close but your enemies closer‘? With us it’s ‘Keep your next data byte close but your next program instruction closer.'”

The Memory Pyramid

“Whuzzat mean?”

“What you want to do is have bytes ready for the processor as soon as it’s ready to work with them. That means predicting which bytes it’ll want next and getting those to the top of the memory pyramid. Programs do a lot of short loops, enough that standard architectures have separate instruction memories just for that.”

“So how do you do that predicting? Like Vinnie said, things move fast in there.”

“You design for patterns. My favorite is sequential-and-discard. When you’re watching a movie you look at frames in series and you rarely go back. In the computer we deliver sequential bytes in an orderly manner to fast memory but we don’t have to worry about storing them back out again. Easy-peasy. Sequential-and-store is also highly predictable but then you have to down-copy, too.”

“Yeah, either way the data just flows through. What others?”

Periodic is useful if you can arrange your program and data to exploit it. If you know a just-used series of bytes are going to be relevant again soon, you try to reserve enough close-in memory to hold onto them. Data references tend to spread out but sometimes you can tilt the odds by clumping together related bytes that are likely to be used together — like all weather data for one location.”

“What if you don’t have any of those patterns?”

“Worst case scenario. You guess periodic, buy lots of memory and cross your fingers.”

~~ Rich Olcott

Computer Power, Or Not

A voice from the scone line behind me. “That’s like poetical, sayin’ a horse’s sinews tie muscle to bone and a computer’s internal network is like sinew ’cause it ties things together the same way. But what does for the computer what muscle and bone do for a horse? Hi, Robert, I’m Vinnie, me and Sy here go way back. I’ll have a strawberry scone, Al, and these guys are on me.”

“Sure thing, Vinnie, here ya go.”

“Thanks, Al.” “Thanks, Vinnie.” “Thanks — Vinnie, is it?”

“Yeah. Glad to meetcha. So what are they?”

“The computer equivalent of horse muscle and bone? Well, the horse’s muscle activity generates its power so the computer’s ‘muscles’ are clearly its processors.”

Horse musculature from artwork by Jenny Stout, with permission

“Processors, plural? My heavy-duty desk machine only has one CPU thingy in there, I looked.”

“Only one chip package, Sy, but there’s a lot inside that black block. Your ‘Central Processing Unit’ is probably multi-core, which means it has somewhere between four and dozens of more-or-less independent sub-processors, each with its own set of registers and maybe even local cache memories. If your operating system is multi-core-aware, at any given moment your system could be running a different program on each core.”

“Hey, you’re right, I often download emails and browse the internet at the same time I’ve got a big calculation going. Doesn’t seem to slow it down.”

“Mm-hm. I like to call those cores eccentric processing units because they’re not really central.” <Vinnie pretends to grab Robert’s scone.> “You’ve got a video card in there, too, right?”

“Of course.”

“This may come as a shock, but you probably have more raw compute power on that card than you do in your CPU module. The card’s primary chip has hundreds of millions of transistors allocated to hundreds or thousands of simple-minded micro-micro-processors ganged together to do identical calculations on separate inputs. Rotating a 3-D object, for example, requires four multiplications and an addition for each x-, y- and z-coordinate of every point on the object. No if-then logic, just a very small arithmetic program repeated a gazillion times.”

“So the main CPU doesn’t have to do that.”

“Right. Same principle, you may have ASICs in there devoted to certain tasks like network interfacing.”

“A-six?”

Application Specific Integrated Circuits. They’re everywhere from your smartphone to your hobby drone.”

“Don’t have a hobby drone, use mine for business.”

“OK, your business drone, Vinnie. Your drone and its controller both use ASICs.”

“How will the quantum computer play into this, Robert? I’ve been reading how it automagically tries all possible solutions and instantly comes up with the one that solves the problem.”

“That’s hype, mostly. Quantum computing could indeed give quick solutions, but to a very limited set of problems. For instance, everyone talks about factoring special large numbers. When QC succeeds in that it’ll disrupt internet security, cryptography, blockchain applications and a couple more not-here-yet technologies that depend on factorization being hard to do. But QC can only tackle problems that involve a small amount of data. It’s no good for Big Data kinds of problems like weather modeling or fingerprint matching or rummaging through a medical database to find the optimal treatment for a given collection of clinical findings.”

“Why’s that?”

“A quantum CPU works with a set of constraints and inputs. It does its tryeverything thing to generate an output that’s consistent with the constraints and input. The factorization constraint, for example, is just one algorithm. The input is a single number. The output is one set of factors. Compare that with the weather problem where the goal is to calculate the future weather for every kilometer-by-kilometer-by-kilometer cell of atmosphere on the globe. The constraint is a whole series of equations governing atmospheric gases (especially water) together with the topography of the underlying surface. Each cell’s input is all the measurable weather variables (temperature, humidity, wind velocity, clouds, whatever) plus history for that cell and its neighbors. The output per time-step is predicted weather variables for a billion cells. Quantum’s no help with that data flood — you need good networks.”

~~ Rich Olcott

The Lengths We Go To

A new face in the scone line at Al’s coffee shop. “Morning. I’m Sy Moire, free-lance physicist and Al’s steadiest customer. And you’re…?”

“Robert Tobanu, newest Computer Science post-doc on Dr Hanneken’s team. He needed some help improving the performance of their program suite.”

“Can’t he just buy a faster computer?”

“He could if there is a faster computer, if his grant could afford its price tag, and if it’s faster in the way he needs to solve our problems. My job is to squeeze the most out of what we’ve got on the floor.”

“I didn’t realize that different kinds of problem need different kinds of computer. I just see ratings in terms of mega-somethings per second and that’s it.”

“Horse racing.”

“Beg pardon?”

“Horse speed-ratings come from which horse wins the race. Do you bet on the one with the strongest muscles? The one with the fastest out-of-the-box time? The best endurance? How about Odin’s fabulous eight-legged horse?”

“Any of the above, I suppose, except for the eight-legged one. What’s this got to do with computers?”

“Actually, eight-legged Sleipnir is the most interesting example. But my point is, just saying ‘This is a 38-mph horse‘ leaves a lot of variables up for discussion. It doesn’t tell you how much better the horse would do with a more-skilled jockey. It doesn’t say how much worse the horse would do pulling a racing sulky or a fully-loaded Conestoga. And then there’s the dash-versus-marathon aspect.”

“I’m thinking about Odin’s horse — power from doubled-up legs would be a big positive in a pulling contest, but you’d think they’d just get in the jockey’s way during a quarter-mile dash.”

“Absolutely. All of that’s why I think computer speed ratings belong in marketing brochures, not in engineering papers. ‘MIPS‘ is supposed to mean ‘Millions of Instructions Per Second‘ but it’s actually closer to ‘Misleading Indication of Processor Speed.'”

“How do they get those ratings in the first place? Surely no-one sat there and actually counted instructions as the thing was running.”

“Of course not. Well, mostly not. Everything’s in comparison to an ancient base-case system that everyone agreed to rate at 1.0 MIPS. There’s a collection of benchmark programs you’re supposed to run under ‘standard‘ conditions. A system that runs that benchmark in one-tenth the base-case time is rated at 10 MIPS and so on.”

“I heard voice-quotes around ‘standard.’ Conditions aren’t standard?”

“No more than racing conditions are ever standard. Sunny or wet weather, short-track, long-track, steeplechase, turf, dirt, plastic, full-card or two-horse pair-up — for every condition there are horses well-suited to it and many that aren’t. Same thing for benchmarks and computer systems.”

“That many different kinds of computers? I thought ‘CPU‘ was it.”

Horse photo by Helena Lopes on Unsplash

“Hardly. With horses it’s ‘muscle, bone and sinew.’ With computers it’s ‘processor, storage and network.’ In many cases network makes or breaks the numbers.”

“Network? Yeah, I got a lot faster internet response when I switched from phone-line to cable, but that didn’t make any difference to things like sorting or computation that run just within my system.”

“Sure, the external network impacts your upload and download performance, but I’m talking about the internal network that transports data between your memories and your processors. If transport’s not fast enough you’re wasting cycles. Four decades ago when the Cray-1’s 12.5-nanosecond cycle time was the fastest thing afloat, the company bragged that it had no wire more than a meter long, Guess why.”

“Does speed-of-light play into it?”

“Well hit. Lightspeed in vacuum is 0.3 meters per nanosecond. Along a copper wire it’s about 2/3 of that, so a signal takes about 5 nanoseconds each way to traverse a meter-long wire. Meanwhile, the machine’s working away at 12.5 nanoseconds per cycle. If it’s lucky and there’s no delay at the other end, the processor burns a whole cycle between making a memory request and getting the bits it asked for. Designers have invented all sorts of tricks to get those channels as short as possible.”

“OK, I get that the internal network’s important. Now, about that eight-legged horse…”

~~ Rich Olcott

  • Thanks to Richard Meeks for asking an instigating question.