What's Past is Prologue

If you want to build something new, it's best to start with history. I am a firm believer in that. The best systems we have at our disposal today, software or otherwise, have evolved over years, decades, and centuries from simple beginnings. Rarely is a new system designed and built from scratch in a short time, and even then the successful ones borrow liberally from the successful systems that came before them.

Of Software Systems and Planets


A few months ago I read an article from David A. Dalrymple, whose opinion seemed to contradict these ideas when it comes to software systems. I found it rather curious, and it's been simmering in the back of my mind since then. I don't intend for this article to be directed at David or to discount the notion that we should strive to improve the systems we work with. I think many people share his viewpoint, and I offer this as another perspective on how computing history has developed, using some quotations from his article to guide the debate. The crux of his article is laid out relatively early (emphasis is David's):
This is the context in which the programming language (PL) and the operating system (OS) were invented. The year was 1955. Almost everything since then has been window dressing (so to speak). In this essay, I’m going to tell you my perspective on the PL and the OS, and the six other things since then which I consider significant improvements, which have made it into software practice, and which are neither algorithms nor data structures (but rather system concepts). Despite those and other incremental changes, to this day, we work exclusively within software environments which can definitely be considered programming languages and operating systems, in exactly the same sense as those phrases were used almost 60 years ago. My position is:
  • Frankly, this is backward, and we ought to admit it.
  • Most of this stuff was invented by people who had a lot less knowledge and experience with computing than we have accumulated today. All of it was invented by people: mortal, fallible humans like you and me who were just trying to make something work. With a solid historical perspective we can dare to do better. 
I have a problem with this line of thought. People coming from this position think that most of what we're using now is suboptimal because the people that created it didn't know what they were doing. The logical conclusion would be, now that we have decades of experience with computing, we should throw the old stuff out and create whole new systems that do things better. The problem is that there is no guarantee that such an undertaking would result in anything substantially better than what we have now. Robust systems are not developed this way. They grow and evolve over time, as John Gall said:
A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.
No other field of study or technology does this—throws out a large portion of what has been developed to do it again from scratch—because it does not work. Complex systems evolve from simple systems. Advanced ideas grow out of the extension and combination of established ideas. To get a different perspective on the development of ideas over time, let's take a moment and think about the planets.

In ancient times Ptolemy published the Almegist, laying out the motion of the planets in a geocentric universe where the sun, planets, and stars moved around the Earth on the surfaces of concentric spheres. This model held for many hundreds of years until 1543, when Copernicus published a heliocentric model of the universe that improved and simplified the predictions of planetary motion. It put the sun at the center of the universe and moved Earth out to the third orbiting planet.

Most people didn't give the heliocentric model much thought until Galileo invented the telescope and started making observations of the phases of Venus in 1610. The phases of Venus strongly disproved a pure geocentric model of the planets and paved the way, with much disagreement from the Church, for the heliocentric model. Kepler further developed the model with his three laws of planetary motion, showing that the planets actually trace out ellipses instead of circles, with defined properties of their motion around their orbits.

Isaac Newton unified the motion of the planets with the motion of objects on Earth with his universal law of gravity and the laws of motion. His laws and Kepler's laws of planetary motion led to the discovery of Neptune when the orbit of Uranus was found to be irregular and not fully described by these laws. Urbain Le Verrier calculated exactly where the new planet must be based on Uranus' orbit, and the new planet was found by Johann Gottfried Galle. This was a huge triumph of Kepler's and Newton's laws.

This model was not good enough to explain some peculiarities in the orbit of Mercury. Mercury is close enough to the Sun to experience relativistic effects of gravitation that we can observe with high-precision measurements. It wasn't until Einstein came up with his general theory of relativity that these deviations could be explained.

Over the course of centuries a great number of people developed and refined our model of the motion of the planets. At each step incremental changes were made that built on the work that came before. Indeed it was Isaac Newton that said "If I have seen further it is by standing on the shoulders of giants." If Newton were born today and grew up to be the same wickedly intelligent person that he was in his time, I have no doubt that he would completely grasp Einstein's work and the work of those who came after him, and Newton would be extending modern physics into new and unfathomable areas. The same is happening now with programming and computer science, but we are extending the solid foundation we have, not tearing it up and trying to start from scratch.

The people that invented our computing paradigms were geniuses, and to pass off their accomplishments as ignorant attempts at developing systems for the future is naive. People like Alan Turing and John von Neuman built an incredible foundation for the software systems we have today, and they also stood on the shoulders of giants that came before them. The systems we are using today have proven to be quite flexible and extensible. The fact that they're still around, yet much improved, should be evidence enough of that, and we will continue to evolve these systems in the future.

Text Vs. Graphics


Here is another peculiar stance in the article, this time on textual vs. graphical languages:
It is bizarre that we’re still expressing programs entirely with text 59 years later when the first interactive graphical display appeared 4 years later.
Language is the most natural, and yet most frustrating, way of expressing ideas that we have at our disposal. Language has been developing for tens of thousands of years. To think that we could suddenly replace it with something else in less than 60 years is bizarre. Writers have been struggling since words were invented to express their ideas through them. The best authors show that the written word is more than adequate. William Shakespeare, Ernest Hemingway, Stephen King, and countless other great authors have shown what language is capable of.

A picture may be worth a thousand words, but they are different words for every person that looks at them. Visual art by its very nature is subjective, so how are we supposed to develop a precise visual representation of a program without resorting to language semantics? Language may also be imprecise, but it can at least be made rigorous. Physics and mathematics have shown that to be true.

Using any of the few graphical languages out there for programs of significant size quickly shows how cumbersome they get. LabVIEW gets really messy really fast, and programs become incredibly rigid and hard to change. Don't even think about doing delegation or lambdas in it. MATLAB's Simulink is fairly similar and takes large numbers of symbols to represent relatively simple concepts. Digital hardware designers moved away from schematics in favor of textual HDLs as quickly as they could because of the overwhelming advantages of text for representing dense, complex hierarchical structures.

Most programming languages are text-based because text is superior to graphics for representing complex procedural and computational ideas.

Operating Systems


The author's views on operating systems were also confusing:
The bizzareness about operating systems is that we still accept unquestioningly that it’s a good idea to run multiple programs on a single computer with the conceit that they’re totally independent. Well-specified interfaces are great semantically for maintainability. But when it comes to what the machine is actually doing, why not just run one ordinary program and teach it new functions over time? Why persist for 50 years the fiction that every distinct function performed by a computer executes independently in its own little barren environment?
Is this not exactly what an OS already accomplishes? It's not the only feature of an OS, but an OS can certainly be thought of as "one ordinary program" that you teach "new functions over time." How have we not already achieved this?

Isolating programs running on top of the OS is a good thing. Dealing with programs that are aware of other programs in the system becomes mind-bending rather quickly. The nearly infinite additional complexity of this type of environment in the general case is something we rationally decided to avoid. We used to have systems that allowed programs to walk all over each other, not necessarily intentionally. One such system was called DOS. Do we really want to go back to something like that? I don't.

We now have preemptive multitasking and multi-core processors that run dozens of concurrent processes. In this kind of environment, virtualization is a key innovation that simplifies the programmer's task. Inherently multithreaded languages like Erlang are also a view into the future of programming, but one that builds on the past instead of discarding it.

Software Vs. Hardware


I don't have a problem with the advances that were chosen in the article. I think they were all extremely important and worthy of inclusion on the list. I don't think they were the only ones, though. David Wheeler has compiled an incredibly thorough and detailed list of the most important software innovations, and there are quite a few more than 8. Some happened well before 1955, and plenty happened after 1970. Discounting all other advances in programming as somehow being derivatives of the eight items picked for this article, or claiming they're substantially less important is too reductionist for me. The author starts to wrap up with this:
I find that all the significant concepts in software systems were invented/discovered in the 15 years between 1955 and 1970. What have we been doing since then? Mostly making things faster, cheaper, more memory-consuming, smaller, cheaper, dramatically less efficient, more secure, and worryingly glitchy. And we’ve been rehashing the same ideas over and over again.
And then in the next paragraph claims "Hardware has made so much progress." What kind of progress? The same kind of progress that he denigrates software systems for making. The primary advance in hardware over the past six decades can be summed up in two words: Moore's Law. A law that's getting pretty close to hitting a wall, either physical or economical.

The funny thing about this comparison is that the same argument could be made about hardware that he's making about software. Most high-performance microprocessor features in designs today were invented between 1955 and 1970. Branch prediction was considered in the design of the IBM Stretch in the late 1950s. Caches were first proposed in 1962. Tomasulo developed his famous algorithm for out-of-order execution in 1967. The first superscalar processor was in the CDC 6600 in 1965.

In fact, the CDC 6600 was a goldmine of hardware advances. The designers were some of the first to attempt longer pipelined processors. The 6600 was the first computer to have a load-store architecture. It was also arguably the first RISC processor. Essentially all modern processors are derivatives of the CDC 6600, and all we've been doing since then is making hardware faster, cheaper, wider, and with bigger caches.

I don't really believe that, though. Hardware has come a tremendous way since 1970. It has been an iterative process with each new innovation building on previous successes, and software is the same way. Almost all modern languages may be able to trace their roots to FORTRAN, but Ruby, Python, C#, Erlang, etc. are most certainly not FORTRAN. That would be equivalent to saying that a Porsche Carrera GT is basically a Ford Model T because they both have four wheels, an engine, and you can get the Porsche in black. They are not the same car. We've made a few advances in automobiles since then.

Days of Future Past


This brings me to one of the closing statements of the article, and one of its main underlying ideas:
Reject the notion that one program talking to another should have to invoke some “input/output” API. You’re the human, and you own this machine. You get to say who talks to what when, why, and how if you please. All this software stuff we’re expected to deal with – files, sockets, function calls – was just invented by other mortal people, like you and I, without using any tools we don’t have the equivalent of fifty thousand of.
As engineers, I think we fall for this type of thinking all too often, but it's exactly backwards. We don't design better systems because we know so much more than those that came before us. We design better systems because we can stand on the shoulders of giants. The systems that have survived the test of time did so because they were designed by wickedly smart people that made the designs flexible and adaptive enough to evolve into what they are today.

We have plenty of equally smart people alive today, but why waste their time reinventing the wheel? The economist Mark Thoma once quipped, "I've learned that new economic thinking means reading old books." The same applies to us as software engineers. We can design better systems because those older systems exist and we can build on them. We can design better systems because of the knowledge we gain from the experience of those who have gone before us. We can design better systems when we take the best ideas and tools from our history and combine them in new and interesting ways.

We should stop worrying about whether or not we're designing the next big innovation or inventing the next paradigm shift in software systems. The people that achieved these things before us did it because they were solving real-world problems. They didn't know how influential their discoveries would be at the time, and neither will we of our discoveries. The best we can do is solve the problems at hand the best way we can. When we look back in 50 years, we'll be amazed at what we came up with.

No comments:

Post a Comment