Search This Blog

Tech Book Face Off: Envisioning Information Vs. Visual Explanations Vs. The Visual Display of Quantitative Information

I came across glowing recommendations for Edward Tufte's books on a number of blogs, and I finally got around to reading them. They're all fairly short books, filled with charts and graphics, so they didn't take long to get through. I'll say right up front that I have mixed feelings about them. The overarching theme of all three books is apparent from their titles. The goal is to describe to the reader the best way to present information in charts and graphics so that the data can be clearly analyzed without distortion or confusion. Tufte obviously cares deeply about the accurate, unencumbered display of information, and parts of each of these books are excellent. But other parts were frustratingly repetitive or overly simplistic. Let's take a deeper look at the merits of each of these visual manuals for the chart designer.

Envisioning Information cover


Visual Explanations cover


The Visual Display of Quantitative Information cover

Envisioning Information

Tufte warned the reader in the introduction of this book that the prose would be terse, and he wasn't kidding. Don't expect to read a detailed exposition on the design of charts here. The writing is minimalist, bordering on unbearable, but I guess that's the point. The graphics should speak for themselves, and for the most part, they do, although some of them were too small to read easily. I still felt the writing did the book a disservice because the stilted prose made it much less engaging than it could have been.

It's likely that I read this book out of order. I read it first, going by it's publication date, but I think it was actually written after Visual Explanations and possibly also The Visual Display of Quantitative Information (now on its second edition). I think it would have been better to read Envisioning Information after either of the other two books, or even not at all, because it's basically a quick summary of the principles of good chart design with a large number of examples. It's only 100 pages and took a couple hours to read through while examining the charts, and it felt like a series of review exercises for someone who had already been exposed to the material.

The book is split up into six concise chapters: Escaping Flatland, Micro/Macro Readings, Layering and Separation, Small Multiples, Color and Information, and Narratives of Space and Time. Each chapter contains an explanation of the idea and a number of charts and graphics that either exemplify the idea or show how its misuse can corrupt the presentation of data.

Each chapter covers important concepts to think about when designing charts, and in the first chapter, Tufte addresses the restrictions of the 2-dimentional page representing the 3-dimentional world with an analogy to writing. I definitely agree on the restrictiveness of text. Putting certain thoughts or descriptions into a linear sequence, and doing it well, takes a lot of focused effort. Reading explanations and gaining understanding from text is also difficult. Often you need to know many different things at once before you can truly understand a concept, but you can only read one sentence at a time. It takes an accumulation of knowledge before all of the concepts settle out and start to make sense. Designing charts is similarly difficult because charts are flat, but the world is not. Most of the time a chart is trying to distill a large multivariate system into a 2-dimentional representation that can't possibly show everything. The condensation of relevant details and the elimination of superfluous ones is part of what makes a good chart design effective.

A major theme of good chart design is the identification and removal of chartjunk—meaningless graphics and decorations that do not add in the least to the understanding of the data represented. Tufte maintains that charts should be data-rich and data-dense. Chartjunk takes up space that could be used for data at best, and it adds to confusion and distraction at worst.

While carrying the theme of reducing chartjunk through the rest of the book, Tufte covers how data can be layered to reveal new insights at multiple levels of detail, how repeated small charts with variations can bring out patterns through comparison, and how the judicious use of color, lines, and borders can greatly assist in understanding data.

One think I particularly liked about this book, and something that was true of all of his books, was the way his descriptions were always on the same page as the charts they were describing, and if need be, he would repeat a chart so the reader wouldn't have to flip pages to see both the prose and the chart. He even called out this practice at the end of the book:
Descartes did the same thing in his Principia, repeating one particular diagram 11 times. Such a layout makes it unnecessary to flip from page to page in order to coordinate text with graphic.
It's a very nice change from most other books I've read with graphics of any kind—programming and text books included—where graphics could be multiple pages removed from their descriptions.

Visual Explanations

While Envisioning Information was a set of examples presented in quick succession with little discussion, Visual Explanations was a much more engaging read with a smaller set of more focused examples and deeper analysis. All of the same concepts were covered and then some, and the format was much better.

In the first chapter, Images and Quantities, Tufte discussed the importance of correct scaling and appropriate labels to give perspective on charts and graphics. He cautioned that scale can be used to both illuminate and deceive, so you need to be careful how you use it. It's a fundamental concept that you see misused all the time.

Then the second chapter, Visual and Statistical Thinking: Displays of Evidence for Making Decisions, raised the bar with two great stories about the use of charts in the Cholera Epidemic of London, 1854 and the Challenger explosion. The careful analysis of cholera deaths charted on a map of London helped mitigate the epidemic, although it's debatable whether the removal of the water pump handle from the contaminated well stopped new incidents of the disease or if the contamination was already subsiding. Tufte had some words of wisdom on this matter:
... the credibility of a report is enhanced by a careful assessment of all relevant evidence, not just the evidence overtly consistent with explanations advanced by the report. The point is to get it right, not to win the case, not to sweep under the rug all the assorted puzzles and inconsistencies that frequently occur in collections of data.

He warns about the perils of data mining—searching for the best representation of the data to prove the desired outcome—and only presenting positive results.

In contrast, the poor presentation of O-ring data prior to the Challenger launch on that cold, fateful January day in 1986 resulted in a disastrous shuttle explosion. The discussion and analysis of both events was excellent, and the book is worth a read for this chapter alone.

Tufte did go a bit far in his criticisms of Richard Feynman and his makeshift experiment to demonstrate how the O-ring material behaves at cold temperatures by submerging it in a cup of ice water at a congressional hearing. Feynman obviously knows the difference between a careful scientific experiment and a theatrical demonstration, and I would give him credit for the effective use of the limited resources available to him to prove a crucial point. I'm sure he was well aware that he was in some part playing political theatre.

The next chapter used the field of magic to show how multiple layers of information can be represented with shading and outlines of objects behind other objects. Magic involves showing one thing to the audience while doing something different in the background. Teaching someone to do magic through pictures involves showing both of these perspectives at once, a valuable skill for representing multiple layers of data in a single graphic.

The rest of the book covered a number of other chart design principles, wrapping up with a chapter on many examples of graphics throughout history—some good, some horrible. This chapter went a bit long, and was not as clear and engaging as the rest. Still, it contained some useful advice on how to represent information in the form of pictures while telling a story in much less space than could be done with words.

The Visual Display of Quantitative Information

This book was my favorite of the three. It largely covered the same chart design principles as the other two books, but it went into more depth and the discussion was more organized. In this book Tufte methodically lays out what makes a good chart design and what will ruin otherwise good data.

He starts off with a nice exposition on the development of charts and graphics over time, and finishes the first chapter with a great summary of graphical excellence:
Graphical excellence is the well-designed presentation of interesting data—a matter of substance, of statistics, and of design.

Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency.

Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.

Graphical excellence is nearly always multivariate.

And graphical excellence requires telling the truth about the data.
Those are the goals to shoot for when designing graphics, and the next chapter dives into an equally nice exposition on how to reveal the truth in graphs. This chapter concludes with another nice summary, this time of principles for graphical integrity:
The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented.

Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data.

Show data variation, not design variation.

In time-series displays of money, deflated and standardized units of monetary measurement are nearly always better than nominal units.

The number of information-carrying (variable) dimensions depicted should not exceed then number of dimensions in the data.

Graphics must not quote data out of context.
After laying the foundation of graphical excellence and integrity, Tufte uses the rest of the book to show how to achieve these ends. I especially enjoyed the chapter Data-Ink Maximization and Graphical Design, where Tufte redesigns a number of different types of plots. Some redesigns I liked, some I didn't, but all of them made me think about why I preferred certain designs. His redesign of the histogram was pretty slick. He removed the chart border and tick marks, instead putting white lines through the bars themselves to mark grid values. It had a very clean look. On the other hand, I thought his redesign of the box plot with a new plot called a quartile plot—with its middle line offset from the line showing the extents of the data—was less clear than the original. I prefer a box plot with dots showing the max and min of the data set, a line going from the second to the third quartile, and a short cross bar at the mean. It's still minimalist while being much easier to discern the important qualities of the plot. Most of his redesigns don't seem to have caught on since this is the first I've seen of them in many years of looking at innumerable charts and graphs.

Throughout the book, Tufte seems to focus too much on increasing the density of information in charts. He advocates shrinking charts down to increase their resolution, and it gives the impression that higher density of information is always better. If a chart doesn't display more than a couple dozen values, he thinks that the data is better shown in a table than a chart. I somewhat disagree with this sentiment. While well-designed, high-density charts can be great for seeing patterns in complex data, simple charts can also be quite illuminating, and not everyone cares to squint to see them. A clean, organized chart of a dozen—or even a half dozen—values can make a point immediately obvious. A table of that same data would take more time and effort to discern the same pattern. Human beings are much better at pattern recognition of visual displays than of rows of numerical values, and the enabling of that recognition should also be a primary concern when displaying data.
Tufte wraps up the book with a summary chapter that brings the over-arching concepts of good design together with a few interesting final thoughts. He advocates a better integration of text and graphics so that they flow together seamlessly, with neither able to stand alone. Instead of referencing graphics in the text with "(See Figure 5)" and attempting to fully describe the graphics in the prose, we should use the text to tell the reader how to interpret the graphics and annotate the graphics directly with concise labels of important features. That's quite a diversion from how we learned to combine graphics and text in school, but it makes some sense. The goal is clarity and understanding, and a well-designed text and graphic combination will have those features without the constant referencing of the graphic in the text, often on a separate page. The information is most easily absorbed when important points are as close to the source as they can get.
Overall, I thought this book was quite good. Tufte covers the principles of chart chart design well, and he shows rather than tells the reader what he means with plenty of thoughtful examples. I would definitely recommend it to anyone who designs visual displays of data or wants to understand what makes a chart clear and effective.

Less is More

Despite the fact that these are all short books, with less than 500 pages between them and half of that being graphics, I wouldn't recommend reading all of them. They all cover the same topics, more or less, and reading them all gets fairly repetitive. The Visual Display of Quantitative Information is the most complete of the three books, and I would say the most engaging read. Even though I didn't agree with everything, it never failed to help me clarify my own thoughts, and if anything, that is the mark of a good book. If you really enjoy that book and want more, then go for Visual Explanations next. It covers most of the same material in a different way, and the in-depth examples of the Cholera Epidemic and the Challenger Explosion are a great read. Envisioning Information can safely be skipped since it doesn't cover anything new, and the material it does have is more superficial than the other two books. In this case less is more, and The Visual Display of Quantitative Information is enough.