A History of Treemap Research at the University of Maryland

Extract from an article by Ben Shneiderman

(Note that the notation treemap and heatmap are inter changeable in the context of this article.)

During 1990, in response to the common problem of a filled hard disk, I became obsessed with the idea of producing a compact visualization of directory tree structures. Since the 80 Megabyte hard disk in the HCIL was shared by 14 users it was difficult to determine how and where space was used. Finding large files that could be deleted, or even determining which users consumed the largest shares of disk space were difficult tasks.

Tree structured node-link diagrams grew too large to be useful, so I explored ways to show a tree in a space-constrained layout. I rejected strategies that left blank spaces or those that dealt with only fixed levels or fixed branching factors. Showing file size by area coding seemed appealing, but various rectangular, triangular, and circular strategies all had problems. Then while puzzling about this in the faculty lounge, I had the Aha! experience of splitting the screen into rectangles in alternating horizontal and vertical directions as you traverse down the levels. This recursive algorithm seemed attractive, but it took me a few days to convince myself that it would always work and to write a six line algorithm. This algorithm and the initial designs led to the first Technical Report (HCIL TR 91-03) in March 1991 which was published in the ACM Transactions on Graphics in January 1992. Choosing the right name took probably as long, but the term 'treemap' described the notion of turning a tree into a planar space-filling map.

My initial design simply nested the rectangles, but a more comprehensible design used a border to show the nesting. Finding an effective visualization strategy took only a few months but producing a working piece of software took over a year. Brian Johnson implemented the algorithms and refined the presentation strategies while preserving rapid performance even with 5,000 node hierarchies. The TreeViz application ran on color Macintosh models and led to the widely cited paper (HCIL TR 91-06) jointly authored paper in the October 1991 IEEE Conference on Visualization. This paper was reprinted in Readings in Information Visualization. I think treemaps are a convenient representation that has unmatched utility for certain tasks. The capacity to see tens of thousands of nodes in a fixed space and find large areas or duplicate directories is very powerful. I still use TreeViz for cleaning up my Macintosh. It does take some learning for novices to grasp the tree structure layout in treemaps, but the benefits are great.

PhD student Brian Johnson's implementation added many other interesting features such as zooming, sound (as a redundant or independent code, for example, larger files had a lower pitched sound), hue/saturation control, many border variations, and labeling control. We struggled to deal with the problem of many small files in some directories, but wound up showing only a blackened area that invited closer examination by zooming. We knew that encoding a linear variable such as file size as an area was breaking a graphic design guideline, but the benefits of seeing a large range of file sizes seemed like a compensation. We also knew that visually comparing long narrow rectangles to squarish ones was problematic, but cursoring over the boxes produced the exact file size on the bottom of the display.

My excitement about treemaps was great and like many innovators I thought millions of users would be using this tool within a few years. Our minds were not focused on getting a patent, since I thought this was more of a concept that a product. Brian's implementation of TreeViz was registered with the University's Office of Technology Liaison which sought to distribute TreeViz.

We found that new users took 10-15 minutes to get acquainted with the treemap display, so we began to explore improvements and training methods. We were impressed to examine thousands of nodes at 5-7 levels at once on the screen, but novices did better seeing 20-50 nodes at 1-3 levels. We had to bring our training times up to about 15 minutes in order to demonstrate the strong benefits of treemaps. Brian Johnson's dissertation (HCIL TR 94-04) reports on two studies, that were never published, that sort out the benefits of his treemaps. The 1992 HCIL Video Reports and the 1993 HCIL Video Reports showed TreeViz in action. TreeViz for Apple Macintosh is available for free downloading.

Masters student David Turo also built a treemap system on the Sun workstation and to make it more comprehensible we chose a fixed-level hierarchy. We used an appealing and familiar sports application: 453 basketball players, organized into the 27 teams in four leagues of the National Basketball Association. We had 48 statistics about each player for the 1991-92 season, so users could chose color and area coding from points scored, fouls, free throws, etc. . The 1993 HCIL Video Reports showed Turo's system with the basketball data. His Masters thesis (Unpublished!) describes his implementations and an empirical study. Johnson and Turo cooperated on a paper describing improvements they made to the visual presentation (HCIL TR 92-06).

By now we were pushing ahead on several application domains. A German visitor, Alexander Jungmeiseter worked with Dave Turo's implementation and built a stock portfolio visualization that showed clients, portfolios, industry groups, stocks and trades (HCIL TR 92-14) . Size might indicate worth of the holdings and color might indicate the degree of increase/decrease in value. I still believe that a worthwhile application would be a stock market monitor that would show the current daily trade activity. It could present the 30 Dow Jones Industrials, the Standard and Poor's 500, or all 2700 companies on the New York Stock Exchange. They would be grouped by industry (airlines, chemicals, drugs,...), area coded by volume of trading, and color coded by increase/decrease.

A Japanese visitor, Asahi Toshiyuki, built his own innovative treemap interface to implement the Analytical Hierarchy Process in decision making (HCIL TR 94-08). Users could express their opinions of the relative merits of a decision choice (such as which site to chose for a factory) by pumping up areas for their preferred choices, and pumping up the areas for importance of costs, availability of labor, tax breaks, etc. (Figure 5). The video demonstrates these processes (HCIL Video Reports 1994, and HCIL TR 95-04) and an empirical study showed users could succeed with this tool.

Another success story for treemaps was their inclusion in a satellite management system for Hughes Network Systems (HCIL Video Reports 1994 and HCIL Report TR 94-07). The three-level hierarchy showed each node of a network as a fixed size and color was used to indicate available capacity. The engineering-oriented community of ground station operators grasped this simplified version quickly.

In my travels to lecture about our work, treemaps became a major topic. However, I ran into resistance when showing still images of our hard disk directories with thousands of nodes. Once at the University of Washington after my talk produced a mixed reaction about treemaps, I asked my audience to follow me down to one of their labs. I installed TreeViz and examined their hard disk directories. I immediately spotted a problem, and with a few clues they could see for themselves that there were three copies of the same C compiler installed on this machine. The x-ray vision metaphor had proven to be effective on this occasion. Similarly, at Apple Computers, my audience much preferred the dynamic queries demos of the HomeFinder, but I gave copies of TreeViz to several interested attendees. The next day one of them reported finding many megabytes of useless information on their network servers.

While TreeViz was appreciated for the Apple Macintosh, we were getting requests for a Windows version. Graduate student Marko Teittinen took up the challenge and used Galaxy from Visix to produce a Windows 3.1 implementation called WinSurfer. Marko's carefully scripted video (HCIL Video Reports 1995) showed the features which were meant to match the Windows Explorer. WinSurfer allowed users to view, delete, copy, move, rename and run files. It worked nicely, but novices were often struggling to understand the layout that might show 5000 or more files at 7 or more levels. Simplifying the initial screen presentation to only 2 levels and allowing simpler user control never got implemented. In fact, we never produced a Windows 95 version and Galaxy is no longer a viable commercial product.

However, we did make a temporary patch so as to enter WinSurfer in the Browse-Off competition at ACM SIGCHI's 1997 Conference. This event drew six software tools for exploring hierarchies. There was no clear cut winner and WinSurfer was a leader in some tasks.

Micro Logic Corp, a New Jersey company, sells a commercial product, called DiskMapper for Windows machines based on the treemap idea. They have received great press attention and awards for their product. The University of Maryland receives a modest royalty on DiskMapper by way of a license agreement with the Office of Technology Liaison.

Statisticians point out that the mosaic display, shown by Bertin, and others is similar to the treemap concept. For fixed level hierarchies there is a great similarity, but the gist of the treemap idea was intimately tied to the computerized implementation and user control panel for setting attributes. A discussion of mosaic displays and an implementation that allows rapid switching of the hierarchy is in a Univ of Maryland graduate student project: CatTrees: Dynamic Visualization of Categorical Data Using Treemaps.

During 1997 an implementation was done in Delphi by Univ. of Maryland grad students Jerome Brown and Shaun Gittens. It is called TreeMap97. This general purpose version of slice-and-dice treemaps was revised by Chris North. You can download the software (updated version) and run the sample data set or import your own data sets.

During Summer 2000, the HCIL resumed work on treemaps, when Raghuveera Chalasani developed a Java version that included dynamic queries sliders. By early 2001 we polished his implementation into Treemap 3.0 which is licensed by the Univ. of Maryland's Office of Technology Liaison. Treemap 3.0 includes dynamic queries to filter out unwanted items, squarified layout (as well as slice-and-dice), improved input, infotips, and color/font controls. A demo version with five data sets is available for free. A useful feature of Treemap 3.0 is that it also allows you to visualize the contents of your Windows PC directory structure.

The cluster and squarified treemap algorithms are visually appealing in part because the rectangles are more s quare like. They avoid the thin rectangles in the slice-and-dice algorithm, but sacrifice the alphabetic ordering of nodes. This creates an additional problem when leaf node sizes change because the position of rectangles can alter dramatically. The goal of ordered treemaps was achieved by a novel algorithm that nicely balanced square-like nodes (aspect ratios close to one) while preserving order. This paper, "Ordered Treemap Layouts" was presented at the October 2001 IEEE Symposium on Information Visualization.

However, we found still more ways to improve the layout by organizing the screen space into horizontal (or vertical) strips. This idea helped keep the square-like aspect ratio and made for an easier to follow ordering. We demonstrated this with a study of 20 users. Another innovation, quantum treemaps, satisfy the need to accommodate fixed shape items like page thumbnails or photos. The paper, "Ordered and Quantum Treemaps: Making Effective Use of 2D Space to Display Hierarchies" appeared in ACM Transactions on Graphics (October 2002, vol 21, no 4, pp 833-854).

To support researchers and students, Ben Bederson and Martin Wattenberg wrote an open source Java 1.1 library of five Treemap Algorithms. They are generic, and should be readily usable by arbitrary applications. Each takes a list of numbers and an input rectangle, and generates a set of sub-rectangles that are proportional in area to the input numbers, and partition the space of the input rectangle.

The 5 treemap algorithms implemented are:

  • BinaryTree - Partially ordered, not very good aspect ratios, stable
  • Ordered - Partially ordered, medium aspect ratios, medium stability
  • SliceAndDice - Ordered, very bad aspect ratios, stable
  • Squarified - Unordered, best aspect ratios, medium stability
  • Strip - Ordered, medium aspect ratios, medium stability

Treemap 4, released in February 2003, includes a flexible hierarchy, color binning, improved color setting, aggregation, directory mapping, and export. Our applications include treemaps to show oil production data (SPE paper), project management (PMI paper), and a gene ontology with 14,000 nodes at 16 levels. Our work with biologist Eric Baehrecke led to a special version of Treemap to handle the Gene Ontology and gene expression data from microarray studies (see paper in BMC Bioinformatics 2004, 5:84) In august 2003, substantial code modifications were made to speed loading, permit use as an applet, and enhance usage of tm3 files. In addition, online documentation has been added in the usual textual format, and also as a set of 13 user selectable QuickDemos (animated with sound) that run from 11 to 75 seconds. A 9-minute video overview is also available.

Treemaps were much in the news during late 2005, when stories appeared in the The Economist and other publications including: the Financial Times (" Clearer views of the data mountain"), Informationweek ("Treemaps Rule: Even The Marines Think They're Cool"), OnWallStreet ("Got the Data Blues? Just Map It Out"), and the Univ. of Maryland newspaper the Diamondback (" Technology map out the future").

Back to Technology