Brief Overview of Heat Maps
If you’ve ever checked the weather or followed an election on television, you’re probably familiar with a specific form of data visualization known as a heat map. Heat maps—graphical representations of data in which a spectrum of colors is used to represent the values of individual data points in a set of data—are very popular in the world of data visualization.
Typically, heat maps take the form of an actual geographic map. For example, when watching The Weather Channel, you’re very likely to see various geographic maps of the United States (or a particular region of the US) shaded with various colors representing one of many different types of meteorological data such as temperature and precipitation. Below in Figure 1 is an example of a temperature heat map of the United States—probably the most common and straightforward type of heat map—from November 3rd, 2010. In the map, temperature is represented by colors ranging from white, pink, red, orange, and yellow for the hotter temperatures (think “white-hot,” or “red-hot” for these hotter temperatures; alternatively, think about the colors that make up a flame) to various shades of blue and (not depicted) purple (think “ice-cold”).
Those who closely follow US elections are probably also familiar with heat maps such as the one below in Figure 2. This heat map presents county-level data of election results. The more blue the county, the higher the percentage of Democrat votes. The more red the county, the higher the percentage of Republican votes. Shades of purple represent results closer to a 50-50 split between the two major political parties.
However, heat maps are a very flexible data visualization tool and can be used to represent data on other surfaces besides geographic maps. For example, below in Figure 3 is a heat map of a Webpage from the US Census Bureau Website. This heat map displays the concentration of clicks on the page. The more red the area, the more clicks there are in that area. The more blue the area, the fewer clicks there are in that area. Areas in the orange and yellow areas of the spectrum represent areas with an intermediate number of clicks.
The potential applications of heat maps are virtually limitless due to this flexibility. All that is needed in most cases is a matrix of data (such as one created in a spreadsheet program like Microsoft Excel) and a way to identify where on the surface the values are located (e.g., latitude and longitude coordinates to plot temperature on areas of a map). In its simplest form, a heat map can be superimposed right over the data matrix as in the one below in Figure 4. We’ll use this basic example for pedagogical purposes for the step-by-step guide further below.
Moving to education research, heat maps are often used to represent longitudinal academic data from students as well as clusters of students (obtained from advanced types of statistical analyses such as hierarchical cluster analysis). Below in Figure 5 we see academic performance for eight students from kindergarten through high school. The two major clusters are comprised of a group of students that tended to do well over the years and a group of students that didn’t. What’s useful and interesting with these types of heat maps is that performance trajectories can be represented by looking at the color changes (or stability) from left to right.
Step-by-Step Guide to Making Heat Maps in Microsoft Excel
Heat maps are now very commonly used in leadership development programs and educational assessment to represent individuals’ performance along various important dimensions (e.g., leadership skills, academic performance in a content area). The pedagogical example presenting here comes from a leadership development program. Microsoft Excel 2007 was used to create the heat map.
In the leadership development program, 75 upper-level managers from a large global company participated in a 360-degree feedback intervention. The managers were evaluated on seven areas by their manager, peers, and direct reports. The evaluated areas, or “competencies,” were clusters of critical knowledge, skills, and abilities deemed necessary for success on the job in that organization (e.g., team-building, business acumen). Raters used a six-point scale with lower values representing poorer standing on the competency and higher values representing higher standing on the competency. The data was aggregated across raters to obtain one score for each competency for each participant. What resulted was a matrix of data as shown below in Figure 6 with participants (P) in the rows and competencies (C) in the columns.
After obtaining or creating our matrix of raw data, we need to think about how we want to arrange or sort the data to get the desired graphical representation. In the margins of the matrix shown in Figures 7 and 8, the average rating across competencies for each participant, and the average score across participants for each competency were computed to be used for sorting in the next steps.
Once we have our heat map data and data we need to sort the cells, we can proceed to bring the heat map to life. First, highlight the entire region of the matrix to be represented in the heat map (see Figure 9).
Next (and this will differ slightly by the version of Excel you are using), click on “Conditional Formatting” in the “Styles” section of the “Home” ribbon, and then click “Color Scales” (see Figure 10). Several prearranged color schemes are provided. Click on “More Rules” to customize the color scheme.
We can get very creative here. In this example, we’ll change the Format Style to a “3-Color Scale,” then assign the “Minimum,” “Midpoint,” and “Maximum” fields as blue, white, and red to represent “cold” (low), average, and “hot” (high) values on the competencies (see Figure 11. The choice of color can sometimes be important, but here we just emphasize that it’s important to choose different colors to represent different ranges of values on the continuum.
After selecting the color scheme, the heat map will look like a random arrangement of reds, whites, and blues (see Figure 12). (Note: make sure that you highlight only the raw data and do not include the marginal data when adding the conditional formatting.)
Once the formatting is taken care of, we can now sort the data to create the heat map. In this example, we first rearranged the columns left-to-right (with the Cut and Insert Cut Cells functions) from the competency on which participants scored highest on average (C7) to the one on which participants scored lowest on average (C1). This rank-orders the competencies so we can see where this group of managers is strongest, and where it is weakest. Then, we sorted the rows (Sort & Filter function) by participant average score from highest to lowest. What results is the graphic below (see Figure 13). As can be seen, the data is arranged such that the top left corner displays data from the higher overall performers on the competencies on which the group as a whole was rated more strongly (strengths to leverage), and the bottom right displays data from the lower overall performers on the competencies on which the group as a whole were rated more poorly (areas to develop). An alternative way to interpret this map (and perhaps more useful way depending on the information desired) is to look at the top-right corner which indicates the areas in which top performers were evaluated more poorly (areas to develop), to the bottom-left corner where we have the areas in which the bottom performers were evaluated more favorably (strengths to leverage).
This matrix can be further manipulated by sorting each column separately to see the distribution of the data within each competency (see Figure 14). This is somewhat like creating a stacked bar graph in which each bar totals 100% of the data and different colors represent different ranges that the data points fall in. Note that all row data is now meaningless (i.e., competency scores across a row are not from one participant), but each cell can still be interpreted as the performance for one participant on one competency. With this sorting technique, the group results can be easier to interpret. In this example, it can be seen that the group was evaluated quite highly on Competency 7 (in fact, more than half received a score of 4 or better) and fairly low on Competency 1. Also, we can compare performance on two competencies with almost identical average scores but fairly different distributions. For example, for Competency 4 members of the group largely received moderate evaluations (i.e., lots of white area) whereas for Competency 2 members of the group received evaluations more spread out across the range.
These types of heat maps can be created with various different sorting variables. For example, we could obtain data for the division these managers worked in and see which divisions had the best performers, providing somewhat of a “geographical” heat map within the organization. Or, as was briefly mentioned in the education example further above, we can take overall performance scores across quarters or years to see improvements (or decrements) by arranging longitudinal data in chronological order from left to right (rather than sorting cross-sectional data from high to low as we did in the guide.)
To close this guide out, it should be noted again that this is a very basic heat map. More advanced heat maps require more work and usually different software. Note that there is a free app that can be downloaded (see https://store.office.com/geographic-heat-map-WA103304320.aspx?assetid=WA103304320) and added on to Excel 2013 where one can created geographic heat maps. Below I provide a few more links to other heat map tools, some of which are free and others for which there is a charge for the service.
- Open Heat Map (www.openheatmap.com)
- Crazy Egg (www.crazyegg.com)
- Free alternatives to Crazy Egg (http://www.paulolyslager.com/heatmap-hot-or-not/)