The old adage goes “Seeing is believing”. However when it comes to data visualization, we believe the truth is that seeing is understanding.
Ever since the world’s first data visualization was credited to Flemish astronomer Michael Van Langren in 1644, data visualizations have been used to make the understanding of data easier, faster, and more intuitive. After all, while numbers allow for complex calculations, human minds are evolutionarily designed to have strong visual processing ability, meaning we are able to understand data visualization about 60,000 times faster than we do just text.
However a lot has changed since the 17th century and now the variety and detail of data visualizations have exploded inline with the amount of data itself. As many aspiring data scientists look to the frontier of data visualizations to see how they can showcase their findings, we thought it beneficial to highlight our favorite data visualizations to date.
To make our list easy to digest, we highlight each posts’ type of the data visualization, as well as the source and link to the original project. We pored through roughly 300 data visualizations to find the most unique and effective of each type of data visualization, with an emphasis on highlighting a diversity of data visualization styles.
We hope you enjoy this list as much as we enjoyed putting it together.
As any avid audiophile will tell you, debate is the foundation of any good music appreciation. Wanting to determine what legendary musicians were having the most ripple effect in music, Italian data visualizer Michele Mauri created a spiral graph that shows which artists were having their music catalog covered the most by others. The result is a clear and beautiful illustration that uses color coding and line density to quickly demonstrate whose covers were dominant each year, and how that changed over time. An added bonus are little nodes for select years that specify which song was the most popular cover that year.
It’s no secret to anyone that partisanship feels higher these days—but is it really? Data visualizer Mauro Martino attempts to address this question by outlining U.S. House of Representative Members as blue and red nodes, and using a linear-repulsion model with Barnes Hut optimization to draw connections for when Representatives agreed with those across the aisle. We’re huge fans of scatter plots like this as the viewer is able to see trends over a 60 year period virtually immediately (whether or not they like the results).
Hey dude, do you call your bro “pal”—or “buddy”? This fun and interactive heat map from Quartz uses geo-tagged tweets on a county level to explore the man-on-man vernacular across the United States. Forensic linguist Jack Grieve used hot-spot testing (a common technique in spatial analysis) to understand geographic trends by measuring concentration of dude-based language compared to the frequency of that same language in surrounding areas. The result is a fun and simple heat map that lets you know if your duder-onomy is similar to those around you.
FiveThirtyTwenty was founded on one concept: budgeting shouldn’t be complicated. By spending 50% of income on needs, 30% on wants and saving 20%, Americans should be able to achieve some degree of financial stability. The unfortunate reality? The average American is spending way more on needs (71%), and way less in savings (12%). While some data visualizations are impressive because of their complexity, this one is impressive for its simplicity—showing readers exactly how we miss the bar.
What happens if you map 1000 Facebook friends as nodes in a 3D space? That’s what animator Evandro Barbosa sought to examine as he explored the more than 33,000 connections his sample cluster has amassed, with varying degrees of interconnectivity and isolation. While mesmerizing, our only critique on this is there is very little written about the methodology, making it more art than science.
The team at digital agency Pudding wanted to answer one question: how many movies are actually about men? Putting on their data science caps, they parsed through 2,000 screenplays (the largest undertaking of its kind), found characters who spoke at least 100 words, and mapped that to the gender of their character on IMDB. The result is an interactive sortable bar chart with the most complete breakdown of which films are male dominant, female dominant and have gender parity.
This absolute combo of a graph uses three different types of graphs to provide the most comprehensive look into who Nobel laureates are. The six line graphs are color coded and broken up according to the 6 categories of Nobel prizes, and the nodes on each graph represent the age of that year’s winner, with a timeline running left to right, and a bar representing the average age for the cohort winners for that category. Nodes with women winners are circled.
In addition to the line chart for age, the graph has a bar chart showing level of education attainment, a Sankey chart for showing what University the laureate went to, and a stacked bar chart at the bottom for frequent hometowns of the laureate winners, in 30 year increments.
The folks at Washington Post have had a sneaking suspicion: “breaking” news seems to be happening a lot more frequently these days than ever before. Are the world’s catastrophe’s truly accelerating amidst a globalized setting, or are news agencies simply taking advantage of your attention span. To answer this question they mapped breaking news by source and day, to track which news outlets were “breaking” news the most relative to their peers.
Kirk Goldsberry has been making a name for himself by bringing the NBA’s most interesting stats to life. His efficiency scatter plots are some of his most liked, using color, direction and size to immediately convey a multitude of data points. In the graph below, fans can quickly see that not only is Steph Curry one of the most high efficiency shooters in the league, but he does so with tremendously high shot volume. Russell Westbrook, often criticized for being the opposite, shows up as the inverse to Curry’s efficiency, while still maintaining the same high shot volume.
While the pandemic may be a tired subject at this point, this animated heat map is certainly a sight to behold for data enthusiasts. What makes this visualization so compelling is that it exemplifies how animations can perfectly encapsulate the dimension of time in a previously static data visualization. Because the pandemic was always about viral spread, the ability to show just how quickly a previously uninfected area can go to dense viral loads and cascade to neighboring communities is an inseparable part of this graphic. As proponents of adding animated dimensions to our graphics, we found ourselves big fans of this graphic, despite the morbid news.
Much has been made of Lake Mead’s declining water resources, and so it was of no surprise to us when our animated line and bar chart became one of the top posts on reddit’s r/dataisbeautiful. By taking a 30-year retrospective on California’s 10 largest reservoirs, we are able to contextualize the current drought to cyclical ones in the state’s past — with alarming results.
When James Eagle released this visual it quickly became one of the most shared data visualizations of the year. Once again demonstrating the power of time as a dimension in data viz, this is an extremely fun watch for anyone who lived through the browser shift of the 2000’s, beginning with the collapse of the dominant Netscape Navigator, and the subsequent takeover by Mozilla, then Safari and Chrome. Telling what is ultimately a 30-year business story in 3 minutes, this visualization shows that data is both easy to understand and interesting to watch when presented correctly.
A counterexample to animated charts, this graphic by Chartr uses a 3x3 grid to compare 9 different social media platforms over the same period to show their respective periods of popularity. A reason for the effectiveness of this method is two fold: firstly, stacking 9 companies on one graph would likely get so crowded as to make seeing any of the individual components too difficult. Secondly, because the Y axes of each graph isn’t uniform, comparing them on one chart would conflate absolute magnitude with respective popularity. Segmenting out each graph individually allows for the best ability to see each individual “spike”.
It’s no secret that here at Row64, we’re fans of both Python and data visualization. So when we came across this Instagram account creating visualizations only from Python, we were notably ecstatic. This particular visualization using MatplotLib and numpy and geopandas illustrates the various waterways of Africa color coded to show the water basins to which they are connected— with a notable absence of color in the Sahara desert.
One of the largest and high profile data visualization sources, the New York times created a literal 3D heat map that is both timely and important. Color coding squares of longitude and latitude, this 3D globe very acutely shows something climate scientists have been warning for decades: the rise in temperature is being felt most acutely where there are ice reserves in the north, which are experiencing a +6°C differential compared to the 20th century average. While we love where this is headed, we would love to see an animated version in the future.