At the beginning of my professional journey with the focus on data visualization, I asked myself: Why do people use data visualization? Why it is necessary? Is it not enough to have all statistics value and tables, which are used for ages? I began my research and I found the answer.

As a data visualizer or data analyst, you communicate data insights from data, and this is the most important skills. And usually, the most compelling way to communicate information about data is visual. And for creating a good data visualization you need two skills:

  1. Design and artistic component to create something that is beautiful and compelling
  2. A strong scientific and mathematical component in being able to deliver the right insights

On Wikipedia page, I found an interesting article about Anscombe’s quartet. This quartet shows four different data sets with nearly identical simple descriptive statistics. They were constructed in 1973 by the statistician Francis Anscombe to demonstrate both the importance of graphing data before analyzing it and the effect of outliers and other influential observations on statistical properties.

https://bit.ly/1RnJLIG

Try to answer the following questions using this data set:

  • What is true for the means associated with any of the X columns?
  • What is true for the means associated with any of the Y columns?
  • What is true for the standard deviation associated with any of the X columns?
  • What is true for the standard deviation associated with any of the Y columns?

What did you find?

I hope, you have this answer:

  • Means for X and Y are the same
  • The standard deviation for X and Y is the same

If you visualize this data, you will see different graphs:

https://bit.ly/1RnJLIG

Only by visualizing the data you can see, that each of the data sets represents different information, even though their statistical characteristics are identical.

But that’s not all!

Alberto Cairo created the Datasaurus dataset which urges people to „never trust summary statistics alone; always visualize your data“. He also followed the same idea of Anscombe and created more advanced data set. In this data set you will find data for 13 different graphs:


I calculated the standard deviation, correlation, and average for every pattern and, as you can see in the table, there are the same!
Once again, you can be sure that trusting only statistical characteristics is not enough, but you need to visualize the data because these statistical characteristics can be absolutely the same. The real picture of what is happening is seen by those who visualize the data!

And last but not least:

Let’s have a look at the „Bandwidth of our senses“ diagram, which was created by Danish Physicist Tor Norretranders. It compares the amount of information each of our senses perceives per second. Thereby, the diagram shows the power of vision compared to our other senses. It demonstrates why visualizations are so effective in conveying huge amounts of information in split seconds.

https://bit.ly/2JH8afd

The sight takes up the majority of the frame here. In fact, it’s been said that sight can process information up to the speed of the computer networks, or an Ethernet cable. Sense of taste has the bandwidth of a calculator. The small white box at the bottom corner is only 0.7% of the area and it’s what we are aware of when all this processing is happening.

Text only is unlikely to be remembered for the long-term. The graph in the text helps readers to:

  • Learn faster about the text
  • Avoid misunderstanding of context
  • Remember the content of the article for a long time

Visualizing data is a modern method of communicating the information. This cannot be avoided today considering how much data we have today. The volume of information in digital form grows day by day, minute by minute. We need to learn how to process this data faster in order to get important information. Visualizing data is one of the most important focus in this process.