Core Foundations of Data Visualization
Understand key visualization terminology, the definitions and scopes of data, information, and narrative visualization, and their historical foundations.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What are the two sub-types of categorical variables based on their ordering?
1 of 13
Summary
Data Visualization: Terminology and Foundations
Understanding Variables in Data Visualization
Before you can visualize data effectively, you need to understand what kind of data you're working with. Data visualization techniques depend fundamentally on the type of variables you're representing.
Categorical Variables
Categorical variables group objects into distinct categories based on characteristics. These variables don't represent numerical measurements—instead, they describe qualities or classifications.
There are two important subtypes:
Nominal variables have no intrinsic order. Examples include: color (red, blue, green), country names (France, Japan, Brazil), or product types (shoes, hats, coats). The categories are simply different from one another with no ranking.
Ordinal variables have a natural, meaningful order. Examples include: education level (elementary, high school, college, graduate) or customer satisfaction (poor, fair, good, excellent). The categories exist in a sequence, though the distances between them may not be equal.
Quantitative Variables
Quantitative variables represent numerical measurements and amounts. These are the variables you can perform arithmetic on.
There are two subtypes:
Continuous variables can take any value within a range and can be measured with arbitrary precision. Examples include: height (5.23 feet or 5.234 feet), temperature, or time. In theory, they have infinitely many possible values.
Discrete variables can only take specific, separate values—typically whole numbers. Examples include: number of students in a classroom, number of cars sold, or number of website visitors. There's a finite set of possible values.
Why does this matter? Different visualization techniques work better for different variable types. Bar charts suit categorical data, while line graphs excel at showing trends in continuous quantitative data. Understanding your variables is the first step toward choosing the right visualization method.
Tables and Graphs: When to Use Each
Two fundamental ways to present data are tables and graphs. While they both display information, they serve different purposes and work best in different situations.
Tables
A table is a structured arrangement of quantitative data in rows and columns, with categorical labels identifying what each row and column represents. Tables are the traditional way to store and display precise numerical information.
When to use tables:
You need to display exact numerical values with precision
Your audience needs to look up specific data points
You have a small number of data points to show
The audience will benefit from seeing all values simultaneously for comparison
Tables excel when precision and specific value lookup are the goals. If someone asks "What was the exact revenue for Q3?", a table gives you the answer directly.
Graphs
A graph (also called a chart) displays relationships among data by encoding values as visual objects—such as lines, bars, points, or areas—positioned within a coordinate system with axes. Graphs emphasize patterns and relationships rather than exact values.
When to use graphs:
You want to show trends, patterns, or relationships over time
Your audience needs to identify the "big picture" quickly
You have many data points and want to reveal patterns
You want to make comparisons easy and intuitive
The key insight: graphs reveal what tables hide. Looking at a column of numbers in a table doesn't immediately show you whether values are increasing, decreasing, or stable. A line graph makes this obvious instantly.
This visualization principle is often summarized as: tables for lookup, graphs for insight.
Defining Data Visualization
Data visualization has a specific meaning in practice. It's not simply any picture of data—it's a purposeful design to help people understand information better.
Core Definition
Data visualization is the practice of designing and creating graphic representations of quantitative and qualitative data and information. Its primary purpose is to help an audience explore, discover, understand, interpret, and gain insights from data.
This definition highlights several important points:
It's a design practice, not just automatic. You make intentional choices about how to represent data.
It works with both types of data: quantitative (numerical) and qualitative (descriptive, categorical).
It serves human goals: exploration, discovery, understanding, interpretation, and insight generation.
Forms of Visualization
Data visualizations can take different forms depending on their purpose:
Static visualizations are fixed images—a chart in a printed report or a saved graph image. They're excellent for communicating a specific finding.
Dynamic visualizations change over time without user interaction—like an animated sequence showing how data evolved year by year.
Interactive visualizations let viewers manipulate and explore the data themselves—zooming in, filtering categories, or switching between views. These empower audiences to answer their own questions.
The form you choose depends on your context: a presentation slide might use static visualization, a dashboard might use dynamic or interactive elements, and an online article might use all three.
Information Visualization vs. Data Visualization
While often used interchangeably, information visualization is a more specialized concept than general data visualization.
Information visualization deals specifically with large, complex datasets containing both quantitative and qualitative abstract information. Rather than visualizing concrete measurements (like temperature readings or sales figures), information visualization often tackles less tangible data—like how concepts relate to each other, how complex systems are structured, or how people interact in networks.
Key differences:
Scope: Data visualization is broader and applies to many contexts; information visualization focuses on complex, abstract datasets.
Goals: Both aim for understanding, but information visualization specifically emphasizes adding value to raw data, improving comprehension, reinforcing cognition (how people think and remember), and supporting decision-making.
Complexity: Information visualization typically handles more complex relationships and larger datasets that humans cannot easily understand without visual representation.
Example: A graph showing quarterly sales (data visualization) is straightforward. A network diagram showing how thousands of scientists cite each other's papers, revealing research communities and influence patterns (information visualization), requires more sophisticated representation.
Narrative Visualization: Telling Stories with Data
While traditional visualizations present data for exploration and analysis, narrative visualization takes a different approach by combining visual elements with storytelling.
Narrative visualization integrates charts, graphs, maps, and other visual elements with a structured narrative flow to convey information through a data-driven story. Rather than showing all data at once and letting viewers explore freely, narrative visualization guides the audience through a predetermined journey, revealing insights in sequence.
Key characteristics:
Structured story flow: Information is presented in a sequence that builds understanding progressively
Guided interpretation: The narrative directs attention to specific patterns and relationships
Emotional context: Stories often incorporate context that makes data more meaningful and memorable
Multiple visual forms: Charts, maps, text, and imagery work together to convey the complete story
Purpose: Narrative visualization helps people identify trends, patterns, and relationships by making data engagement feel natural and compelling—much like reading an article rather than analyzing a spreadsheet.
Example: Instead of presenting a raw dataset about climate change with interactive tools, a narrative visualization might tell the story of how temperatures have changed across specific regions, introducing one location at a time, revealing impacts on communities, and building to a conclusion—guiding the reader through a coherent argument supported by data.
<extrainfo>
Historical Development of Data Visualization
Understanding the history of data visualization provides context for modern practices, though the specific historical details may have limited direct exam relevance.
Key 20th-Century Contributions
The modern practice of data visualization was shaped by important figures and publications:
John Tukey introduced exploratory data analysis (EDA), a methodology where visualization is a primary tool for discovering patterns, identifying outliers, and testing hypotheses about data. Tukey's work emphasized that visualization should be a core part of statistical analysis, not just a final presentation step.
Edward Tufte authored The Visual Display of Quantitative Information, a seminal work that established many modern principles for effective data visualization. Tufte's work emphasized clarity, reducing unnecessary decoration ("chart junk"), and maximizing the "data-ink ratio"—ensuring that every element in a visualization conveys information.
These contributions established visualization not as decoration or afterthought, but as a fundamental tool for understanding data and communicating insights.
</extrainfo>
Flashcards
What are the two sub-types of categorical variables based on their ordering?
Nominal and ordinal
What are the two sub-types of quantitative variables based on the precision of their values?
Continuous and discrete
What do quantitative variables represent in a dataset?
Measurements
Which six groups are used to categorize techniques in the periodic table of visualization methods?
Data
Information
Concept
Strategy
Metaphor
Compound
According to Tamara Munzner, what is the core purpose of providing computer-based visual representations of datasets?
To help people perform tasks more effectively
Does computer-based visualization aim to replace or augment human capabilities?
Augment human capabilities
What is the general practice of designing graphic representations of quantitative and qualitative data called?
Data visualization
What are the three possible states of visual items used in data visualization?
Static
Dynamic
Interactive
What are the main goals of information visualization regarding raw data and cognition?
Add value to raw data
Improve comprehension
Reinforce cognition
Aid decision-making
Which two elements are combined to create narrative visualization?
Visual elements (charts, maps, etc.) and storytelling
What does narrative visualization help people identify through its story-driven experience?
Trends, patterns, and relationships
Which 20th-century figure introduced the concept of exploratory data analysis?
John Tukey
Who authored the influential book "The Visual Display of Quantitative Information"?
Edward Tufte
Quiz
Core Foundations of Data Visualization Quiz Question 1: Which of the following best describes an ordinal variable?
- A categorical variable with a meaningful order (correct)
- A categorical variable with no intrinsic order
- A quantitative variable measured continuously
- A quantitative variable taking only integer values
Core Foundations of Data Visualization Quiz Question 2: What do graphs primarily display in data visualization?
- Relationships among data (correct)
- Exact numerical values in a table
- Textual descriptions of categories
- Hierarchical structures of files
Core Foundations of Data Visualization Quiz Question 3: Which of the following is a category in the periodic table of visualization methods?
- Data (correct)
- Color
- Layout
- Animation
Core Foundations of Data Visualization Quiz Question 4: According to Munzner, visualization primarily aims to ___.
- augment human capabilities (correct)
- replace human decision‑making
- store datasets efficiently
- encrypt data for security
Core Foundations of Data Visualization Quiz Question 5: Narrative visualization combines charts, graphs, maps, and other visual elements with ___ to convey data.
- storytelling (correct)
- statistical testing
- database queries
- code compilation
Core Foundations of Data Visualization Quiz Question 6: What categories can visual items used in data visualization belong to?
- Static, dynamic, or interactive (correct)
- Printed, auditory, tactile
- 3‑D, holographic, virtual reality only
- Animated, cinematic, narrative
Core Foundations of Data Visualization Quiz Question 7: Data visualization chiefly aims to enable the audience to ___ with data.
- Explore and gain insights (correct)
- Automate decision‑making without human input
- Compress data for storage efficiency
- Encrypt data for security purposes
Core Foundations of Data Visualization Quiz Question 8: Which contributions are correctly paired with their creators in the field of data visualization?
- Tukey introduced exploratory data analysis; Tufte authored “The Visual Display of Quantitative Information” (correct)
- Tukey created the periodic table of visualization methods; Tufte developed 3‑D rendering software
- Tukey invented interactive dashboards; Tufte pioneered database encryption techniques
- Tukey wrote “The Visual Display of Quantitative Information”; Tufte introduced exploratory data analysis
Which of the following best describes an ordinal variable?
1 of 8
Key Concepts
Types of Visualization
Data visualization
Information visualization
Narrative visualization
Periodic table of visualization methods
Visualization (Munzner definition)
Data Variables
Categorical variable
Quantitative variable
Key Figures and Techniques
Exploratory data analysis
John Tukey
Edward Tufte
Definitions
Data visualization
The practice of designing graphic representations of quantitative and qualitative data to aid exploration, insight, and decision‑making.
Information visualization
The visual representation of large, complex datasets that combine quantitative and qualitative abstract information to enhance comprehension and cognition.
Narrative visualization
A storytelling approach that integrates charts, maps, and other visual elements into a structured narrative to convey data-driven stories.
Categorical variable
A type of variable that groups objects by characteristic, either nominal (unordered) or ordinal (ordered).
Quantitative variable
A variable representing measurements, either continuous (infinitely precise) or discrete (finite set of possible values).
Periodic table of visualization methods
A classification scheme that organizes visualization techniques into six groups: data, information, concept, strategy, metaphor, and compound visualizations.
Exploratory data analysis
A set of statistical techniques introduced by John Tukey for summarizing main characteristics of data, often using visual methods.
John Tukey
An American mathematician and statistician who pioneered exploratory data analysis and contributed foundational concepts to data visualization.
Edward Tufte
An American statistician and author known for “The Visual Display of Quantitative Information,” shaping modern principles of data visualization.
Visualization (Munzner definition)
Computer‑based visual representations of datasets designed to help people perform tasks more effectively, augmenting human capabilities.