Five Key Visualizations for AI Literacy (2 of 3)
Part 2: Optimization Terrains and Confusion Matrices
I began the exploration of key visualizations for AI literacy with scatter plots, powerful tools for understanding data patterns and relationships. (see part 1).
This second installment focuses on two crucial visualizations: optimization terrains and confusion matrices.
Optimization terrains provide a visual metaphor for how AI systems learn and improve, while confusion matrices offer a clear picture of AI performance in classification tasks. These visualizations not only illuminate core AI concepts but also offer valuable insights applicable to human learning and decision-making processes.
2. Optimization Terrains: Learning to be the Best
Just as hikers navigate physical landscapes, AI systems traverse abstract terrains in search of optimal solutions. This mental image of a landscape, with peaks and valleys representing outcomes with different levels of “best”, is a powerful tool for understanding how AI learns and improves.
Imagine a vast, undulating landscape where each point represents a possible solution to a problem. The height of any point corresponds to how good (or bad) that solution is. The AI's goal in its learning process is to find the lowest (or highest) point in this landscape, which represents the best possible solution.
This mental model is particularly useful when explaining concepts like neural network training, where millions of parameters are adjusted to find the optimal configuration. While we can't visualize this high-dimensional space directly, our understanding of simpler terrains helps us grasp the principles at work.
This process mirrors how we often approach problem-solving in our daily lives, trying different approaches and gradually moving towards better outcomes.
For AI, and for most brain challenges, this terrain is multi-dimensional (more than two or three information features that matter) and multi-objective (many ways to measure “best”). They’re impossibly complex to visualize fully. However, the 2D or 3D mental model we can picture for simpler problems helps us grasp key concepts:
Local vs. Global Optima: Just as a hiker might reach a small valley and mistake it for the lowest point, an AI can get stuck in a "local minimum" that isn't the best overall solution. This mirrors how we sometimes settle for "good enough" solutions in life. Sometimes that’s bad, resulting from lack of awareness of other, better solutions. Other times finding the best solution is too difficult or time consuming, and a quick and pretty good solution is better. Optimization landscapes help these conversations.
Gradient Descent: AI often navigates these landscapes by following the steepest downward (or upward) slope, akin to a ball rolling down a hill. This mimics our intuitive problem-solving approach of making incremental improvements.
Exploration vs. Exploitation: The AI must balance between exploring new areas of the landscape and exploiting known good regions, much like how we balance trying new things versus sticking with what we know works. One of the first lessons of optimization is that the learner can’t get unstuck without some randomness.
Learning Rate: How big should the steps be that the AI takes as it moves across the landscape? Too large, and it might overshoot the best solution; too small, and it may never reach it. This parallels how we adjust our approach to learning or problem-solving based on our progress.
The optimization terrain visualization helps illuminate why AI sometimes produces unexpected or suboptimal results. Just as a hiker might end up in an unexpected location due to the terrain's complexity, an AI might arrive at a surprising solution because of the intricacies of its optimization landscape.
By introducing students to this concept, we equip them with a powerful mental tool for understanding not just AI, but also complex problem-solving processes in various fields, from economics to engineering. It encourages a mindset of continuous improvement, brings forward critical conversations about what ‘best’ is, and helps develop intuition about how small, iterative changes can lead to significant results. These are principles that apply equally to far flung realms such as AI development and personal growth.
3. Confusion Matrices: Clarifying Performance
At first glance, the term "confusion matrix" might seem paradoxical – how can confusion bring clarity? Yet this simple visualization tool is a powerful way to understand and evaluate the performance of classification systems, both in AI and human decision-making. Many tasks that we give AI are classification tasks, where it is trained to decide whether something fits within one or more exclusive categories. Confusion matrices aren’t the be-all of AI performance analysis, but since so much of what we and AI do involves classification, it’s a good introduction to the subject.
In education, confusion matrices can be applied to a wide range of subjects, since categorization is a broadly applicable concept. In a literature class, students might use one to analyze their ability to identify different types of figurative language. In a history class, it could be used to evaluate the accuracy of predictions about historical events. In English class, it could help evaluate a student's ability to classify different literary genres.
A confusion matrix is a table that compares predicted outcomes (what the AI said) against actual outcomes (“truth”). It's a way to see not just how often a system is right, but also how it's wrong – a crucial distinction in many real-world scenarios.
Confusion matrices can help teach about characterizations of classification systems. Is it biased? What is the false positive rate, false negative rate, precision, recall, and accuracy (all of which can be calculated from confusion matrices)?
For AI systems, confusion matrices are invaluable for fine-tuning performance. They help developers understand not just the overall accuracy, but the specific types of errors being made. This nuanced view is crucial in many applications. In medical diagnostics, a false negative (missing a disease) might have very different consequences than a false positive (diagnosing a disease that isn't present). Incorrectly marking important emails as spam (false positive) is generally worse than letting a few spam emails through (false negative). In criminal justice, the implications of falsely convicting an innocent person versus letting a guilty person go free are profoundly different.
Confusion matrices can have many more than two categories. Each row represents the actual category, and each column represents the predicted category. The diagonal represents correct categorizations, while off-diagonal elements show various types of misclassifications.
The confusion matrix concept helps illustrate several key ideas in both AI and human cognition:
Bias: Confusion matrices can show whether there are certain kinds of errors that are more prevalent. In the table above, categories B and C are more often confused with one another. When saying ‘category B’, the AI is biased toward saying ‘category C’ much more than the other options. That might be because those categories are more difficult to distinguish, but if that categorization bias is ethically troublesome, then confusion matrices are the first step in determining whether there’s a problem.
Cost-Sensitive Learning: Not all errors are equal. Just as humans weigh the consequences of different types of mistakes, AI systems can be tuned to prioritize avoiding certain types of errors. If one category is dark-skinned human faces, and another is gorilla faces, then misclassifications of human faces as gorillas should have a much higher cost than other errors. While the cost of errors is a human question, the confusion matrix provides the raw material for such valuations.
Prevalence and Base Rates: A high accuracy can be misleading if one class is much more common than others. This relates to the human cognitive bias of neglecting base rates in probability judgments. In the table above, category D (the “Actual D” row) occurs much less often than the other categories, so there should be less trust that the statistics related to that category are accurate.
By introducing students to confusion matrices, we equip them with a powerful tool for critical thinking about classification and decision-making. It encourages a more nuanced view of accuracy and performance, applicable not just to AI systems but to human judgments and societal decisions as well.