It was 1988, and I had started to informally do research with the professor who ended up being my dissertation advisor. Her research focused on understanding vision in humans and various animals, with the frog being a common subject. Along the way, she had invented a method for finding the optimal image that would best stimulate neuron firing.
That optimization method was a variation on others at the time that relied on complex math, but I was reading about new approaches based on combinations of simple-acting, rough analogs to biological neurons—Artificial Neural Networks (ANNs).
ANNs are the basis of much of the world’s AI, and certainly of the best-performing ones. GPT-4 is rumored to have 1.76 trillion parameters that are adjustable in training. Those are mainly the connection strengths between artificial neurons. The number of neurons is less clear but is likely in the billions. The human brain has something in the range of 100 billion neurons. AI networks are in the size-range of brains. (Note: Most AI companies no longer report model size, as it’s not the best proxy for performance level.)
In contrast, I started working on models with just a few neurons. At the beginning, I “wired” them in software by setting each number myself. I played around with different neuron properties, network architectures, and learning rules and objectives.
In one experiment, I wired together artificial neurons in the connection patterns seen in the real retina. Like with today’s AI, that artificial retina learned by adjusting the strength of connections. To my amazement, I could get the network to behave functionally like a real retina using only simple learning rules for each neuron. I experimented with different control conditions, like the time I had the artificial retina wrap around to make sure it wasn’t an edge effect that was causing the retina behavior pattern I saw.
That playing with neural networks was fun, and the lessons have transcended 80s AI. Emergent collective behavior isn’t just about bee hives, ant hills, and bird flocks. Playing with ‘toy’ ANNs taught me about the power of distributed intelligence in any group, organization, or society. It showed the influence of imprecise or conflicting objectives. I could clearly see how the data used for training affected behavior; I could even run ANN ‘sensory deprivation’ experiments. I learned about network overfit and bias, the tradeoff between learning rate and quality, and many other system and adaptation fundamentals.
These early experiences taught me an invaluable lesson about AI education: the power of starting small.
The Value of Starting Small
Learning about AI should start like my education did, especially if being a computer scientist is a possibility. Students need to be able to see inside ‘the box’ at the beginning. They may have to manually change connection strengths so the influence on the whole network is easily seen. They can use simple learning rules, uncover their weaknesses, and imagine the next step forward.
I bet I sound like an old geezer saying, “back in my day…”, but that’s not really the point.
Conceptually, the AI of today isn’t fundamentally that different from what I started with in the late 80s. The AI models are way more complicated than they used to be, but most of the concepts are visible in smaller versions. My career just happened to trace the arc of computational power, where similar methods to 80s AI went on anabolic steroids.
After all, I’m not recommending other old stuff. Don’t learn about computers by using anything I had in 1988. It was a coin flip whether my ANN would finish its training before the computer would crap out.
I’m recommending this because black boxes of the scale of modern AI are too complicated to easily use to learn the fundamental lessons about how they are built. As importantly, they’re too complicated to develop a gut feel in young learners. Many of the key lessons can be taught using much simpler neural networks, beginning with tinker-toy-ish network designs, basic learning rules, and easily understood problem statements.
It doesn’t take long in the learning progression before the transparent box becomes a black one. Beyond toy problems, ANNs get too big to intuitively understand. The training, learning, and data complexities get abstract.
Applying AI Principles to Education and Beyond
At this point, future computer scientists encounter AI training in college, or late high school if they’re lucky. Giving them toy problems doesn’t seem appropriate for their age. Certainly, they don’t hand-wire artificial networks in their initial studies. They make a deep neural network using a readily available software library.
I hand-wired stuff in my Ph.D. program, and I found it fascinating, but I don’t necessarily think college computer science students should start that way. There’s a lot for them to learn, and their immersion in the field and ability to think abstractly will probably teach them the same lessons in other ways.
Instead, I think every student should get a taste of this at much younger ages, because the lessons and tradeoffs are instructive far outside of an AI challenge. What I learned from small AI has affected how I think about a host of social and organizational challenges that have similar characteristics to neural networks.
For example, I am often mentally seeking to find the fundamental keys to challenges in ways that map to the organization and teaching of ANNs. What are the changes to objectives, or measurements of success, that can drive different behaviors? What are simple things individuals can do that are powerful when aggregated across the organization? In what ways might changing too quickly also bake in bad practices?
Computational thinking focuses on software concepts through non-coding challenges that can be taught at young ages.
AI thinking can similarly be taught to youngsters through age-appropriate exercises, especially ones where the mechanisms are transparent. Students can understand a lot of AI principles and tradeoffs through smaller-scale problems. If they play enough, play being the critical word, then the tradeoffs and lessons will last when the AI box gets opaquer, or the evolving system is a bunch of people instead of artificial neurons. By introducing these concepts early and simply, educators can lay the groundwork for a deep, intuitive understanding of AI that will serve students well as they encounter more complex systems.
How Does K-12 Get Started?
Classrooms can bring the same hands-on AI exploration to their students at any age, in any subject, and in ways that don’t require much specialized teacher knowledge. Like with computational thinking, you’re teaching methods and principles that have ubiquitous application.
Success is when children build an intuition about these concepts because they apply them routinely. It's not about teaching them to code complex algorithms – it's about letting them play with the fundamental concepts.
There are many ANN principles, but three are the principles of distributed pattern recognition, self-organization, and optimization. The table below gives examples (from Claude 3.5 Sonnet, 6/28/2024) of how you could explore those at various ages, and in any subject (though I show only three). I am scratching the surface.
I have been saying for some time that AI literacy needs to hit more fundamental aspects than prompting. This is not only because understanding what’s behind the curtain—or inside the black box—helps for using AI productively. It’s also because the principles behind ANNs have wide applicability to human thinking and learning.
Yes, high schoolers didn’t need to understand network effects and computer vision, nor the word “ephemeral” to use Snapchat. Yet, there they were - dabbling in computing that a graduate comp sci would have thought too complex to explain to a high schooler a decade prior.