Dominikus BaurData Visualization

Science is thinking by other means

The scientific process mimics how our brains work
Published: 2015/06/26

Recently, the connection between the scientific process and how our brain works hit me.

I had given a lecture on the basics of visual perception a couple of weeks back. Our understanding of how visual perception works is based on two ideas: bottom-up and top-down processes.

Bottom-up processing describes the flood of data that arrives at our senses and is gradually turned into analysed data. Think of the vast amounts of data that are produced by your eyes and how they become lines, textures and finally cars and people in your brain.

Top-down processing is the other route, going from our vast knowledge of the world (our mental model) towards the senses. We're testing hypotheses of what we might see based on the given data and quickly recognize objects that we've already identified before.

Visual perception combines bottom-up and top-down processing.

Optical illusions such as the Necker cube or the Charlie Chaplin illusion point towards the importance of top-down processing. On the other hand, the top-down idea suffers from a cold start problem - how do we initially build our mental model of the world?

Theories such as Neisser's perceptual cycle or Schema theory unite these two concepts: bottom-up and top-down processing are each applied in rapid succession until we know what we see or have constructed a new visual concept.

Coupling these two tools enables us to be extremely effective at data processing. For example, here's a crazy amount of data:


Here's your big data!

That 666,000 data points right there. Think of which statistics you'd employ to understand that. Or you could just think: That's a person sitting on a bench at the beach. Simply because your top-down processes inform you with the suitable patterns that they recognized.

Visual perception vs scientific perception

If we want to crazily simplify what our brains are up to, then we can say that they have one goal: to recognize patterns. These patterns can be visual, auditory or even higher level when trying to understand how other animals, people and physical phenomena behave.

Which is also pretty close to what science is up to. Science is just thinking by other means. Since science is naturally much slower than our minds, it has taken us a couple of centuries to form hypotheses and recognize patterns in the natural world, thus replicating the process that happens in a mind when growing up. You get data from your senses and you turn them into schemata that you can recognize in the future. Slowly, a theory of how the world works arises.

Science has gotten much closer to the amounts of data that our own minds have to digest every moment.

What's changed recently, is how our tools to do science have become better and better and we became able to measure and store vast amounts of data. The Large Hadron Collider is probably our most prominent scientific tool at the moment and it creates a crazy data amount of 1 Petabyte per second. Researchers from Google looked at this "Unreasonable Effectiveness of Data" for language processing and concluded that feeding large amounts of data to automated learning works much better than trying to come up with specific theories. Big data is all the rage and this data-driven science as the Fourth Paradigm was even hailed as the End of Theory. Science has thus gotten much closer to the amounts of data that our own minds have to digest every moment.

So: science solved, right? Bottom-up processing has won.

Well, it's not that easy. While it's convenient to overemphasize the crazy big numbers that pour out of our research instruments, the resulting data is still infused with domain knowledge. Even in the Petabytes of LHC data we're still looking for specific things like the Higgs boson. Learning algorithms could probably also find patterns in this data, but these patterns would most possibly be meaningless to us and not really help us with our scientific understanding.

Data without theories is just ones and zeroes.

That means that even in today's data-centric science applying domain knowledge and theories is still central. Data without theories is just ones and zeroes and top-down processing is back with a vengeance.

So, thanks to our data hurling machines, science had to develop its own way of bottom-up processing (using machine learning) while still sticking with our top-down scientific models of the world. We've arrived at the perceptual cycle from before and science is turning into a global mimicry of our minds: filtering massive amounts of data from its "senses" and turns them into meaning in a top-down way.

We've rebuilt our minds on a global scale.