I’ve long contemplates what math actually is. How can manipulating symbols on a paper according to some rules tell us how to manipulate atoms that we cannot even see, or send rockets to the moon, or create phones, and everything else we’re able to do today?
In Zen and the art of Motorcycle Maintenance, Phaedrus contemplates where hypotheses come from. Phaedrus says that in the scientific method, hypotheses creation is the biggest mystery. Hypotheses just appear, out of nowhere.
I don’t agree. I think there is a clear explanation as to how hypotheses seem to suddenly appear from nowhere. It is a step by step process. Continue reading
A principle-based decision making framework. That is – I’ve come to realize – the key to success. Regardless of how you measure success. Let me explain.
Currently reading 7 Habits of Highly Effective People. (Sidenote: probably the most life changing book I’ve read in my life – I can’t believe I haven’t read it until now – just imagining how my life would have been if I read it when I was 20 or so.) The book mentioned principle-centered living a couple of times, and even dedicated a whole chapter to the subject. But it’s not until I read this part that I really got it:
“Where does intrinsic security come from? It doesn’t come from what other people think of us or how they treat us. It doesn’t come from the scripts they’ve handed us. It doesn’t come from our circumstances or our position. It comes from within. It comes from accurate paradigms and correct principles deep in our own mind and heart.”
Suddenly, it made sense. I’ve been wrestling with a big decision for a couple of weeks now. It is actually a life changing decision.
I’ve always had a hard time making big decisions. I could never really feel that “gut feeling” that people talk about. I try to foresee the future. I weigh pros and cons. I analyze potential outcomes. I do everything – but decide, and just go with the decision.
In one blow, after reading that sentence, I realized this: I’ve been making decisions based on wrong premises all my life. I’ve been trying to predict the best outcome. This isn’t principle-centered decision making. It is outcome-centered decision making.
“What will lead to the greatest outcome” – rather than “what is the right thing to do, based on what I believe is right?”
Ironically, I realize now, outcome-centered decision making leads to worse outcomes, for several reasons.
First, it is prone to error. Nobody can predict the future.
Second, it is draining. By trying to predict the future, you will constantly end up evaluating your choice – never really “landing” in what’s the best course of action. Every bit of feedback you get along the way will make you question whether the choice you made will really lead to the best outcome, or if your other option was the better choice. You will end up constantly questioning yourself.
By making decision based on what’s the right choice – based on principles you believe in – regardless of outcome, you will be more centered. More firm. More trustworthy. Have greater integrity. Knowing what you believe in, and following that belief regardless of the possible outcome, does that to you.
And perhaps more importantly, the gut feeling can develop. Finally, it will be possible to feel what is right – because that feeling will be based on whether you are following your principles, or whether you are doing something wrong. Suddenly, it is possible again to listen to your emotions when making decisions.
This will save a lot of energy. Ironically, because you don’t waste a lot of energy questioning yourself, you will also become more productive. And, correct principles are time-tested. They are proven to work, through the history of people men throughout time. People who made the world a better place. People who were happy. So good principles are a good guide, and a great measuring stick for how well you are doing.
So what are the correct principles?
According to Covey, it’s the following:
- Human Dignity
- Quality and Excellence
- Potential and Growth
- Patience, Nurturing and Encouragement
And what do they mean? What is it, to be fair? What is it, to have integrity, honesty, to view people with dignity, and the other principles?
Well, that, I guess you have to decide for yourself. And that’s the part that I guess you can second-guess based on experience and feedback.
Whether or not we can directly comprehend reality is irrelevant. What is relevant how well we are able to predict the future. That is the purpose of science, and the only important factor to strive for, as that is what will give us increased control over our own fate. (Of course, this assumes that our purpose is to increase control over our own fate – some would object to that, mostly leftists and other types of collectivists.)
For example, whether or not something called “electron” actually exists is not important. What is important is that our model can use the construct it calls “electron” to predict future observations.
Our model consists of two major things:
- Constructs that model reality (for example the construct of “electron”)
- A way to translate observable reality into the language of our model
Using these two main components, our model can increase our ability to understand (and thus control) the future in the following way:
- We start by making an observation of the current reality, translating that into our model
- We then introduce an cause in our model, and verify through observation (may be done indirectly) that the same cause happens in reality
- We make a prediction in our model of what a future observation will look like, then we verify by making an observation and see if it is what we predicted
Whether or not the constructs used in our model do or do not actually exist in reality (i.e. Realism) is thus irrelevant. What is relevant is our connection to reality, and our connection to reality is the observations we can make. Thus, it is not important to predict actual reality, but to predict what we will observe using our model’s constructs.
Here’s a picture describing how this looks:
When it comes to solving problems, there are two main lines of thoughts:
- Plan, plan, do
- Do, learn (repeat)
I’ve found during my life experience that the second method is almost always the best one. It arrives at the solution quicker, and it solves the problem better.
In this post, I’ll explore why that is – how iterative problem solving actually works.
Solving math problems iteratively
During homework one day when I was a kid, I discovered that I could more easily solve my math problems by just testing and seeing what would happen. Even if I had no idea what would happen, I simply started jotting down a solution – with no idea as to whether it would hold or not.
What I discovered was that the problem of “just trying something” would yield a solution far quicker than thinking ahead.
I proudly announced my discovery to my teacher. I don’t know if she really understood what I was talking about, but she applauded me nevertheless, encouraging me to keep doing what I did.
This approach to problem solving has stuck with me ever since. Now, I’m at a point where I will explore the mechanisms behind this type of problem solving to understand when and where it can be applied. In order to do that, we have to look at the mechanisms that makes this work.
How iterative problem solving works
In the situations where iterative learning works, what happens is as follows:
You have no clue what the solution is, but you do have some (far from correct) ideas or guesses or assumptions.
So based on these ideas/guesses/assumptions, you test a quick and dirty solution. What you arrive at is probably very wrong, but you will have gained something immensely valuable: Learning.
By testing your ideas, guesses and assumptions very quickly, you will see the actual results they yield. This is feedback, which gives you learning.
This, during the process of trying and learning, the amount of additional learning you will have gained will probably be far more than what you would have concluded/learned if you tried to figure out the “correct” solution without getting your hands dirty to actually try immediately.
Using those new learnings, you will have revised your guesses, assumptions and ideas, and you can try again. This time, from a higher level of understanding.
By repeating this process, you will continuously increase your learning until you are at a point where your assumptions, guesses and ideas are correct enough to bring you to the solution.
A formalized iterative learning process
Actually, learning IS making an assumption (read: guess) based on what you do know, then testing those assumptions to see if they hold.
So in your original “try and learn” approach, you might have tried to solve the problem by assuming (read: guessing) three things: Assumption 1, Assumption 2 and Assumption 3 (A1, A2 and A3).
If the assumptions produce the correct answer, great! You have verified that all three assumptions are correct.
If you get the wrong result, at least one of the above assumptions must be wrong. This, in itself, is valuable knowledge, because it presents you with two choices:
- If you have other ideas (for example A4 and A5) which you think are likely to produce the correct answer, simply try to solve a problem again using these.
- If you don’t have any more ideas, or if you have too many possible ideas to test, then you might want to drill down to A1-A3 to draw additional learnings about why they failed.
Number 1 is easy: Simply repeat the process.
Number 2 will create a “recursive iterative learning” cycle.
Recursive iterative learning
Pick one of your original assumptions to drill deeper into, for example A1.
Formulate sub-assumptions that underlie A1. For example, you might have some ideas (assumptions) about why A1 can’t be correct: Let’s call these A1.1, A1.2 and A1.3.
Pick one of these sub-assumptions, preferably one that would lead to some “chain reactions” in terms of your original solution (i.e. if any of them are correct, then it would also eliminate or strengthen some of your other assumptions). Then test it.
If it succeeds, great: You have learned something new. This new learning will have consequences for at least A1 (striking it from your list of possibly correct assumptions), and possibly more.
If it fails, repeat the process by testing the other assumptions in this level (A1.2, A1.3 and so on), or create new sub-sub-assumptions and test the sub-sub-assumptions (for example A1.1.1, A1.1.2 and so on). Do this until you can draw a definitive learning, and go back in your recursive learning chain and let all the recursive learnings fall into place.
You have now drawn a set of learnings from your original guess. From these new set of learnings, you can make new assumptions that are closer to the truth, test them, and repeat the process. With each iteration, you will come closer to the truth until you finally arrive at it.
What it looks like in real life
In reality, nobody (I hope..?) thinks like the above. Instead, the process happens unconsciously when we just “try something”.
For example, let’s take the math problem I was trying to solve as a kid described above. Here’s what actually happened:
I was sitting and looking at the problem, with no clue as to how I was supposed to solve it.
So instead of sitting there, stuck in my own thoughts, I decided to simply jot something down. I started by drawing a character, and then the next character. Before I started the process of jotting down each character, I had no idea which character would actually be “jotted down”. Instead, the actual character came to me as I started jotting.
A couple of times, I realized that the character or formula I jotted down didn’t make sense (=> my first iterative learning, happening organically). So I erased, and tried again (using the learning from the previous step to try something new, i.e. realizing that A1 didn’t work and trying A2 instead).
At some point, I thought I had arrived at the correct solution (using A2). But I realized everything I had done had been garbage, because it didn’t turn out the way I wanted it. I started wondering why the heck it didn’t work. I had an idea (A1.1). So I started experimenting at the side trying to answer the question in my mind as to why my original solution didn’t work (testing A1.1). Suddenly, I got an interesting result (proven A1.1) which gave me a new idea (A3) which I used when starting with the original question from scratch (testing A3), which arrived at the correct conclusion.
In reality, the process is even messier than this. But the actual process is the same, only more complicated (more branches, more assumptions and sub-assumptions), not different.
A couple of scenarios in which you can apply iterative learning
So where can you actually apply “iterative learning”? Well, as it turns out, in a lot of places:
Programming: Trying a solution and seeing where it leads you, drawing learnings from that destination and trying again (Agile Software Development)
Starting companies: Start from where you are, using the knowledge you have, make a quick and dirty roadmap, start the journey, and learn and adjust as you go (Lean Startup).
Building rockets: Build a rocket as quickly as you can, using what you know (A1, A2 etc.). When the rocket crashes, analyze why it crashed, draw a new conclusion (A1.1), make a new assumption (A3) and build another one. (Elon Musk’s methodology as described in this biography)
And probably much more 🙂
Summary of iterative learning
So in summary, when you have a problem, even though you know that you don’t know the answer, simply assume things and get started. Then learn from the results you get, and start again with the higher level of knowledge you have. And so on, until you have ruled out all but the correct solution.
How the ambiguous nature of reality makes it difficult to model in a database application
I was reading the book “Data and Reality” (highly recommended) about how to model reality in a database application. In its first chapter, the book talks about the difficulty in doing so because of the inherent ambiguity of reality. However, you don’t want your database to be ambiguous – it has to be structured in a way that lets you efficiently categorize, separate, search, perform operations, etc.
Or does it?
How reality is ambiguous (from a data modelling perspective)
Reality is ultimately not a set of categories, states, etc. Humans interpret and model reality that way, using computers, language and other constructs, in order to interpret it in the most efficient way for our purposes.
At a very low level (enough for most of our purposes, if we are talking about practical everyday or business applications), the best model to approximate reality is simply physics. From the basic laws of physics, we can derive everything else (assuming higher level than quantum physics for now). We could, in theory, model reality as a set of atoms and molecules with physical properties that interact with each other. This makes reality incredibly fluid.
Any time you want to simplify that fluid reality into a database model, you loose that ambiguity – and with it, the true mapping of your database to reality (the more you simplify it, the more rough your approximation of reality will be, and this means real problems in your business applications).
As a data modeler, you trade flexibility for CPU
In reality, a collection of molecules which together create a metal rod is just that – a collection of molecules organized in a specific fashion with endless amounts of physical properties based on that assembly. But your database, depending on what it is designed to do, can categorize that reality as either a metal rod, a pipe, a baton or something else. And it may look at different properties of this thing depending on what the application is designed to do. For example in some cases the weight will be important, in other cases the ability to lead electrical current. This is the fundamental problem of modelling reality in a database. Which aspects do you model? Which aspects do you ignore? How do you categorize the thing you are modelling?
So, as a data modeler, you are making choices. These choices will limit that incredibly flexible “collection of atoms” thing into a more narrow field of use. Modelling it in a certain way, making those choices, necessarily limits what you see that thing as, and that reduces your application’s flexibility. If you call it “a collections of atoms and molecules of X type grouped in Y way”, you have incredible flexibility. Only your imagination and physical properties limits what you can use that thing as. If you categorize it as a rod, you disregard all those physical properties, and you see it as a rod, and a rod only.
By categorizing that thing upfront (based on your database design which assumes certain things), you are in fact making a trade-off: You give up flexibility for clarity and speed. The less defined your categorization is, the more processing power and intelligence you require to find, group, understand and make decisions based on the things you are modelling. Basically, by creating a category, you are making a decision upfront that will make your processing power more scalable (because you have made one decision that you can easily query for many times in the future – if you created the category “rod” and put objects into this category, it will be very easy to find all rods in your database) but also more rigid (simply because you HAVE made a decision upfront, and decisions are often irreversible).
Whereas if you don’t categorize, but instead put objects into the database based on their properties alone (for example a lower level of categorization of a rod would be “thing created of metal with length x and width y”), you have more flexibility in the future (you can define those things as either rods or something else), but the approach is less scalable. Each time you want to find all things in the system that can be used as a rod, you have to first define which properties “a rod” as per your definition has, and then compare each object in your database to that definition.
Can we model and search based on sensory information rather than properties or categories?
We can follow the above logic further. If we go even lower level, you may not need to even provide any properties yourself. Perhaps the properties can be derived from more basic information – such as sensory information. Perhaps you can feed into your database images of rods and other sensory information such as temperatures of rods, roughness of rods, sounds rods make when they hit something, and so on. From that sensory information, it should be possible to derive any of your possible requirements.
Perhaps, we can even take your use case to a lower level. Perhaps you won’t search for rods by providing search properties, but simply another piece of sensory information (image, feel, etc.). Thus, without any categorization inherent in your database, you could let the system find all the rods similar to the rod you scanned in. You could then improve the search by giving the system feedback on the rods it presented.
Of course, this doesn’t meant that the items in the system aren’t categorized. They may be – but the categorization is internal in the system, not externally visible to you.
But doesn’t that mean that we haven’t actually solved the problem of how to categorize reality better? Because what we have done is to move the responsibility of categorizing from ourselves to the system itself. The categorization, simplification, and “upfront decision making” of how to model reality is still going on behind the scenes, and we really haven’t gained any flexibility at all (I will go as far as to say that such a system can impossibly be flexible in the way we want it – and the way we want it is that it models reality without making any upfront decisions about which category something belongs to – because such decisions is actually the opposite of creativity, and creativity can be described as the ability to not pre-categorize objects, thus finding new ways to use existing tools).
What we are searching for is a way to save and retrieve information quickly without sacrificing flexibility. How can we do that? It seems like as soon as we start categorizing things, we gain speed but loose flexibility.
What if we copy and simulate reality instead of “represent” reality?
Well, actually there is another way. A different way if storing information which doesn’t even use the notion of categorization. We can model the thing we are saving physically and directly, by simulating it in another system. Basically, we can copy the physical world physically into another physical object, the “other” physical object having some properties which the first physical object lacks, namely 1) ability to change and adapt quickly and 2) ability to iterate quickly (actually follows from the first point). Bear with me for now, and I will describe what I mean below.
The “simulator” gets sensory input about the “real” object, and creates a replica of it inside itself. There are no categories, properties, data points. Just a simulation. This “simulator” can store many such replicas inside itself, each corresponding to real sensory information that it gets from its “reality sensors”.
It can also invent imaginary simulations, because it has the ability to copy, manipulate, and play with its own simulations inside itself. Basically, it is a simplified, more dynamic simulation of all the sensory input it has ever received from the real world, plus it’s own ability to manipulate those simulations (which corresponds to human imagination). When manipulating (or let’s call it “playing with”) it’s simulated physical world, our “reality simulator” may see recurring patterns. It may then save those recurring patterns as a separate “pattern object”. In the future, when saving new objects that are similar to the pattern object, it can simply save how those objects differ from the pattern objects, thereby saving space and CPU (because it can save how the pattern object usually interacts with other objects, and derive from that how the actual object it is simulating should behave – including its differences from the pattern object).
So we are going form a “representing” paradigm to a “copying and simulating” paradigm. Depending on the physical implementation of this simulator, it may have both the efficiency of the “modelling” paradigm, and the flexibility of the real world.
Interestingly, such a “reality simulator” exists. Evolution has chosen it as the best way to model reality. About every decision making organic entity on earth (including humans) uses it every day to copy and store a version of reality based on sensory input inside their nervous system. Humans are the most advanced species doing this, using our brains. We use this reality simulator to copy a version of reality inside our heads, predict the future based on simulations of this copy of reality inside our heads, and to query and retrieve information (report on) reality to make (business or life) decisions.
How would a reality simulator be implemented?
A reality simulator (and our brains) doesn’t actually create an actual replica of actual objects. Instead, physical objects are mapped to sensory information, and that sensory information represents actual physical objects, and is stored. This is equivalent as simulating the physical object, since reality is mapped to specific sensory input.
This has some implications to our theory. It means that instead of creating a rigid model of reality which consists of set categories, we instead present reality to our computer as a set of sensory signals. Instead of representing “object with properties x or y” we present “vision sensory signal a, sound sensory signal b, touch sensory signal c” and so on. This combination if sensory signals can then represent anything that exists in reality, and the computer can replicate, duplicate, iterate on, combine etc those representations in its internal world. In that way, the computer can simulate all of reality in its internal systems.
For example, when the computer senses a rod, it stores the rod (the sensory signals that represent the rod) inside its simulated reality. After sensing multiple rods a number of times, the computer may start to recognize that these objects are very similar to each other, and seem to be used for the same purposes. So it may create a “generic rod” pattern in order to optimize CPU and space, and predict reality based on how a generic rod usually interacts with other objects.
How to take it to computers
As long as the human created digital computer world does not move from a “representing” paradigm to a “simulation” paradigm, we will not have strong AI, regardless of how powerful CPU gets. Because the representing paradigm presents inherent limitations, expressed somewhat above, that cannot be solved for the trade-offs discussed above (and probably other reasons which I haven’t discussed).
However, if we embrace the simulation paradigm, we can create the same strong AI found in our brains, but improve on the inherent weaknesses of a living organism. As organic implementations, our brains consists if living tissue that deteriorates fast, and has limitations in scale, power, and size. By representing the same simulation paradigm but implemented in a non-organic physical system, we can overcome these weaknesses and create a brain unconstrained.