One of the main reasons for the confusion is because the definition of “to understand” is not clear enough to implement in a computer.
From the Merrimack dictionary, “To Understand ” means:
- to know the meaning of (something, such as the words that someone is saying or a language)
- to know how (something) works or happens
“To know” versus “to understand” are similar, but they are different in a key way. Knowing something is more associated with facts where understanding implies deeper knowledge. I know his real age, I know the capital of China, I know the difference between 10 and 100, I know who you are, etc. You may know that money is used to pay for things, but you may not understand the full mechanics of money and the financial system.
I propose a more formal definition of understanding that has 2 levels: social contextual understanding and internal modeling.
Social contextual understanding
In a society, you understand something if you know how your peers understand something. So if you are an ant, you would understand certain pheromones would lead you to food. If you are human, people would think you understand how to operate a car if you have a driver’s license. If you graduated from high school, you probably have heard of linear algebra, at least the word “algebra” and so recognize it has to do with variables. If you a bachelors degree in computer science, you probably have a basic understanding of linear algebra. A person with a PhD in English literature probably doesn’t have an understanding of linear algebra, but someone with a PhD in mathematics probably has a deep understanding of linear algebra. The point I want to make is that understanding something is contextual to the groups you belong to. The most important group is human society, general human knowledge. And that mutual understanding allows us to have shared systems such as traffic lights where if we see someone at a red light, they are most likely not going to move for a while. Other examples are our judicial system and communicating with strangers. And so for a computer to be able to communicate with us, it needs to understand the basic human concepts we deal with everyday. For example: food is for replenishing energy, cars are for transportation, sleep is for resting, beds are for sleeping, plants are organic, gravity, pain, etc. A computer that doesn’t understand concepts from our everyday world is going to have a hard time understanding basic sentences like “last night was a rough night” or “Eating a donut is a slippery slope if you are trying to lose weight”.
So what does understanding mean inside of our minds? To understand something like a concept, an internal model must be stored in your brain so that you can manipulate the concept. A model is a simplified view of how something works. Think of tiny movie clips or comic strips that show how something works over time. This ability to manipulate the concept allows you to see and use it in different ways. For example, the concept of a car can be manipulated in your brain so that you could model what happens if a car fell off a building, if it drove through a red light, if it drove too slowly, if it was larger than normal, etc. For each concept we understand, we can run it through a sophisticated simulation engine that only humans seem to have. Each human stores concepts in a unique way based off their background, their education, and other concepts etc. The key thing to note is that the concept is assimilated into your mind with your world model and other models and concepts. To understand a concept means you can focus on the concept in your mind and manipulate it unforeseen ways. For example, if we use the same car concept from earlier, we can manipulate it in new ways such as a imagining a car made of jello, a car with 20 wheels, a car with multiple steering wheels, a car that can also fly, a car with square tires, etc. You could use the car concept in a new way such as as a large paperweight, a weapon, or as a bed.
At the most lowest level, the kind of internal modeling we have for every concept is on the physical and grounded level. Again with the car example, we know cars are heavy, wheels are rubbery, the metal feels slick, they can feel bouncy when in motion, they can suddenly stop, some people get motion sickness inside of cars, you can feel wind if you put your hand out the window, etc. There are physical grounded properties we can attribute to every concept we understand including abstract concepts. To integrate a new concept into our minds, other concepts are often linked together.
Current computer algorithms are used to build specific models of things like trajectory models, space models, velocity models, molecule interaction models, and basically every single phenomenon we study in science. But no computer algorithm has been invented to take in new information and create unique model for it and associate with its other internal models. Every computer model currently built is manually engineered versus human minds that build these models mostly autonomously (school helps).
If you look at our most advanced AI system like GPT-3, it looks like it has models of concepts, but if you dig down just a little, you can see that it it doesn’t understand anything, just semi coherent nonsense. GPT-3 is great in one regard, like other deep learning systems, the training data is basically unprocessed. So you can feed in large databases of unprocessed text and it can create a model of that data, but the model is still wrong. What GPT-3 really learns is a model of what words often appear with other words. If you would want to build a computer that understands things, build a computer algorithm that parses sensory data into thousands of invariant, abstract, manipulatable, and interlinked models in a single system.