A computational definition of grounded compositionality

“The meaning of a whole is a function of the meaning of the parts and of the way they are syntactically combined”. It seems there is not a clear consensus on how we would implement compositionality in computers.

I have gone through several research papers and collected several different definitions.


The most common definition of compositionality is systematicity first proposed by Fodor in 1988.

“The ability to produce/understand some sentences is intrinsically connected to the ability to produce/understand certain others”.

Systematicity is a complex way to say recombination. If we understand the parts and rules, we can recombine them.

We can understand new sentence combinations we have not heard before because we can understand the individual parts. If a meaning of a word/concept/atom is understood in one context, then we should be able to use that concept in a different context.

For example:

If you understand “black dogs” and “red cat”, then you should be able to understand “red dog” and “black cat”.

If you understand “5 boxes in the basket” and “red circles in the box”, you should be able to understand “5 red circles in basket”.

In Foder and Plylyshyn’s 1988 paper, they contrast this to a computer that stored every system in memory in an atomic way. If you had a new similar but slightly different sentence, the computer would not understand the meaning.


Chomsky and Von Humboldt stated “language makes infinite use of finite means”. With a finite vocabulary, we can generate infinite variations and lengths and be able to understand it. The lengths of our sentences are constrained by our biological memory.

In neural networks, they currently cannot understand sentences that are longer than their training data.

In traditional programming languages this is not a problem. As long as the sentences follow the grammar, the interpreter and compiler can “understand” any length of sentence possible.

An example: If we know several adjectives, we can combine them: “A big pink ugly smelly dog”

Contextual references

If you are able to reference an object in one situation, you should be able to reference the same object in different situations, but it might need to be references differently.

In Scene A, you are have a dog you are targeting. Its weight is 20 lbs and there are larger dogs with you. You call it the small dog. In Scene B the small dog is still there, but the big dogs are replaced with tiny dogs that weigh 10 pounds. The easiest way to reference it is by calling it the larger dog. Your model must be able to use relative references.

Leave a comment

Your email address will not be published.