
Just read this short and sweet paper from Guy Davison et al.
Brenden Lake is also a co-author, I have read several of his papers.
paper link: https://psyarxiv.com/byzs5/
They built mini programs as DSLs that represent game rules. The programs act as reward generating functions (goals as reward-generating programs). Their line of reasoning follow the approaches of Fodor’s Language of Thought hypothesis (LoT) where they treat knowledge as programs and learning as program induction within the language.
Kids often do self directed play where they are exploring and creating their own mini games such as stacking blocks or throwing balls. From a computational process, this is thought as “generating new ideas and exploring novel search spaces”.
They took AI2-THOR, an open source embodied 3D research platform and make it look like a child’s room with lots of toys all over (blocks, balls, and other stuff). Then they invited adults to a website (didn’t know it worked over web!) to interact in the room and create a game with a way to score points by writing a description of how the game would work and how scoring would work. They ended up with ~100 games. They then turned those natural language descriptions into actual programs using their DSL.
They classified all the games created and found these types of games in order: throwing objects games (75), organizing games (10), and stacking games (8).
I love that they explored modeling games as program. I have been thinking a lot about how to simulate the world with models and in doing so I have been contemplating what would the underlying data structure look like. Is it vectors, neurons, natural language, programs, or something else. Programs as models are interesting because they are mostly compositional because their primitive atoms can be reused in infinite ways. There has actually been a lot of research in this area already and some of that research is referenced in the paper such as PDDL from Ghallab et al.
They “found compositionality” in their programs in that 50% of the programs reused and recombined across games using the same blocks. They do this by counting all the reoccurring code and saw that there is certain blocks that appear frequently but with minor variations. That means that people are making semi similar games around core themes with slight modifications. They also saw creativity which they measured by high variance across subjects. They also saw common sense via intuitive physics (throwing balls and stacking blocks).
They have some interesting areas to look at in the future:
Explore more models/programs that act on the game programs built.
Writing more models that can generate, reason with, and pursue playful goals.
Some future questions they want to answer:
Could they devise and train a model to predict whether a particular goal is easy or hard?
Can a model propose games that are indistinguishable from human-generated ones.
TODO:
- reread https://cocolab.stanford.edu/papers/GoodmanEtAl2015-Chapter.pdf
- study LTL https://en.wikipedia.org/wiki/Linear_temporal_logic