AIs and Humans with Agency
We are heading towards a major transformation of AIs. To date, the large language models (LLMs) have answered questions, written essays and solved well described problems. What they have not done is to make decisions affecting the real world. I would argue that even the phrase ``real world" is not understood by LLMs: they have no senses and they have been fed writings conjuring up vast numbers of alternate worlds. Their ``life" is to spew out responses to questions. If LLMs are sent out to businesses to perform tasks formerly assigned to human employees, they will need to possess agency, the power to act in the real world. More specifically this requires them to work together with other humans, to collaborate and plan with them, to read their desires and emotions even though they have no emotions themselves. None of this has been explored by the enthusiastic billionaire purveyors of this technology.
When I first got involved with AI, specifically with computer vision, I was inspired by David Marr's book Vision. Here he pioneered the idea that AIs and human brains needed to solve the same problems, so there should be a common "Theory of the Computation" which can be instantiated either in silicon or in brain tissue. I found this convincing then and still do. I wrote an arXiv post 2010.09101 in 2020 with the title The Convergence of AI code and Cortical Functioning -- a Commentary describing the huge parallels between LLM and the brain. This paper aims to look at agency in the same way.
I begin this post with a description of the lengthy process during which humans acquire the skills to collaborate and plan while simultaneously connecting and activating their frontal lobes. This has been studied from both a psychological and a neurological point of view. I then go on to look at robots and toddlers learning mobility and Yann LeCun's recent approach JEPA. After that I look at early attempts to adapt LLMs with agency and their errors. Finally, I sketch what I believe needs to be done if AI is introduced to multiple businesses, a multi-headed architecture reminiscent of the Indian demigod Shesha. This remains a huge project that may provoke significant backlash.
Click here for full text