CBMM10 Panel: Interacting with the physical world
Date Posted:
November 6, 2023
Date Recorded:
October 6, 2023
CBMM Speaker(s):
Leslie P. Kaelbling ,
Marc Raibert ,
Matt Wilson Speaker(s):
Daniela Rus, Mehrdad Jazayeri
All Captioned Videos CBMM10
Description:
Which aspects of human intelligence require embodiment? Motor control: was it the key for development of human intelligence? Is embodiment necessary for consciousness?
Panel Chair: D. Rus
Panelists: M. Jazayeri, L. Kaelbling, M. Raibert, M. Wilson
DANIELA RUS: So welcome to our panel on "Interacting with the Physical World." My name is Daniela Rus. I am a roboticist, and also the director of CSAIL. And I'm excited to be your host today. Now, this session will be an exciting dialogue between the intricacies at the intersection of human intelligence, embodiment, and the development of AI.
Now, embodiment refers to the tangible physical form or the vehicle that houses intelligence. It serves as the interface through which intelligent entities, whether engineered or natural, perceive or interact with the physical world, and influence their environment. It is not a passive vessel, but a dynamic participant that shapes and is shaped by the intelligence it contains.
Now, intelligence, on the other hand, is the capacity to acquire, possess, and apply knowledge. It enables entities to navigate, to adapt, to manipulate the surrounding world. And in the realm of humans and animals, intelligence is inherently embodied with a brain for natural creatures and body working in tandem to produce coherent, purposeful behavior.
But in machines, intelligence is instantiated through intricate algorithms and hardware that are designed to mimic or replicate aspects of biological intelligence in a digital or in a mechanical way. So our embodiment and intelligence, a synergistic duo, each amplifying and refining each other's contributions to a holistic, functional, and adaptive system, capable of responding to the challenges in the physical world. Or could intelligence exist, thrive, and even surpass its embodiment counterparts in an abstract disembodied state, where physical limitations are nonexistent and the possibilities are boundless?
Our panel will navigate through nuanced aspects of embodiment and intelligence. We will ask whether motor control, a foundational pillar in the tapestry of human intelligence, is important. How vital is embodiment to intelligence we observe in living organisms and in engineered machines? Does embodiment play a pivotal role in intelligence?
So in the session, we seek to unpack some of these complex questions, exploring the terrain where biology, embodiment, and artificial intelligence intersect and interact. And together, we will discuss profound questions, hoping to shed some light on the interaction between embodiment and intelligence between the tangible and the intangible in natural creatures, in biological creatures, and in engineered creatures. We have four extraordinary panelists to discuss these topics.
Professor Leslie Kaelbing is a very esteemed computer scientist who has dedicated her career to research at the intersection of machine learning and robotics. Leslie's contributions are extraordinary. They advance the frontiers of knowledge in AI and robotics. And they combine symbolic and model-based reasoning about the world. Her work is both deep and transformative, and has carved pathways for practical impactful and impactful applications in intelligent systems.
Next, we have Mark Raibert, the visionary founder of Boston Dynamics. He's known for his pioneering efforts in robotics. Mark's efforts have given life to many engineered creations that mirror the agility and grace of living systems.
He has built robots that run, leap, dance. And he doesn't just craft technology. He reshapes our understanding of what is possible with technology.
So joining our roboticists on the panel are two neuroscientists. Professor Mehrdad Jazayeri has expertise at the convergence of neuroscience and AI. His work dives deeply into the intricacies of the brain to fetch insights that are not only academically enriching, but are also crucial for the progressive development of AI. His work is a bridge that connects the functions of the mind with the infinite potential of machines.
Our fourth panelist is Professor Matt Wilson. His research is a treasure trove of insights into cognition into learning and memory. Professor Wilson specializes in deciphering the roles and functionalities of the hippocampus, and his findings are instrumental for anyone venturing to design machines that can learn and evolve intelligently.
So without further ado, let's delve into the panel. And I would like to invite Leslie to give her opening remarks. Let the dialogue begin.
[APPLAUSE]
LESLIE KAELBING: So hi. I'm Leslie. I do robots. OK, so what do I want to do? I want to make robots that work in complicated real-world domains.
And I think it requires human-level intelligence. If you think about doing disaster relief where everything is crazy, the structures are gone, helping a person in the household, doing construction on a complicated and messy construction site, managing a depot-- that really requires intelligence. And so I want to think about what class of intelligent systems are we studying.
So you could say at this moment, we have the large language models, and those are intelligent systems. But maybe we should refine our scope a little bit. You could think of systems that interact with the world around them. Those might be stock trading systems, for instance, or things Siri in your phone.
There are embodied systems like robots. At least they have some physical connection to the world around them. You could think about animals. You could think about humans. There's all these different categories of systems that are intelligent.
And as a person who's interested in this constellation of areas, the question is, what subset of these systems should we study? And I would like to argue that systems that are actually connected to the world and have to operate in the world like you, like a dog, like a robot that's helping people, are faced with the same set of questions roughly. And they have the same set of structural regularities in the world that they can exploit. And so I have a hope that, in fact, because we all face the same problems, it might turn out that the same underlying technical ideas will help us solve the whole problem class.
So in both cases-- in humans, in robots, there's a closed-loop sensori-motor interaction with the environment. And you see overtly signs of the intelligent behavior thoughts that are going on inside. So in a Quest mission that I'm involved with, the goal is to define a broad range of naturalistic tasks, to study humans and animals that are solving them, and to build general-purpose robots that can try to solve the same set of tasks. And we hope that the similarity in the situation and the similarity in the problem class will induce similarities of solutions.
So here's the problem I face as a robot engineer. I'm the person, I'm in the factory, I'm trying to make these robots. These robots should go out in the world and operate in a wide variety of different circumstances.
And the question is, what program in the world should I put in the robot so that it can be as good as you at helping somebody in their house? So that's a super-hard question. I don't really know how to solve it. And so I don't know. I was hoping maybe I could learn something from the natural intelligence people.
So what could I learn from the study of natural intelligence? I think some things I'd like to learn. What kind of stuff is innate? What should I try to build in the factory? What should the robot learn when it's out in the world?
One thing I'm particularly interested in is what corners can we cut safely. Almost all the problems that we study in artificial intelligence formally are intractable. You can't solve them.
If you're a theoretician, you might just say, oh, no, I can't do that problem. It's too hard. But humans do these problems at some level, and they do them by cutting corners. So I would like to know what corners you cut so that I can make the robot cut the same corners and be as effective as you.
There's a question about modularity. If I'm going to engineer systems, I have to engineer systems, I think, with some modularity either in the system or the process, because I'm a human and I can only understand a small piece at a time. So if I'm going to put pieces together-- so I'm hoping to learn maybe about some kinds of modularity in natural systems and use that to inspire modularity in engineered systems.
There's questions about how brains encode spatial information and how brains encode information about object memory. How do you think about where you left your car keys, for instance? So these are all questions that are of great importance to everyone. But they would really help me if I knew some answers.
Similarly, we could ask, well, what can natural scientists learn from the study of robots? And there's the classic things. What are maybe some externally measurable correlates of different kinds of internal algorithms or structures?
One thing I think is important is what kind of perceptual information do we really need for performing actions in the world? There's a huge community focused on naming objects, but that might not be the most important thing in terms of visual perception. So what information is really necessary and how can we put these things together?
Another thing that would be maybe helpful to talk about, that maybe people who understand something about robot systems could help with natural science is thinking about hypotheses about algorithms and strategies for dealing with uncertainty-- both at the low control levels and at very high levels. So thinking about uncertainty is another one. And another thing is, if we think about systems that learn, there's a really important question, especially in embodied systems, about the degree to which embodiment actually changes the calculations about how much data costs, and how much data you can get in order to effectively learn.
There are questions also, I think, really that are very important for embodied systems about lifelong learning. How is it that you learn over time and change the way that you gather your own data? So I'm hoping that by studying embodied systems, we can all together develop a unified theory of natural and artificial embodied I. Thanks.
[APPLAUSE]
MARC RAIBERT: I am so glad to be here. You know this is the 10th anniversary of CBMM. But it's the 50th anniversary of me being connected to course 9, which used to be the psychology department when I was here.
I was a graduate student. And I started a month ago, 50 years ago here. And it's interesting to hear the description of the history where there was the connection over the last 10 years between computation and biology.
And then someone referred to the '80s when there was a bridge being built. Well, there was actually one in the '70s, too, when David Maher, was here and Tommy Poggio showed up as a visitor while I was a graduate student. And I think we shared a bicycle for a while or something like that. So thank you very much for having me.
Well, I started out here as a neurophysiologist-- I'd been an engineer as an undergrad, working in Peter Schiller's lab. And Emilio Bizzi was nearby, and I paid a lot of attention to his stuff. But in 1974, just about 18 months after I got here, I fell in love.
I fell in love with Nancy. But I also fell in love with robotics. I wandered over to the AI lab after an IAP class, and there was a robot arm taken apart on the table in about 1,000 pieces. And I just looked at that, and from that moment on, I was a roboticist. So that's just background.
There was a time about 10 years ago when I started using this slide, talking about intelligence having two parts, when there was so much excitement about AI and I felt like we were being left out as roboticists. So I thought, well, maybe I could claim that the physicality of robots and machines is a kind of intelligence, too. And I started thinking in terms of the athletic part-- the part that makes us balance and conserve energy, and interact in real time with other objects that are moving around the world in the real-time perception and all those things.
Those all fell in the athletic part. But then there was the part that we didn't do for many years, which is the thinking part-- making a plan of how to get to the airport on time and figuring out, well, what's the traffic going to be, how long is it going to take me, and working back through those things. Or all of you just using your cognitive intelligence to understand the words I'm saying and what they mean and relate them to what's going on.
So I'm going to show a couple of examples of athletic intelligence. And then I'm going to say something about my recent embracing of cognitive intelligence. I think in athletic intelligence, there are two basic approaches which are active right now. Most people call one the more traditional approach, which is using control systems.
Model predictive control is the one I'm going to show the examples. It's really not traditional. It's really currently being developed and active.
But it's more algorithmic and hand-created. And in fact, even today, it's getting better and better all the time because people are investing research in it. And then the other one are the more learning-based algorithms, which have caught the wave of everybody's imagination. So I'm just going to show some examples.
This is the Atlas robot developed by Boston Dynamics. And these are clips that have been developed over the last seven or eight years. This is a robot that's got 28 degrees of freedom. It's got computers both on board-- well, all on board for this one, that are reading all it's sensors, doing real-time control in order to balance it.
In this case, the robot's got a vision system so it can see where the blocks are. And although it has some pre-run information that's telling it what it wants it to do, all the real-time balancing and controls and adjustment to coordinate the robot with the terrain are going on here. You can see inside the robot's brain that it's looking at surfaces, seeing what the surfaces are, using an understanding of its mechanics and dynamics in order to coordinate the placement of feet and things like that.
And then recently, we've been expanding model predictive control to be able to do manipulation as well. So here, the controls are really controlling all of the degrees of freedom in the body of the robot, as well as some of the constraints that relate to the interaction between the robot and the world it's operating in. And all those things are important in order to get the dynamics of these behaviors to work out.
For a long time, we've been interested in really dynamic maneuvers like athletes do. And here's the most recent one. So that's representative of the model predictive control world. And again, even though this is sometimes by some people called traditional, it's really still an evolving and very powerful force. And I think that the highest achievements have still been done that way, even though people are optimistic about what's going to happen in machine learning.
Now, I'm as embodied a guy as you're going to find. And I feel like I would be remiss if I didn't remind everybody that the hardware really matters, too. And innovating on the hardware side, I believe, is I'll say just as important-- I don't know if it's just as, but approximately as important as innovating on the computation side.
And not only does the computer science world sometimes ignore this point, where there's so many people who think computation is going to do everything. I think this is true in, I'll say, your world-- the brain world. Because the rift between the brain people and the biomechanics people-- I used to organize conferences where we tried to pull together the biomechanics people and the neurophysiology people.
And man, there's no love there. And maybe that's changed now. But when I was doing it, there was no love there.
So I implore you to think about the body and its characteristics. The body does computing, too, in my view of the world. And I think it's really important.
Well, for the last 30 years-- or actually for the last 50 years, I've been working on the athletic side. And just recently, I started a new Institute which is located just two blocks from here in the Akamai building, which is continuing to work on the athletic side, but is really working on cognitive intelligence as an important component. We actually pitched this, and got it funded before ChatGPT had captured everybody's imagination. And so now I feel like we're drowning in attention-- maybe more attention than we want.
And I'll just give you some examples of the kind of things we would like to do. The institute's only been around for one year, so we don't really have any results yet. But right now, to get Atlas to do those performances you saw, you really need a whole room full of programmers that work very seriously for a fair amount of time in order to get the robots to do a choreography like that Parkour course.
And if I asked you to do it, if you had the physical skills, you could just listen to me describe like I was your director, and then go out and do it. And if I asked you to take on this assembly line job, you could go watch that man do his assembly job, and then take it over. Maybe it'd take you 15 minutes to watch, understand, do what he was doing.
So we have a project called Watch-Understand-Do. And not only do we hope this is going to lead to more practical ways of telling robots what you want them to do. But in order to understand enough about what it means to watch someone do something-- to understand all the steps and the intricacies of the physical task, the sensing, and then to do it, I think we're going to learn a huge amount about the kinds of things that this group and the CBMM is interested in.
There's the robot doing it is the concept. This is all science fiction today. But we're going to work on making it happen.
Here's another project called Inspect-Diagnose-Fix, where you have a robot in an environment where there's machines. And sometimes those machines don't work right. We already have that robot in that space that this is a drawing of.
It's an integrated circuit fabrication facility up in Vermont. Ironically, I worked there as a co-op student in 1969 and '70. And now, 50 years later, Boston Dynamics has robots that are going around and taking measurements from the machines. They are not diagnosing or fixing anything yet. But that's, I think, the next step in the path.
And of course, I've shown examples of industrial applications. But the same kinds of skills and techniques can be used for domestic tasks as well. So here's Inspect-Diagnose-Fix. I'm sure all of you have had something break at home and had to deal with this. Thank you.
[APPLAUSE]
MEHRDAD JAZAYERI: Yeah. Hi, everyone. Thank you for inviting me. I feel very much out of my depth talking about embodied AI.
There's nothing after seeing what you have seen with Leslie and Mark. I'm an enthusiast more than a opinion leader in how to think about embodied intelligence. But I have thoughts about it, so I'll just share with you.
It seems to me that the dominant view in the machine learning circles is to do bigger and bigger models, train them with larger and larger data, and use more and more computing power, and intelligence will emerge. In large parts of neuroscience, the view is that the solution will present itself if we do enough experiments to have a full part list of the brain, from genes to cell types to connectomes. The stuff we're talking about here is close to my heart. I think there is really an interesting topic of embodied intelligence as a way of thinking, and another way of finding a way to get our foot into the doorstep of intelligence.
So I want to talk about two types of intelligence. One of which we haven't talked about. And that's emotional intelligence, which I think is actually very close to my heart as an important topic. One is what was just mentioned-- the cognitive intelligence. I won't talk about athletic intelligence because I don't know anything about it. But I'll talk about cognitive and emotional intelligence as two types of intelligence that I think are important to be considered for a general program of embodied intelligence.
Now, I don't know how much of what I will say will be relevant for robotics. But I think it's really important for neuroscience-- for national intelligence. And the fact of the matter is, even in neuroscience, we really have not started to study embodied intelligence. We've just been studying largely disembodied intelligence. So I think it's important to think about that in the domain that at least is something that is close to my heart.
So in terms of cognitive embodied intelligence, I think it's not hard to appreciate intuitively why natural intelligence is embodied. You see a little kid playing with the wooden blocks, and you hand them a new block. They examine it manually for a moment. And then they use it to make new structures. Clearly, that active exploration and dealing with the physicality of the object is important part of that.
This is also true in the animal kingdom. My research largely in the lab is on non-human primates-- macaque monkeys. And the rich behavioral repertoire of macaque monkeys shows similar embodied kind of intelligence. Like when they open packages to steal goods, when they trade objects for food, when they crack open things with tools, when they look at another monkey and learn from what they need to do just from watching another monkey's movements, these are all elements that seems like the movements are integral part of the intelligent behavior.
And we can think about that cognitive embodied intelligence in two ways. One I would consider the soft version of the embodied intelligence, which is maybe you don't need the actual body. But the key to it is that throughout evolution and development, the data structures on which the intelligent functions operate on are really organized in the space of movement-- or at least have an important part of it be organized in the space of movements.
The world in the monkey's mind or in a human's mind is not just more and more abstractions of the sensory inputs. Maybe you think about affordances in a way that is really important and different from appearances. So if you think about that, that seems important to study embodied intelligence, because it might give you a hint towards what are the appropriate data structures that the brains rely on to function, even if the actual movement is not important.
And there is, of course, a hard version of embodied intelligence cognitive intelligence that is important, which you would imagine that, in fact, the sensory reafference-- which is just a technical word for saying you have to move your hand and get the feedback to come to the brain. That's actually part of the process of being able to operate and do interesting computations. It may be the case. Certainly, it is the case that if that reafference is not present, many of the very basic things that humans can do, cannot do. There are patients that attest to that.
So that's on the cognitive embodied intelligence, and I think it's really important to study it. And that's something that my lab has begun to get to by moving slowly away from reduced contrived lab task to tasks that the monkey sits in front of a tabletop, deals with physical objects, and tries to figure out how to, for example, find a grape among objects that are movable, not movable, obstacles-- things that you might want to avoid and things of that kind. And we hoped, by doing that, to get inside the brain, and see what are the data structures, what are the functions that operate on those structures. And maybe we can learn about the representations and functions that are structured in relation to movements, not just in relation to how things look.
The second type of intelligence that I haven't started to work on-- but I'm fascinated by, and I hope at some point is sort of like a dream to get my hands on studying, is actually emotional intelligence. Now, emotional intelligence encompasses such fundamental concepts as self-awareness, self-regulation, motivation. Obviously important for natural intelligence. And the idea of emotional intelligence is somehow using emotional states favorably, and not fall victim to them.
An emotional stress can motivate you or can cause a loss of control. So the idea of how do you use those emotional states to produce context-dependent, condition-dependent, appropriate computation is a really important part of natural intelligence. And presumably, if you want to have robots that feel and care about other people's feeling, and have empathy, maybe we should be thinking about this space for robots as well.
Now, what are emotional states? In reality, we actually don't know. I think there is no real understanding of what emotional states are. In the animal models, where you can get your foot inside and try to understand things, really we have an impoverished theoretical framework for emotions.
We think of valence and intensity as being really two important axes, and we put our emotions on that two-dimensional space. I'm sure all of you, like me, have much more colorful emotions and much higher dimensional emotional space than valence and intensity. That's a very low bar for deciding what emotions are.
So what are emotional states? Again, I think it's not implausible to think that the patterned body states-- the embodiment has something deep to do with the emotional states that we experience. Anxiety of public speaking, as I am experiencing it now, comes with, like, dry mouth, butterflies in the stomach, heart racing.
I see a loved one-- again, heart racing. But then, I have a warm feeling in my chest. So there's these rich, patterned body responses that are associated with emotional states. And I think it's not crazy to imagine that those body states have something to do with those emotional states. And therefore, the idea of thinking about emotional intelligence as a really integrated, inherent interaction between brain and body seems to me like a really fruitful idea to explore.
And if we think about emotional intelligence as a computational problem, then the real idea here is that emotional states and their regulations are instantiated by, and controlled through, a control loop that connects interoceptive ascending pathways from the body states to the brain, in conjunction with controls that the brain sends to the body through the autonomic nervous system. And this interaction is fundamental. And of course, many of the things we have learned in the somatic system in terms of how to think about control systems-- having a model of the body.
How do you do prediction errors? How do you combine the errors with the sensory feedback to generate appropriate controls? These are all concepts that are used in more traditional robotics, and also fundamental to how we control our skeletal-motor system. I think it potentially is also very, very important for how we control and regulate our emotions.
OK, so I just want to leave with one idea that is intriguing to me. It's fun to share, which is, like, imagine an emotional state. I just want to give a little bit more tangible sense of what I think body states could be doing for this emotional intelligence.
So I get into a situation. I have an emotional reaction. I'm in a particular emotional state-- let's say fearful.
So one way to do this emotional processing disembodied is to dedicate a whole part of my brain that keeps remembering that I'm in this fearful state. Like, I have this recurrent activity, persistent activity in my working memory saying, remember, you need to be careful. You're in a jungle. Something bad is about to happen, so be careful.
Another way to do it is to actually download that information from the brain and use your body as a storage device. So here's one idea of how emotional intelligence could be embodied. The body's natural time constants are quite a bit longer than the brain.
So one way to this is that-- and this is not consistent what we know about the brain, is send a phasic control signal from a place like prefrontal cortex and cingulate cortex where we believe these sites or types of signals exist, send it to the body. And then the body can, for a certain amount of time, hold to that information. And more important than holding it, the ascending pathway becomes now a tonic control signal that puts the brain in the appropriate state if the brain has an appropriate internal model for it. That then can guide the brain to do the right thing under that emotional state.
This is just to give a tangible-- of course, this is all science fiction at this point, but a tangible sense of how embodied intelligence could be really a serious way of thinking about controlling our behavior under situations where it's state-dependent. And I'm excited to hear what all of you and the panel has to say about this. I'm largely here to learn, but here are my thoughts anyway. Thank you.
[APPLAUSE]
MATT WILSON: Yeah, the 10 years has really been inspiring. I think the next 10 will be even more inspiring because you can see the intersection of ideas and biology and cognitive science-- the really amazing videos that Mark shows. You can see where robotics and the engineering of intelligence systems are really going to lead to many, many great things.
Now, what do we have to contribute? Mehrdad and I were put on the panel this neuroscientists-- this question of what does the brain have to contribute to the anticipated progress in the development of synthetic intelligence, but also our understanding of human intelligence? And so Daniela and the group, we were thinking about what would we talk about. That was an issue. What do we actually have to contribute?
And so in thinking about that, I was sitting on my back porch. The inspiration for me came from squirrels. I was sitting in the back porch. And we have trees in the back.
And one thing about all these trees is they're just filled with squirrels. The squirrels are back there at this time of year. They're collecting all the various nuts and berries . And if you watch squirrels, they're really remarkable.
I mean, they are remarkably acrobatic. They go from branch to branch. They will jump from tree to tree.
And you think about what is the process? You can think, oh, could we build a robot that would actually do that? And you could imagine, there would be the engineering of the mechanical systems. There would be the planners and forward models that would allow you to predict and anticipate where you could go. But when you think about how does that actually manifest in the brains of squirrels, what does squirrel intelligence look like?
Now, the area that I study primarily is the hippocampus, where we think about the representation of space and how that representation is actually used-- planned behavior. And in rodents, rats, we typically do this in kind of simple two-dimensional spaces. And when you record from the brains of rodents, you see that they think in spatial terms. 10 times a second, they're imagining where am I now, where could I go in the future.
And so they have a very simple plan which you could simulate using your favorite next-state predictor. So you have a model state predictor. It could be a simple convolutional network. You use transformers. It's just, given where am now, what's going to happen next?
So you have the squirrel next-state predictor. It is I can make it over to that branch on the next tree. And from that, if you actually watch them, they will capture the physics of the branch to allow them to bounce to the next branch. OK, so fine. So you imagine that in their brain, there is a manifold-- a space of planning, in which they see the world in these not just three-dimensional, but tree-dimensional-- you like that one, tree-dimensional terms.
[LAUGHTER]
So now, you think, OK, so this model in some way is actually embedded in the representation of space and the world that is available to them in this next-state predictor. Now you imagine a person up in the tree, sitting on the same branch, thinking the same thing. Now, what is that person going to do?
Now, you could take your favorite-- again, transformer model. You're going to train it on the vast repository of videos on the internet-- lots of videos of squirrels. ChatGPT will figure out, oh, look, the squirrel is going to make this jump.
And you say, oh, what about the person? What's the person going to do? Well, if you know anything about the internet, you would think, OK, the smart thing is obviously, the person knows they can't actually make the jump. And so they won't do it.
Of course, on the internet, there are plenty of videos of people doing stupid things, and actually making the jump. You take the Jackass movies. These are things that people do, and that is that they will violate in their internal representation-- if you look in the manifold of space. That manifold does not actually continue on to that next branch. And yet, they will actually make the leap.
So you can of think of their two spaces-- one, the grounded space in which actual next-state predictions are based on the available affordance. The squirrel has a grounded representation of space. And you think about that, the grounded representation is actually based on the available affordances. And this is something that's very deep. If you look at the spatial representations in a rat, for instance, it moves in three-dimensional spaces, but along surfaces. They run up walls.
And if you look at the manifold of that space, it is, in fact, two-dimensional, even though they move in two-dimensional space. If you look at this in a bat, the representation of space is actually three-dimensional. So they have available to them-- they can think in this three dimensional space.
What about the person? They're in the tree. They have no three-dimensional manifold. And yet, they will make the ungrounded leap. And so thinking about that divide between the grounded and the ungrounded-- there are the things that you can do, the robot that you would program knowing its affordances. This is what you can do, and that's the squirrel.
But then there are the things that you can't do, but you might be able to do. And so, Leslie, you brought this up as the pure abstract space, where you just imagine everything that might possibly happen. It's interesting, because when you look at the neurophysiology of this system-- the hippocampus in which you see the kind of explicit prediction based on constrained affordances, the grounded representations, there's another state in which that continuity is violated.
And that's in these states of quiet wakefulness or sleep, where we see the same expression of continuous evaluation of these little manifolds over short periods of time in a discontinuous way-- small chunks where you put together short sequences. And so the idea that you think about-- how would you actually incorporate both the necessary grounded constraints that tell you what you can and should do, and combine that with the ungrounded constraints that allow you to imagine what you might do? And we see in rodents, and presumably in other organisms, that there's this interplay between evaluation of the continuous manifold. And then during these offline quiet, wakeful, or sleep states the discrete juxtaposition of grounded elements-- but in potentially in novel ungrounded ways.
And so back to the person. The person's going to make the jump. You know that. You've watched the videos.
They will make the jump. They will probably not make it to the branch. But there will be a trace of that experience, as Leslie pointed out. This is one of the things, how do you learn from experience.
To me, the most interesting part of both the study of the hippocampus and the work by John O'Keefe and the Mosers, which was given the Nobel Prize in 2015, is the insight that the same structure that is involved in this kind of predictive encoding and evaluation of next-state in navigation and movement is involved in the encoding and use of episodic autobiographical self-embodied memory. And so what we imagine, the person makes the leap, there will be a single, one-time trace of exactly how that played out. The substrate for that episodic memory will be exactly the same substrate-- and this is the system of the hippocampus, that will capture that sequential or episodic memory.
And now, somehow, that experience or memory will be translated to a change, presumably in the manifold the representation, that will be used to do planning. So that the next time, they do it either they do it better-- jump farther. Get something that changes-- the kinematic model. I don't know, carry some balloons, put on some wings. But it's the transformation from the episodic memory into the planning manifold that's most compelling.
And I just want to point out that tomorrow, the first panel-- I don't know if Isla is here, but Isla is going to be on. And she's done some really beautiful work showing that the same system-- that is the system that is in the hippocampus, the grid cell system that's used to actually capture and represent these spatial manifolds, also is something that confers the ability to encode and express these episodic sequential-like memories with a high degree of fidelity and capacity. So what's good for space is good for memory.
And so thinking about how this one system that can combine space, episodic memory, continuous grounded, discontinuous ungrounded evaluation of experience, and combine that to form models that will allow organisms to do what a transformer model trained on all of the data that's available that has embedded within it all of the embodied constraints. So really, is it smarter than a squirrel if it knows what a squirrel is going to do? The squirrel does it without any training, and so it's this question of how you incorporate those sorts of embodied constraints.
And I think it's interesting to think about how something like the hippocampus and these low-dimensional manifolds, which are a key of the whole system-- how they incorporate the fundamental elements of a forward kinematic model to really be able to predict where the squirrel knows where it's going to go. Is it actually running a full kinematic simulation? I mean, it could be.
Or it just has the planning space, the representational space for action, has already incorporated that. In a sense, it has been trained evolutionarily on the same kind of data that you would have trained, perhaps like a transformer network. So this points to something that Jim had raised earlier, thinking about how you might evaluate or diagnose a network-- let's say an artificial network, to think about how you might incorporate the same principles that have already been embedded in biological systems, in ways that would allow them to perform the kinds of feats that we would hope robotic systems would be able to do. That is, be able to rapidly adapt to changing environments, to learn very quickly, to modify their actions and behaviors.
DANIELA RUS: Amazing.
[APPLAUSE]
So that was really a tour de force. I'd like to invite the panelists up front, so we can have a conversation. So that was really fantastic. I've learned so much.
And Mark, you're sitting right next to me. So let me ask you. Your robots are extraordinary. Do you actually find inspiration in the natural world? Or how do you think about the engineered solutions of your extraordinarily agile and wonderful machines? How important is the natural embodiment as a source of inspiration for you?
MARC RAIBERT: Well, I think much of the inspiration for what to try and make a robot do comes from seeing a person or an animal do it. I'm also a squirrel enthusiast. And I think the first time I was ever interviewed, which was a long time ago, I used the squirrel. And I said, boy, if we could just do what a squirrel does, that would be amazing.
I first got started-- I showed that picture from 1974. In 1974 or '75, I went to a conference. And someone was showing a robot that was six-legged, very slow. It was trying to work the way a table works where you keep the center of mass.
And I looked at that and said, wow, that's nothing like people and animals. People are dynamic things. They're bouncing on compliance. They're storing energy, they're balancing. So all the inspiration came from biology.
But if you went further, when we do a design these days, we don't worry about mitochondria for energy. We use either motors or engines or batteries. We use the engineering tools that are available to us in the fine structure of our solutions. I mean, I think there's opportunity to take motivation there, too. But I think, at least on my projects, we don't go all the way down.
DANIELA RUS: What do you think, Leslie?
LESLIE KAELBING: You mean am I inspired by natural systems?
DANIELA RUS: Yes, and also, Mark just shared with us that deep biological insights are not so relevant for making machines.
LESLIE KAELBING: I guess in my actual practice, I do a fair amount of armchair cognitive science via introspection. So in that sense, I'm inspired by natural systems. But I will try anything in order to get a robot to work better. So I love to be inspired by things from nature, but I don't feel beholden to them.
DANIELA RUS: Matt and Mehrdad, are there aspects of intelligence that require embodiments? What can we, the engineering side, learn from your greatest insights in intelligence?
MATT WILSON: Embodiment, if you really think about it as those elements of intelligence that are grounded-- and that is that they reflect the constraints or known constraints of our interaction with the world. And so I think of that as the rational world. And when you think about that-- OK, grounded experience, grounded behavior, it makes sense if you simply want to ensure optimal interaction with the world.
But think about how you actually respond in an adaptive way when the environment of the world changes. And so being able to think in an ungrounded way-- and Leslie, you pointed out. I mean, you have synthetic AIs that aren't constrained in this way. When they're not embodied, everything is possible. And you can find arbitrary, often hidden, connections and correlations, which can be quite valuable.
So you could say, well, why don't we just have like one biological intelligence do that? Why doesn't biological intelligence just go free-wheeling and search for hidden connections? Well, it does. In fact, there are these two competing forces.
One, the rational, grounded approach, which often is engaged and expressed during active behavior. We see there are these two states-- the active attentive interactive state, and then this offline state, where there's no longer the need to actually interact with the environment. Now the ungrounded evaluation goes on.
We think about that as in these quiet, wakeful, and these sleep states. Interestingly, you think of sleep. What is sleep like? It is like the full, ungrounded search of this possible space.
And sleep deprivation-- if you deprive someone of sleep, the body is really pushing you to go to sleep. And you get into this state where you are asleep but actually awake. And what that looks like behaviorally, phenomenologically, is psychosis. Psychosis, in a sense, it's delusions, it's hallucinations. It's the brain now unmoored and free to imagine anything.
MARC RAIBERT: So when I was listening to Matt's talk-- especially when you were talking about the Jackass thing where the human takes the jump that they can't do. And I was thinking about what Mehrdad was talking about about the emotional intelligence. Is there a connection there in your mind?
MEHRDAD JAZAYERI: In my mind? Yeah, so I think it's a very interesting problem. I mean, this is somewhat analogous to child play. You do things that are really weird.
I mean, from a survival perspective, children do things that, if you just put your head together, it's like, this is crazy. Like, how did evolution come up with this solution? So I think there's interesting problems here of learning. I don't know. These are just interpretations of really what's going on. There's very little science to attach to it.
But you can imagine that one of the things that evolution imparts onto the brain is not actually necessarily always looking for solutions, but creating state spaces where then, finding solutions is easy. And if you imagine now some of those things that somebody does-- jumping from a tree to a tree, or a child does, which is kind of crazy sometimes. They can hurt themselves.
You can imagine this is creating a state space in which you can flexibly consider new cost functions, new possibilities. And having that flexible state space available to you, then you're better prepared in adulthood, or in just normal life, where you do face a situation where things are a little bit unlike what you expected. So although on the face value, it seems not intelligent, it actually might be exactly the intelligent thing to do. Because by not doing that, you're doing to yourself what a neural network that can only generalize to its own local training space does.
Now they're clever about it. They just don't go about doing everything. What they do, they kind of cleverly find solutions that are not too easy, not too hard. And that element of seeking surprise that actually can be resolved is a really fundamental aspect of the children's intelligence.
And I think there's something definitely to be learned about that. And that definitely has an embodied component to it, because they go put themselves out there. They do something that is embodied. It's not completely nuts, but it is somewhat nuts. And by doing that they create just enough exploration and surprise and resolution of surprise, which allows them to build an ever-increasing richer internal model of the world.
MARC RAIBERT: I mean, there's definitely places in adaptive control where people inject noise in order to get the system out of the rut it's in, and thereby get data that broadens the solution.
LESLIE KAELBING: Yeah, we've actually-- in some work with Josh and students we co-supervised, done some stuff where try to do-- in the context of something like model predictive control, where a system has predictions about what it can do, the idea is to try to pick some dimension of the state space in terms of formulating a new state space, and say, I wonder if I could change that dimension in the state space. I've never changed that dimension of the state space before. Let me try.
And I use my model and I make a plan. And I say, oh, I see. I think I could hit that water bottle with my elbow, and then I could try and see if it works and so on. And it does work to make learning more efficient, because you do gather data that's you're ready to consume, as opposed to data that's just too far away from what you can already understand.
MARC RAIBERT: Or the same data you already have.
MATT WILSON: I think this idea of optimal versus adaptive-- and so biological intelligence, definitely not optimal. But it is very adaptable. And the idea is that you create representations that are capable of responding to changes in the world. That's how the group in the collective sense can survive.
And so I think, like with the quest interest in collective intelligence or how intelligence across heterogeneous organisms is designed to work, it's designed for many to fail. That is, it's not optimal for individuals. But it gives you search of a larger space in which they're more adaptable.
DANIELA RUS: So one of the arguments that we hear in AI around why AI is really not powerful, not intelligent enough, and it's questionable how good it could get, has to do with interaction with the physical world. It's the fact that an AI system is just not able to interact with the world the way a natural body-- a human or an animal, does. So how important is this interaction in defining intelligence?
Mark, for you it's not, right? You would argue no. But maybe I'm putting words in your mouth.
MARC RAIBERT: I think there's a bunch of thoughts. I think the physical, the robot can interact the way a person does. But I don't think either of them can collect the massive amounts of data that are popular right today.
DANIELA RUS: But is intelligence all about this massive amount of data or not?
MARC RAIBERT: I don't think every intelligence is about massive data. But I think that those things could be intelligent, getting back to the CBMM theme that there can be an intelligence, and then biological intelligence, and presumably others. At least I think that.
So what I heard in the presentations before the panel was a sense that CBMM feels like their significance requires a solution that's better than ChatGPT or something like that. Like, ChatGPT is this really cool thing. But is it human intelligence? And I think most people think it's not, and that there's ways that human intelligence could be better. It could be ways that the current methods could be better than humans.
DANIELA RUS: So I want to come back to interaction. But since you took us down the language path, let me ask our neuroscientist colleagues about the role of language in intelligence, and in particular in transformers. The whole shtick about transformers is that you can learn everything in parallel. You can do all this training in parallel. But I don't think natural--
MATT WILSON: We have a whole panel tomorrow, and that will be the key topic. I think one interesting point-- and it relates to something, Mark, you brought up and, Mehrdad you touched on. And so I talked about spatial memory, but there's social intelligence and social learning. What happens is also dependent upon the same system.
So episodic memory, spatial memory-- just a different part of the hippocampus. So the hippocampus actually has a longitudinal extent, and so different parts of it-- same circuitry, different inputs, related function. And so the idea of thinking about prediction and movement in actual space, and thinking about prediction and movement in a social space-- same substrate, probably the same principles.
Now, you could think, oh, so what about language? If language is the same thing, do you need the hippocampus for language? And there, no.
So to me, that's interesting. It says that the language itself, there are different substrates, maybe different mechanisms. And so I don't know that-- and obviously, Tommy, you put up the MIT 150, where you had Noam Chomsky there. And of course, Noam Chomsky is very dismissive of this idea that the study of animals would give us insight into language.
And I feel like there are a lot of shared mechanisms, but I don't think this is one of them. So there's something different about that, which means that-- I don't know. Like, these large language models may be capturing some aspect of our experience that can be expressed in language.
But I think a lot of these fundamental things that we've been talking about, thinking about-- these fundamental representations that allow thinking, planning, and interaction with the world and other agents, that it's a conserved system. I mean, you see this from the squirrels and lizards and rats and people. So that's a common element. Language is something different. I think they'll touch on that.
DANIELA RUS: Well, I was thinking about language in the context of an intelligent creature-- perhaps a robot. And so with language, I understand that there are a number of interesting neuroscience experiments that show the extraordinary explosion in the reasoning abilities of children once they have language. And so it seems like language is a very important part about interaction with the world. And so for robots we have begun to connect large language models with a control system of robots.
And we are finding, in fact, the representations and the abstractions that we can synthesize through these mechanisms by connecting control with language concepts. So we get at higher levels of abstraction somehow faster by doing this connection. So wouldn't that suggest that language is important for robots intelligence?
MATT WILSON: I think to this question of capturing elements of human intelligence-- and this may be something as distinct perhaps from these common and shared substrates, and perhaps manifestation. Biological intelligence that you see might relate to basic principles of embodiment, and what it means to actually think about interacting with the world through interacting with the world. I'm perfectly fine with elevating humans and human intelligence above animals. It's not necessarily better.
But I think that it is different, and it may be that it's harnessed this ungrounded capacity that might be commonly expressed-- systems like the hippocampus to explore, to imagine, to simulate in this new way, but has now anchored it and put it into a formal construct, so that it's not as completely ungrounded, and therefore can serve a directed purpose as the examples that you give.
MARC RAIBERT: Well, I think one of the confusions about the language models is that they don't represent just language. They represent the language that humans actually used. And those were embodied humans in most cases. So they're encoding not only language, but also the experience and the situation of whoever it was who uttered the sentences that had been digested by the large language models. So it's language plus context plus human situation that's all boiled into that. And thereby, some of those other features can make their way through to the value of the result.
MEHRDAD JAZAYERI: I think it would be interesting to interrogate them-- Jim was pointing out and some of the visual models to find analogs of the biological system, to think about how those embedded, embodied constraints that are present in the data, how are they actually manifest in different elements of these networks.
MARC RAIBERT: That's the hardest part.
MATT WILSON: You find hippocampal-like representations hidden in there to capture those things.
DANIELA RUS: There's that. I just wonder if you can talk a bit more about the role of emotional intelligence. Is embodiment necessary for emotional intelligence?
MEHRDAD JAZAYERI: I mean, neuroscience has done very little work in that space. And I have done even less than the average of the neuroscience in that space. So it's a little bit hard to tell, especially for me.
So the plan of action for someone like me is that I want to understand, really mechanistically, what's going on. I'm more of a discoverer in this space than an engineer. I just want to understand really what is the machinery. And working on animal models is really essential for that at this stage.
We don't have proper, noninvasive tools to really study this at its full richness in humans. But then, we have pretty straightforward practical problem, which is there's no way to know what emotion an animal is experiencing. So if embodiment is not even part of the larger theoretical framework that will emerge, I find it as a really important way to get my foot into the doorsteps of understanding what emotions are, banking on the idea that emotional states have a reasonable representation within a state space where the axes are my body states.
So if I could, for example, come up with an outfit that I would wear and I walk around, and I just measure all my heartbeat, electrical activity of my heart, blood pressure, skin galvanic response-- everything that I can measure, temperature all around, and also register my emotions. And then build a state space of emotions expressed in indexed by my body states. Then I think if we go to an animal model, then we can just take as a possibility that, if we create a situation where the animal's body is in that same state that mine was-- or at least is homologous in some meaningful way, then we have a way in to understand what is that emotion.
And I think if we get that far, then we can put electrodes in the brain and see what are the actual internal models in different particular part of the cortex, or other ways. We have learned a great deal about the circuitries involved in emotion, from cingulate and insular cortex, to hypothalamus to amygdala, to brainstem circuits related to autonomic system. But we really don't understand what an emotional state is at all.
So I think the idea of is embodiment important-- I think it's just it's essential, anyway, to be able to study it. And it seems to me reasonable not to spend a ton of spiking energy over longer timescales, because emotional states usually last longer than the high, fast timescale that the brains operate. So I think actually, if there's one domain that embodiment will turn out to be quite important-- even if it turns out that it's less of an important thing in just the somatic control of the movement, I think it will turn out to be really, really important for emotional states.
[INTERPOSING VOICES]
DANIELA RUS: I just want to say, we can make you the shirts across the street. Let's talk. Leslie.
LESLIE KAELBING: So agency-- I mean, I think a thing that ties all these things together, and really argues for embodiment, is the need for agency. So how can you study emotion if you can't actually be frustrated because you can't do something? How can you gather the data?
How does your hippocampus work if you get carried around in a basket? Not very well. How do we gather data that lets us actually understand the causal structure of the world unless we get to poke it?
And the argument that says we can't get enough data to learn? I think that that's not really right. I mean, one thing about learning, we look at the current machine learning methods, and we say, oh my goodness, they need a vast pile of data. That mustn't be right because individual agents don't need very much data. So this seems wrong.
But I think that that's just a confusion, because evolution needed a ton of data, and I don't need that much. In the world of robotics, what do we do about that? And one thing that we can think about is, well, we can run simulations, and we can run lots of robots in parallel and gather all their data, and wad it into a big large model. But those simulated robots and those actual robots had agency when they were gathering that data. And they understood, in some sense, what actions they took, and how it was that those actions rated results. And so I think we could conceivably take all that data and insights from natural science and put it together, and try to do what evolution did to make systems that can then learn in an embodied way from a little bit more data about the world that they live in.
DANIELA RUS: So we have--
MARC RAIBERT: And just to riff on that, there's foundation models, and then the derivative models. And you could argue that the foundation models are happening in evolution, and the other models are happening in an individual's experience. And there's much less data in those spinoffs-- those smaller, focused models.
MATT WILSON: Just an interesting history of neurosci note. There's a circuit know as the Papez circuit, which was believed originally to subserve emotional control. And it involved, as Mehrdad mentioned a couple of these structures, a rogues gallery of brain areas. There's the amygdala, hippocampus, mammillary bodies of the hypothalamus, anterior thalamus, cingulate.
What connects all of these things? Well, later, that same circuit-- the Papez circuit, was revised to be a memory circuit. It's the circuit that connects the hippocampus.
Damage to the mammillary bodies produces profound memory deficits-- Korsakoff syndrome. Damage to the anterior thalamus gives you similar. You get memory deficits.
But an interesting observation is that when you go in neurophysiologically and record from all of these areas, one thing they all share-- and this was Leslie's point about agency. They all have movement correlates. And that is that they carry out these functions of emotion and memory control, and they all reflect the engagement of the animal with the world movement correlates. So movement, emotion, memory are somehow all tied together.
DANIELA RUS: Yeah that's super-interesting. And we have a question from the audience related to what you were talking about. And so the question is whether active sensing and reinforcement learning are the future of creating the next generation of intelligent systems.
LESLIE KAELBING: Well, it depends on what you mean by reinforcement learning, I think. I mean, that phrase now has taken over in popularity. And it can mean something very narrow or something very general. So active sensing is critical. That's the agency part.
If I'm not sure what's going on-- if I'm not sure if that remote is glued to the table, I have to be able to poke it and see if it moves. So the active sensing thing is critical. I have to be able to look in a bag to see what's in there because how could I find out otherwise? Reinforcement learning, I think, has an important role to play, but possibly not as big a role as people think it does.
DANIELA RUS: Care to elaborate?
LESLIE KAELBING: Sure.
[LAUGHTER]
OK. So reinforcement learning, generally speaking, is this trial and error learning. This is why I said people will argue about the definition, and then we could go on forever. But roughly, if you take it to mean learning from trial and error-- learning a strategy for doing something by trying it and correcting and fixing and so on, that's super-important. That's how I learned to ride my bike and how I learned to almost juggle, and how you learn to do a bunch of things that involve tight sensory motor loops.
Is reinforcement learning how I learn to fly to London on an airplane and read the schedule and buy a new ticket if mine doesn't work? I don't think so. I think that that's more model-based.
I think I have to learn a predictive model of how the world works. I think I have to be able to take in new evidence and condition on it and reason. And I think there are technical computer science reasons why learning models and using them to do deliberation is more efficient in terms of data and computation than trying to learn a strategy that covers all possible things that could happen to me in advance. OK, now I'm done ranting.
DANIELA RUS: We have another question from the audience. And I'll give this to you, Mark what's the primary bottleneck for why research in robotics is far behind the software side?
[CHUCKLES]
MARC RAIBERT: That's a good question. It's related to the squirrel thing, I think.
[LAUGHTER]
MATT WILSON: It all comes down to squirrels.
MARC RAIBERT: What?
MATT WILSON: It all comes down to squirrels.
[LAUGHTER]
MARC RAIBERT: I think that fundamentally, people think that brainy intelligence is the hard part. I think brainy intelligence is the easy part. Physical intelligence about the world is the hard part.
Now, the fact that the squirrel can do it just belies the idea that humans are the top of the pyramid, I think when it comes to squirrely-like behavior. It is true that human athletes are pretty incredible. But I don't know. I'm just meandering around your question.
What's the bottleneck? I guess I don't really know. I think that the machines are a hard part of it.
One thing I didn't talk about in my talk was manipulation. I think that locomotion, we've made huge progress on. And robots can go to a lot of places now. They can't go everywhere, but they can go to a lot of places. And I can see a path to them having the mobility to do almost anything-- certainly human levels of things.
But manipulation is a totally different beast, and we've been working on it a long time. And I would argue it's almost nowhere in terms of robots' capabilities. And I don't really know why that is. I think that humans are-- that's a place where a lot of inspiration comes from, because we are so good with our hands. We can sense with our hands without using vision, although we can also augment it with vision.
We have all these processes going on that we don't understand in our hands. There's haptic things. There's guided sensing. I don't have an answer. But--
AUDIENCE: It's very sophisticated.
DANIELA RUS: The hardware is hard.
[INTERPOSING VOICES]
DANIELA RUS: The hardware is very hard.
MARC RAIBERT: And if you look at robot hands that try and be like a human hand, they're really pathetic by and large. Not because the people doing it are pathetic, but it's just we haven't figured it out.
DANIELA RUS: There is a tension in the development of robot manipulators, where there's a desire to make the manipulators compliant. But that means that they're--
MARC RAIBERT: You lose control.
DANIELA RUS: You lose control. And also, the materials and the techniques we have to bake compliance into the hands are such that you also lose payload. You don't have much strength. And so controllability compliance and strength are really difficult to satisfy all at once.
MARC RAIBERT: So I have friends who think that the pendulum has swung too far on the computation side, and that it's going to swing back, because a lot of those things you just mentioned have to do with building structures that require mechanical engineering and materials, and all kinds of cool stuff that we don't know how to do yet.
AUDIENCE: So it'd be nice for us in the audience to hear if you have a taxonomy of tasks that you think are important. So you've given us two axes today. One is the dexterity of emotion. And one is the problem-solving, planning axis.
And so I'm wondering on this plane, maybe there is a third dimension that is important. Where do different tasks lie, and how can we trace a path towards success by moving from one to the other? So this is for the roboticists.
And the flip side for the neuro people is, as you look across different species, we know that there are different types of intelligence. Some species have tremendous memory, some tremendous dexterity and manipulation, and so on. And sometimes, your labs are focused a little bit too much on one thing only for too long, which is a necessary condition, I think, for making progress. And so by comparing species, can you see a little bit of the same picture that maybe the roboticists campaign for us from their point of view?
DANIELA RUS: OK who wants to start? Robotics?
LESLIE KAELBING: Well, I have one quick reaction to the question of dimensions. So the ones that I think about a lot are, what is the time constant-- so the horizon. So how far ahead do I have to look to do this? That's a very important dimension that I think causes us to use different techniques.
And another one is, how routine is it? And so you're going to use different methods to address the problems that you solve every day than I think you would use to address the problems that only come up occasionally, but you should still be able to cope with if they appear. And uncertainty-- OK, three dimensions. And then how much do you know about the world state that you're operating in? So I think those are my main axes.
DANIELA RUS: Mark?
MARC RAIBERT: I think Leslie's answer is better than mine. But I think in terms of-- there's mobility that I mentioned before. There's mobile manipulation. If you keep your shoulder fixed and you're sitting at the dinner table-- you can do this tonight, ask yourself how much of the stuff on the table you can reach.
OK, I can reach this. But I promise you, you won't be able to eat dinner if you keep your elbow fixed. But what people do? They really do this. So they really do whole-body manipulation almost all the time.
And if you can free the manipulator-- a lot of robotics is just an arm stuck in a post, the vast majority of them. And if you can free that, I think there's a whole world opened up. That I already said. There's in-hand manipulation, there's whole-body.
Another dimension is dynamics. If you look at the learning literature on robot manipulation, which is going blazing, almost all of it is statically grasping objects, where the problem is to figure out where the object is, understand how to position your hand with respect to it. And people don't work like that at all.
It's just like I was talking about. Legged things don't move statically. I think manipulation is more like compliant, dynamic juggling than it is like static grasping. So it's not exactly an answer to your question, but those are the things I think about.
DANIELA RUS: Mehrdad.
MEHRDAD JAZAYERI: I'm not sure I understood. Is there something to be learned from the museum of neurobiology.
AUDIENCE: I was wondering whether by comparing different species-- you talked about squirrels. And they have--
MEHRDAD JAZAYERI: I didn't, but lots of people did here.
AUDIENCE: You can see-- I don't know, crows and parrots using their beak a lot to move and also to handle objects, and do some things really well. There are some woodpeckers that can store acorns and find them six months later, but they are not very good at doing other things. I remember a little bird that got trapped inside Tommy's house two years ago. And it was clear that inside the house, it was an insoluble problem for the bird to figure out how to get out.
It was such an unfamiliar environment. It was just going at random and banging into walls. And you don't see the birds banging into stuff outside. So it was clear that there was something profoundly amiss. And I was wondering whether by comparing what different species can do, you can derive different dimensions to challenges that animals have to solve, and whether one can figure it out?
MEHRDAD JAZAYERI: Yeah, my guess is that yes for sure. But I think it's also a challenging path to integrate the information like that, because a lot of the things that-- the set of cost functions that different animals have to satisfy, both because of their biomechanics and because of their evolutionary goals, are quite different. So while it's going to be very valuable to do that-- and I think we should, drawing principles out of that that will generalize across the wide variety of biomechanics and evolutionary goals of different species is going to be challenging. So looking at a particular animal that, with its beak, does a certain thing, it's because it has a beak, and certain things that it can do and it wants to do.
And how you generalize from that with manual dexterity of a primate is going to be challenging. It's not going to be an easy process to get at. But I think fundamental questions about how do you go from sense data and muscle data to abstractions in the brain that are relevant for that-- I think understanding what are the axes of working memory? Certainly, we have misunderstood working memory for the way we study, at least in primates. But I think largely that's the case.
We think of stimuli. It comes in and goes to the working memory. That certainly is not the way it happens. So stimuli comes and it goes through a large filter of long-term memory, where you actually now compress that information in relevant ways through abstraction, and then put it in working memory.
Episodic memory is the same. The way you actually analyze information and represent states is really related to stuff you have learned and is relevant for your survival. So I think all animals probably do that.
And trying to understand how the neurobiology goes from sense data and muscle data to those abstractions is something that potentially we can learn quite a bit from from looking at across species. But the actual abstractions themselves will not necessarily match. The actual data structures might not be identical. But the computations that go from a high representation to more abstraction potentially are going to be similar. That's ...
DANIELA RUS: Matt, do you want to add?
MATT WILSON: Yeah, I had one thought. And actually, I was thinking about some of the discussion of social memory, social learning, questions of observational learning. And this was actually something to throw back to the roboticists.
It made me think another domain of agency-driven intelligence is cooperative interaction. And that is having a robot interact with another agent. It could be another person handing them you know handing them a package, working with another robot. In a sense, it brings in elements of things like theory of mind.
I remember when John Leonard was-- this was early on. I think maybe it was at one of the intelligence-- the I squared meetings. He brought up the left turn problem. What's the biggest problem at the time in autonomous vehicles?
He said it's the left turn problem, because when you're making a left turn, you have to be able to predict and anticipate what's that person going to do? What's that squirrel going to do? I mean, it's just very challenging, basically. Now you have to predict the action of another agent. In any kind of cooperative interaction, if I'm going to hand you something, it's more than just my dexterity. It's knowing what--
DANIELA RUS: Having a rich model of the world and the other agents around it-- many in robotics, we often cut corners there. And we use communication as a way of coordination. So if it's all robots, then communication is a good solution.
But if it's robots and people, then the robots actually have to understand. For that left turn, the robot has to understand is the incoming vehicle egoistic or altruistic? How do you measure that?
Well, it turns out that the social psychology literature has come up with metrics called social value orientation, which you could then fold into the control cost of a robot and plan for it. So it's interesting. There are really rich opportunities that are actually very important as we think about mixed human-robot societies.
AUDIENCE: In a virtual world, where basically you can actually have this coexistence of physical embodiment of robots-- or at least virtual agents, and then me having a physical representation of myself, is that perhaps a path to having this kind of physical representation without having to solve all of the robotic problems? I'm just wondering if embodied agents in 3D give us a path to testing these various models for self-embodied representations. And then the second aspect is, of course, the fear of death. Basically, the concept is that I am a physical me and I will cease to exist. Or basically, there's this concept of self, which is tangible, and that perhaps could be there also in the virtual space.
MARC RAIBERT: I mean, I think physics-based simulations are very similar to what you just described. And they are an opportunity to generate lots of data or test out situations, and used heavily in the robotics world.
LESLIE KAELBING: Right, I think it's important that they not be too idealized, though. Because fundamentally, natural systems have to deal with, like, noise and mess and partial observability and things breaking and all that. And I think that's really fundamental to squirrels and me. And so it's hard to make a simulator that has that depth and richness.
DANIELA RUS: Any response from neuroscience?
MEHRDAD JAZAYERI: No. I'll make it zero seconds.
MATT WILSON: Zero, well-- that's right.
[LAUGHTER]
DANIELA RUS: All right, let's thank our wonderful panelists.
[APPLAUSE]