Connect with us

Tech

Embodiment is Essential for AGI

Published

on

There are many routes to AGI. There are many paths to AGI. Nando de Freitas wrote recently that Scale is all you have after the publication of Deepmind’s Gato paper. One promising path is the use of language interfaces to supervise learning from large datasets and create larger transformers. Gato evaluates and trains on data from robots but there is another way to get AGI. This involves completely bypassing robotics or embodiment. This is the path taken by OpenAI/Anthropic who once disbanded its robotics team .. Recently, I was riding in a car with a friend who was “member of technical staff” at OpenAI. He asked me “Do you think solving embedded AI is necessary for AGI?” I have been thinking a lot about this question ever since.

Language as the end-all of intelligence


Language is often the beginning and end of intelligence when it comes to AGI conversations. However, intelligence is not something that has evolved outside of embodied agents. Language itself evolved to facilitate multiagent communication. Language and structured coordination of agents helps with reward maximization in an environment, and could pose as an evolutionary byproduct of intelligence in line with the Reward is Enough hypothesis. Language is an important tool to solve intelligence problems. It encodes a lot of human knowledge. Is it enough to enable AGI?

What is AGI?


All of this comes down to the question of “What is an AGI?” Andrej Karpathy says it is a feeling. It is defined strangely by several Anthropic researchers as “that which would make this world a strange place”. Nick Bostrom defines AGI as “an intellect that is much smarter than the best human brains in practically every field”. Wikipedia defines it as “the ability of an intelligent agent to understand or learn any intellectual task that a human being can.” Open Philanthrophy describes “transformative AI” with an economic definition as that which would 10x the Gross World Product by bringing about as much difference as agriculture or the industrial revolution. OpenAI circles believe that AGI is possible to hire remote workers and that they are as competent as human beings. Is it AGI if an AI can perform like a researcher or physicist, make scientific breakthroughs, and write papers without moving a finger? Steve Wozniak suggested a coffee test for AGI. A machine would figure out how to make coffee in a human-unseen kitchen.

In the end, whether embodiment is necessary for AGI will depend on which definition you choose. For this essay, I’ll pick that which is as good or better than a human in every aspect as a baseline. It is possible to do many useful things in the world. Embodied AI would be required to transform these.

alt_text
“Vintage robot gives book reading in 1950s beat cafe.”
Generated by DALLE and prompted by @Merzmensch

There are three main ways to achieve AGI in relation to embodiment:

  1. Have an AI trained on vision and language and that which exists only on the internet.
  2. An AI trained on vision and language tasks but can do embodied tasks
  3. An AI trained on vision, language and control, is embodied and is quite intelligent and physically capable as a human.

Visual language AI that is completely digital is not AGI


In 1996, Deep Blue won against Garry Kasporov, becoming the most competitive chess player in the world. In those days, chess was the highest level of intelligence. AI has improved in many video games, image generation, and language tasks. Asking if non-embodied AI can solve AGI problems is like asking if an AI who can only play Go and chess is AGI. Although specialized AI can perform certain tasks better than humans, it is not universal enough. It is capable of doing amazing things such as running a scientific laboratory, publishing papers, inventing breakthrough technologies, and writing beautiful prose and poetry. Even though software engineer jobs can be automated, ex-software engineers still need food, clothes, laptops, and delivery. The AI wouldn’t be able to serve many industries, including manufacturing, construction, driving logistics, energy, mining, and agriculture. This is basically every industry that existed before the internet. This is not AGI.

Similar to the definition I used, 3 is unambigiously AGI. Let’s discuss 2.

What is the strongman version of an AGI paradigm that removes embodiment?


This hypothesis suggests that it is possible to train an intelligent AI that doesn’t collect data from robots, but is trained using visual and language datasets. It can zero-shot approximate actuator parameters and operate any robot at least as well as a human operator teleoperating it.

If an AI manages chefs but a human chef performs the actual act of cooking, then is that AGI?


To perform embodied tasks one needs to have embodiment. That is a tautology. But, is it possible for an AI to guide the actions of an embodied worker’s actuators? Is it AGI if every chef has an AI that can interpret visual language and tell them what to do in the classic Ratatouille style? Thinking further in this framework, an AI would never reach its full economic value and be transformative on account of an inability to 10x physical productivity without embodiment. This would result in AI being bottlenecked by a mechanical axis. AGI that involves the human embodiment in a closed feedback loop is also embodied AI.

This proposition must work if an AGI is not embodied to guide a robot. However, there are some caveats to this:

  1. Data from embodied agents: It needs to be trained on data from embodied agents, their sensors as input and their actuators as outputs. This works in a similar way to offline learning, where you expose a network of Youtube videos that show humans and other agents performing tasks. The model then maps to a sensor/actuator configuration that is morphologically identical. Is an AGI predisposed to depend upon data collected by embodied agent? If so, does that make it an embodied AI. If the agent is human, then no. If it requires data from teleoperated or learned robots, then no. This is just a subjective line I am drawing. Deepmind’s Gato by that subjective definition is embodied because it attempts to do robotics using supervised learning demos from real robots and simulated agents, and evaluates on real robots.
  2. Mapping morphology: A second consideration is the effectiveness of an AI that learns from human demonstrations but executes on a robot and how good it can get at generalizing for morphology from a cross-embodied agent demonstration. This paper gives some baselines on where we are at now, formulated in an RL, rather than supervised learning setting. Another close example is quadruped robots learning to walk from dogs, who have very similar morphology.
  3. Offline learning: A third aspect is the effectiveness of AI that only learns from policies executed by other agents. AI such as this would not be able to learn self-correcting behavior and model counterfactuals. When you learn how to surf, your awareness about your abilities and limitations will play a role in your decision-making. While flamingo attempts to few-shot a few visual learning tasks, this is not done for control yet.
  4. Sim2Real: Domain adaptation using simulated agents is one way to add embodiment without adding the physical aspect of it. However, sim2real is still an an active area of robotics research because agents exposed to only simulations cannot, yet, transfer very well to the real world.

Robotics is an AGI problem, but more importantly, AGI is a robotics problem


AI research that is primarily focused on language overlooks many problems that require intelligence. This includes 3D spatial imagery, decision making under partial observation and decision making. Simultaneous mapping and localization are two skills you would need to navigate a new place, navigate a maze in a game, or drive. These skills go beyond what is possible with language-based formulation. You will not be able to predict the movements of dynamic agents as you walk on the streets. Or, the control mechanisms that you would need to surf in a chaotic ocean. This is a problem you won’t encounter or solve using language.

Even after adding vision to language, there are still aspects that need to be controlled precisely and how it relates with intrinsic variables of embodiment. Solving robotics requires solving vision ( sensory perception, 3d reasoning), speech ( instruction and contextual reasoning) and control ( manipulation and navigation).

Agent/Environment cannot be abstracted away from Intelligence


Our genes have encoded data over many episodes that has enabled us to learn lifelong. A baby can understand structured motion and the physical world around her before she can understand and understand language. Intelligence is closely tied to survival advantages in an environment. For example, aquatic animals have visual systems that are much better at seeing underwater because they’re evolved to accommodate for the refraction by water in a way that humans are not. Our visual sensors attempt to replicate our visual range. The data we have collected, including YouTube videos, is not suited to our visual capabilities. In patients who have had their cataracts removed allowing them to see for the first time, it was seen that despite spending an entire life in a 3d world, they lacked understanding of spatial imagery because their sensors didn’t have that input. The definition of intelligence cannot include agent and environment.

The last ten years of AI research have seen the development of large transformers, which are extremely adept at multitask speech and vision benchmarks. A language first AI would be susceptible to the failure modes of a blind agent, beyond the visual context it receives from a training corpus gathered from humans who can see. This logic extends to the possibility that a visual language model might be unable to approximate actuator parameters necessary for precise control of an embedded agent. Reasoning about the real world require not just thinking about methodological spaces and language, but to be grounded in real world context.

In an intelligent world designed by humans, intelligence that is not sensitive to sensory-motor dynamics will be suboptimal. Superhuman abilities and physical agency are essential for achieving universal control and physical agency. AGI cannot exist without the ability to embody.

I’d like to acknowledge Ben Mann, Otavio Good, Jared Kaplan and Vivek Aithal for valuable discussions and Kanishka Rao, Vincent Vanhoucke and Alex Zirbel for providing feedback and proofreading.



To engage with my ideas, add me on Twitter.

Read More

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published.

Tech

FIFA 23 lets you turn off commentary pointing out how bad you are

Published

on

By

FIFA 23 lets you turn off commentary pointing out how bad you are
A player shouldering the ball



(Image credit: EA)

FIFA 23 might be the best game soccer game yet for terrible sports fans, as it lets you turn off commentary that criticizes your bad playing.

Now that the early access FIFA 23 release time has passed, EA Play and Xbox Game Pass Ultimate subscribers can hop into the game ahead of its full release. But as Eurogamer (opens in new tab) spotted, they’ll find a peculiar option waiting for them.

FIFA 23 includes a toggle to turn off ‘Critical Commentary’. The setting lets you silence all negative in-match comments made about your technique, so you can protect your precious ego even when you miss an open goal or commit an obvious foul. The more positive commentary won’t be affected. 

Spare your feelings

A player dribbling the ball in FIFA 23

(Image credit: EA)

The feature looks tailored toward children and new players, who don’t want to have their confidence wrecked within mere minutes of picking up the controller. But even experienced players who just so happen to be terrible at the game might benefit.

It’s not perfect, though. According to Eurogamer, the feature didn’t seem to work during a FIFA Ultimate Team Division Rivals match, with critical comments slipping through the filter. Still, who hasn’t benefited from a light grilling every now and then?

Polite commentary isn’t the only new addition in FIFA 23. It’s the first game in the series to include women’s club football teams, and fancy overhauled animations that take advantage of the PS5 and Xbox Series X|S’s new-gen hardware. EA will be hoping to end on a high, as FIFA 23 will be the last of its soccer games to release with the official FIFA licence.

If disabling critical commentary doesn’t improve your soccer skills, maybe building a squad of Marvel superheroes will. Although you might not do much better with Ted Lasso wandering the pitch.

FIFA 23 is set to fully release this Friday, September 30.

Callum is TechRadar Gaming’s News Writer. You’ll find him whipping up stories about all the latest happenings in the gaming world, as well as penning the odd feature and review. Before coming to TechRadar, he wrote freelance for various sites, including Clash, The Telegraph, and Gamesindustry.biz, and worked as a Staff Writer at Wargamer. Strategy games and RPGs are his bread and butter, but he’ll eat anything that spins a captivating narrative. He also loves tabletop games, and will happily chew your ear off about TTRPGs and board games. 

Read More

Continue Reading

Tech

Google Pixel 7 price leak suggests Google is totally out of touch

Published

on

By

Google Pixel 7 price leak suggests Google is totally out of touch
The backs of the Pixel 7 and the Pixel 7 Pro



(Image credit: Google)

We’re starting to hear more and more Google Pixel 7 leaks, with the launch of the phone just a week away, but tech fans might be getting a lot of déjà vu, with the leaks all listing near-identical specs to what we heard about the Pixel 6 a year ago.

It sounds like the new phones – a successor to the Pixel 6 Pro is also expected – could be very similar to their 2021 predecessors. And a new price leak has suggested that the phones’ costs could be the same too, as a Twitter user spotted the Pixel 7 briefly listed on Amazon (before being promptly taken down, of course).

Google pixel 7 on Amazon US. $599.99.It is still showing up in search cache but the listing gives an error if you click on it. We have the B0 number to keep track of though!#teampixel pic.twitter.com/w5Z09D28YESeptember 27, 2022

See more

According to these listings, the Pixel 7 will cost $599 while the Pixel 7 Pro will cost $899, both of which are identical to the Pixel 6 and Pixel 6 Pro starting prices. The leak doesn’t include any other region prices, but in the UK the current models cost £599 and £849, while in Australia they went for AU$999 and AU$1,299.

So it sounds like Google is planning on retaining the same prices for its new phones as it sold the old ones for, a move which doesn’t make much sense.


Analysis: same price, new world

Google’s choice to keep the same price points is a little curious when you consider that the specs leaks suggest these phones are virtually unchanged from their predecessors. You’re buying year-old tech for the same price as before.

Do bear in mind that the price of tech generally lowers over time, so you can readily pick up a cheaper Pixel 6 or 6 Pro right now, and after the launch of the new ones, the older models will very likely get even cheaper.

But there’s another key factor to consider in the price: $599 might be the same number in 2022 as it was in 2021, but with the changing global climate, like wars and flailing currencies and cost of living crises, it’s a very different amount of money.

Some people just won’t be willing to shell out the amount this year, that they may have been able to last year. But this speaks to a wider issue in consumer tech.

Google isn’t the only tech company to completely neglect the challenging global climate when pricing its gadgets: Samsung is still releasing super-pricey folding phones, and the iPhone 14 is, for some incomprehensible reason, even pricier than the iPhone 13 in some regions. 

Too few brands are actually catering to the tough economic times many are facing right now, with companies increasing the price of their premium offerings to counter rising costs, instead of just designing more affordable alternatives to flagships.

These high and rising prices suggest that companies are totally out of touch with their buyers, and don’t understand the economic hardship troubling many.

We’ll have to reach a breaking point sooner or later, either with brands finally clueing into the fact that they need to release cheaper phones, or with customers voting with their wallets by sticking to second-hand or refurbished devices. But until then, you can buy the best cheap phones to show that cost is important to you.

Tom’s role in the TechRadar team is to specialize in phones and tablets, but he also takes on other tech like electric scooters, smartwatches, fitness, mobile gaming and more. He is based in London, UK.

He graduated in American Literature and Creative Writing from the University of East Anglia. Prior to working in TechRadar freelanced in tech, gaming and entertainment, and also spent many years working as a mixologist. Outside of TechRadar he works in film as a screenwriter, director and producer.

Read More

Continue Reading

Tech

DisplayMate awards the “Best Smartphone Display” title to the iPhone 14 Pro Max

Published

on

By

DisplayMate awards the “Best Smartphone Display” title to the iPhone 14 Pro Max

, , , , , ,

search relation.

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

 

Read More

Continue Reading

Trending

Copyright © 2022 Xanatan