• Audio
  • Live tv
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Service
Sunday, March 26, 2023
Morning News
No Result
View All Result
  • Login
  • Home
  • News
    • Local
    • National
    • World
  • Markets
  • Economy
  • Crypto
  • Real Estate
  • Sports
  • Entertainment
  • Health
  • Tech
    • Automotive
    • Business
    • Computer Sciences
    • Consumer & Gadgets
    • Electronics & Semiconductors
    • Energy & Green Tech
    • Engineering
    • Hi Tech & Innovation
    • Machine learning & AI
    • Security
    • Hardware
    • Internet
    • Robotics
    • Software
    • Telecom
  • Lifestyle
    • Fashion
    • Travel
    • Canadian immigration
  • App
    • audio
    • live tv
  • Home
  • News
    • Local
    • National
    • World
  • Markets
  • Economy
  • Crypto
  • Real Estate
  • Sports
  • Entertainment
  • Health
  • Tech
    • Automotive
    • Business
    • Computer Sciences
    • Consumer & Gadgets
    • Electronics & Semiconductors
    • Energy & Green Tech
    • Engineering
    • Hi Tech & Innovation
    • Machine learning & AI
    • Security
    • Hardware
    • Internet
    • Robotics
    • Software
    • Telecom
  • Lifestyle
    • Fashion
    • Travel
    • Canadian immigration
  • App
    • audio
    • live tv
No Result
View All Result
Morning News
No Result
View All Result
Home Tech Computer Sciences

Sometimes it’s bad for AI to be too curious

by author
November 10, 2022
in Computer Sciences, Machine learning & AI
Reading Time: 5 mins read
0 0
A A
0
0
SHARES
12
VIEWS
Share on FacebookShare on TwitterLinkedinReddit
mario video game
Credit: Pixabay/CC0 Public Domain

It’s a dilemma as old as time. Friday night has rolled around, and you’re trying to pick a restaurant for dinner. (Assuming there’s still reservations since you waited until the last minute to book). Anyways, should you go to your most beloved watering hole, or try a new establishment, in the hopes of discovering something superior? Potentially, but that curiosity comes with a risk: you explore, and the food could be worse, or you exploit, and fail to grow out of your narrow pathway.

Curiosity drives AI to explore the world, now in boundless use cases—autonomous navigation, robotic decision making, optimizing health outcomes. Machines, in some cases, use “reinforcement learning” to accomplish a goal, where an AI agent iteratively learns from being rewarded for good behavior and punished for bad.

Just like the dilemma faced by humans in selecting a restaurant, these agents also struggle with balancing the time spent discovering better actions (exploration) and the time spent taking actions that led to high rewards in the past (exploitation). Too much curiosity can distract the agent from making good decisions and too little means the agent will never discover good decisions.

In the pursuit of making AI agents with just the right dose of curiosity, researchers from MIT’s Improbable AI Laboratory and Computer Science and Artificial Intelligence Laboratory (CSAIL) created an algorithm that overcomes the problem of AI being too “curious” and getting distracted by the task at hand. Their algorithm automatically increases curiosity when it’s needed, and suppresses it if the agent gets enough supervision from the environment to know what to do.

When tested on over sixty video games, the algorithm was able to succeed at both hard and easy exploration tasks, where previous algorithms have only been able to tackle only a hard or easy domain alone. With this method, AI agents use less data for learning decision making rules that maximize incentives.

“If you master the exploration-exploitation trade off well, you can learn the right decision-making rules faster—and anything less will require lots of data, which could mean suboptimal medical treatments, lesser profits for websites, and robots that don’t learn to do the right thing,” says Pulkit Agrawal, MIT Professor and Director of the Improbable AI Lab, who supervised the research.

“Imagine a website trying to figure out the design or layout of its content that will maximize sales. If one doesn’t perform exploration-exploitation well, converging to the right website design or the right website layout will take a long time, which means profit loss. Or in a health care setting, like with COVID-19, there may be a sequence of decisions that need to be made to treat a patient, and if you want to use decision-making algorithms, they need to learn quickly and efficiently—you don’t want a suboptimal solution when treating a large number of patients. We hope that this work will apply to real-world problems of that nature.”

Curiosity killed the cat

It’s hard to encompass the nuances of curiosity’s psychological underpinnings—the underlying neural correlates of challenge seeking behavior are a poorly understood phenomena. Attempts to categorize the behavior have spanned studies that have dove deeply into studying our impulses, deprivation sensitivities, and social and stress tolerances.

With reinforcement learning, this process is sort of “pruned” emotionally and stripped down to the bare bones, but it’s quite complicated (surprise surprise) on the technical side. Essentially, the agent should only be curious when there’s not enough supervision available to try out different things, and if there is supervision, it must adjust curiosity and lower it.

Since a large subset of gaming is little agents running around fantastical environments looking for rewards and performing a long sequence of actions to achieve some goal, it seemed like the logical testbed for the researchers’ algorithm. In experiments, with games like Mario Kart and Montezuma’s revenge, they divided said games into two different buckets: one where supervision was sparse, meaning the agent had less guidance, which were considered “hard” exploration games, and a second where supervision was more dense, or the “easy” exploration games.

Suppose in Mario Kart, for example, you only remove all rewards so you don’t know when an enemy kills you. You’re not given any reward when you collect a coin or jump over pipes. The agent is only told in the end how well it did. This would be bucket one with sparse supervision. Algorithms that incentivize curiosity do really well in this scenario.

But now, suppose the agent is provided dense supervision—a reward for jumping over pipes, collecting coins and killing enemies. Here an algorithm without curiosity performs really well because it gets rewarded very often. But instead, if you take the algorithm that also uses curiosity, it learns slowly. It is because the curious agent might attempt to run fast in different ways, dance around, go to every part of the game screen—things which are interesting—but do not help the agent succeed at the game. The team’s algorithm, however, consistently performed well, irrespective of what environment it was in.

Future work might involve circling back to the exploration that’s delighted and plagued psychologists for years: an appropriate metric for curiosity –no one really knows the right way to mathematically define curiosity.

“Getting consistent good performance on a novel problem is extremely challenging—so by improving exploration algorithms, we can save your effort on tuning an algorithm for your problems of interest. We need curiosity to solve extremely challenging problems, but on some problems it can hurt performance. We propose an algorithm that removes the burden of tuning the balance of exploration and exploitation. Previously what took, for instance, a week to successfully solve the problem. With this new algorithm, we can get satisfactory results in a few hours.” says MIT CSAIL Ph.D. student Zhang-Wei Hong, co-lead author along with Eric Chen, MIT CSAIL MEng ’22, on a new paper about the work.

“Intrinsic rewards like curiosity are fundamental to guiding agents to discover useful diverse behaviors, but this shouldn’t come at the cost of doing well at the given task. This is an important problem in AI and the paper provides a way to balance that tradeoff. It would be interesting to see how such methods scale beyond games to real world robotic agents,” says Deepak Pathak, Faculty at Carnegie Mellon University.

“One of the greatest challenges for current AI and cognitive science is how to balance exploration and exploitation—the search for information versus the search for reward. Children do this seamlessly, but it is challenging computationally,” notes Alison Gopnik, Distinguished Professor of Psychology and Affiliate Professor of Philosophy at UC Berkeley, who was not involved with the project.

“This paper uses impressive new techniques to accomplish this automatically, designing an agent that can systematically balance curiosity about the world and the desire for reward, [thus taking] another step towards making AI agents (almost) as smart as children.”

More information:
Eric R Chen, Zhang-Wei Hong, Joni Pajarinen, Pulkit Agrawal, Redeeming intrinsic rewards via constrained policy optimization. openreview.net/forum?id=36Yz37cEN_Q

Provided by
MIT Computer Science & Artificial Intelligence Lab

Citation:
Sometimes it’s bad for AI to be too curious (2022, November 10)
retrieved 10 November 2022
from https://techxplore.com/news/2022-11-bad-ai-curious.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Tags: algorithmcuriositygood behaviorhealth outcomes
Previous Post

High-throughput computational screening of organic molecules for organic ion battery cathodes

Next Post

A soft robotic microfinger that enables interaction with insects through tactile sensing

Related Posts

Machine learning & AI

Commentary: War in Ukraine accelerates global drive toward killer robots

March 26, 2023
11
Engineering

New way to predict the damage and aging of bridges

March 25, 2023
12
Next Post

A soft robotic microfinger that enables interaction with insects through tactile sensing

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR TODAY

quebec daycare bus crash
Local

Judge orders psychiatric evaluation for driver in Quebec daycare bus crash

by author
March 26, 2023
0
12

The man charged with killing two young children when the bus he was driving slammed into a daycare had to...

Over 18 percent of Maryland households are burdened by high energy bills: Report

March 26, 2023
12
Maeen Abdulmalik Saeed

UN seeks $4.3 billion to cover Yemen 2023 humanitarian needs

March 26, 2023
12
An Ottawa Police officer sits in their cruiser on Wellington Street below Parliament Hill in Ottawa, on Friday, Jan. 27, 2023. THE CANADIAN PRESS/Justin Tang

Ottawa councillor denounces police wearing ‘thin blue line’ symbol on hockey jerseys

March 26, 2023
12

10-year-old dies in fire in Mississaugas of the Credit First Nation

March 25, 2023
13

POPULAR NEWS

Why Ray Dalio says SVB collapse is a ‘canary in the coal mine’

March 21, 2023
20

Biden backs tax hike on investment income to bolster Medicare, as he rolls out his budget proposal

March 20, 2023
19

Hackers scored data center logins for big corporations more than a year ago. Now they’re selling that information

March 21, 2023
16
A woman holds out her hands to a physician.

Osteoarthritis: Experimental Drug May Help Reduce Inflammation and Symtpoms, Early Study Finds

March 23, 2023
16

A new way to trap radioactive waste in minerals for long-term storage

March 21, 2023
15

EDITOR'S PICK

The Kelowna RCMP provided images of these four pieces of art that were stolen from a local gallery on Feb. 28, 2023.
Local

4 sculptures stolen by masked men during Kelowna break-and-enter, RCMP say

by author
March 20, 2023
0
11

Mounties in Kelowna are investigating the theft of four sculptures of "significant value" from a local gallery on Tuesday. Officers...

Read more

Bitcoin core dev calls out ‘misleading’ auction selling his code as an NFT

Driver hospitalized after flipping refrigerator truck on highway near Courtenay, B.C.

Canada trounces U.S. 5-0 for fourth straight win to claim Rivalry Series

How Hamilton might find way out of our housing deprivation crisis

Morning News

Welcome to our Ads

Create ads focused on the objectives most important to your business Please contact us info@morns.ca

  • Home
  • Audio
  • Live tv
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Service

© 2022 Morning News - morns.ca by morns.ca.

No Result
View All Result
  • Home
  • News
    • Local
    • National
    • World
  • Markets
  • Economy
  • Crypto
  • Real Estate
  • Sports
  • Entertainment
  • Health
  • Tech
    • Automotive
    • Business
    • Computer Sciences
    • Consumer & Gadgets
    • Electronics & Semiconductors
    • Energy & Green Tech
    • Engineering
    • Hi Tech & Innovation
    • Machine learning & AI
    • Security
    • Hardware
    • Internet
    • Robotics
    • Software
    • Telecom
  • Lifestyle
    • Fashion
    • Travel
    • Canadian immigration
  • App
    • audio
    • live tv
  • Login

© 2022 Morning News - morns.ca by morns.ca.

Welcome Back!

Sign In with Facebook
Sign In with Google
Sign In with Linked In
OR

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Go to mobile version