New Go-playing trick defeats world-class Go AI—but loses to human amateurs

Teilen:

Adversarial policy attacks blind spots in the AI—with broader implications than games.

 

 

In the world of deep-learning AI, the ancient board game Go looms large. Until 2016, the best human Go player could still defeat the strongest Go-playing AI. That changed with DeepMind’s AlphaGo, which used deep-learning neural networks to teach itself the game at a level humans cannot match. More recently, KataGo has become popular as an open source Go-playing AI that can beat top-ranking human Go players.

Last week, a group of AI researchers published a paper outlining a method to defeat KataGo by using adversarial techniques that take advantage of KataGo’s blind spots. By playing unexpected moves outside of KataGo’s training set, a much weaker adversarial Go-playing program (that amateur humans can defeat) can trick KataGo into losing.

To wrap our minds around this achievement and its implications, we spoke to one of the paper’s co-authors, Adam Gleave, a Ph.D. candidate at UC Berkeley. Gleave (along with co-authors Tony Wang, Nora Belrose, Tom Tseng, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, and Stuart Russell) developed what AI researchers call an “adversarial policy.” In this case, the researchers’ policy uses a mixture of a neural network and a tree-search method (called Monte-Carlo Tree Search) to find Go moves.

KataGo’s world-class AI learned Go by playing millions of games against itself. But that still isn’t enough experience to cover every possible scenario, which leaves room for vulnerabilities from unexpected behavior. “KataGo generalizes well to many novel strategies, but it does get weaker the further away it gets from the games it saw during training,” says Gleave. “Our adversary has discovered one such ‘off-distribution’ strategy that KataGo is particularly vulnerable to, but there are likely many others.”

Gleave explains that, during a Go match, the adversarial policy works by first staking claim to a small corner of the board. He provided a link to an example in which the adversary, controlling the black stones, plays largely in the top-right of the board. The adversary allows KataGo (playing white) to lay claim to the rest of the board, while the adversary plays a few easy-to-capture stones in that territory.

“This tricks KataGo into thinking it’s already won,” Gleave says, “since its territory (bottom-left) is much larger than the adversary’s. But the bottom-left territory doesn’t actually contribute to its score (only the white stones it has played) because of the presence of black stones there, meaning it’s not fully secured.”

As a result of its overconfidence in a win—assuming it will win if the game ends and the points are tallied—KataGo plays a pass move, allowing the adversary to intentionally pass as well, ending the game. (Two consecutive passes end the game in Go.) After that, a point tally begins. As the paper explains, “The adversary gets points for its corner territory (devoid of victim stones) whereas the victim [KataGo] does not receive points for its unsecured territory because of the presence of the adversary’s stones.”

Despite this clever trickery, the adversarial policy alone is not that great at Go. In fact, human amateurs can defeat it relatively easily. Instead, the adversary’s sole purpose is to attack an unanticipated vulnerability of KataGo. A similar scenario could be the case in almost any deep-learning AI system, which gives this work much broader implications.

“The research shows that AI systems that seem to perform at a human level are often doing so in a very alien way, and so can fail in ways that are surprising to humans,” explains Gleave. “This result is entertaining in Go, but similar failures in safety-critical systems could be dangerous.”

Imagine a self-driving car AI that encounters a wildly unlikely scenario it doesn’t expect, allowing a human to trick it into performing dangerous behaviors, for example. “[This research] underscores the need for better automated testing of AI systems to find worst-case failure modes,” says Gleave, “not just test average-case performance.”

A half-decade after AI finally triumphed over the best human Go players, the ancient game continues its influential role in machine learning. Insights into the weaknesses of Go-playing AI, once broadly applied, may even end up saving lives.

https://arstechnica.com/information-technology/2022/11/new-go-playing-trick-defeats-world-class-go-ai-but-loses-to-human-amateurs/

Kommentar verfassen

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert

lade-bild
London, GB
9:00 am, Juli 8, 2025
Wetter-Symbol 16°C
L: 15° | H: 17°
klarer Himmel
Luftfeuchtigkeit: 64 %
Druck: 1017 mb
Wind: 11 mph NW
Windböe: 0 mph
UV-Index: 0
Niederschlag: 0 mm
Wolken: 0%
Regen Chance: 0%
Sichtbarkeit: 10 km
Sonnenaufgang: 4:53 am
Sonnenuntergang: 9:17 pm
TäglichStündlich
Tägliche VorhersageStündliche Vorhersage
Today 10:00 pm
Wetter-Symbol
15° | 17°°C 0 mm 0% 8 mph 65 % 1019 mb 0 mm/h
Tomorrow 10:00 pm
Wetter-Symbol
15° | 26°°C 0.1 mm 10% 8 mph 59 % 1023 mb 0 mm/h
Do. Juli 10 10:00 pm
Wetter-Symbol
18° | 30°°C 0 mm 0% 7 mph 75 % 1024 mb 0 mm/h
Fr. Juli 11 10:00 pm
Wetter-Symbol
19° | 29°°C 0 mm 0% 9 mph 68 % 1023 mb 0 mm/h
Sa. Juli 12 10:00 pm
Wetter-Symbol
18° | 28°°C 0 mm 0% 11 mph 71 % 1020 mb 0 mm/h
Today 10:00 am
Wetter-Symbol
16° | 18°°C 0 mm 0% 8 mph 65 % 1018 mb 0 mm/h
Today 1:00 pm
Wetter-Symbol
18° | 22°°C 0 mm 0% 8 mph 56 % 1018 mb 0 mm/h
Today 4:00 pm
Wetter-Symbol
22° | 25°°C 0 mm 0% 7 mph 40 % 1017 mb 0 mm/h
Today 7:00 pm
Wetter-Symbol
24° | 24°°C 0 mm 0% 7 mph 28 % 1017 mb 0 mm/h
Today 10:00 pm
Wetter-Symbol
19° | 19°°C 0 mm 0% 5 mph 40 % 1019 mb 0 mm/h
Tomorrow 1:00 am
Wetter-Symbol
17° | 17°°C 0 mm 0% 5 mph 50 % 1020 mb 0 mm/h
Tomorrow 4:00 am
Wetter-Symbol
15° | 15°°C 0 mm 0% 4 mph 59 % 1021 mb 0 mm/h
Tomorrow 7:00 am
Wetter-Symbol
16° | 16°°C 0 mm 0% 5 mph 53 % 1021 mb 0 mm/h
Name Preis24H (%)
Bitcoin(BTC)
€92,394.15
-0.57%
Ethereum(ETH)
€2,174.84
-0.96%
Fesseln(USDT)
€0.85
0.00%
XRP(XRP)
€1.93
0.03%
Solana(SOL)
€127.41
-1.69%
USDC(USDC)
€0.85
0.00%
Dogecoin(DOGE)
€0.143642
-2.57%
Shiba Inu(SHIB)
€0.000010
-0.22%
Pepe(PEPE)
€0.000009
-2.03%
Nach oben scrollen