Thread Tools Display Modes
Old 01-17-2018, 01:25 PM
Chronos Chronos is offline
Charter Member
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 76,114
I guess what I'm getting at is that, at sufficiently-high levels, the assumptions behind the Elo rating system probably break down. One can envision two extremely powerful players who almost always draw against each other, but such that A can beat B one game out of 100, while B can't beat A even one time in a billion. If B only makes a game-losing error 1% of the time, but A only makes a game-losing error less than one game in a billion, then in some sense A is ten million times better... but their ratings will be very close.
Old 01-19-2018, 11:14 AM
BigAppleBucky BigAppleBucky is offline
Join Date: Jan 2012
Posts: 2,253
Originally Posted by borschevsky View Post
That number for AlphaZero's rating doesn't make sense. Since it scored 64% against Stockfish, its rating should be 100 points higher than Stockfish's.

I've seen estimates for the maximum possible rating at about 3600. If you imagine a perfect chess-playing entity, would Magnus Carlsen draw one out of every 50 games against it, and lose the other 49? If so, its rating would be about 3600.

In these games the engines were given the first few moves of human concieved openings. Not sure what the engines would do without those openings. Maybe white has a forced win or maybe with best play the games are always draws.

1200 games of 12 openings, 100 games per opening were played by AlphaZero & Stockfish 8 that are in the paper released by the developers. 10 games from the paper are in current videos. I haven't seen any one mention anything about the other games???? So I did this video & came up with a rating from the data, & also expressed some obvious implications of it.

Both programs played back & white 50 games in each opening. The results for Alpha Zero were:

As white; 242 wins, 353 draws, 5 losses.
As black; 48 wins, 533 draws, 19 losses.
For a total of 290 wins, 886 draws, 24 losses.

23% wins, 73.83% draws, 2% losses. So it did lose to Stockfish, why nobody is mentioning it I don't know but I'm assuming investment reasons & hype.

AlphaZero was run on a super computer platform, apparently Stockfish was not. That might have been extremely important.
Old 01-19-2018, 11:40 AM
borschevsky borschevsky is offline
Join Date: Sep 2001
Location: Canada
Posts: 1,905
Originally Posted by BigAppleBucky View Post
He writes the following in the video description:

For the rating compared to Stockfish which is around 3400, AlphaZero won 23% of the games & lost 2%, for a 21% strength over Stockfish, which is 700 points & a total rating of 4100+.
He's apparently taking that 21% and saying that 3400 + (3400 * 0.21) = 4100+. That is not remotely close to how the ratings work. Winning 23%, losing 2%, and drawing 75% against a 3400 gives you a rating of 3474.


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump

All times are GMT -5. The time now is 01:46 AM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2018, vBulletin Solutions, Inc.

Send questions for Cecil Adams to:

Send comments about this website to:

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Publishers - interested in subscribing to the Straight Dope?
Write to:

Copyright 2018 STM Reader, LLC.

Copyright © 2017