Reply
 
Thread Tools Display Modes
  #1  
Old 10-19-2017, 04:22 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
AlphaGo: Where's AlphaChess?

A year ago, Google developed AlphaGo, an AI that plays Go. A few months after that, it took the world by storm when it handily defeated the best human players in the world, when previous efforts at Go AIs didn't come anywhere close to that. But what was most interesting was how they did it: According to the designers, they didn't program it how to play Go, but rather, programmed it to learn how to play a game, and then let it learn Go.

Now, given this, it seems to me that the logical next step is to take the same system and let it learn some other game as well, and the obvious choice for a game is one that's already been extensively studied by AI researchers, namely, chess.

But I haven't heard anything about the Google programmers working in this direction. So, why aren't they growpramming Alpha to play chess, also? Or are they, and I just haven't heard about it?
__________________
Time travels in divers paces with divers persons.
--As You Like It, III:ii:328
Check out my dice in the Marketplace
  #2  
Old 10-19-2017, 05:06 PM
DPRK DPRK is offline
Guest
 
Join Date: May 2016
Posts: 911
http://www.bbc.co.uk/news/technology-41668701

I would like to see the source code, but according to the news report it can learn to play a large class of games, which I reason includes Chess (I assume if it can learn to play Go it can learn to play Chess).

ETA: perhaps some of the AI experts on this board can describe in more detail how the algorithm works, and what hardware it requires to run

Last edited by DPRK; 10-19-2017 at 05:07 PM.
  #3  
Old 10-19-2017, 09:07 PM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Posts: 7,287
Probably because computer programs already beat humans at chess and they were not excited by it. They did more than Go though.

https://deepmind.com/research/public...ment-learning/
  #4  
Old 10-19-2017, 10:29 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
Huh, Atari games... I was not expecting that as the next step. I guess that their emulator is able to run them at many times actual speed, giving them the time needed for the system to learn them. And even at Atari resolutions, there are a lot more pixels on the screen than there are squares on a Go board, and it's having to react to them more quickly, too.
  #5  
Old 10-19-2017, 10:31 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
...Wait, that was 2015, so before they applied it to Go. I suppose that winning the Go game is a more nebulous end-state than just "get as high a score as possible", though.
  #6  
Old 10-19-2017, 11:24 PM
pulykamell pulykamell is offline
Charter Member
 
Join Date: May 2000
Location: SW Side, Chicago
Posts: 41,969
AlphaChess? What about AlphaStratego?! (Reference to another thread me and the OP participated in a few months ago.)
  #7  
Old 10-20-2017, 12:35 AM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Posts: 7,287
Yeah, didn't mean to imply that Atari was after Go, just that it's been worked on, and that algorithm was general enough that it could do well on a wide variety of games with different training.

For the next step in complexity, I think Starcraft is what DeepMind is concentrating on.
  #8  
Old 10-20-2017, 12:54 AM
DPRK DPRK is offline
Guest
 
Join Date: May 2016
Posts: 911
The article quotes the DeepMind chief as mentioning drug and materials design as well as more general applications in scientific research. It may be that playing board games is no longer cutting-edge enough for them to waste their time on it.
  #9  
Old 10-20-2017, 02:28 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
I went in with the assumption that Alpha was a machine programmed to learn how to play board games. It looks like a more accurate description might be that it's a machine programmed to learn. Which, really, is even more impressive.
  #10  
Old 10-20-2017, 04:37 PM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Posts: 7,287
Quote:
Originally Posted by Chronos View Post
I went in with the assumption that Alpha was a machine programmed to learn how to play board games. It looks like a more accurate description might be that it's a machine programmed to learn. Which, really, is even more impressive.
My understanding is that AlphaGo and it's successor are programmed with the rules of the game and to know that winning a game is good, that's about it. The atari version, for instance, doesn't even know the rules of the games at the start. It just gets raw pixel input and infers the rules based on what sorts of actions tend to get positive results.
  #11  
Old 10-21-2017, 08:31 AM
BeepKillBeep BeepKillBeep is offline
Guest
 
Join Date: Jul 2014
Location: Canada
Posts: 1,848
Quote:
Originally Posted by DPRK View Post
http://www.bbc.co.uk/news/technology-41668701

I would like to see the source code, but according to the news report it can learn to play a large class of games, which I reason includes Chess (I assume if it can learn to play Go it can learn to play Chess).

ETA: perhaps some of the AI experts on this board can describe in more detail how the algorithm works, and what hardware it requires to run
I'll try to keep this non-technical as I can and if anybody has any more technical follow-up questions, I'll answer them as I am able.

The algorithm basically works as follows. A neural network is a collection of virtual neurons that take a set of weighted inputs and apply some function to it (usually a sigmoid function) to transform it into an output. If you consider what this means, no matter what the final output from a neural network is a purely function of the inputs. I.e. there is no randomness, once trained a network is deterministic. So you can always unravel a neural network to produce a function. Now, the functions are beyond extremely complex for any non-trivial neural network, but they are functions none-the-less. This is why neural networks are sometimes called "universal function approximators". So if you give them a set of samples, and train the network, it will converge on a configuration that optimally transforms the samples into the desired outputs.

Reinforcement learning (Q-learning) is the pretty simple. The idea is that an AI can determine good/bad actions based on a reward, as opposed to being told whether something is explicitly correct or not. As an example, suppose you're training a self-driving car. You give it the goal of "safely transporting the passenger and vehicle from point A to B." Using reinforcement learning, the AI should very quickly discover that traversing a red light is bad because neither the passenger nor the vehicle will arrive safely at B. This differs from explicitly telling the AI that it did something wrong everytime it traverses a red light. The advantage of reinforcement learning is the AI is more free to find its own way, so to speak. For example, the AI might learn that traversing a red light is just fine when there is no traffic (say at 2 AM in a small town), which is actually pretty reasonable even if illegal. The drawback is, that much like with neural networks, why a particular policy (a set of preferred actions for every state) is selected isn't always clear, i.e. it might tell you that the policy says I can run red lights at 2 AM, without the additional information that it does this because there's no traffic at 2 AM. Here's the key thing though. Reinforcement learning is ultimately represented as a function. It is a function that describes for a given state, and a given policy, what is the expected reward? Finding the particular function to optimally solve a problem is not easy, and the functions tend to be fairly simple.

So, what these researchers realized, is neural networks can find optimal solutions for complex functions. Reinforcement learning could benefit from more complex functions but they are hard to find. So they married these two together and use a neural network to find an optimal reinforcement learning function, which in effect, gives them the optimal policy.

I hope that helps. Again, if there's any follow-up questions feel free to fire away.

Last edited by BeepKillBeep; 10-21-2017 at 08:34 AM.
  #12  
Old 10-21-2017, 04:20 PM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Posts: 7,287
Isn't the logistic/sigmoid activation function basically obsolete?
  #13  
Old 10-21-2017, 04:53 PM
BeepKillBeep BeepKillBeep is offline
Guest
 
Join Date: Jul 2014
Location: Canada
Posts: 1,848
Quote:
Originally Posted by Snarky_Kong View Post
Isn't the logistic/sigmoid activation function basically obsolete?
Not as far as I know, most of the papers I read papers still use a sigmoid function. The neuron input -> output (activation) function can be any function, but sigmoid functions remain commonly used.

Last edited by BeepKillBeep; 10-21-2017 at 04:56 PM.
  #14  
Old 10-21-2017, 05:24 PM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Posts: 7,287
Interesting. Pretty much everything I see, including AlphaGo, uses rectified linear units (ReLU).

When you say it can be any function, it must be non-linear correct?

Last edited by Snarky_Kong; 10-21-2017 at 05:26 PM.
  #15  
Old 10-21-2017, 05:32 PM
Dr. Strangelove Dr. Strangelove is offline
Guest
 
Join Date: Dec 2010
Posts: 6,117
Yeah, my understanding is that ReLU is more popular these days, if for no other reason than that it's super-cheap (though I think it has better properties than sigmoid as well). It must be non-linear--a linear function would make the whole net equivalent to a matrix multiply, which isn't that interesting.
  #16  
Old 10-21-2017, 07:32 PM
BeepKillBeep BeepKillBeep is offline
Guest
 
Join Date: Jul 2014
Location: Canada
Posts: 1,848
Quote:
Originally Posted by Snarky_Kong View Post
Interesting. Pretty much everything I see, including AlphaGo, uses rectified linear units (ReLU).

When you say it can be any function, it must be non-linear correct?
Technically, the function can be anything, but of course some functions are better or more meaningful than others. In terms of linear functions, the piecewise linear activation function certainly isn't common, and I've never tried it, but it does show up every now and then. By leaps and bounds, non-linear functions are more common for certain.

Rectified linear units show up mainly in larger networks, specifically convolutional neural networks, such as AlphaGo. In smaller networks, sigmoid functions are still commonly used. In fact, I've seen more than a few papers where the deepest layers of a deep neural network are sigmoid (as there are few neurons) and the outer layers are ReLU. In non-deep networks, i.e. 3-layer perceptrons or single layer recurrent, I still see a lot of just sigmoid functions. This is mainly in computer vision papers since I review those a lot, maybe in other domains it is different.

Doing a Google Scholar search, "neural network sigmoid activation function" returns 22,500 since 2013 and "neural network rectified linear" returns 17,000. Not that the return results necessarily mean anything in terms of popularity, but certainly the sigmoid function is not obsolete.

Last edited by BeepKillBeep; 10-21-2017 at 07:36 PM.
  #17  
Old 10-21-2017, 09:40 PM
Dr. Strangelove Dr. Strangelove is offline
Guest
 
Join Date: Dec 2010
Posts: 6,117
Quote:
Originally Posted by BeepKillBeep View Post
In terms of linear functions, the piecewise linear activation function certainly isn't common, and I've never tried it, but it does show up every now and then.
ReLU is piecewise linear. "Piecewise linear" is just a type of non-linear. I think Snarky_Kong meant truly linear, like f(x)=.5x+.5 or whatever. But those are boring since you can just multiply through to the end and it turns into a matrix multiply.
  #18  
Old 10-21-2017, 09:48 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
What's wrong with a Heaviside function? That's the way I'd always heard that real neurons worked, and it'd be computationally easier, though of course no Heaviside function in nature is ever actually truly Heaviside.
  #19  
Old 10-21-2017, 10:06 PM
Dr. Strangelove Dr. Strangelove is offline
Guest
 
Join Date: Dec 2010
Posts: 6,117
Quote:
Originally Posted by Chronos View Post
What's wrong with a Heaviside function?
It's not differentiable. NN learning works by backpropagation, which takes a portion of the output error and feeds a portion of it to the inputs. This only works when the derivative is defined. You could in principle do something similar with a Heaviside function, where you randomly change a few inputs to change the output activation, but having a derivative makes for "smoother" learning.
  #20  
Old 10-21-2017, 10:10 PM
Dr. Strangelove Dr. Strangelove is offline
Guest
 
Join Date: Dec 2010
Posts: 6,117
And FWIW, ReLU is the integral of the Heaviside function. So it's almost as simple as you can get.
  #21  
Old 10-21-2017, 10:41 PM
DPRK DPRK is offline
Guest
 
Join Date: May 2016
Posts: 911
It is not any more differentiable than the Heaviside function, though. What are the technical conditions on the function, and how do I judge which one is optimal?
  #22  
Old 10-21-2017, 11:11 PM
Dr. Strangelove Dr. Strangelove is offline
Guest
 
Join Date: Dec 2010
Posts: 6,117
Sorry, yeah--differentiability itself isn't exactly the criterion. I guess the better answer is that it needs a non-zero derivative over significant parts of its range.

As said, it needs to be non-linear, since most interesting functions are non-linear, and you can't build a non-linear function from linear ones.

The non-zero derivative is so backpropagation works. The question is just: given the difference between the current output and the output we want, how do we tweak the inputs to reduce that error? And the answer is to use the derivative; multiply that by the error and add that to the input.

You don't want the function to have huge amounts of compression in some regions but not others. Otherwise you're wasting bits, because large portions of the input range get squeezed into small portions of the output range. You need some compression due to the non-linearity, but you don't want to go crazy.

No one really knows how to judge whether one is optimal. It's still an emerging technology. ReLU has been shown, in practice, to work a little better than sigmoid and some others. About the only really solid criterion is the hardware requirements, which are trivial for ReLU (can be implemented in a few dozen gates), whereas sigmoid uses an exponential and divide--both expensive in HW. Of course, there's still plenty of room for experimentation with other functions.
  #23  
Old 10-22-2017, 12:12 PM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Posts: 7,287
Quote:
Originally Posted by Dr. Strangelove View Post
ReLU is piecewise linear. "Piecewise linear" is just a type of non-linear. I think Snarky_Kong meant truly linear, like f(x)=.5x+.5 or whatever. But those are boring since you can just multiply through to the end and it turns into a matrix multiply.
Right, linear activation functions can only do linear separations of examples. Boring.

As far as I can tell, it isn't known what makes activation functions optimal. Lots of comparisons are done on test data sets and the two things that generally are compared are: 1) final prediction accuracy, 2) training speed. So, ReLU is faster and gives better results in many cases, but the reason why that is probably isn't as well understood as it could be.

I've seen a few papers that have neural networks try to learn the activation function for a second neural network and gotten some novel possibilities.
  #24  
Old 10-22-2017, 12:36 PM
DPRK DPRK is offline
Guest
 
Join Date: May 2016
Posts: 911
Quote:
Originally Posted by Snarky_Kong View Post
I've seen a few papers that have neural networks try to learn the activation function for a second neural network and gotten some novel possibilities.
Sorry about all the elementary questions, but what functions did they get? Did the functions vary depending on the architecture of the second neural network, or not so much?
  #25  
Old 10-23-2017, 08:00 AM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Posts: 7,287
Quote:
Originally Posted by DPRK View Post
Sorry about all the elementary questions, but what functions did they get? Did the functions vary depending on the architecture of the second neural network, or not so much?
So, I've been trying to find the paper I was thinking of and failing. As I recall, there were two "families" of functions that worked best. One had the basic form of ln(a+e^x) which is basically the "softplus" / a smoothed ReLU. I don't recall the form of the other family.
  #26  
Old 10-23-2017, 09:59 AM
Pleonast Pleonast is offline
Charter Member
 
Join Date: Aug 1999
Location: Los 'Kamala'ngeles
Posts: 6,381
Quote:
Originally Posted by Dr. Strangelove View Post
No one really knows how to judge whether one is optimal. It's still an emerging technology. ReLU has been shown, in practice, to work a little better than sigmoid and some others. About the only really solid criterion is the hardware requirements, which are trivial for ReLU (can be implemented in a few dozen gates), whereas sigmoid uses an exponential and divide--both expensive in HW. Of course, there's still plenty of room for experimentation with other functions.
Sounds like the perfect problem for AI to solve!
  #27  
Old 10-23-2017, 11:38 AM
BeepKillBeep BeepKillBeep is offline
Guest
 
Join Date: Jul 2014
Location: Canada
Posts: 1,848
Quote:
Originally Posted by Snarky_Kong View Post
So, I've been trying to find the paper I was thinking of and failing. As I recall, there were two "families" of functions that worked best. One had the basic form of ln(a+e^x) which is basically the "softplus" / a smoothed ReLU. I don't recall the form of the other family.
I hope you can find it, I'd like to read it.
  #28  
Old 12-06-2017, 12:27 PM
borschevsky borschevsky is online now
Guest
 
Join Date: Sep 2001
Location: Canada
Posts: 1,865
Quote:
Originally Posted by Chronos View Post
Where's AlphaChess?
It's here:

https://arxiv.org/pdf/1712.01815.pdf

If I'm understanding this right, they trained it for four hours, without any inputs other than the rules of chess. So no opening book, no grandmaster games. After that training, they had it play a 100-game match against the world computer chess champion, Stockfish. AlphaZero won 28 and drew 72, and didn't lose a single game.

Some of the games are here. Crazy stuff in some of them.

Last edited by borschevsky; 12-06-2017 at 12:28 PM.
  #29  
Old 12-06-2017, 01:27 PM
Dead Cat Dead Cat is offline
Guest
 
Join Date: Feb 2005
Location: UK
Posts: 3,278
That is astonishing. I'll try to find time to play through some of those games. Thank you for posting this.
  #30  
Old 12-06-2017, 05:10 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
Wow. What was the white-black breakdown on those wins? And when AlphaChess plays against itself, what's the white-black record?
  #31  
Old 12-06-2017, 05:44 PM
borschevsky borschevsky is online now
Guest
 
Join Date: Sep 2001
Location: Canada
Posts: 1,865
When AlphaZero was white, it won 25 and drew 25. As black, it won 3 and drew 47. There aren't any white/black statistics from the training, unless I missed them. The paper does have some neat graphs, showing how the algorithm played openings at different rates as it learned.

As I understand it, they're going to release some further information, including some games from earlier in the learning process.
  #32  
Old 12-06-2017, 06:13 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
Quote:
There aren't any white/black statistics from the training, unless I missed them.
I don't mean training games; I mean games between the fully-trained program and itself (well, I'm sure that it still continues learning continually, but it's probably pretty close to the asymptote by now, close enough that a hundred more won't move the needle much). I ask because a strong white-black skew could be expected to be one sign of a high degree of chess mastery.
  #33  
Old 12-06-2017, 09:14 PM
Chessic Sense Chessic Sense is offline
Guest
 
Join Date: Apr 2007
Posts: 6,562
Quote:
Originally Posted by borschevsky View Post
Some of the games are here. Crazy stuff in some of them.
This is just crazy stuff. Look at White's development after move 12. Check out 26. Qh1! It's just nuts the things these computers are able to get away with.
  #34  
Old 12-06-2017, 09:24 PM
pulykamell pulykamell is offline
Charter Member
 
Join Date: May 2000
Location: SW Side, Chicago
Posts: 41,969
Quote:
Originally Posted by borschevsky View Post
It's here:

https://arxiv.org/pdf/1712.01815.pdf

If I'm understanding this right, they trained it for four hours, without any inputs other than the rules of chess. So no opening book, no grandmaster games. After that training, they had it play a 100-game match against the world computer chess champion, Stockfish. AlphaZero won 28 and drew 72, and didn't lose a single game.

Some of the games are here. Crazy stuff in some of them.
Pretty cool stuff. I figured at some point AI and chess would get to this (just feed it the rules, and it figures out best play for itself), but it's pretty cool to be at that point already. That's just amazing. I'm curious to see how the computer learned games develop chess theory and if there's any weird strategic curveballs that go against conventional theory the computer finds. (IIRC, with Go there were a number of moves that went against conventional go wisdom that opened up areas of theory interest, but I'm not that familiar with go.) So far, from what I understand in the paper, it seems to have settled on its own on a number of established openings, so there's nothing too surprising there that I see.

Can't wait to see what more comes of this.
  #35  
Old 12-06-2017, 09:25 PM
pulykamell pulykamell is offline
Charter Member
 
Join Date: May 2000
Location: SW Side, Chicago
Posts: 41,969
Quote:
Originally Posted by Chessic Sense View Post
This is just crazy stuff. Look at White's development after move 12.
Whoa. That certainly does not look like your usual opening development with three central pawns missing and pieces exposed like that.
  #36  
Old 12-06-2017, 11:41 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
OK, I'm not a chess expert like you guys, but the move that seems most surprising to me is one of Stockfish's (which should be more or less conventional, I think): After 32: c4, why doesn't Black just take the pawn? Surely snatching up a free pawn (and getting two of your own passed in the process) should be a better opportunity than just making some feeble menacing facial expressions at White's bishop (which is worth less than the rook threatening it, and which is defended)?
  #37  
Old 12-07-2017, 12:45 AM
pulykamell pulykamell is offline
Charter Member
 
Join Date: May 2000
Location: SW Side, Chicago
Posts: 41,969
Quote:
Originally Posted by Chronos View Post
OK, I'm not a chess expert like you guys, but the move that seems most surprising to me is one of Stockfish's (which should be more or less conventional, I think): After 32: c4, why doesn't Black just take the pawn? Surely snatching up a free pawn (and getting two of your own passed in the process) should be a better opportunity than just making some feeble menacing facial expressions at White's bishop (which is worth less than the rook threatening it, and which is defended)?
You can actually play it out if you want on that website. (Just move the black pawn to capture, and follow the lines below.) I do see that if you make the move where black captures c4, white all of a sudden jumps to a huge advantage (of a piece), and all sorts of bad things happen with a response of 33. f4.

Last edited by pulykamell; 12-07-2017 at 12:46 AM.
  #38  
Old 12-07-2017, 12:54 AM
pulykamell pulykamell is offline
Charter Member
 
Join Date: May 2000
Location: SW Side, Chicago
Posts: 41,969
Unfortunately, after a few moves, it doesn't seem to let you analyze with the full depth (it limits it to 8 levels), so I can't quite see what's going on other than a lot of pressure and coordination of pieces against black's king side. Maybe an actual expert like glee can chime in? But, whatever the case, the analysis seems to suggest that taking the pawn is not good news for black in the long run.
  #39  
Old 12-07-2017, 07:34 AM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
Oh, obviously, or the (second-)best player in the world wouldn't have done it. I just can't see what the not-good-news is.

And on the scoring analysis, it also thinks that the move immediately prior to that by White, which enabled the capture, was a horrible mistake.
  #40  
Old 12-07-2017, 10:39 AM
borschevsky borschevsky is online now
Guest
 
Join Date: Sep 2001
Location: Canada
Posts: 1,865
I've messed around with that position for a while, and it looks like the issue is that white wants to make use of the c4 square with his queen. White is down material and attacking, so he wants to open up the opponent's king, with something like 32.f4. Black is trying to avoid opening lines, so he can try 32...g4. So you could get something like this:

32.f4 g4 33.Qxg4 Bxc3 34.f5 Rxf5 35.Bh6+ Kf7

If the black b-pawn wasn't covering c4, white would have Qc4+. So white tries to insert c4 bxc4 prior to going f4. Obviously there are many other lines to look at as well, but I think that's the idea.

I've read a bit more about this, and some people are pointing out that the hardware used for the Stockfish match may not really have been fair. AlphaZero is running on high-powered custom hardware, while Stockfish wasn't, and was given a relatively small hash table size and no access to tablebases. So maybe the result isn't quite so significant, in terms of the strength of the AlphaZero engine itself. If you put Stockfish itself on similarly-powered hardware, it would probably beat the testing version of Stockfish handily as well.
  #41  
Old 12-07-2017, 11:38 AM
pulykamell pulykamell is offline
Charter Member
 
Join Date: May 2000
Location: SW Side, Chicago
Posts: 41,969
Quote:
Originally Posted by borschevsky View Post
I've read a bit more about this, and some people are pointing out that the hardware used for the Stockfish match may not really have been fair. AlphaZero is running on high-powered custom hardware, while Stockfish wasn't, and was given a relatively small hash table size and no access to tablebases. So maybe the result isn't quite so significant, in terms of the strength of the AlphaZero engine itself. If you put Stockfish itself on similarly-powered hardware, it would probably beat the testing version of Stockfish handily as well.
Yeah, I saw that, but even if Stockfish were somewhat hobbled, I find AlphaZero Chess an impressive result for a self-learning system with no access to any outside inputs other than the chess rulebook.
  #42  
Old 12-07-2017, 10:35 PM
Chronos Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 73,183
Another thought: Stockfish was programmed by human chess masters, and designed to deal with what humans expect to see. Novelty, itself, is a weapon against such.

It wouldn't be the first time something like this has happened: POWs in Vietnam found themselves playing a lot of chess, because it was about the only way they could spend their time. None of them was particularly good at the start, and they knew little about conventional theory, but they ended up learning a lot just by practice. And when they were freed, it took a while for the conventional chess world to adapt to their unconventional techniques.

Well, here again we have someone becoming highly proficient, without contact with conventional theory. Maybe the conventional wisdom (as implemented in Stockfish) just doesn't know how to react, again. Well, not "just": That obviously can't explain all of a hundred-game lossless streak, but it might be part of it.

The logical test to perform would be to take multiple copies of AlphaZero, let them learn chess independently, and then play them against each other to get relative rankings. Would they all be about equally good, or would some of them have developed ideas that surprise even their siblings?
  #43  
Old 12-08-2017, 07:01 AM
N9IWP N9IWP is offline
Charter Member
 
Join Date: Aug 2001
Location: Southeast MN
Posts: 5,695
Related ArsTechnica article: https://arstechnica.com/gaming/2017/...hess-overlord/

Chess 24article: https://chess24.com/en/read/news/dee...-crushes-chess

Quote:
Meanwhile, though, it’s gratifying to see that the computer has justified 100s of years of chess development, since the program, entirely by itself, has ended up playing some of the best known human openings... The graphs are fascinating to study, since you can see how certain openings became popular in the algorithm’s training games – such as the French Defence and the Caro-Kann – before dropping off in popularity as its strength increased.
Brian
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 09:59 AM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2017, vBulletin Solutions, Inc.

Send questions for Cecil Adams to: cecil@chicagoreader.com

Send comments about this website to: webmaster@straightdope.com

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Publishers - interested in subscribing to the Straight Dope?
Write to: sdsubscriptions@chicagoreader.com.

Copyright 2017 Sun-Times Media, LLC.

 
Copyright © 2017