At freechess.org’s chess server, there is a “surveybot” compiling data on players’ FIDE ratings and their FICS (Freechess’s server) ratings. I can have it give me a list in which the two sets of values are displayed in parallel columns. At a glance I can see the two are generally within about 200 points of each other, and it seems it is a little more likely for the FIDE rating to be higher than the same person’s FICS rating. (I think.) In my ignorance, that’s all I can say about the list. But after the list, the bot does give the following:
My question is just this: Can someone tell me what the significance of that formula is? What does it say about the relationship between FIDE and FICS ratings, in layman’s terms?
The bot is saying there is a relationship between the two scores. The relationship is not a simple linear one like we are used to, it’s a more complicated exponential one. This proposed model/calculation has an r^2 of .597, meaning (very roughly speaking) that this model explains about 60% of the variation of the FIDE score by using the FICS score.
Or to say it another way. You give me a FICS score. I can 60% predict what your FIDE score will be using that information.
(These are very rough english ways to state the statistical concept.)
FIDE ratings are based on strong national and all international events.
Usually there is at least a 4 hour session of play (and 7 hours is more common e.g. in the British Championship and the UK National league).
There will also be a tournament referee, who will submit the results for grading.
Any chess server provides a constant supply of opponents, but new players are given an arbitrary starting rating, nobody can check to see if a computer is helping you and games are usually with a fast time control. Also a player can drop a poorly rated username and start again.
So these ratings are pretty rough and I think it is very unwise to try to compare them with FIDE ratings.
To me it seems unwise to allow the considerations you mention to prevent one from carefully, rigorously, working through such a comparison. Using statistics.
It’s an interesting question whether there is any correlation between the two ratings, and I am glad they are doing something to answer that question. As explained above, it appears that what is being discovered is that there is a weak correlation, probably too weak to put too much stock in. This, of course, is what you suspected, and it is what I suspected as well, but now you and I know what before we merely suspected.
Anyway, without putting too much stock in the comparison, I will continue to keep in mind that almost without exception, the FICS score was within 200 points of the FIDE score. That’s something at least.
Of course there is the problem, and I confess I only just now thought of this, that the sample the “surveybot” is taking is completely self-selected. I bet people who try to game the rating system in less-than-honest ways are less likely to submit their info…
60% can be weak or strong depending on context. But nitpicking aside, 60% sounds pretty weak for something as straightforward as this. Glee’s points are well-taken, but to me they sound like they would produce a fairly straightforward set of linear biases.
The missing piece is that we don’t know what models the bot considered. Does it automatically only consider an exponential model, or does it go through other ways of looking at it and pick the best one?
I would suggest throwing the data in Excel, add a trendline, and there’s an option to calculate the r^2. This way you can force into a linear model and see what comes out. Just scatterplotting the two rows will allow you to visually see if there’s a connection.
The correlation between two variables more or less measures the strength of the linear relationship between them, as well as the sign of the slope. R[sup]2[/sup], on the other hand, is a measure of how well a particular model fits observed data.
I generated a set of data in Excel over the interval [0, 5] using the formula y[sub]i[/sub] = x[sub]i[/sub]*(5 - x[sub]i[/sub]) + NORMSINV(RAND())/10. X and Y have a correlation of 0, but the R[sub]2[/sub] for a quadratic regression is .9971.
Correlation also can’t be used to tell you anything about the relationship between three or more variables, so you’ll need to use R[sup]2[/sup] for models with multiple input and output variables.