22-06-2013, 03:18 AM
I’ve no issue with the ‘squad’ idea for selection either way but the proposed 're-inventing' of the rating system (as coined by Phil I think) is a worry!
Some good points already made in the posts here. What basis is there for only considering the highest tournament results and ignoring the low ones? This seems to defeat the purpose of ratings. Is it the role of selectors to weigh up factors outwith the grading data, or to try to ‘improve’ the rating system itself by trying to squeeze more information out of the data? I thought it was the former.
One new point though, about those TPR’s - which I always thought were just for fun, or should be. This paragraph illustrates the thinking in the parts of the ‘squad’ motion that discuss ratings:
“TPR: Tournament Performance Ranking is a measure of a players performance over the number of games played at a particular tournament. The TPR can reflect the current form of a player more accurately than the players live grade. This indicator is a valuable indication of a player’s progress from one tournament to another.”
<!-- m --><a class="postlink" href="http://www.chessscotland.com/downloads/2013additional.zip">http://www.chessscotland.com/downloads/ ... tional.zip</a><!-- m -->
These statements are not correct. The TPR doesn’t ‘measure’ performance. The rating (change in) is the indicator of progress over the period, but individual tournament results jump up and down (for various reasons) and such variations do not indicate progress. That’s looking too close.
To be fair, the actual point being made in the cited paragraph is about the further effect of games with rating difference outwith the 400 gap – but the motion does propose and discuss “the use of a combination of live grade, Scottish congress TPR and selector assessment..”
I’m no expert in the ratings system and I’m aware there are one or two areas of the rating system where I’m not fully conversant with what actually happens – but this philosophy does seem to be at odds with the principles of the system, or rather its statistical basis.
As I understand it, the mathematical meaning of a tournament performance rating is that it’s the rating which would, IF you had already that rating, make your tournament result right on average. Or something like that.
This, even although I like to think that way myself (if I have good one that is :-) ) is NOT the same as the level of rating performance that is the most likely given a player’s results! In fact for a particularly good or bad tournament, especially of shorter duration, it’s usually rather unlikely. For example, in a single game, if one 2000 rated player beats another the ‘performance ratings’ would come out at 2400 for the winner and 1600 for the loser. Not likely! Or even if they played four games and a player won 3-1 the TPRs would be about 2200 and 1800 – still not as likely as that one player played slightly above 2000 and one below. As 400 is the difference where the system predicts almost a 100% score, why should it also be the actual difference when two players with the same rating play?
I'm not criticizing what selectors do. As various people have said, it is right to consider other factors on top of the rating – but that should means factors that are not already included in the data!? Selectors tend to be quite good at that. I appreciate part of the rationale might be trying to firm up policy, make selection more objective etc. But there is sound statistical reasoning for preferring at least 30 games! (Or nearest offer…) Scrutinizing sub-sequences of five games or so would surely just undermine it!?
Cheers
Some good points already made in the posts here. What basis is there for only considering the highest tournament results and ignoring the low ones? This seems to defeat the purpose of ratings. Is it the role of selectors to weigh up factors outwith the grading data, or to try to ‘improve’ the rating system itself by trying to squeeze more information out of the data? I thought it was the former.
One new point though, about those TPR’s - which I always thought were just for fun, or should be. This paragraph illustrates the thinking in the parts of the ‘squad’ motion that discuss ratings:
“TPR: Tournament Performance Ranking is a measure of a players performance over the number of games played at a particular tournament. The TPR can reflect the current form of a player more accurately than the players live grade. This indicator is a valuable indication of a player’s progress from one tournament to another.”
<!-- m --><a class="postlink" href="http://www.chessscotland.com/downloads/2013additional.zip">http://www.chessscotland.com/downloads/ ... tional.zip</a><!-- m -->
These statements are not correct. The TPR doesn’t ‘measure’ performance. The rating (change in) is the indicator of progress over the period, but individual tournament results jump up and down (for various reasons) and such variations do not indicate progress. That’s looking too close.
To be fair, the actual point being made in the cited paragraph is about the further effect of games with rating difference outwith the 400 gap – but the motion does propose and discuss “the use of a combination of live grade, Scottish congress TPR and selector assessment..”
I’m no expert in the ratings system and I’m aware there are one or two areas of the rating system where I’m not fully conversant with what actually happens – but this philosophy does seem to be at odds with the principles of the system, or rather its statistical basis.
As I understand it, the mathematical meaning of a tournament performance rating is that it’s the rating which would, IF you had already that rating, make your tournament result right on average. Or something like that.
This, even although I like to think that way myself (if I have good one that is :-) ) is NOT the same as the level of rating performance that is the most likely given a player’s results! In fact for a particularly good or bad tournament, especially of shorter duration, it’s usually rather unlikely. For example, in a single game, if one 2000 rated player beats another the ‘performance ratings’ would come out at 2400 for the winner and 1600 for the loser. Not likely! Or even if they played four games and a player won 3-1 the TPRs would be about 2200 and 1800 – still not as likely as that one player played slightly above 2000 and one below. As 400 is the difference where the system predicts almost a 100% score, why should it also be the actual difference when two players with the same rating play?
I'm not criticizing what selectors do. As various people have said, it is right to consider other factors on top of the rating – but that should means factors that are not already included in the data!? Selectors tend to be quite good at that. I appreciate part of the rationale might be trying to firm up policy, make selection more objective etc. But there is sound statistical reasoning for preferring at least 30 games! (Or nearest offer…) Scrutinizing sub-sequences of five games or so would surely just undermine it!?
Cheers