Confessions of a chess novice: Chess Tactics Server: Some Data

Monday, February 12, 2007

Chess Tactics Server: Some Data

Originally posted 12/05

I couldn't sleep, so I thought I'd do a little analysis of the data at the Chess Tactics Server (CTS). CTS has engendered more than a little interest amongst the Knights, especially in our intrepid Temposchlucker, whose mania in problem solving can only be described as insane. I got the data from all members who did more than 100 problems in the last month, and plotted rating versus number of problems solved. The scatter plot is here, where the x-axis is number of problems solved in the last month (on a logarithmic scale: most people have done a lot fewer than 1000 problems) and the y-axis is the rating:

The blue line is the best linear fit (slightly curved upward toward the right because of the logarithmic scale on the x-axis), the red dot is Tempo's rating, and the green dot is Jadoube's rating. Note that Tempo is right on the line: he is about at the average for people who have done ~5000 problems in the last month, while Jadoube is a bit below the line, as he is a bit below the average for people who have done ~2000 problems in the last month.

A couple of things to notice. The correlation coefficient of 0.09 is small but significant (p<.05), and the slope of the line is ~0.0132. Interpreting this is difficult, but it suggests that ratings are not strongly influenced by problem frequency. This is pretty clear by inspection of the data, which shows no clear upward trend. One possible interpretation is that for every thousand problems you do, you can expect to improve by 13 rating points. This is significantly higher than Tempo's fun estimate of 2.4 points/1000 problems which he hopes to gain in his training. On the other hand, interpreting such graphs is always problematic: it is likely that solving more problems is correlated with higher motivation to improve as well as other chess efforts which CTS doesn't measure. Remember, correlation does not equal causation: I could do 30000 problems in a month while watching TV, and have a rating of 10.

At any rate, I just geeked out during a bout of insomnia.

20 Comments:

Pawnsensei said...: I hope you don't mind swearing on your blog but that was bad a**.

PS; 12/17/2005 07:05:00 AM
takchess said...: Man, you are a geek.... 8). Speaking of geeky thing in January I am taking a business trip up to Bar Harbor Maine to see all the mice.

Glad you like the opening book. I have found that to be a good source just to thumb through and get ideas what good moves might look like for various openings. Learn by osmosis.This is one of my few chess books, you can read before going to sleep without a board and without devoting alot of energy to.; 12/17/2005 07:11:00 AM
Unknown said...: So, this is one of the times that anecdote appears to match empirical data. . .

Well done.; 12/17/2005 04:24:00 PM
Ed Doyle said...: This is what we need real analysis.

Can I cite you in my PhD ?

Ed; 12/17/2005 05:15:00 PM
Ed Doyle said...: Actually this is one of my favourite bloggs.

However im intrigued to know what kind of shows you would need to watch to solve 30,000 problems and end up with score of 10.

Jerry Springer comes to mind.

Ed; 12/17/2005 05:21:00 PM
Temposchlucker said...: When I suffer from insomnia I solve some chess problems. . .; 12/17/2005 07:20:00 PM
Blue Devil Knight said...: Thanks for the kind comments. It was a lot of fun to analyze data when it had nothing to do with my profession!; 12/17/2005 11:49:00 PM
Ed Doyle said...: Actually, all joking apart I think this analysis is very worth while. My reading of the numbers is that improvement is not correlated in absolute terms with the number of problems attempted(sgould this be solved Im a little confused about what was being measured). This would appear at least superficially to disprove MDLM. However as CTS generates problems (or does it have a fixed library??) one would expect the pattern recognition to kick in after a certain point, and the rate of improvement to show some kind of increase. This does not appear to be borne out from the data, not withstanding the higher level of effort required to increase ratings by the same comparable absolute number over time.

Blue Devil, you are bringing real thinking to this problem ..; 12/18/2005 05:12:00 AM
Jeff said...: Notice the exceleration of the curve toward the outermost end of the graph where there is significantly less data? The rating hardly changes from 100 to 1000 problems, then change slightly between 1000-6000 problems. Above 6000 problems it accelerates more rapidly.

The question is that an actual trend or is it a product of sparse data.; 12/18/2005 09:55:00 AM
Blue Devil Knight said...: A couple of things.

First, the line is actually a straight line with the same slope along the whole range of the independent variable (number of problems attempted in the last month). As I mentioned in the original post, the line appears curved because I have transformed the data to a logarithmic coordinate solely because it improves visualization (without it, you just can't see all the data for the low-frequency users). For a discussion of this topic, see this Wikipedia article.

gxh7# makes the good point that there is clearly way more data for
<1000 problems attempted. The algorithm to generate the line (roughly) minimizes the average distance between the line and all of the data points. Hence, the low-volume solvers will actually have a bigger affect on the slope of the line than the >1000 problems attempted, so any bias will be toward trends exhibited in the low-volume group.

As for showing that MDLM is wrong, I think that cannot be inferred. This is mainly because this is really not even an evaluation of an MDLM program, but of CTS performance. Technically, the MDLM program involves repeating the exact same problems over and over again. Also, in the early MDLM circles you spend as much as 10 minutes on an individual problem while in CTS going slowly is a rating killer.

My bias is that a good way to learn is via repitition of the same problems, at first really taking the time to calculate. Then, later, problems that once took lots of calculation to solve become problems you can automatically recognize. With the CTS approach, you are always in pattern recognition mode, so it tests what you already know, while in MDLM you build up pattern recognition ability.

I, personally, hardly ever do CTS, because the MDLM approach better matches my style of learning. I like to start slow, and then by repetition acquire the ability to do problems quickly (just like I learned the multiplication tables, to use Heisman's analogy).

Caveat emptor: I am a total freakin' patzer and after 6 months of playing chess still suck. Just look at my numbers. (Though, in my defense I haven't started my circles yet, have improved greatly (e.g., added 200 points at Playsite in 6 months: I think ICC is just really hard for total beginners like me so my rating is still bouncing around at the bottom there), and I am still working through a chess tutor program for the first time, very slowly: see my post on the Divine Tragedy for details of what I'm doing).

Also, remember the plot is measuring performance in CTS, not OTB play, as Tempo rightly takes pains to differentiate in his post on ratings in CTS. The relation between the world of CTS and the world of OTB is complicated.; 12/18/2005 02:33:00 PM
Temposchlucker said...: The relation between the world of CTS and the world of OTB is complicated.

At the moment we don't even have a clear picture of the relation between working with CTS and improving at CTS.
That's the first thing that has to be understood well. Your post is helpful in this specific area.

The point is that the problemset is so big. When my rating is stable, my problemwindow of 10,000 problems is stable, and I can repeat all 10,000 problems 7 times. But at the moment things go well and my rating creeps up, my problemwindow immediately starts to shift and I get new problems. Only when I have done the whole problemset of all 23,000 problems 7 times, I can expect a boost in performance. But how effective will such exercise be?
How many words do you remember when you have read a vocabulary of 23,000 words of a foreign language 7 times?
Of course it will not be very effective.
So maybe it is necessary to repeat it 10 or 15 times before a boost will take place.
And if that all is true and a boost at CTS has taken place, we have no idea how it is going to work out in OTB play.
So indeed no sane man will undertake such an uncertain effort.

Ok, back to CTS now.:); 12/18/2005 06:50:00 PM
CelticDeath said...: There are a couple things I can see that would improve the analysis, but which would be very difficult from a data collection point of view:

1) Compare OTB and/or online rating improvement vs. # CTS problems solved per month

2) Stratify the participants in CTS by methodology/reason they use CTS (e.g., just for fun, preparation for tournament, tactical awareness improvement, etc.) Then, you could cull the ones who just do CTS for fun from the herd.; 12/19/2005 12:14:00 PM
Pale Morning Dun - Errant Knight de la Maza said...: WHOAAAAAAAAAA WAAAHHHHH WEEEEEEEEEEEEEEEEEEEEEEEEEEEEE!!!!!!!!!!!!!!!!!!!!

My man BDK has been busy! In the words of Chevy Chase, impersonating President Gerald Ford during a debate on Saturday Night Live:

"I was under the impression there would be no math."

Almost had a seizure looking at this thing. Cool. Where's my dot?; 12/20/2005 04:14:00 PM
Blue Devil Knight said...: Only people that had 100 puzzles attempted IN THE LAST MONTH are there. I looked for Knight names, but chances are I missed a couple. If anyone else was missed that does over 100 a month there, please let me know and I can redo it at some point in the future.

Also, where the hell did my sidebar go?!!; 12/21/2005 11:31:00 AM
phorku said...: I have been doing at least 100 a week for a couple months now.

I DEMAND MY DOT!!

Just kidding. A more interesting graph would be between CTS problems and ratings produced from game play.

My CTS rating has not really improved but my USCF rating is up about 260 points.; 2/12/2007 03:41:00 PM
Justin said...: hi,

Just saw ur chess blog. It's really interesting.
I just started playing chess again and like you I'm trying to get my rating up.
I also have a chess blog...it pretty basic and simple compared to your's.
its: chess306.blogspot.com

anyway I was hoping we could play a game together....?
Since I live in Australia we will hav to play over the net....
I play on Yahoo and also on www.chessmaniac.com under the username Maxwell843....
but I'm happy to play wherever you want....

we could just play by email...???
they say correspondance chess improves your game!

lookin forward to hearin back from u
talk soon

thanks
Justin
Nowra, NSW, Australia

my email is jmm5656@hotmail.com; 2/12/2007 11:39:00 PM
Nezha said...: Uhm, i hope you dont mind too.. but huh!? hehe.. the discussion and the comments are way over my head. Way way way waaaayy over my head.. =<

and oh yeah - Where is my dot?; 2/13/2007 02:00:00 AM
The Quacks of Life said...: I've never understood the point of those esoteric problems. Real game situations I can take.

Sam Loyd is a notable exception his problems were fun.

PS - You just reek geekiness ;); 2/13/2007 12:28:00 PM
rockyrook said...: WRT CelticDeath's comment ... for what it's worth, I am kind of keeping stats on CTS and my FICS rating. My plan is to post the spreadsheet with the data on my tactics blog (sirrockyrook) once a week.

I'm working on memorization with the CT-ART problems every night and then doing about 50 problems on CTS ... just for practice. If I keep up with 50 problems a weeknight, then I'll be doing close to 1000 problems a month on CTS.

As for playing on FICS ... I'll try to play a certain number of blitz games a week as well as at least one standard game a week (OCL or other).

Maybe I can be somewhat of a guinea pig for tracking data on CT-ART, CTS and FICS.; 2/14/2007 03:13:00 PM
Edukator said...: As a struggling graduate student I have to say admiringly that your regression analysis is superb. As one who loves regression over ANOVA and whatnot...I also admire the fact you saw such an interesting opportunity to use the method!! I never saw it like this before.; 2/18/2007 09:27:00 PM

Monday, February 12, 2007

Chess Tactics Server: Some Data

20 Comments:

Recent Posts

Email me:
bluedevil
-d.o.t. here-
knight
~a.t. symbol~
yahoo
-another d.o.t-
the business suffix

Monday, February 12, 2007

Chess Tactics Server: Some Data

20 Comments:

Recent Posts

Email me:bluedevil-d.o.t. here-knight~a.t. symbol~yahoo-another d.o.t-the business suffix

Email me:
bluedevil
-d.o.t. here-
knight
~a.t. symbol~
yahoo
-another d.o.t-
the business suffix