Tuesday, February 20, 2007

Initiative/flexibility: how the computer should help

When going over games with computers, flexibility is one factor for which they provide no feedback, but is extremely important for human practical play. Given two moves to which Fritz gives the same evaluation, if one leads to an extremely sharp position for the opponent, and the other leads to an extremely flexible position for the opponent, then the former move is better (all else being equal). Conversely, if one of the moves leads to an extremely sharp variation for me, so that I have to walk a tightrope for the next 20 moves to end up with the same evaluation, that is much worse than a move which leaves me with a great deal of options in the follow-up. The fewer options you have, the more likely it is your opponent will be able to take back the initiative.

Fritz and the other programs should give us the option of displaying a flexibility measure in their evaluation function. There are many ways it could work, any of them being trivial to implement.

Most helpful for humans, the response-evaluation graph. For each move being evaluated, show a graph of the evaluation function for the responses to that move (on the x-axis is each response, with the best being on the left, and on the y-axis is the evaluation function for that move). Recall in interpreting such graphs that the best response will not change the evaluation but keep it the same, so the response-evaluation graph will have the same value as the present position where it intercepts the y-axis, and then will only go up (if it is presently white's move) or down (if it is black's move). If the graph has a sharp and fast change close to the y-axis, then you are putting your opponent on a tightrope and should weight the move more highly than a move that is given a similar numerical evaluation but with a flat response-evaluation graph. Programmers might use the derivative of the graph to help them make algorithms against human opponents.

Another possibility is the more computer-friendly flexibility measure F. First, set some threshold value, call it T. In addition to giving the numerical evaluation of each possible move, Fritz should tell you how many follow-up moves there are by your opponent that are within T of the computer's evaluation of the present position. Call that the flexibility measure F. For example, assume you are playing white and the present position is given a 0.3, and let T=0.5. Say the best move Fritz finds is given the evaluation of 0.3 but F=10 (your opponent has ten responses that will be between 0.3 and 0.8). Your second-best move is given 0.25 but F=2: that is, your opponent has only two moves to stay between 0.25 and 0.75!

Such flexibility measures would be helpful in many ways. When building an opening repertoire, pick moves for which F is very low for the opponent. Also, it will give you an idea of which opponent responses you need to book up on: those moves your opponent makes that leave you with little flexibility. All of the same could be said for evaluating middle and end-game positions. This could help explain why good humans often disagree with computers, and why we can't blindly follow the computer's numerical evaluation.

13 Comments:

Blogger katar said...

dont listen to Tempo.

i posted a response at tempo's blog.

nice pic.

2/20/2007 02:40:00 PM  
Blogger Temposchlucker said...

I would be interested in flexibility too as evaluation information. Maybe a suggestion for the authors of Fritz.

2/20/2007 05:19:00 PM  
Blogger Blue Devil Knight said...

I wonder who to contact? Anyone know an email address?

2/20/2007 06:09:00 PM  
Blogger Temposchlucker said...

To my knowledge, Fritz is sold to chessbase.
http://www.chessbase.com/contact/index.asp

2/20/2007 06:28:00 PM  
Blogger Unknown said...

You are aware that good computer chess programs already do all the things you are talking about?

2/22/2007 10:17:00 AM  
Blogger Blue Devil Knight said...

J'adoube: Yes, that's why I spent 20 minutes composing a post about how much I would like it. :P

Seriously, though, if they do such things they are hidden away. How do you do it in Fritz, for instance?

Note I am NOT talking about showing all the possible present moves (infinite analysis in fritz) and their evaluation. If that's what you are talking about, you've misunderstood my post.

2/22/2007 11:15:00 AM  
Blogger Unknown said...

Just because you've never seen a million dollars before doesn't mean it doesn't exist.

The types of evaluations you are talking about are part of the algorithms. That's the reason the machines play so good that humans, even the best players, can barely keep up with them - the programmers have done mountains of research about how GM's evaluate a position and then encoded it.

Although the semantics of the evaluation are different than what you describe, the evaluation you want to see is part of the chess engine.

If it wasn't, then all humans would have to do is exploit this and win all the time.

Unfortunately, most of the thinking about chess engines these days are rooted in criticisms from 10-15 years ago - but they don't apply anymore.

Probably the main reason you don't see all the evaluation criteria is because it's proprietary. Why let your competition see how you figure out a position?

2/22/2007 11:50:00 PM  
Blogger Unknown said...

Eric,

We can tell from the moves a chess engine makes that it uses some measure of your evaluation criteria (flexibility).

Good luck with your request to Chessbase, anyway. My guess is that they have a long list of items, but if there is a market demand for it, then it will get done.

2/23/2007 10:44:00 AM  
Blogger Blue Devil Knight said...

J'adoube:

Note my main point was that it would help if the programs displayed a flexibility measure, and I see you agree with me. I've emailed the Fritz people about it. This discussion about whether Fritz secretly uses such a measure is kind of tangential, but since it is interesting I'll persue it.

We don't know how (or if) the computer uses flexibility, or combines it with other aspects of the evaluation function, so we need to be wary of blindly accepting the numbers spit out by a computer.

Clearly there are times when good humans would prefer one move over that suggested by a program solely because the computer isn't sufficiently weighing (or perhaps even considering) flexibility in its evaluation function.

You claim that a computer that didn't use flexibility could be exploited to be beaten by a human. This is a dubious claim. Computers excel in sharp, inflexible positions and would have no need to avoid them in the face of human players who tend to be much worse in such positions.

I would be curious to know if there is anything written about flexibity measures in computer evaluation functions.

2/23/2007 10:57:00 AM  
Blogger Blue Devil Knight said...

J'adoube replied to me before I responded!

:)

Joke: he responded while I was editing my original response, but luckily it wasn't all that different.

If it does use flexibility, it often doesn't weight it enough to be useful for me. For instance, if it sees a mate in 20, that move will make it to the top of the list no matter what, but in practice it is unlikely I would find that mate even if I made the first move it suggested. In real games, I typically take much longer to mate, simplifying the position to a more easily won endgame. The computer never does this if it sees mate.

2/23/2007 01:16:00 PM  
Blogger katar said...

relevant article in PDF

I disagree with Jadoube here, for the same reasons already outlined by BDK. (so i won't repeat them)

I would only add that human interpretation of an engine's raw output might obviate the need for enhanced engine settings. One can already get a sense of the "flexibility" and/or practical (as opposed to objective/theoretical) merits of a position by simply noting:

1) the difference between evaluations of the 1st and 2nd candidate moves

2) the "obviousness" of the 'best' move. Is a human likely to find the best move?

3) the "naturalness" of the losing moves. Is a human likely to play a losing move?

I feel pretty comfortable doing the above #1-3, and i think i can do it at least as well as some algorithm. As a human, i am a far better judge of #2 and #3 than a computer would be.

2/23/2007 05:43:00 PM  
Blogger katar said...

also, note that my #2 and #3 are essential variables, otherwise checks and captures would be unduly favored-- even when they amount to nothing--- simply b/c they limit opp's choice of replies.

2/23/2007 05:49:00 PM  
Blogger Sancho Pawnza said...

BDK,
This is why I love Chessbase 9. Load a position or game from scratch, hit the "Reference Tab" sit back and enjoy. It will show the stats (based on the selected database of course) of the move in question and the replies made.
Number of games, Scoring percentage, Last played, Highest ELO, Best Players, and Frequent Players.
From there just a simple right-click, and "Opening Report" will supply you with the "Moves and Plans" along with "Critical Lines".

I look at this stuff before I turn on Fritz, as I could give a flying fig as to whether or not Fritz thinks one line is better by .07 in its evaluation. It is more important to play through the games and see if the ideas appeal to you.
Ask yourself is this something that I would be comfortable playing, or do the moves seem alien.
Don't be scared of "alien" moves as sometimes they open doors into new understanding.

There have been some Kasparov games where I did not know which way was up, simply due to the fact he would reverse the direction of his pieces and then reroute them.
I have a hard enough time bringing pieces forward. :)

2/24/2007 10:35:00 PM  

Post a Comment

<< Home