Entry Summaries

The attached mail is the collection of write-ups I received. As usual they are mostly unedited (only personal POTM-master insults were removed). Write-ups are provided in order of finish (more or less).

            SplitMoo    24 (  7/  9/  8)   gC Frederic vanderPlancke
SplitMoo submitted by Frederic van der Plancke at fvdp@decis.be
++ Please summarize the algorithm used by SplitMoo

The first two guesses are hardcoded; they were computed by a variant
of Mad_Cow_Cheating (a DICTIONARY ENTRY - see below).

For the remaining guesses:
for 4 seconds SplitMoo tries to find all solutions (words that are still
possible according to the outcome of the previous guesses). If it does not 
find all solutions, for the next 4 seconds it tries to come out with a
random sample of the solution space. (Ideally this sample should be unbiased,
but in my program it is not, because I think this would be too expensive).
Each solution is attributed an "englishness" representing the likeliness
of it being an english word, based on the probability of each sequence of
consecutive two letters in my dictionary.

Then each solution found is evaluated against the solution sample (part of
the computation is done during the search.) The heuristic aims at minimizing
the expectation of the number of guesses needed to find the word, by
examining how each candidate splits the sample (hence the name "SplitMoo").

For the last 2 seconds, it searches for an improved guess by examining
mutations of the previous candidates. This part plays a really important
role ! Doubling its running time has probably been one key to my success.
(It's important in words like "STAT" (doesn't typecheck but I did test it anyway):
my program quickly founds it must be "S?A?", but then there remains many
possibilities for the two remaining letters; if it did check only for guesses
of the form "S?A?" it would lose time. Of course "englishness" helps as well
in this case.)

--- in more detail

The most involved part is the search for solutions. 
I wrote a (cut-down) constraint propagation engine for that purpose... 
Not spending too much source code for it was an interesting challenge !

My program essentially maintains the following information during the whole
search for a word:
- set of letters that are possibly/surely present in the solution
- for each i, set of possible/sure letters at position i
- for each guess, set of positions for which it is possible/sure we have a hit
  (this is redundant with the previous sets actually)
- "linear constraints", based on the number of misses for each guess
and reduces the sets according to the previous guesses outcome
(before searching for any solution) and the previously instantiated letters
(during the search). It propagates simple deductions to do so. 
("simple" does not really mean "trivial" though, as witnessed by 
my classes CharLinCtr or LetterUnionCtr !)

The search uses heuristics so that it first places letters that are needed,
according to the constraints: letters that are surely in the solution, 
but not yet known to be at any position; as a second choice,
other letters that can help reaching the number of non-misses in a guess.

++ Did SplitMoo use any letter frequency tables?  Where did you get them?

SplitMoo uses a table of likeliness of all sequences of two letters
(including spaces after or before the word, to take into account the 
likeliness of the initial and final letters of the word.)
They were extracted from my dictionary.
Where I got this dictionary is explained in the "Mad_Cow_Cheating" 

++ If I had it to do all over - here's what I would try ....

Most importantly, I should share my time more evenly among the different
parts of my design, and keep much more time for developping the
guess generation algorithm and for overall tuning.
Two important parts that make my program as good as it is were added in the
last 24 hours: englishness measure, and the search for more-or-less random
alternative guesses.
I did take almost no time for real tuning (although I tested PINEAPPLE with
every version of my program). It is possible that single tuning decisions
could improve my results even more. I didn't even try spending more than
2 seconds in the final phase, even though moving from 1 sec. to 2 sec.
improved my results a lot ! The deadline was close at the time
and I was getting tired with a wish to finish... (I did not realize I
had _that_ much of a chance to win !)

Other remarks:
* Perhaps is the constraint propagation engine overkill. Perhaps not. It surely
helped about maintainability.
I could perhaps have tried to be less brute-force-ish in my approach to finding
* I should develop the last two seconds of alternative guesses search; 
maybe by introducing a more sophisticated genetic algorithm 
(like the one in Mad_Cow_Cheating). 

Two weeks before the deadline I tried a purely genetic approach. After much
refinement, it was able to find PINEAPPLE in 14 to... 36 guesses, so I
gave up the idea... It was based on SplitMoo's design but the task of
figuring out a solution sample was handled by a genetic program. 
The problem then is to keep this sample even and not concentrated around one 
word or family of words; specially when no word matching all constraints 
is known.


I live and work in Brussels, Belgium. 
My main work project is a program for automated staff planning.

For fun: drawing, swimming, writing programs (my next big "home" project is a 
MineSweeper solver, for which I have good ideas), reading/doing some maths...

For my innermost secret, have a look at the "Mad_Cow_Cheating" 
answers... ;-)
[Avec un clin d'oeil a Caroline N, salut si par hasard tu me lis !]

POTM ideas:
Solving MineSweeper (that game that ships with Windows) or a variant
thereof (like allowing the player to continue the game after (s)he has
hit a bomb)... I can stay off-contest if you think it's fairer... (but
I'd prefer to stay in !)

     SchrodingersCow    25 (  7/  8/ 10)   gC William Little
SchrodingersCow submitted by William Little 
++ Please summarize the algorithm used by SchrodingersCow

The algorithm has 3 phases...

Generate a population, the members of which each match all of the
information gathered so far about the target word.  These members are
generated by starting with a random sequence of letters, and then applying
mutations.  The mutant is then compared to the original, and replaces it
unless it gets a higher (worse) score from the fitness function.  This
process continues for 2000 cycles, or until the score reaches zero.  If
zero is successfully reached, then the word is satisfactory and is added
to the population; otherwise it is discarded to avoid getting trapped in a
local minimum.

The fitness function determines the candidate's score by calculating the
number of hits and misses that would have been returned for all the
guesses made thus far if the target word were the candidate.  It totals
the absolute deviation from the actual values, and unless desperation has
kicked in, it also adds one for each letter triplet in the candidate which
is not listed as viable in the triplets table.

If the size of the population is zero, then turn on desperation, and make
a random guess to buy time.  Otherwise, the population is now a sample of
words which might be the target word.  One of them might be the best
guess, so, for each member, calculate how its use as a guess would divide
the whole population, and the size of the largest chunk.  Choose the
member that would result in the smallest largest chunk as the starting
point for phase 3.

The result of phase 2 is a good guess, chosen from among a population of
possible target words, but the best guess need not be a possible target
word.  Therefore, phase 3 uses any remaining time until the 9 second mark
to apply mutations to the guess, much as in phase 1, with the hopes of
reducing the largest chunk size even further, with the difference that
draws go to the original, not the mutant.  Once time is up, the guess is
made, and the program terminates.

Those are the three phases of the program.  As for the triplet table, it
has one entry for each sequence of 3 letters, from AAA to ZZZ.  Each entry
has 3 relevant bits:  one determines whether a triplet occurs at the
beginning of any known word, one does the same for occurences at the ends
of words, and the other one for occurences anywhere other than the
beginning or end.

++ Did SchrodingersCow use any letter frequency tables?  Where did you get

No, it doesn't care a bit about letter frequency.  All it cares about is
whether a given letter triplet is viable.  As for the triplet table, I
wrote a separate program to generate that.

++ If I had it to do all over - here's what I would try ....

...fewer bad ideas and algorithms, before deciding on this one.


WHERE I LIVE: A quaint post-industrial globe, the native inhabitants of
which are still largely convinced that they live in a Euclidean

WHAT I DO FOR FUN: Write programs, exploit technology, search for the
perfect flashlight, and try to catch that blasted roadrunner.

WHAT I DO FOR WORK: Try to keep everything from falling apart.

INNERMOST SECRET: This extremely mysterious person's utmost secret?  It's
something...most entertaining.

POTM IDEAS: Large cash prizes!

           MooIndigo    27 (  7/  9/ 11)    c Scott Kennedy
MooIndigo submitted by Scott Kennedy at tsk@net2phone.com
+-+- Please summarize the algorithm used by MooIndigo

MooIndigo generates each possible letter combination and 
compares it against each previous guess.  If the comparison 
generates the same number of COWS and BULLS, then
it is printed as the next guess.

The letters chosen by order of frequency in commonly 
occurring words.  The first set of guesses are selected 
to use the letters of the alphabet with no repeated letters.
For example,  if the word length is 6 letters, the first 
4 guesses will contain different letters. The fifth guess 
will include the 2 unused letters and possible letters 
from previous guesses (based on number of COWS).

The remaining guesses are generated by trying combinations 
with no repeated letters, 2 repeated letters, and finally 
any number of repeated letters.

Two passes are made.  On the first pass, letter combination 
frequency tables are used to reject bad letter combinations 
(e.g. all consonants, 'Q' but no 'U',  words beginning with  
'ZG', etc.).  On the second pass, all combinations are 
tested (just in case the POTM-MASTER found an obscure word).

The passes through the alphabet are made through a set of 
nested loops.  The current indexes of the loops are saved 
in the temporary file so that each time the program
starts up, it can reset the loop counters and pick up 
where it left off.

To fully utilize the 10 seconds allotted for each pass, 
MooIndigo looks ahead for other possible guesses.  Using the 
letter frequency tables, it determines a score for each
guess and if time expires, it prints the guess with the highest score.

If the 10 seconds expires and MooIndigo has not found a 
new guess, it prints the current combination that it is evaluating.

MooIndigo generates a list of currently available letters 
so that it can skip letters that could not possibly be in the word.  
For example, if a guess generates 0 COWS and 0 BULLS,
none of the letters in the word are used.  Similarly, knowing 
which letters are not used, it can also determine which 
letters must be used.

+-+- Did MooIndigo use any letter frequency tables?  Where did you get them?

Yes.  The UNIX 'spell' dictionary was used to generate the 
following letter frequencies.

1) The letter frequency for each letter based on word length:


2) Frequency of 2-letter combinations at beginning of a word

3) Frequency of 2-letter combinations at end of a word

4) Frequency of 2-letter combinations in the middle of a word

5) Frequency of N consonant in a row.

6) A list of three letter combinations that do no appear in words.

+-+- If I had it to do all over - here's what I would try ....

I would have like to do additional work on the  scoring mechanism 
and some performance improvements to search more gueses within 
the 10-second allotment.  I ran the program against a set of 
about 4000 words to make sure and changes made to the scoring 
and alogoritms actually reduced the average number of guesses.

Also,  trying different approaches to determine which letters 
are repeated might reduce the number of guesses.


By the C, by the C, by the beautiful C.

     dancing_letters    27 (  7/ 10/ 10)   gC Pavel Ouvarov
dancing_letters submitted by Pavel Ouvarov at uvarov@sirius.ihep.su
++ Please summarize the algorithm used by dancing_letters
  The algo:
  After the first guess build the Candidate Word Space (CWS).
Choose a word from CWS, try it and reduce the CWS. Make it repeatedly.

  There are two problems: 
  First, how to organize CWS? I invented a class that I called 'mask', which
can contain a great number of words. It is easily constructed and can be
easily overlapped with another 'mask'. Thus, CWS is a (relatively small) list 
of 'mask's. 
  Second, which word from CWS to choose? I decided to choose the best word, 
based on statistics. CWS may be very large and often there isn't enough time 
to look through all the words. So I choose only best 'mask's (based on 
statistics) and splits them into words. Then I choose the best word.
  (For detailes ask me or look through the code)

++ Did dancing_letters use any letter frequency tables?  Where did you get them?
  Yes, it uses three levels of statistics. (They were all taken from my linux
spell dictionary).
  First, simple letter frequency table for each place in the word.
  Second, letter pair frequency table for each place in the word.
  Third, letter triple allowance. It helps to find out if the letter triple
may be used or not without regard to place (for example AAA is not allowed).

++ If I had it to do all over - here's what I would try ....
  I'd follow better programming style... 
  Perhaps I'd try to find the word that possibly reduces the CWS best.

  I leave in Russia in the best town of the world - Protvino. Now I'm the
second year student of the physical faculty in the Moscow State University.
  DO FOR FUN? Friends, girls, beer...
  DO FOR WORK? Physics...
  INNERMOST SECRET? Tsss... I think that Eugeniya Bagdasarova is jealous
for you... Therefore she's the most pretty and sexual girl in the universe...

               Skunk    28 (  8/  8/ 12)   gC Alexey Zhelvis
	Sorry ... No questionairre was returned for Skunk

          StupidComp    29 (  9/  8/ 12)   gC Trieu Nguyen
StupidComp submitted by Trieu Nguyen at trieu@ix.netcom.com
++ Please summarize the algorithm used by StupidComp

In my program, I define a "English word" is strings of consonant, and vowel
next to each other.
For example: E-ngl-i-sh
There is a array hold all strings of consonant, and vowel. At first the
program will try some words, after that it will find the match one in all
English words.

++ Did StupidComp use any letter frequency tables?  Where did you get
The array of consonant and vowel is generated from an English dictionary on
the web.

++ If I had it to do all over - here's what I would try ....
  Dont dare to do it again :-)

Berkeley, California. Working at Lawrence Lab.Interested in computer chess.
I wrote a small Chinese chess playing program, and let it play in the
internet. It beats many people in fast game heheh.

        QuackTheDuck    30 (  5/ 17/  8)    c Wolfram Hinderer
 QuackTheDuck submitted by Wolfram Hinderer 
	at Wolfram.Hinderer@math.unikarlsruhe.de
 ++ Please summarize the algorithm used by QuackTheDuck

For my first try I used Pascal. I transformed the answers into linear
equations and simplified them. The problem then was to find the guesses which
would give the most information. I was not very successful with this 
(often 20 guesses or more), and I didn't know how I should use 
Information about English.

So I started again from scratch, using C. Well, that should probably be 
_abusing_ C. I have not so much experience programming in C, but I had to
use it (no Pascal support, and it's faaaaster). 
If you have a look at my source, you will see that it is just 
terrible. The recursion (see below) is the only
thing where I tried to write smart code: time is valuable!

But now the algorithm:
I use lots of data of the English language. I use triples, 
like "ING". For every combination of two letters 
("AA", "AB", ..., "ZY", "ZZ") there is a rank list of the 
letters that could possibly follow (like the "probability" that a
certain letter is the next letter in the word). 
For every position (first three letters in word, 
letters 2,3,4 in word etc.) I have seperate lists. (I had 
different lists for different word lengths at the beginning, with 
significantly better results, but the size limit... I had to 
compress my data anyway!) The ranks for "AD" at the start of 
a word would be "VDJMOAHUERSILK", for example. 
(I have lists for the first letter, and for the second letter
depending on the first letter of a word.)

My guess would be the first word found that is not yet ruled out by the 
guesses made. The first step is to look for words only using 
"rank 1" letters, if that's not possible, use "rank 2" 
letters and so on.  More precisely, select first letter.
for every guess already made, calculate the hits, non-hits 
and non-misses for a word starting with this letter ...
if hits, non-hits or non-misses for any guess exceed the 
values indicated by the answer to the guess, select another 
first letter (with lower rank), else continue with the second letter
and so on for every letter of the word (recursively). If the 
last letter of the word has been succesfully selected, the 
misses still have to be checked, and then the word is printed.

That's all. QuackTheDuck does not try to make guesses which give 
a lot of information, it simply makes the next possible guess. 
This seemed different from the techniques discussed on the 
message board that I decided to have a name with an animal 
very different from a bull or cow. Since the program simply 
utters some english-like words and since I didn't think that my
approach would work well, I named my program QuackTheDuck. 
After I got it to work, I realized that somehow it worked 
much better than I had thought, but I didn't change the name. 
I thought it was funny and at the time, I thought that  it 
should be funny. But I just looked up what Fred says about it in the 
"Common POTM Rules", and there he says "something clever please!". 

I have added something to the algorithm: 
If a guess gets wordlen-1 hits and there are more than 3 
possible words left, I try to make guesses that eliminate more 
than one possibility.

My approach has a big problem: some words can't be found. 
Example words starting with "ADX", because "X" is not in the 
list for "AD" at the start of a word. I find all words in 
my spell dictionary, and many more, but nevertheless 
I had to extend my algorithm: if no word can be found, 
fill all ranklists with missing letters. So everything will 
be found, even "ASDJJFX", but often would need many guesses... 
but at least it will be found!

If time is short, I print the "best" word found up to now.

++ Did QuackTheDuck use any letter frequency tables?  Where did you get =

Yes. I used the rank lists described above. I used my spell dictionary to
create them. That's why QuackTheDuck finds all the words in my 
dictionary (and many more).

++ If I had it to do all over - here's what I would try ....

Not sure.


If there will ever be a similar POTM contest, it would be fun to 
allow access to the dictionary - the dictionary could then be in 
any language, or be a fantasy or "complete" dictionary. 
A dictionary with exactly one word would be interesting...
In general, a first step, where every program generates its 
own statistics, and a second step, the guessing or whatever it 
might be, could be used in other situations.

<<<<<<<<<<<<  THE NEXT TWO ARE THE DICTIONARY ENTRIES  >>>>>>>>>>>>>>>
DICTIONARY ENTRY: bigmoo    15 (  5/  4/  6)    c Susan Adams
bigmoo submitted by Susan Adams at buttgirl_69@yahoo.com
++ Please summarize the algorithm used by bigmoo.

It was a walking dictionary. Over 63,000 words totaling about 650K.
They were divided into lists by word length. The actual code was only
about 1% of the total size. It used the idea that Fred so eloquently

   2. a most useful "trick" was making good use of the
      scores of previous guesses to limit the "next"
      guess ... as follows:

      a) I'm going to make a guess - there are a large
         number of possible guesses to choose from

      b) IF my guess is to be a winner, then it MUST
         yield the identical scores that my previous
         guesses yielded.  If it doesn't, then it
         couldn't have been the target word - so
         perhaps I'm better off with another guess!

++ Did bigmoo use any letter frequency tables? Where did you get them?

No way. What use would I have for those? Go big or don't go at all. I
got my dictionary from Paul Banta (Bozo).

++ If I had it to do all over - here's what I would try ....

1) I'd put my list of words in a comment. That way, they wouldn't count
against me for size. The rules specifically say that comments don't
count against you.

2) I'd copy the source code ("BigMoo.c"), comments and all, into my
temp file ("TEMP.BigMoo"). I don't think there are any rules against

3) I'd read my temp file into a large array of guesses.

I'll have to check with Fred on this, but I think making these three
changes would yield a legal entry by the letter of the rules. It might
bend the spirit of the rules slightly, though.


I'm a perpetual student in Cheyenne, Wyoming. Do for fun? I don't know.
Girls just wanna have fun! Innermost Secret? That's a tough one. Help
me out here, Fred. POTM ideas? None.

DICTIONARY ENTRY: Mad_Cow_Cheating 17 (  7/  5/  5) gC Frederic vanderPlancke
Mad_Cow_Cheating submitted by Frederic van der Plancke at =
++ Please summarize the algorithm used by Mad_Cow_Cheating

The algorithm relies on a heuristic formula to evaluate a 
possible new guess: if the outcomes of the guess separate 
the set (of size S) of possible solutions in parts that 
have size s_1, s_2,... s_N then the value of the guess is
given by:

   bare_value(guess) := - sum_i (s_i) * log(s_i)
   value(guess) := log(S) + bare_value(guess) / S

where bare_value is used in the comparison, whereas value 
is the really meaningful quantity: it represents the expected 
quantity of information gained by the guess. (By quantity of 
information I mean the log of the size of the solution set;
I expected this to be proportional to the number of 
guesses needed. This  is roughly true in the beginning of a game, 
but becomes an underestimation towards the end.)

(NB. I didn't take the time to re-check the maths so I 
might be wrong)

When comparing a new guess that is a possible solution to 
one which is not (or is not known to be one) the above 
value is adapted to take it into account.

(1) The "basic" algorithm (used in earlier versions):
It considers all words in the dictionary (even words that 
cannot be the solution according to the current knowledge), 
and chooses the one that has the best heuristic value.

(2) The "genetic" algorithm (Mad_Cow_Cheating's algorithm):
It starts with the whole dictionary (*), or a random sample 
of it if the dictionary is too big. (I did not really bother 
with the 10 seconds limit, for quite obvious reasons ;-) 
but I still did not want to make Fred wait for  minutes...)
Then it creates new words from old ones, by mutations, 
permutations and cross-overs.

The innovative part is the structure I use for maintaining 
the genetic pool: a "heap" or "priority queue": that's a 
binary tree where each node is a better candidate than the 
two "son" nodes. This way, I can instanteously get the best
move so far, and for chosing words for reproduction I can weight them
according to their depth in the tree, which is roughly 
correlated with their goodness. (To increase this correlation 
I insert the new guesses randomly in the bottom level, 
before restoring the heap invariant as usual).

(*) reduced to words of the right length, this goes without saying...

++ Did Mad_Cow_Cheating use any letter frequency tables?

Mad_Cow_Cheating uses a dictionary, so it does not need 
frequency tables ;-) I made searches and found two sources of 
special interest (for MOO purposes):
UKACD 1.5: =

(especially the "common" dictionary in "Moby Words"; they have many
things including a pronunciation dictionary, the whole of 
Sheakespeare's work and the Constitution of the U.S.A. !)
The page where I found links to these is 

where you'll find links to a LOT of dictionaries of various kinds...

I took the intersection of these UKACD15 and MOBY(common),
reduced to the 4- to 9- letters words, as my dictionary.

++ If I had it to do all over - here's what I would try ....

I think Mad_Cow_Cheating is quite good as is...

If I had the time... I could implement alpha-beta min-max 
analysis of the end of game (and perhaps of the whole game 
if resources permit; D.Knuth has done it for the version 
of MasterMind that has 1296 possible solutions. A complication 
here is that MasterMind symmetries are lost; and the 
possible moves are much more numerous...)

I noticed the unexpected fact that the "basic" algorithm 
outperforms the "genetic" algorithm on lengthy words.
This means dictionary words have an advantage that my 
heuristic does not capture, probably because they are more 
akin to the words to split, but I did not study this
question further. (The "basic" algorithm should find 
most 9-letters words in no more than 4 guesses, even 
when using a bigger dictionary.)

Oh yes, I could weight each dictionary entry 
according to its probability of being chosen by Fred ;-)

I live in Brussels (Schaerbeek) in Belgium.
Look at "SplitMoo" for the other answers...

             deepmoo    31 (  7/ 11/ 13)   gC Sean Barrett
deepmoo submitted by Sean Barrett at buzzard@world.std.com
++ Please summarize the algorithm used by deepmoo

A sort-of diary of the development of deepmoo is
at http://world.std.com/~buzzard/potm/journal.txt


Fixed opening book (does not change in response
to results), short-circuited if all letters are
encountered.  Binary encoding is used for most
probes for 7+ letter words (consider the meaning
of 5 misses for 'OAAEEEE').

The fixed opening book guarantees that every
letter (or every letter but one) has been probed
by the time the opening book is completed--or
else all letters in the word are accounted
for if short-circuited, but this cannot happen
if there are repeats.  (This avoids slow probes
through lots of letters during endgame.)


Depth-first search for strings that match all
previous guesses.  Heavily "pruned" for speed
(i.e. without changing search results).  Only
matches for which every set of three letters
matches some word in a precompiled dictionary
are allowed.  Search at each letter position
goes in order of letter frequencies at that
position.  (This is not actually using a dictionary,
but a set of frequency tables that have only
1-bit frequencies.  See below.)


If all words that "match" the trigraph frequencies
are rejected, retraverse entire tree without
culling bogus trigraph words.  (Typically deepmoo
has enough information to only take one or two
extra guesses in this case if the word is actually
English or English-like but not in the original
dictionary used.)

++ Did deepmoo use any letter frequency tables?  Where did you get them?

I compiled a dictionary by getting two different
/usr/dict/words, removing all uppercase and non-alphabetic
words, running through it by hand looking for anything
too weird, and then passing the whole thing through 'spell'
on a machine belonging to an acquaintance which was
running the same OS version as the potmmaster.

I then generated frequencies as follows:
   For each letter position, compute the relative frequencies
   of each possible letter. I could have computed this independently
   for every possible word length, but I didn't bother.

   For every trigraph position, compute a 1-bit value for
   every three-letter string, indicating whether that string
   ever appears in that position.  Again, I could have computed
   this independently for each word length, and it would have
   made a big difference, but I couldn't afford that much data.

I generated an equivalent set of information with quadgraphs
(four letter strings), but was unable to compress them sufficiently.

The core premise of deepmoo is to use 1-bit frequencies so as
to effectively narrow the search space, rather than simply
orient it.  The final endgame guarantees that even words that
don't match the 1-bit frequencies are eventually explored.
In other words, multiply together the frequencies of eahc
of the trigraphs from each position.  In my case, the final
answer is either 1 or 0, which can be understood as
"much more likely" or "much less likely".  All of the former
are explored before any of the latter.

I have some serious concerns about the fact that there is
no real distinction between frequencies and dictionaries,
in some sense.  For a discussion of this extracted from
my journal (the most interesting/revealing thing in the
journal, I think):


++ If I had it to do all over - here's what I would try ....

Nothing, because by the end all I was trying to do was
compress tables for better narrowing the wordspace (see
the aforementioned URL).  There may be good alternative
approaches, but in *practice* this is a good attack for
making the program 'seem to know English', which is of
course something all of us playing the game by hand do,
but which programs are absolutely ruled-against doing.

In some sense, I don't think the chosen POTM problem is an interesting
problem space.  The problem space "you can have a dictionary,
but you should be able to solve even words which aren't
real words" is the problem space humans solve.  The problem
space "the word must be in this dictionary which you can use"
is an interesting problem.  The problem space "the word could
be any string with equal odds" is an interesting problem
space.  The chosen POTM problem gives a problem space in
which the goal is to model the english language without
bothering to use a dictionary.  As such, it's kind of boring.
For me, at least, it became the problem "write a 20K program
that produces a wordlist as minimally-larger than some fixed
list of words as possible, while still containing all those
words".  The best I could come up with was 20,000 times as
large.  (I had one less than 100 times as large, but couldn't
get it small enough.)  As such, I could watch my program and
see how it *sort of* knew English, but not really, and this
seemed the main front on which it could be improved.  (Indeed,
there may simply be entirely different strategies, considering
how far deepmoo is from the top of the rankings, but they're
not coming to mind here.)

With the "less than 100 times as large" quadgraph-based
structures, deepmoo generally guessed words that *were*
in the original dictionary on the first guess after the
opening book.  Therefore, at about that point, it would
become interesting to pursue shortening the opening book,
or looking for other strategies without this dichotomy.


I live in Boston.  I program computers for fun and work,
writing 3d graphics code and games.

            MooFight    32 (  9/ 11/ 12)    c Kris Kenis
MooFight submitted by Kris Kenis at kris.kenis@barco.com
++ Please summarize the algorithm used by MooFight

Step 1: make random guesses with all "differrent" letters (wordlength >
6 letters --> 7 different letters, wordlength <= 6 letters --> 4
different letters). Make guesses until all letters of the alphabet are
used or until all letters of the word or found (-> only when all the
letters of the word are different). Sometimes some letters can be
eliminated (misses=wordlength).

Step 2: make guesses with "only possible letters". Use 2 different
letters for wordlength <= 6 letters and 3 different letters for
wordlength > 6 letters. It is always possible to get the exact letters.
Check with previous guesses and try to eliminate other letters.

Step 3: Make guesses with exact letters using frequency tables.
Frequency table contains frequency information for a pair of letters.
Keep word with highest score.

++ Did MooFight use any letter frequency tables?  Where did you get
Yes. Somewhere in the potm message board.

++ If I had it to do all over - here's what I would try ....

Gent - Belgium - Graphics R&D Engineer

    NoBullPeacePrize    32 (  9/ 10/ 13)   gC Randy Saint
NoBullPeacePrize submitted by Randy Saint at saint@phoenix.net
++ Please summarize the algorithm used by NoBullPeacePrize
I borrowed this strategy from the Mastermind optimal algorithm.
Try to generate a complete list of all possible solutions (candidates).
Limit this list to 2000 words.
Test all candidates against each other to see which one splits the list the
"best".  If time is left over, generate random guesses to see if they can
split the list of candidates "better."
By "best" and "better", each candidate word is given a score (the sum of the
frequencies of each letter pair).  The "best" guess would be the one that
generates the most even histogram of scores for each possible guess result.
This way, the program would lean more towards finding guesses that would
split the candidates that were most likely words.
I also kept a list of the letters and which locations the letters could not
be in.  This helped to reduce the amount of possible candidates I had to
search through.

++ Did NoBullPeacePrize use any letter frequency tables?  Where did you get
Yes.  At first I got them from some lists that people had posted on the
original message board.
With only a week to go, I ran those lists through the spell program and
realized that there were many words in there that would fail.  I culled
those out of the list and used the remaining ones that were spelled

++ If I had it to do all over - here's what I would try ....
I couldn't figure out any way to generate my list of candidates faster.
After a few guesses, the list of candidates was zero and I couldn't search
through enough of the possibilities to find even one.  Then the program
takes a previous guess, changes one letter and uses that.  That way I get a
little information about the two letters.

Live: Houston, Texas.
Fun: Programming Contests, and tinkering with my computer.  I purchased a
used motherboard to use in my second computer, but somehow screwed it up
with less than 2 days to go before the contest deadline.
Work: Software Development Manager with Lockheed Martin, at NASA, Johnson
Space Center.
Ideas: Number guessing vs. the other contestants, was a contest called

              MooBoo    33 (  8/ 13/ 12)    C Rolandas Gricius
	Sorry ... No questionairre was returned for MooBoo

           one_guess    34 (  8/ 12/ 14)    c Ionel Santa

one_guess submitted by Ionel Santa at ionels@phobos.cs.unibuc.ro


1. My algorithm tries at every guess to find a word that is possible to be
correct. The words are searched with a backtracking algorithm, taking care
of eliminating from the beginning the words that contain invalid combinations
of letters, and also the words that are in contradiction with the current
knowledge obtained from the previous guesses and answers. If in seven seconds
isn't found any word, than the search is continued with a random search.
(Hint : if the space of solutions is very small, then the knowledge obtained 
already makes backtracking a very easy task, and if the space of solutions
is very large, a random search (conforming also the knowledge) will be more
likely to succeed.
   My first 3-5 guesses (depending on the word length) were predefined. For
example for 7 length I've answered first something like XYYZZZZ, where X Y
Z are frequent letters, which permitted me to know for every of these
letters if there is or not in the final word. (For example if MISSES = 3, I 
can take for sure that Z is in the word, and X, Y aren't).
   I tried to make inferences about :
a) letters that can or cannot be on a specified position in the word.
b) The minimal and maximal number of letters that can be in the final word
for different sets of letters. At this point I make inferences about
intersection, difference, etc.


2. I use very little frequencies, only for my first guesses. I took these
tables by counting letters in a novel, regardless the length of words (I 
counted every letter).


3. Maybe I would try to work with schemes like :
['A'-'Z']['O']['R','Y']['Y'] - which means that the first letter is from 'A'
to 'Z', the second is 'O', etc. First I would go with ['A'-'Z']['A'-'Z']..., 
and then after every guess-answer I would split the schemes that I have into 
others who are conforming this guess-answer.


4. I live in Romania, for the moment I'm at studies in Bucharest (computer
science), I like very much literature and music, I cannot make public my
secrets, and for the moment I haven't any POTM ideas. Anyhow, I think this
contest is very well organized, and I thanks FRED for that.

                Bozo    35 (  8/ 11/ 16)    c Paul Banta
Bozo submitted by Paul Banta at pbanta@yahoo.com
++ Please summarize the algorithm used by Bozo

Bozo is about 1/3 statistics and about 2/3 code.

Bozo makes some initial "pre-canned" guesses.

It then generates guesses based on the following idea. The word that I'm about
to guess MIGHT be the word that Fred is thinking of. If it IS the word that
Fred is thinking of, I should be able to compare it to all my previous guesses
and get the same number of hits and misses that Fred reported. So, that's
exactly what I do. I compare the word that I want to guess with all my previous
guesses and see if the hits and misses match Fred's hits and misses. If the new
guess doesn't match all of Fred's hits and misses, then I don't guess it - it
CAN'T be the right word. Instead, I keep looking for a word that matches ALL of
Fred's hits and misses. When I find such a word, that becomes my next guess.

++ Did Bozo use any letter frequency tables?  Where did you get them?

Yes, but they were only minimally helpful. I generated them myself from my
63,000+ word dictionary. A more interesting question is where did I get my
dictionary from. I found several dictionaries on the web. I combined 3: a list
of english words, a Webster's dictionary, and the English half of an
English/Spanish dictionary. I then threw out all the words that were not 4-9
characters long. That gave me a list about 100,000 words. I then ran this list
through a Sun spell program (special thanks to Ken Been in NY for doing this
for me), and threw out the words that it didn't like. This gave me a list of
about 63,000 words.

What was more helpful than letter statistics were other self-generated
statistics that I use in guessing words. I have a table of which triple-letter
combinations occur and don't occur in my dictionary. For example, "QUI" does
occur in my dictionary, but "BBB" doesn't. This helps me to never even consider
guessing words with "BBB" in them. There are 17,576 possible three letter
combinations (26^3). I don't remember the specifics, but I think only about 1/3
of them occurred in my dictionary. So, 2/3 of them didn't occur. I also stored
and used statistics for which sequences of 4 consecutive consonants, 5
consecutive consonants, and 6 consecutive consonants occur in my dictionary.

++ If I had it to do all over - here's what I would try ....

I had a somewhat different algorithm idea late in the contest, but didn't have
time to implement it.

Phase 1: You make an initial guess, say "ABCDEF" (6-letter word). Fred gives
you back hits and misses.

Phase 2: Your next guess uses only those 6 letters such that if it WAS the word
that Fred is thinking of, it would get the same number of hits and misses as
Fred reported. You keep making guesses this way until you can't make any more -
any guess that you make wouldn't have the same number of hits and misses with
all previous guesses that Fred reported. When you run out of guesses, you go
back to phase 1 and guess another 6 letters, say "GHIJKL" - now you have 12
letters to work with.

I think this algorithm is better than my algorithm, but I'm not sure how much
better. My current algorithm averages about 12.3 guesses per word which aint
too shabby.


Live: Colorado Springs, Colorado, USA

Fun: Hiking, Fishing, Camping, Backpacking, Volleyball, Breaking perfectly good
automobiles under the guise of "fixing" them, POTM.

Work: An old computer programmer trying to learn new tricks (java).

Innermost Secret: I have no more inner-most secret. I'm out of the closet (so
to speak). I'm "Mr. Bozo".

POTM Ideas: None that I haven't already sent to Fred.

              zigzag    35 (  7/ 13/ 15)    c Jaco Cronje
zigzag submitted by Jaco Cronje at s9805027@student.up.ac.za
++ Please summarize the algorithm used by zigzag
For words of length 4-6 :
For the first 4 picks, output something like ABCD then EFGH then IJKL 
then MNOP. After that, do a DFS to find a solution that is legal.
A "legal" solution is one that could be a possible answer in 
comparison with the previous picks. If the time is up and no "legal" 
solution were found, just output the last combination of letters 

For words of length 7-9:
Find out exactly which letters are used in the word.
Maximum of 9 picks required.
Pick something in the form of : ABBCCCC ( For 7 letter words )
Then, check the result of the misses.
If 0 miss, then A,B and C is in word.
If 1 miss, then B,C is in word and A not.
if 2 miss, then A,C is in word and B not.
if 3 miss, then C is in word and not A or B.
if 4 miss, then A,B is in word and C not.
if 5 miss, then B is in word and not A or C.
if 6 miss, then A is in word and not B or C.
if 7 miss, then A,B and C is not in the word.
( The same thing can be done for words of length 8 and 9)

When the (amount of letters found = letters in word) and (picks < 9) 
then this process is stopped.
Now, again a DFS is done, only on "legal" solutions and only the
correct letters are used.
++ Did zigzag use any letter frequency tables?  Where did you get them?
Yes, compute them from the UNIX dictonary.
Used to optimize the DFS. Searching in the most possible solution 
++ If I had it to do all over - here's what I would try ....
Maybe a combination of a GA and a DFS.
My DFS can be optimized more I think.
Better combinations to figure out what letters are used and how many 
of each letter are used.

Live in : South-Africa, in Gauteng near Pretoria.
For Fun : Visit my girlfriend, play Quake, program games.
For work: Student at the University of Pretoria, doing Computer 
Science. I am a 2nd year student now.
Secret : Thats for you to find out.
Ideas :
* Playing a game against other people.
* Programming a robot who fights against the other bots.
* Quake like deathmatch in a 2D arena.
* or Maybe a Chesslike board game.
* or Chinese Checkers...
* or maybe Capture the Flag !!

            scharade    36 ( 10/ 12/ 14)   gC Michael Strauch
scharade submitted by Michael Strauch at michael.strauch@cityweb.de
++ Please summarize the algorithm used by scharade

In general, "scharade" uses the following strategy:

1) The first guesses test the complete alphabet, using a hard
   coded list for each word length from 4 to 9 (if some of these first
   guesses show up that all characters of the word are identified, scharade
   continues immediately with step 2)

2) The next guesses are calculated by some kind of exhaustive search to find
   a word that could be a possible solution. This might not be the optimal
   strategy, it could be more efficient to use some guesses that are
   not possible any more, but I had no good idea how to accomplish this.
3) Some times, ten seconds are not enough for step 2 - if scharade exceeds a
   running time of 9 seconds, it stops immediately and chooses a random
I think step 2 needs some more explanation. First, I set up a
straight-forward backtracking algorithm for this task - but though I
optimized it a lot, scharade ran too often into step 3, especially for words
of length 8 or 9. Than I tried to make use of the fact that Fred said that
he will only use "spell-checking" words (see below). That made the situation
a little bit better, and decreased the average number of guesses.

Finally, I added some ingredient called "Ordered Binary Decision Diagram"
(OBDD). These kind of data structure allows scharade to calculate quickly
all subsets of the alphabet that are a possible candidate for Fred's hidden
word, according to the information of the previous guesses and their number
of "misses". Then, the backtracking algorithm runs once for every possible
subset, now testing only words with characters from this subset, to find one
that meet the restrictions coming from the previous "hit" results. This
gaves me the speed improvement I was looking for - step 3 was only reached
in very rare cases (on my machine, and Fred's machine seems to be comparable).

An OBDD is some kind of finite state machine, that represents a binary
function. Scharade uses OBDDs to calculate the function f(x1,...,x26),
where x1,...x26 are binary input variables representing one subset of the
alphabet, and where f(...)=1 if and only if (x1,...,x26) represents a subset
that meets all of those "miss restrictions". As you might have noticed, the
number of "misses" of a guess do not give you any information about the order
of the letters you are looking for, it tells you only which letters are used.
(Instead of using OBDDs, one could make a list of all ~5 million possible
subsets of the alphabet of size <=9 and delete all those subsets that do not
meet the "miss restricions", but I do not think that will be quicker).

++ Did scharade use any letter frequency tables?  Where did you get them?

Scharade does not make use of a letter frequency table - but it uses
some kind of data I extracted from a ~2,5 MB sized dictionary of english
words. (I took the dictionary somewhere from the Web, don't ask me where I
got it from.) The kind of data I am talking of is the list of all 3-letter
strings that are part of a word from this dictionary (I found 7863 out of
26^3=17576 possible ones).
Words that contain a 3 letter string not in my dictionary are not used by
scharade - well, if Fred manages to find a  word that spell checks correctly
and contains one of those missing strings, scharade will fail 
(no risk - no fun).
I managed it to compress this "3 letter dictionary"  into less than 5K of
source code (including a simple base64 decompressor). The only thing I
fear is that Fred will disqualify me - but I see my dictonary more as a
"statistical data table" than a real dictionary of english words (hope
Fred thinks similar about it).
	>>> Yeah, I allowed it ... =Fred
++ If I had it to do all over - here's what I would try ....

- try to develop a better theory of how to find "good" guesses
- try to encode more statistical information into 25Kb of
  source code
- perhaps, rewrite the thing in Perl, just for fun (well, that
  "10 secs per guess" rule gives me some headache about this)


I live in germany, do POTM contests for fun, do software engineering for
work, won't tell my innermost secret  -- it wont be a secret any more ;-) --,
and wonder each time where Fred gets all those genious ideas for his
contests from.

               steak    37 (  6/  9/ 22)   gc Ben Angrim
steak submitted by Ben Nye at angrim@montana.com
++ Please summarize the algorithm used by steak
main idea of interest was to estimate the reduction in search space
resulting from a guess by creating a large sample of strings
that would be valid based on the previous guesses and the letter triples
and then picking which one partitioned that sample search space the best.

This worked very well provided that I was able to find a large sample.
In the final test(final word) my program ran into time
trouble and created very small sample search space which resulted
in BAD partitioning.

The letter frequency tables were used to weight each entry in the
sample space based on how likely its letter pairs were, 2 partitions of
(XZXZ,XXXZ,ZZZZ,ZOOX),(ZOOM) would be scored better than 3 partitions
(ZOOM,XZXZ,XXXZ),(ZZZZ),(ZOOX) for instance because ZOOM would be
weighted so high.

++ Did steak use any letter frequency tables?  Where did you get them?
yes, 2 tables, 1 was the frequency of pairs of letters, the other was
the possibility of triples of letters.  This resulted in very english-like
words, some of them very funny :-)
I got my letter frequencies from the dictionary that came with the
hangman game on my linux system.

++ If I had it to do all over - here's what I would try ....
start coding 1 day earlier, I thought about the problem for months,
but didn't start coding till Thursday evening.. first version that
worked at all was around 4PM Saturday.. I knew that it was having
speed troubles with (some) 9 letter words when I sent it in but
was out of time.

Still living in Missoula, MT
do for fun: program, play Starcraft, once it gets a little warmer hike
Work: Unix/C/X11 programmer
Secret: a secret shared is no secret

               EISAR    38 ( 10/ 15/ 13)    c Alex Popiel
EISAR submitted by Alex Popiel at popiel@snugharbor.com
++ Please summarize the algorithm used by EISAR

EISAR's logic is divided into two parts: a deductive engine
and a guess generator.  Most of my work went into the deductive
engine, since that's what I understood the best.

The deductive engine worked to determine for each letter:
the maximum number of instances of it in the word, the
minimum number of instances of it in the word, and whether
of not it was in each of the possible positions.  After
each guess, it would cycle through all of the guess results
so far, deriving as much information as it could from the
guesses.  Many different rules were used to extract information
from the guess results... this is where about half of the 25K
program lived.  Based on the number of misses, it would figure
out which letters used in the guess were present in the target
word (raising the minimum count to 1) (it would use guesses
with repeated letters in proportions like 4 E's, 2 S's, and
1 R, so that the miss count was unambiguous).  Based on the
number of hits, it would fill in the position grid, and change
the minimum counts.  Based on unknown positions and other minima,
it would set the maximum count (and also would set the maximum
count based on hits if all unknowns were in a single guess).
By iterating through this repeatedly (and also using knowledge
like there was exactly one letter in each position), the
deductive engine would fill in positively known data at a
fairly brisk pace.

The guessing engine was a different beast entirely.  It ran
in two phases: first, figure out what letters were in the word
(and how many of each), and then second sort out what positions
they were in.  To figure out what letters were present, it would
generate guesses of two or three letters repeated in unambiguous
proportions, as mentioned above, starting with the most likely
occuring letters, and continuing until all letters were located.
Thrown in here would be guesses just to count occurances of a known
present letter, if the probability of that letter being repeated
was higher than the probability that the next few unknown letters
would be present.  (Likelyhoods were determined via composites of
raw frequency charts and frequency of bigrams with known present
letters.)  Also, during this phase, letters would be positioned
in the _least_ likely positions to try and keep the hit count low,
so that the negative info in the position matrix filled in faster.

The second phase of the guessing began when the minimum and maximum
counts were equal to each other for all letters (that is, all the
letters were known to be present or not, and in what quantity).
EISAR would then begin to alternate a try for the real word (based
on letter-in-position frequencies, bigram frequencies, and consonant
run-lengths) and position determination for a single letter.
Occasionally (as with DUCK), the first guess after figuring out
what letters were present would be the actual word.  After figuring
out positively where various letters were, it was generally pretty
good about getting English-like guesses.

The submission that actually ran in the finals is a rather old version
of EISAR.  I made a number of "improvements" to the deductive engine,
generalizing the rules it used and allowing it to use combinatorial
set theory to figure stuff out... but it ran very much slower, without
consistent improvement to the guess count.  (I have lots and lots of
histograms of running various versions of EISAR against my entire
dictionary, if anyone's curious to see them.)

One last note: EISAR is guaranteed to find the solution in less than
100 guesses... it will know the letters present in no more than about
20 guesses, and it can easily find the positions in the remaining
80 guesses.  This is true regardless of the english-ness of the word
to be guessed.

++ Did EISAR use any letter frequency tables?  Where did you get them?

Yes, EISAR used letter frequency tables of many forms.  I had
separate frequency tables for each length of word for each of
"Does this letter occur in the word?" and "Does this letter
occur more than once in the word?".  I also had a bigram frequency
table (which was not length differentiated).  All of this was
encoded in my program to reduce source size.

I generated all my tables myself based on a fairly nice dictionary
that I got from reading the message board.  Since the source of
the final test words was unknown, I did not worry overmuch about
whether my dictionary matched the word distribution of the final

++ If I had it to do all over - here's what I would try ....

I'm really not sure what I would do.  I probably spent too much
time on the deductive engine, and not enough on the guess generator...
but I was too interested in provably finding the answer in time,
instead of finding english-like words quickly.

I would have liked to differentiate my guessing algorithm by word
length (for instance, when finding which letters were present in
length four words, it would have been beneficial to quickly eliminate
the low-frequency letters in groups of four with the first couple

For the (missing?) Hints and Tricks section: Once again, I never
used malloc().  Didn't want to waste the time.  Also, all my state
information was in one big struct (with various arrays and such)
which could be shifted to and from disk easily with a single read
or write call.


I live in Mountain View, CA, USA.  I code for fun, I code for work,
code is my life.  (Well, okay, I also play games like scrabble and
othello and such with various friends, and go out to the occasional
movie, but don't tell anyone!)  My innermost secret?  Umm... I'm
obsessive-compulsive.  Wait, everybody knows that...

POTM ideas?  Given a set of shapes, how many different ways can you
fit them in a box?  Given a list of words, determine which are least
likely to be confusable?  Maybe a head-to-head game where you have
three artillery pieces, and each turn you can move them a certain
distance, or fire them, but not both, and the last one with a gun
left wins?

- Alex

          SimpleMind    39 ( 12/ 14/ 13)    c Edmond Bernardy
SimpleMind submitted by Edmond Bernardy at ebernardy@lucent.com
A History of SimpleMind, by ebernardy@lucent.com

A good strategy for solving a problem like mastermind seemed to be
to apply the concepts of information theory, and to try to maximize
at each step the information obtained in response to a proposed word.
The two numbers (hits and misses) can be considered as a message,
and if the probabilities of the different possible messages are known
(or can be computed/estimated), the information content of the
received message can be computed.
(The information of a message is computed as -SUM (p_i log(p_i)),
where p_i is the probability of a message instance, and the sum is
taken over all possible messages).

The strategy is simple to state, but Fred took care to make the
search space so large - up to 26 exp 9, or about 5.4E12 at the
beginning - that counting the numbers of possible answers is not
a realistic proposition.
The idea was therefore to generate at each step a (representative)
sample of possible words, and to determine in that restricted sample
space the best proposal.

It was also decided to use only words which were consistent with
previous answers, although in some cases this is not the best choice.

An efficient method for generating possible words was obtained by
creating two arrays, which are updated at each trial and saved in
the temporary file.
One array contains the possible letter sets of already tried letters,
the other array contains the possible letters for each letter position.
Possible letter sets are coded as bit fields, 26 bits fitting nicely
into a 4 byte integer.
Sounds simple, but it took more time than expected for debugging and =
it efficient, eliminating duplicates or impossible letter combinations.

This strategy seemed to work OK for short words, but for words of more
than 6 letters it became obvious that letter and di-(tri?)graph =
must be taken into account.

At this point, time was running short.
Single letter frequencies and frequencies of preceding and following
letters in digraphs were obtained for all words of at least 4 letters
from Webster's dictionary.

The present version of SimpleMind generates a word by first chosing
among the allowed letter sets the shortest set with the highest letter
frequency, and putting the letters in allowed positions. The word is
completed by adding the most probable letters, taking into account
already fixed preceding or following letters.
Weak spots are:
- different letter frequencies should be used for the first and last
  letters of a word: digraphs like 'rd' can appear at the end of a word,
  but not at the beginning
- in the first trials more weight should be given to using untried =
  and avoiding (?) repeated letters

Conclusion: great fun, but next time START EARLIER!

      Nero_the_Horse    42 ( 12/ 16/ 14)  PERL Mike Liddell
Nero_the_Horse submitted by Mike Liddell at mike_liddell@hotmail.com
++ Please summarize the algorithm used by Nero_the_Horse
constraint solver---use previous guesses to determine all possible patterns, 
using patterns.  eg ABxxE  would allow ABCDE (all letter restrictions are 
observed..eg third letter might be A..P,Q,S..Z if we know that R isnt 
allowed.  This bit works fine, but I did very little work on using this 
information--I don't make particularly good guesses and my initial guesses 
were made semi-randomly

++ Did Nero_the_Horse use any letter frequency tables?  Where did you get 
nup...it is not english-specific (dammit)

++ If I had it to do all over - here's what I would try ....
same approach, possibly in C as speed is a concern.  Spend time analysing 
the second part of the problem (choosing best guess once full info is 

Melbourne, Aus.  wrote this potm entry in London, currently in Philadelphia.

Disclaimer: I had to stop working on this potm at the end of March due to 
travel etc.... so I have an excuse!

John Fitz: no entry?!?!?

Thanks once again, Fred, for a fun comp.

         not_bullish    45 ( 10/ 11/ 24)  PERL Eric Prestemon
not_bullish submitted by Eric Prestemon at ecp@io.com
++ Please summarize the algorithm used by not_bullish

Dumb initial guesses.

In the second phase, after sorting the previous guesses by hits and then
misses, it randomly fiddled its current guess to match as many previous
guesses as possible. Too slow on long words, so it would short-circuit

++ Did not_bullish use any letter frequency tables?  Where did you get them?

The program had no knowledge of the English language except the 26
letters. Not even its dumb guesses were optimized.

++ If I had it to do all over - here's what I would try ....

I did do it all over, with a best-first search algorithm that worked much
better on most words, but was prohibitively slow. Then I lost interest.


I program at an internet startup in silicon valley. However, I still
manage to lead a relatively sane life.

               TOEFL    46 ( 12/ 13/ 21)    c Sergey Kondrachshenko
TOEFL submitted by Sergey Kondrachshenko at SKondras@iacgsz.pavlodar.kz
++ Please summarize the algorithm used by TOEFL
++ Did TOEFL use any letter frequency tables?  Where did you get them?

 In algorithm the following statistical data are used:
- the ranglist of symbols of the alphabet in decreasing order frequencies of
use of a symbol;
- the ranglist most frequently of met pairs symbols (in decreasing order);
- List of pairs symbols which have been not met in the processed list of words
of the English language;
- the ranglist most frequently of met groups of three symbols (in decreasing
 First three lists are received for each possible(probable) size of a word,
and last general(common).
 For reception of the given statistical information the words from the following
source were analysed: ftp://ftp.simtel.net/pub/simtelnet/msdos/lingstcs/ files
words1.zip, words2.zip, words3.zip, words4.zip.

Algorithm consists of two parts:

1. Generation of a new sequence of symbols as the decision.
 For check of presence of a symbol in a word the circuit 4444221 is used
depending on length of a word with necessary lengthening of the longest
component (for example 22221 or 44444221), and after presence of an absent
symbol strictly agrees to the circuit.
 The procedure for test of symbols according to a rank from the first list,
and in subsequent with the account already of available symbols and pairs
of symbols from the second list.
 When there is a symbol which presence in the certain position requires check,
it is carried out in parallel with check of the next portion of symbols.
 Choice of a symbol, for which the check will be carried out occurs in the
following order: at first group from three symbols, then pair of symbols, then
other variants in view of absent pairs (undertakes symbol at which less
possible(probable) positions and is checked position at which greatest number
of possible(probable) variants), in last turn simply possible variant
(similarly previous).
 For N=7 the circuit 4444221 is used if there are no symbols for check, the
circuit 221 is otherwise used.
 At check of presence of a symbol in a word such sequence is whenever possible
generated, that at the analysis there was an opportunity in addition (in the
certain conditions) unequivocally to define(determine) a position,
borrowed(occupied) by a symbol, (for example for 1 it is a position: for which
symbol yet is not determined, and for 2 it is one position already engaged and
one free).

2. Analysis of results of the previous decision.
 On MISS and used circuit the symbols, present in a word, and if it
unequivocally probably also position, borrowed(occupied) by them, (various
combinations HITS and MISS are defined(determined) in view of the offered
 The additional analysis will be carried out(spent):
a) All unequivocally certain positions should not be checked at the present
b) If there is a symbol at which there is no certain position and only one
possible(probable) unverified position: that she(it) is precisely determined
also performance of item a).
c) If all symbols are checked up on presence in a word and there is a position
Which can be borrowed(occupied) only by(with) one of symbols: that this
position is precisely determined and the items a) and b) are carried out.
 If the sum of the certain positions and symbols not having any position is
equal N, the stayed symbols on presence are not checked.
 If a symbol for one position precisely is not determined: that in decision are
used all guessed symbols and next checked on position a symbol (if he will
coincide that complete answer) at once will be received.


Where do you live :  Republik of Kazakhstan ,  city Pavlodar
Do for fun :  programming, logical game
Do for work :  programmer, temporarily unemployed

      Leap_Year_Baby    50 ( 12/ 20/ 18)   gc Yoichi Kono
Leap_Year_Baby submitted by Yoichi Kono at kono98@geocities.co.jp
++ Please summarize the algorithm used by Leap_Year_Baby

In first several steps, try to figure out which character is 
used exactly. After these steps, just try to figure out
positions from the history. (Anyway it is not so smart)

++ Did Leap_Year_Baby use any letter frequency tables?  Where did you get them?

Yes. I got them from English-Japanese online dictionary:-)

++ If I had it to do all over - here's what I would try ....


          Cowculator    55 ( 19/ 18/ 18)  JAVA Ted Alper
Cowculator submitted by Ted Alper at alper@turing.stanford.edu
++ Please summarize the algorithm used by Cowculator
optimized for speed of writing the program, not quality of results...
I thought about the problem for a little while back in February, but
never got around to coding. The day before the deadline, I had
the chance to write something, so I tried to put the gist of the idea
in, but didn't have time to do it all.

The goal was:
     (A) use initial guesses to get precise information about which
     letters occur -- for example, if you guess ABBCCCC, then the
     number of misses uniquely determines which letters among "A"
     "B" and "C" occur in answer. Of course, you may also get
     some information about letter placement.
     (B) once the number of possible letter-sequences that could be
     the answer becomes manageable, generate the collection of them
     and now make each guess one of the possible answers which
     does the best job of winnowing the set of possible answers in the
     worst case. 

++ Did Cowculator use any letter frequency tables?  Where did you get them?
NO, I had intended it to do so,
but I didn't have time to implement that. Heck, I even
just used alphabetical order to make my initial guesses

++ If I had it to do all over - here's what I would try ....
I'd start coding more than 24 hours before the deadline! I'd use
letter frequency information to improve (A) and (B) -- guess most
common letters first, weight the possible answers by their likelihood
of being a word, etc. I'd improve my measure of using the informatoin
from hits and misses, and of generating valid words.

Palo Alto, CA. No fun; I wish I had more of a programming job,
lately I spend most of my work-time tutoring math students over
the internet.  

          Revolution    58 ( 11/ 24/ 23)    C Dave Krofssik
Revolution submitted by Dave Krofssik davek@soft-tek.com
++ Please summarize the algorithm used by Revolution

Save any previous match and what was hit and missed for later
comparisons.  If the word was all misses, eliminate all letters in
that word from the alphabet.

Output the letters in the alphabet (in frequency order) until
all the letters have been used at least once.

Go through all combinations of letters that have not been eliminated,
searching for one that is compatible with the responses that were
previously recorded (i.e. a guess that would produce all the previous
responses if it were the correct one).

If a word was not found before time expired, output a word that
contains only the next letter in the alphabet that has not been
absolutely included or eliminated.

> ++ Did Revolution use any letter frequency tables?  Where did you
> get them?

I did use a table I found on a website, but I did not record which

> ++ If I had it to do all over - here's what I would try ....

Speed up the processing, so I could handle more combinations in the
time allotted.

            MOOguess    59 ( 19/ 22/ 18)    c Arkady Zhikharev
MOOguess submitted by Arkady Zhikharev at arzhik@yorp.yaroslavl.su
++ Please summarize the algorithm used by MOOguess
I think that it isn`t interesting to anybody (look finals :( )

++ Did MOOguess use any letter frequency tables?  Where did you get them?
Look above

++ If I had it to do all over - here's what I would try ....
I would think slightly... :) Unfortunately I have not any free time

Russia. For fun I solved the last POTM problem (POTM is GOOD IDEA and GOOD
PROGRAMMERS TEST!). My work - to develop SCADA (Supervisory Control and Data
Acquisition) program. Secret ... shhhhhh - is a secret ;) If any idea will
visit me, of course, will let you know.

           MOOtPoint    61 ( 13/ 24/ 24)    c Shawn Smith
MOOtPoint submitted by Shawn Smith at smiths2zzyy@worldnet.att.net
++ Please summarize the algorithm used by MOOtPoint

There's nothing special about it.  It has two phases:  the first phase is
used to find all the letters that exist in the word, and the second phase
is used to find out where each letter goes.  It doesn't recognize all the
history in the game that would cut down the number of guesses, so it is
not very optimal.  The ONLY somewhat tricky part of the first phase is
that it uses the number of misses with a different number of each letter
to tell what letters are and are not in the target word.

++ Did MOOtPoint use any letter frequency tables?  Where did you get them?

It used letter frequency tables in the sense that the first phase selects
letters for its guesses from an examination of letter frequencies that I
did on a dictionary that I downloaded from the internet; I can't remember
where.  Here is a list of the initial guesses:

static int NLetter[10] = {0,0,0,0,2,2,2,3,3,3};
static int Dist[10][3] = {{0,0,0},{0,0,0},{0,0,0},{0,0,0},{1,3,0},
static char const *IniGuesses[6][14]  = {
static char const LetterOrder[6][28] = {

++ If I had it to do all over - here's what I would try ....

A few things:
1.  I was hoping to have a set of trees (1 for each word size) that would
    be used to select the next word that MOOtPoint would guess, given the
    current selections that it made and the results obtained.
2.  Instead of trying to find where each letter goes individually, guessing
    possible solutions and examining the results for the next guess.  For
    example:  If I guessed PINETREES for PINEAPPLE, the result would have
    been 4 hits and 3 misses.  I would try to find the "next" word that
    would have 4 letters in the same position, and three letters that were
    not in PINETREES.
3.  Use neural nets to recognize English words, in order to cut down the
    number of possible guesses that would be available for #2.  There would
    have been one net for each size of word, and the inputs to each net
    would have been the letters, probably encoded in some way, and the
    output would have been 0 or 1, for answering the question, "Is this an
    english word?"


Las Vegas, Nevada.

The POTM, bicycle riding (because I'm too cheap to have car), and watch a
few programs on television.  Also, finding a completely optiminal solution
for the test words using my algorithm in the hopes that I could win a "Worst
abuse of the rules" consolation prize.

Maintain and modify part of the IR (information retrieval) system used on
the Yucca Mountain Site Characterization Project, for TRW Environmental
Safety Systems.

See my response on inverse boggle POTM.


        Daily_Farmer    67 ( 11/ 33/ 23)    C Gabriel Gambetta
Daily_Farmer submitted by Gabriel Gambetta at ggambett@ucu.edu.uy
++ Please summarize the algorithm used by Daily_Farmer
Quite straightforward. I'll use the words DOGS for example. 
I call LETTER a character that is not a MISS nor a HIT 
(ie., right letter in a wrong place)

1) Isolate the "groups" where the HITS and LETTERS are. 
DF does that checking for the alphabet, in this

ABCD 0 3 (1L)
EFGH 1 3 (1H)
IJKL 0 4 (nothing)
QRST (1L) ---> Got all the letters, stop here

It saves the useful groups (ABCD 1L, EFGH 1H, MNOP 1L, QRST 1L). 
It also saves a "empty" letter, in this case I (the first 
letter in the first useless group).

2) Get the HITS. Does this testing each letter in the 
	group in its own place. 

EIII 0 4
IFII 0 4
IIGI 1 3 => Current word = "__G_"

3) For each group containing letters, check each letter in a 
valid place (a place where it was not tested and is
not used). This is for finding the letters and also finding a 
hit in a lucky guess. It puts current word characters for the 
case in which the last letter is found, so the final guess 
is the "full guess"

IAGI 0 4
BIGI 0 4
CIGI 0 4 => No need to test DIII, it MUST be the letter in 
this group. So, locate the letter by testing it in the remaining places.

DIGI 2 2 => Current word = "D_G_"

Continue this process until find the word.

4) (optional) If all groups were tested and the game continues 
(the word was not guessed), there were repeated letters. 
So begin testing each already used letter in any unused space.

++++++++ Did Daily_Farmer use any letter frequency tables?  

Yes, actually it checked against EASR, INTO... I didn't remember 
where did I get these. I did my own reserch on a word list, 
but I don't remember which frequency chart I used at last. 
The one I started using was posted on the Bulletin Board.

The other language-specific optimization I used is to arrange the 
valid positions for testing words such that a test that would 
imply two consecutive vowels go later.

++++++++ If I had it to do all over - here's what I would try ....

I stopped coding and thinking about the problem after my 
program compiled for the first time. It was before easter, I think. 
I had a couple ideas anyway, like testing hits and letters 
from different groups in the same test. I don't think this 
could help much. Other idea was dividing the tests in halfs 
or thirds, so if ABCD had a hit, then test AB with the second 
half of other group containing a hit, and so on... 


Montevideo, Uruguay, somewhere lost in South America
Play soccer, write programs, study Computer Science!
Write administrative-like programs for my company
Actually I'm The Brain, and this is just a small part of my 
	newest plan to take over the world...
Something like an Improved Showdown, maybe a tank war, with 
	terrain attributes, line of sight... I don't know...

            oxmoomoo    73 ( 17/ 22/ 34)    C Kwok Ngai
oxmoomoo submitted by Kwok Ngai at kwok@esun.att.com
++ Please summarize the algorithm used by oxmoomoo
Fill all positions with a list of letter.
When there are GOOD letters exists on the list
	Loop through each poitions to determine the number of GOOD letter
	For each GOOD letter
		Loop through each position to determine the GOOD spot.

Continue the above process until all GOOD letter are found.  It also
keeps track of all possible GOOD spots and absolute BAD spots to speed up
or eliminte searching.

++ Did oxmoomoo use any letter frequency tables?  Where did you get them?
++ If I had it to do all over - here's what I would try ....
Do not know yet.

 I live in NJ and I am a programmer.


               sieve    73 ( 15/ 29/ 29)  JAVA Aivars Zogla
sieve submitted by Aivars Zogla at fix.lv!uvsk@kcig2.att.att.com
++ Please summarize the algorithm used by sieve
Two stages:
1) determining of letters
2) determining of correct positions
At first stage for words, made from 4-6 letters, each guess looked like
ABBB[BB], for words longer than 6: ABBCCCC[CC]. If it appears to be doubles,
or triples, or so, next some guesses were of format AAAA[AAAAA] - single
letter used.
Second stage - each guess took 2 letters and additional letter, which isn't
in the word at all: ABXX[XXXXX]. Depending on misses, decision was taken to
make next approaching guess for determining correct position for at least
one letter. If 2 misses' guess, first letter (out of 2) was queued for
further examining.
Here's "sieve's" fight for PINEAPPLE:

===	=====	====	======           ===	=====	====	======
1	AEEIIIIII	0	0         14	EAFFFFFFF	0	7
2	OUUPPPPPP	2	3         15	AIFFFFFFF	1	7
3	RSSMMMMMM	0	9         16	AFFFFFFFF	0	8
4	NGGWWWWWW	0	8         17	PFNFFFFFF	2	7
5	KLLCCCCCC	0	7         18	FFFLEFFFF	0	7
6	DHHVVVVVV	0	9         19	FFFEPFFFF	1	7
7	TQQJJJJJJ	0	9         20	FFFEFFFFF	1	8
8	YBBXXXXXX	0	9         21	FFFFPEFFF	0	7
9	FZZYYYYYY	0	9         22	FFFFEAFFF	0	7
10	EEEEEEEEE	2	0         23	FFFFALFFF	1	7
11	AAAAAAAAA	1	0         24	FFFFAFFFF	1	8
12	IIIIIIIII	1	0         25	FFFFFPPFF	2	7
13	PPPPPPPPP	3	0         26	FFFFFFFEL	0	7
                                          27	PINEAPPLE	9	0

Most interesting at this POTM was temporary file. For sach an algorithm it
was enough to store at most about 30 bytes of "knowledge about past". It's
even possible to restore all sequence from this knowledge at any particular
point of game.

++ Did sieve use any letter frequency tables?  Where did you get them?
I did. Vovels first, consonants after. What a frequency could appear in 3
final words?

++ If I had it to do all over - here's what I would try ....
I get some new and fresh ideas for myself out of this POTM. I wouldn't like
to forget them.

Ugale, Latvia. For fun I'm thinking what I could do for fun. 
For work - still trying to pour knowledge about algorithms 
	into high-school students' heads.
Secret is very simple - there already exists perfect pictures 
	in nature, we just have to make them visible.
Fred's ideas for POTM are OK!

              Abacus    78 ( 17/ 28/ 33)    c Igor Volkov
Abacus submitted by Igor Volkov at i-volkov@bigfoot.com
++ Please summarize the algorithm used by Abacus
	Using letter frequency table I tryed to eliminate unused symbols,
then found duplicates.
++ Did Abacus use any letter frequency tables?  Where did you get them?
	It used frequency table of letters. I got it from the list (approx.
100000) English words.

++ If I had it to do all over - here's what I would try ....
	I would try to find more elegant solution.

	Belarus, Minsk. I work at Software Engineering Dept., BSU. 

           Max_Power    78 ( 20/ 23/ 35)   gC Ed Mandy
Max_Power submitted by Ed Mandy at erm@moby.atcorp.com
++ Please summarize the algorithm used by Max_Power

It's a brute-force approach that uses some "smarts" to cut down on the 
iterations necessary.  It should always get any (legal) word in the 
allotted time/iterations.

The algorithm used it based on the idea that if you had a N-letter 
word, you could try every letter of the alphabet in order to figure out 
which letters are in the word.  

This pass has guesses that look like:

Then, once you are done with that pass, you know which letters are in 
and which letters are out.  You can then start placing the letters in 
their respective positions.

For example, if you knew that the letters A, C, S, and T were in the 
word, then you could go through iterations that looked like:
BABB		(now we know that A is in the 2nd space)
CBBB		(now we know that C is in the 1st space)
BBBS		(now we know that the string is CATS)

I also do some combining of the 2 passes just to cut down on iterations.

++ Did Max_Power use any letter frequency tables?  Where did you get them?

Yes, well, kind of.  It uses a rearanged alphabet so that the most 
likely letters are tried first.

I got the frequency table from a WWW site.

++ If I had it to do all over - here's what I would try ....

I would finish off my approach.  I think it's a good solid approach, 
but I did not optimize it enough.


In a suburb of Minneapolis, MN.


Play Magic (and now Pokemon), rollerblade, play computer games, play 
with computer software and hardware, play with my fiance, etc.


I program for Triticom (www.triticom.com).

              newbie    81 ( 17/ 30/ 34)   gc Arlie Stephens
newbie submitted by Arlie Stephens at arlie@netcom.com
++ Please summarize the algorithm used by newbie

Newbie was very much unfinished, due to not spending sufficient time on it...
I'm embarassed at the simple-minded algorithm... 

Very simpleminded: 
	- scan letters in bunches to determine whether or not they were in the 
		word (Mostly winnowing "not present" letters; some info saved
		for later)
	- rescan letters remaining individually to determine presence/absence
		and number of instances present
	- try each present letter in all possible positions (one guess per
		candidate position, using known-non-present fill)

++ Did newbie use any letter frequency tables?  Where did you get them?

Only the basic "ETAIONSHR" is most frequent in english, in that order, from
memory. (I used to solve ciphers for fun, and learned that list at that time.)

++ If I had it to do all over - here's what I would try ....

First and foremost, spend more time developing it. 

Second, read the rules more carefully. I had a better algorithm for determining
letter presence that I didn't use because I got it from another entrant, and 
didn't realize that it was legal to use his algorithm, provided I didn't use
his code. 

If I stayed with the simplistic algorithm above, a whole bunch of refinements
were in my to-do list. This included switching to slightly different algorithm
variants (including letter frequencies, batch sizes when excluding letters, 
etc.) for each word length, and a bunch of changes to garner more information 
with each guess. Also, take more advantage of known properties of the english
language; the algorithm I used took eseentially no advantage of the fact that
the target was known to be an english word. 

I'd also have experimented with an entirely different approach. (Or perhaps
several.) I had an intuition that the algorithm style I used would max out 
(not be farther improvable) at a relatively high number of guesses, whereas
an algorithm based on letter patterns (arrangements, letter frequencies) 
within actual words of each length would potentially do much better ... but
do much worse in its early stages, be significantly more work to develop,
and like-as-not exceed the code size limit. 

Certainly I would have run dictionaries through various candidate algorithms 
looking for which worked best. 


I live in Sunnyvale CA (i.e. Silicon Valley). I work as a software engineer.
I haven't done much actual programming lately at work (I spend my time 
writing documents, attending meetings, etc.) and have noticed that my skills 
are slipping; I'm doing POTM in an attempt to get some of the rust off my
programming skills, and keep it off. 

             Mooniac    84 ( 22/ 30/ 32)  JAVA Denis Konovalchik
Mooniac submitted by Denis Konovalchik at denisk@compassplus.ru
++ Please summarize the algorithm used by Mooniac

My algorythm is simple enough. I ordered all latin letters by 
frequencies of them. Then I tried to compose whole word from the
current letter. If it was no "cows" in this word, it was necessary 
to take the next letter, and so on. When the letter appeared in
the word, the algorhytm tries to determine its positions in the 
word by the bisectional method -- by splitting the "unguessed"
word's letters set into two parts, and applying this method to 
each of them. This routine runs until determining of all occurencies
of the current letter.

++ Did Mooniac use any letter frequency tables?  Where did you get them?

Yeah, 6 vectors of the single letter frequencies (for each word 
length, from 4 to 9 letters). These vectors were built on the
dictionary placed to the net by one kind man (he also dropped the 
link to this site into the POTM message board). Many thanxx for

++ If I had it to do all over - here's what I would try ....

It was necessary to upgrade this algorhytm, but I didn't have 
enogh time for it ;-( The main idea was to try each new letter not
alone but in the company with another letters. So, receiving "3 cows" 
in answer to the version "ABBCCCC" we'll know that the word
has letters "A" and "B" within but it doesn't contain the letter "C".


I live in Magnitogorsk, Russia.
Our city is famous for its location (we live both in Europe and Asia;
the border between these continents, Ural river splits our
city into two parts), for its metallurgical plant (the biggest in 
the world, that's why the ecology is poor enough ;-(.
The ice-hockey team "Metallurg" of our city became this year 
champion of Russia and Europe (it's pitiful that it will not strike
this season with the champion of NHL).

I'm 24 y.o. programmer, working in the software firm, menacing C++ 
with Java and writing code on Microsoft mustdie (c) Windows. In
my vacation I do write letters into Russian computer magazines, 
including "Computerra" (www.computerra.ru), where a year ago was
published my opus about POTM (Fred Great and Awful even placed the 
link to it onto his pretty site ;-)) During this time I am also a
postgraduate student of Magnitogorsk Technical University.
I would be happy to work as programmer in United States some time later,
it's the real dream for each serious IT specialist in my
country. Anyway, now I spend a funny life.
Thanx very much everybody who entered this contest! I hope that 
we'll meet here once more.

      Brave_Cow_Word    90 ( 26/ 36/ 28)    c Lourenco Arnasalon
	Sorry ... No questionairre was returned for Brave_Cow_Word

           HitOrMiss    91 ( 17/ 29/ 45)  JAVA Stuart Rowland
HitOrMiss submitted by Stuart Rowland at swrowland@lucent.com
++ Please summarize the algorithm used by HitOrMiss

HitOrMiss used logic to try to determine the word, mostly by
elimination.  It could solve any 4-9 letter sequence whether it was an
English word or not.  It first tried to eliminate as many letters as
possible by guessing words that contained the least frequently occurring
letters.  If the response was all misses, all the letters were
eliminated.  If the response showed that at least one letter was
correct, but no letter was in the correct position, each letter in the
guess could be eliminated in that position.  The temporary file
consisted of the set of possible letters for each position, the set of
letters that were known to be in the word, the set of letters that were
known not to be in the word, and the guess history.  The history was
reviewed each time.  If a guess in the history had one hit and all but
one of the letters were known not to be in the word, the remaining
letter had to be correct.  If a position was narrowed down to two
possible letters, a guess was constructed using one of the choices and
letters known not to be in the word.  The response would either select
the letter or reject it.  If a letter known to be in the word occurred
in only two positions, a guess was formed with the letter in one of the
positions and letters known not to be in the word.

++ Did HitOrMiss use any letter frequency tables?  Where did you get them?

I used to grope dictionary to find the frequency of occurrence of the
letters in each position over the sets of words of length 4 to 9.
++ If I had it to do all over - here's what I would try ....


New Jersey

In the Seventh Guest adventure game there is a microbe game that would
make an interesting competition.  It is a territory game played on a
square grid.  Each player starts with a single piece in opposite
corners.  A move consists of either cloning or jumping.  A clone move
creates a new piece in one of the 8 empty spaces next to an existing
piece.  Jumping is moving a piece to an empty square two away in one of
the 8 possible directions.  At the end of a move, any of the opponents
pieces that touch the moved piece are changed to the players color.  The
game ends when all positions are occupied.  The winner is the player
with the most pieces.

          Spelling-B    93 ( 21/ 38/ 34)    c Andrew Gauld
Spelling-B submitted by Andrew Gauld at andrew.gauld@att.com
++ Please summarize the algorithm used by Spelling-B
Two pass method: pass 1, find how many of what letters are contained in the
word by using guesses containing only single letters (EEEEE,TTTTT, etc.) once
all letters are identified, go to second pass.
Second pass is identify letters one by one by using a known bad letter to pad
the end of the guesses (EZZZZ, TZZZZ) this proceeds until the last two letters
at which point I just guess on their order.

++ Did Spelling-B use any letter frequency tables?  Where did you get them?
Just the old traditional ETAOIN SHRDLU as a seed for the letter search

++ If I had it to do all over - here's what I would try ....
Winning the lottery so I'd have more time to spend on it.


               rayka    94 ( 23/ 37/ 34)    c O.Jansen-Olliges A.Gerns
rayka was submitted by Olaf Jansen-Olliges and Arnd Gerns
	at Ojan_Band@t-online.de
++ Please summarize the algorithm used by rayka

it's a really dumb one:

1. step collect the letters in the word by testing
 aaaa, eeee, ffff,  and so on.
2. step find out the position of the letters with the help of a letter
not in the word
e.g. q ( eqqq, qeqq, qqeq , ...)
after this step you can build the word.

++ Did rayka use any letter frequency tables?  Where did you get them?

yes we've builded them from a UNIX dictionary file.

++ If I had it to do all over - here's what I would try ....

1. do it better ;-) (perhaps find more than one letter per try e.g abbb
instead of aaaa)2. spending more time (but unfortunately this time we
had a lot to do)


Arnd lives in Braunschweig and I (Olaf) come from Hannover

And sometimes we like to sit together in my home-office  (a sun and
some LINUX boxes (5)) wasting some time programming c or c++ (not for
the job).

rayka - by the way is the name of my daughter (15 month) one of the
reasons for me to leave my home-office not too late.

I'm looking forward to the next POTM-Problem !!

           superfreq    94 ( 20/ 36/ 38)  PERL Dan Schmidt
superfreq submitted by Dan Schmidt at dfan@alum.mit.edu
++ Please summarize the algorithm used by superfreq

superfreq is incredibly simple; it's actually the first program I got
working.  I then started a totally new program, superfreq2, which only
ever worked on a subset of legal words, but it does get HONESTLY in 7,
faster than any of the programs on the list :)

I'll describe superfreq2 from now on, since it's more interesting.  It
first determines what letters are in the word.  It guesses words like
EEEETTA, since we then know exactly what letters are in the word,
given the number of misses.

I then go through all possible permutations of those letters, checking
for compatibility with the results of previous guesses and then scoring
them.  Scoring is done with the aid of a trigraph table; the table
starts like this:

  ED$ 3073
  ING 2844
  NG$ 2505

where each line is the number of times that trigraph occurred in
/usr/dict/words (^ and $ stand for beginning- and end-of-string).

Anyway, the score is just the sum of these numbers for each
three-letter sequence in the word we're trying out: trigraphs not in
the table get a score of -10000.

Here it is at its best:

  d:\src\moo> sue HONESTLY
  ASAEEEEE -> 1 hits, 2 misses
  RRSRIRIN -> 0 hits, 6 misses
  TLOOLEOO -> 0 hits, 0 misses
  UECUDCCC -> 0 hits, 7 misses
  PMGPOMMM -> 0 hits, 7 misses
  BBBBHYHS -> 0 hits, 4 misses
  HONESTLY -> 2625 / 8 hits, 0 misses
  Got it in 7 guesses.

  d:\src\moo> sue LATRINES
  ASAEEEEE -> 1 hits, 0 misses
  RRSRIRIN -> 2 hits, 0 misses
  TLOOLEOO -> 0 hits, 4 misses
  LATRINES -> 4426 / 8 hits, 0 misses
  Got it in 4 guesses.

So it did great on 7+ letter words with no duplicates, but if there
were too many duplicates then my permutation code was too slow, and I
never even tried to handle 6- letter words (my EEEETTA trick for
quickly finding all the letters only works for 7+ letter words).

I had a fast multiset-permutation generator (stolen from Knuth),
scoring function, and validity checker, but even so I couldn't handle
more than one duplicate letter in a 9-letter word on my P166.

++ Did superfreq use any letter frequency tables?  Where did you get them?

It used a trigraph table that I created from /usr/dict/words on my system.
The only interesting thing about it is that it uses ^ and $ as well as
the 26 letters.  I think that helped a lot.

++ If I had it to do all over - here's what I would try ....

I would finish my better program!  But I never got a good idea for
what to do for short words or ones with too many duplicates.


I live in Cambridge, MA, and write music software.
A POTM idea is forthcoming...

                 Dan Schmidt -> dfan@harmonixmusic.com, dfan@alum.mit.edu
Honest Bob & the                http://www2.thecia.net/users/dfan/
Factory-to-Dealer Incentives -> http://www2.thecia.net/users/dfan/hbob/
          Gamelan Galak Tika -> http://web.mit.edu/galak-tika/www/

          Moot_Force    97 ( 26/ 35/ 36)   SH Brian Rapp
Moot_Force submitted by Brian Rapp at rapp@boutell.com
++ Please summarize the algorithm used by Moot_Force
Moot_Force uses a method which is guaranteed to solve the problem
in less than 100 guesses, but it is a very primitive method indeed.
It first guesses words consisting of all one letter (e.g. AAAAA)
and continues to do so until it discovers all the letters in the word.
Along the way it remembers one letter that is not in the word (e.g. Q).
In the next phase, it solves each position of the word one at a time
by guessing a valid letter in that position and the invalid letter
for all other positions (e.g. AQQQQ).  When all positions are known,
it suddenly guesses the correct word.  Because Moot_Force follows
such a simple strategy, it can almost completely ignore MISS data
and store its knowledge in a very small temp file.  Clearly such a
brute force strategy cannot _win_ the contest, but it does represent
an interesting and at least technically successful extreme case.

++ Did Moot_Force use any letter frequency tables?  Where did you get them?
Moot_Force does have a frequency list from which it guesses the letters
in order during the first phase.  Because the first positions in the word
will have more possibilities to be guessed in the second phase, I biased
the list somewhat towards letters near the beginning of words.  Because
Moot_Force isn't terribly concerned with winning, this list hasn't been
optimized very much, but it might be better than just using alphabetical

++ If I had it to do all over - here's what I would try ....
I found it very amusing to program Moot_Force in shell script,
but if I were to put in the effort for a serious attempt I'd certainly
need to start over from scratch.

Maryland.  Games.  Programming.  Mercury.  Battle-Royale of all programs.

            themoops   101 ( 32/ 30/ 39)    c Stephen Palmer
themoops submitted by Stephen Palmer at srpalmer@att.com
++ Please summarize the algorithm used by themoops
Very simple algorithm.  Loop through all letters (using a crude
order based on most common letters first) to find out how many of
each letter are in the word.  Once all letters are known, search
for one letter at a time to find its position.  Once all positions
are known, you've got the word.

++ Did themoops use any letter frequency tables?  Where did you get them?
A frequency table was used, and there seem to be loads of examples
depending on the source.  Mine came from a web site that ranked letter
usage based on several popular novels.

++ If I had it to do all over - here's what I would try ....
I never cared about the misses, and determined the letters and
positions only from hits.  I'm sure I could have decreased the number
of guesses if I had tried to use all available information.

I live in central New Jersey and work for AT&T designing web solutions
using Microsoft NT.  And I love POTM problems.

                 pat   104 ( 24/ 36/ 44)    c Bernard Hatt
pat submitted by Bernard Hatt at bmh@arkady.demon.co.uk
++ Please summarize the algorithm used by pat

The algorithm was to change one letter at a time and learn
which letters were correct or could be eliminated. From previous
POTM experience I have learned that simple robust algorithms
are better able to withstand Fred's choice of test data :-)
++ Did pat use any letter frequency tables?  Where did you get them?

Letter frequency tables were generated from the expanded
GNU ispell english dictionary which stores words in a root+affix

For each position in the word I had a table of the most likley
letters for that position in order, and for each letter in each
position an ordered list of the previous and next letters.
++ If I had it to do all over - here's what I would try ....

Changing only one letter at a time was never going to be a very
fast algorithm, (I think) a better solution would be to change
several letters at a time and store the information about
what was learned (eg. there is no 'Z' in the word, or there is an 'H'
or a 'K' in the word, but not in position 3 etc.) to pick
the best guesses from the letter probabilities and the stored

I'm based in Hemel Hempstead in the UK (about 25 miles NW of London)


Programming/Gardening/Cooking/Eating/Wine making/Drinking.


I work as a sysadmin for a large database company, where I am one of
the very few people who knows very little about databases.



How about some time of shape fitting problem like Tetris ?

            simental   104 ( 26/ 40/ 38)    c Marty Fouts
simental submitted by Marty Fouts at potm@fogey.com
++ Please summarize the algorithm used by simental

Simental is a 'stupid guesser' algorithm that did about as poorly as I
expected it would.  It works in two passes:

   * Pass 1 -- try each letter of the alphabet in decreasing order to
     find out what letters are in the target word.

   * Pass 2 -- try each letter of the word in each possible position
     in the word, in left to right order, skipping those positions
     that have already been found.

The only 'cleverness' at all in simental is that the letters of the
alphabet were tried according to a frequency order, rather than

++ Did simental use any letter frequency tables?  Where did you get them?

Yes.  I invented the frequency table by analysing several online word

++ If I had it to do all over - here's what I would try ....

Almost anything.  Unfortunately, schedule intervened and I never got
around to improving the algorithm that I sent in as a place holder.



Play classical guitar.

That the *real* answer to the universe is 17, not 42.

No, I'm not clever at making puzzles.

          Fools_Gold   111 ( 32/ 35/ 44)    c Jeffrey Jackson
Fools_Gold submitted by Jeffrey Jackson at jjackson@harmonic.com
++ Please summarize the algorithm used by Fools_Gold

step 1.  using my letter frequency array, make 'solid' guesses (e.g. SSSSSS
or HHHHHH for a 6 letter word) until i've found all the letters in the word.

step 2.  for each letter in the unknown word (left ot right) go through all
the letters we know are hits and check if it hits at that particular spot.
i used a letter known NOT to hit in the other spaces so i could check just
the one letter at a time.

++ Did Fools_Gold use any letter frequency tables?  Where did you get them?

i did use one.  it was just a simple array ordering the 26 letters.  i
pulled several thousand lines of plain text off the web and wrote a little
program that counted each letter.  i then ordered my array from most common
occuring letter to least.

++ If I had it to do all over - here's what I would try ....

i would come up with another dimension to my frequency table.  this other
dimension would carry frequencies for the most common letters following
certain other letters.  for example, the letter 'V' is almost always
followed by a vowel.  so if i found a 'V' anywhere, i could check the
vowels after that instead of sticking to my one dimensional frequency table.


i live in minneapolis and write software for a living.  a real shocker i'm
sure.  haha.  fun?  is there time for that?  :-)  hey, if i told you my
innermost secret, it wouldn't be secret anymore.  :-)  and it might even
end up out on the internet somewhere!  haha.  yikes.

potm ideas....hmmmm.  although i didn't have time to code and submit an
entry, i really liked the one a few contests ago called 'scenic tour'.
more problems similar to that would be fun.  

               naive   111 ( 30/ 37/ 44)   gc Michael Chen
	Sorry ... No questionairre was returned for naive

             MereMOO  1021 (999/ 11/ 11)    c Pavel Zaletskiy
MereMOO submitted by Pavel Zaletskiy at pavelz@dpt.ustu.ru
++ Please summarize the algorithm used by MereMOO

My program works at two steps:
1) determining letters presenting in the word;
2) determining the word, consisting of that letters.

For the first step the following algorithm was used.
If amount of letters in the word is more than or equal to 7
then each guessing word consists of 3 different letters.
The first letter meets there 1 time, the second - 2 times and
the third - 4 times. Other positions fills with letters,
which are already known about that they present in the word.
Having known a result of the guess program can determine whether
each of three letters is in or out of the word.

If word length is smaller than 7 than letters are divided into groups.
And for each group program makes several guesses. In each guess
information about one group and one letter of another group can
be asked. If word length >=5 than in each guess program asks info
about one more letter which placed in the word in several copies
to determine it presence undoubtedly.

At second step program generates all possible words. For each word calculates
some criteria and than choose the word with best criteria. This criteria
estimates as probability of that this word is an English one and words
also amount of words which will exclude in sorting out of the next guess.

++ Did MereMOO use any letter frequency tables?  Where did you get them?

Yes, it uses table for pairs and also it knows about mostly used triples and
chains of four letters.

++ If I had it to do all over - here's what I would try ....

I only can suggest to try to add some heuristics for short words (n<=6)


I live in Russian city Ekaterinburg.
And I do it just for fun. Also for my brains not to become too soft.
It is interesting for me to try such things the more so because I can
do them rather well.
For work? - May be, I'm a student now and it can be a good recommendation.
Of cause if it works. :-)

          u_can_mu_2  1051 ( 22/ 30/999)   gc Mark McCann
u_can_mu_2 submitted by Mark McCann at McCann.Mark.MA@bhp.com.au
++ Please summarize the algorithm used by u_can_mu_2
	The u_can_mu_2 program used a simple algorithim. It 
was based on two parts, discovering the letters and discovering 
the positions. The first part was acheived by outputing all 
the letters and using the number of misses as a guide. The 
second part was achieved by trying the letters found in each 
place and using the number of hits to indicate it is in the 
same position. I combined the two methods to acheive a moderately 
fast implementation.
	The program stores in the file the word discovered so 
far, the size, the letter it last tried and the letters it knows 
are in the word the number of times it is in and the last 
position it tried. 

   read input, 
   if ( size only ) 
      initialise file
      read file
      if no misses == 0 
         if first known letter is unknown occurances 
            set occurances to no hits. 
            add the letter last tried to the known letter 
            get the next letter to try
            if  no other known letter 
               set occurances to no hits - no in word 
               set occurances to unknown
      else if no hits does not equal no of letters in words 
         add the known letter to the known word at the position
         get the next letter to try
         increment the position to try for the known letter
         get the next letter to try

   if the first known letter has unknown occurances 
      output a word completely with the letter
      create the output word from : the default word, the 
      known letter at the next position and the next letter 
      to try in the rest of the positions. 

   save the file and output the word.
	Simple but I made a small mistake in one of the checks 
which is why it failed but didn't bother fixing it as I 
thought I could create a better method. (See question 3)		

++ Did u_can_mu_2 use any letter frequency tables?  Where did you get them?
	Yes, found the letter frequency in a posting on the 
	POTM message board. The order is not that important 
	so I wasn't too woried about its accuracy. 

++ If I had it to do all over - here's what I would try ....
	I started on a plan to use a constraint based approach. 
	The program would choose a random set of letters and output it. 
	The program would then read in the result as a constraint. 
The program would then choose the next set of letters so that it 
conforms to this constraint. It would output it. 
	The program would continue to read, add the new constraint 
and choose the letters and their positions so that they did not 
violate any of the constraints read in so far. 
	The tricky part was choosing the letters so that you didn't 
run out of possibilities. To achieve this the program ordered the 
constraints so that the most restrictive were complied with first. 
Unfortunately the program was only partially successful and I ran 
out of time to fix the problems.    

	Newcastle, Australia (A city about 300km up the coast from 
		Sydney. Famouse for its hunter valley wines.)
	Argue with progect managers 
	Argue with project managers
	I really don't enjoy arguing with project managers
	An aid to argue with project managers. 

      Scrooge_MooDuc  2006 (  8/999/999)    C Boris Bulayev
Scrooge_MooDuc submitted by Boris Bulayev at bbboris@rinet.ru
++ Please summarize the algorithm used by Scrooge_MooDuc

1. To find all the good letters.
2. To put them onto correct places
3. To fill the spaces.

++ Did Scrooge_MooDuc use any letter frequency tables?  Where did you get them?


++ If I had it to do all over - here's what I would try ....

normal genetic algorithm with letter frequency tables


hasn't changed

                Stan  2006 (  8/999/999)   gC Steven Burnap
	Sorry ... No questionairre was returned for Stan

              GotMoo  2019 ( 21/999/999)  JAVA Jeffrey Fielding
GotMoo submitted by Jeffrey Fielding at JJProg@cyberbury.net
++ Please summarize the algorithm used by GotMoo
GotMoo first guesses a word made up of all A's. Then, for each word
after that, it starts with its last guess and increments the characters
until it finds a word that could be the answer. It checks each word
against each of its previous guesses by putting in the test word as the
answer and calculating the results for its previous guesses. If the
results differ from the actual results, it doesn't guess that word. When
it comes to a word that works, it guesses it.

++ Did GotMoo use any letter frequency tables?  Where did you get them?
I tried using letter frequency tables by writing another program to
count the letter frequencys for each word and letter position in that
word in a list of words that was on the message board. It improved the
program a bit, but it didn't work very well when I ported it to Java (I
wrote most of my test programs in Euphoria since it is simple and

++ If I had it to do all over - here's what I would try ....
I'd try to get my program to work using the algorithms I described, and
I would add some procedures to determine the probability that a
particular word is actually a word.

I live in Connecticut. I like to program and play chess.

           Moohahaha  2997 (999/999/999)    c Gneen Halfling
	Sorry ... No questionairre was returned for Moohahaha

          UdderGuess  2997 (999/999/999)  JAVA Jim Rothenberger
	Sorry ... No questionairre was returned for UdderGuess

               XYZZY  2997 (999/999/999)   gC Joshua Huntington
	Sorry ... No questionairre was returned for XYZZY

              bovine  2997 (999/999/999)    c Mark Masten
	Sorry ... No questionairre was returned for bovine

             nayobka  2997 (999/999/999)    c Evgeniya Bagdasarova
	Sorry ... No questionairre was returned for nayobka

Make your own free website on Tripod.com