Coding Countdown: The Letters Round

For the benefit of the uninitiated, Countdown is a long-running British TV game show comprising of multiple ‘letters’ and ‘numbers’ rounds, and concluding with an anagrams game called the ‘conundrum’. If, unlike me, you didn’t spend your entire childhood watching it with your grandparents, find out more here.

This blog post is about cheating in the letters round with the help of a bit of Python code. It seemed worth writing about as the obvious solution is far from optimal, and the optimal solution is far from obvious (to me, anyway). I’ll talk you through how to find all of the possible five-, six-, seven-, eight- and nine-letter English words in under 0.2 seconds.

Setting Up the Problem

Obviously, any algorithm is going to need some way of finding out whether or not some fudged combination of letters is a (British) English word.  For this purpose, I downloaded a spell-checking text file containing around 80,000 words from here.

For the purposes of testing, I also thought it would be best to simulate the drawing of vowels and consonants at the beginning of each game. Allegedly, ‘the frequencies of the letters within each pile are weighted according to their frequency in natural English, in the same manner as Scrabble. Candidates are allowed to choose how many vowels and consonants go on the board, to a total of nine, but typically pick between two and four vowels.

Some basic code to start us off, then, is going to look like this:

from random import sample, randint
from sys import argv

def solve(letters):
    '''
    Function to return the 'winning' countdown words
    for a given string of letters.
    '''
    words_file = open('brit-a-z.txt')
    # We'll write code to find the winning words here

# Allow the option for the user to pass their own letters.
# If none are given, generate nine random letters.
try:
    letters = argv[1]
except IndexError:
    VOWELS = 'eeeeeeeeeeeeaaaaaaaaaiiiiiiiiioooooooouuuu'
    CONSONANTS = 'nnnnnnrrrrrrttttttllllssssddddgggbbccmmppffhhvvwwyykjxqz'
    n_vowels = randint(2,4)
    vowels = sample(VOWELS, n_vowels)
    consonants = sample(CONSONANTS, 9 - n_vowels)
    letters = vowels + consonants
# Pass these letters to 'solve' function.
solve(letters)

The Obvious Way

The obvious way to solve this problem is to look at every possible permutation of the nine letters, and check each one against the word list. Python’s itertools.permute function will handle this nicely for us.

Supposing we were interested in words of five letters or more, we can calculate the number of possible permutations as follows (‘nPr’ denotes the number of r-permutations of n):

9P9 + 9P8 + 9P7 + 9P6 + 9P5 = 362880 + 362880 + 181440 + 60480 + 15120 = 982800

As for looking up words in the list, we could read the words file into a Python list using the ‘readlines’ method, loop over the permutations, then ask the computer to check if each permutation was in the list. This would be a bad idea for two reasons:

  • Lookups take a long time in large lists as they are O(n) operations.
  • Some of the words in the dictionary have fewer than five letters, and some have more than nine, meaning that we’re wasting time flicking through extraneous words.

As a solution, we could do something like this:

valid_words = set(word.strip() for word in f.readlines()
    if len(word) > 4 and len(word) < 10)

Here, we use a list comprehension to filter out words with too many or too few letters, and put them all in a set, in which lookup is close to O(1). (The .strip() method removes any newline and whitespace characters from the words.)

Even better, since the list of words isn’t going to change each time we run the program, we could pickle the valid_words set to a file and then just load it up again each time we run the program. Always try to avoid repeated effort, whether it’s yours or the computer’s! Our ‘solve’ function now looks like this, and takes around 0.6 seconds to solve a typical countdown letters round.

def solve(letters):
    '''
    Function to return the 'winning' countdown words
    for a given string of letters.
    '''
    valid_words = pickle.load( open('word_set.p') )
    for i in xrange(5, 10):
        print '\nLooking for %s letter words...\n'%(i)
        log = set()
        for permuted_letters in permutations(letters, i):
            word = ''.join(permuted_letters)
            if word in valid_words:
                log.add(word)
        found_words = ', '.join(
            sorted(word for word in log))
        print found_words

A Better Way

I grant that 0.6 seconds isn’t all that bad. However, the algorithm doesn’t scale well. What if we wanted to play ‘Super-Countdown’ with 12 letters? 12 letters have 12! permutations, which is over 479 million. We would, literally, be here all day.

A much more sophisticated method is to remove the requirement to permute the letters by sorting them. Let me demonstrate using the letters a, p and t.

There are 3! = 6 ways of permuting these letters. If we went through all six, we would find that we could spell ‘apt‘, ‘pat’ and ‘tap‘. However, what if we had a pre-sorted dictionary which told us every word that could be spelt using the letters a, p and t? We would only need to make one check, not six. This is crucial. This changes the problem from a permutations problem to a combinations problem. We can calculate the number of possible combinations for five to nine letters from nine as follows (‘nCr’ denotes the number of r-combinations of n):

9C9 + 9C8 + 9C7 + 9C6 + 9C5 = 1 + 9 + 36 + 84 + 126 = 256

Yes, we just reduced the search space by a factor of almost 4000.

Now, about that pre-sorted dictionary. We could quickly pickle something to do this. Note that dictionary lookup is also ~ O(1). You can read about the ‘-1′ argument to pickle.dump here, or simply think of it as magic and move on.

import cPickle as pickle

f = open('brit-a-z.txt')
valid_words = {}
for word in f:
    word = word.strip()
    l = len(word)
    if l < 5 or l > 9 or "'" in word:
        continue
    sorted_word = ''.join(sorted(word))
    if sorted_word not in valid_words:
        valid_words[sorted_word] = [word]
    else:
        valid_words[sorted_word].append(word)
# Improve performance slightly by storing the
# valid words as tuples instead of lists.
for key, value in valid_words.items():
    valid_words[key] = tuple(value)
pickle.dump(valid_words, open('word_dict.p', 'w'), -1)

Our finished Python code looks like this. It solves the Countdown letters round in about 0.11 seconds on an Intel i5 Apple Mac; most of which is taken up through loading the valid_words dictionary into memory.

from random import sample, randint
from sys import argv
from itertools import combinations
import cPickle as pickle

def solve(letters):
    '''
    Function to return the 'winning' countdown words
    for a given string of letters.
    '''
    valid_words = pickle.load( open('word_dict.p') )
    letters = sorted(letters)
    for i in xrange(5, 10):
        print '\nLooking for %s letter words...\n'%(i)
        log = set()
        for combined_letters in combinations(letters, i):
            sorted_word = ''.join(combined_letters)
            if sorted_word in valid_words:
                for word in valid_words[sorted_word]:
                    log.add(word)
        found_words = ', '.join(
            sorted(word for word in log))
        print found_words

# Allow the option for the user to pass their own letters.
# If none are given, generate nine random letters.
try:
    letters = argv[1]
except IndexError:
    VOWELS = 'eeeeeeeeeeeeaaaaaaaaaiiiiiiiiioooooooouuuu'
    CONSONANTS = 'nnnnnnrrrrrrttttttllllssssddddgggbbccmmppffhhvvwwyykjxqz'
    n_vowels = randint(2,4)
    vowels = sample(VOWELS, n_vowels)
    consonants = sample(CONSONANTS, 9 - n_vowels)
    letters = vowels + consonants
# Pass these letters to 'solve' function.
solve(letters)

Do you have any ideas on how this could be optimised further? Let me know!

The full source code can be downloaded as a zip file by clicking the following link:
countdown_letters