Spelling Replacers in Microsoft Speller Challenge

Microsoft/Bing recently introduced its Speller Challenge, and I immediately thought about using my spelling replacer code from Chapter 2, Replacing and Correcting Words, in Python Text Processing with NLTK Cookbook. The API is now online, and can be accessed by doing a GET request to http://text-processing.com/api/spellcorrect/?runID=replacers&q=WORD. With an Expected F1 of ~0.5, I’m currently at number 12 on the Leaderboard, though I don’t expect that position to last long (I was at 10 when I first wrote this). I’m actually quite suprised the score is as high as it is considering the simplicity / lack of sophistication – it means there’s merit in replacing repeating character and/or that Enchant generally gives decent spelling suggestions when controlled by edit distance. Here’s an outline of the code, which should make sense if you’re familiar with the replacers module from Replacing and Correcting Words in Python Text Processing with NLTK Cookbook:

repeat_replacer = RepeatReplacer()
spelling_replacer = SpellingReplacer()

def replacer_suggest(word):
    suggest = repeat_replacer.replace(word)

    if suggest == word:
        suggest = spelling_replacer.replace(word)

    return [(suggest, 1.0)]
  • Do they not release the code for any of the submissions? I couldn’t find any information on that on the web site. If that’s the case, entering into that contest sounds like a way to work for Microsoft for free… :/

  • The winners have to write a paper to collect their prize, but you’re right, it’s basically cheap R&D for Microsoft.