As nltk-trainer becomes more stable, I realized that I needed some way to test the command line scripts. My previous ad-hoc method of “test whatever script options I can remember” was becoming unwieldy and unreliable. But how do you make repeatable tests for a command line script? It doesn’t really fit into the standard unit testing model.
Enter roundup by Blake Mizerany. (NOTE: do not try to do apt-get install roundup
. You will get an issue tracking system, not a script testing tool).
Roundup provides a great way to prevent shell bugs by creating simple test functions within a shell script. Here’s the first dozen lines of train_classifier.sh, which you can probably guess tests train_classifier.py:
#!/usr/bin/env roundup describe "train_classifier.py" it_displays_usage_when_no_arguments() { ./train_classifier.py 2>&1 | grep -q "usage: train_classifier.py" } it_cannot_find_foo() { last_line=$(./train_classifier.py foo 2>&1 | tail -n 1) test "$last_line" "=" "ValueError: cannot find corpus path for foo" }
describe
is like the name of a module or test case, and all test functions begin with test_
. Within the test functions, you use standard shell commands that should produce no output on success (like grep -q
or the test command). You can also match multiple lines of output, as in:
it_trains_movie_reviews_paras() { test "$(./train_classifier.py movie_reviews --no-pickle --no-eval --fraction 0.5 --instances paras)" "=" "loading movie_reviews 2 labels: ['neg', 'pos'] 1000 training feats, 1000 testing feats training NaiveBayes classifier" }
Once you’ve got all your test functions defined, make sure your test script is executable and roundup is installed, then run your test script. You’ll get nice output that looks like:
nltk-trainer$ tests/train_classifier.sh train_classifier.py it_displays_usage_when_no_arguments: [PASS] it_cannot_find_foo: [PASS] it_cannot_import_reader: [PASS] it_trains_movie_reviews_paras: [PASS] it_trains_corpora_movie_reviews_paras: [PASS] it_cross_fold_validates: [PASS] it_trains_movie_reviews_sents: [PASS] it_trains_movie_reviews_maxent: [PASS] it_shows_most_informative: [PASS] ========================================================= Tests: 9 | Passed: 9 | Failed: 0
So far, roundup has been a perfect tool for testing all the nltk-trainer scripts, and the only downside is the one-time manual installation. I highly recommend it for anyone writing custom commands and scripts, no matter what language you use to write them.