On September 14, 2011, I’ll be giving a 20 minute overview of NLTK for the San Francisco Python Meetup Group. Since it’s only 20 minutes, I can’t get into too much detail, but I plan to quickly cover the basics of:
- tokenization and why it’s not as easy as
str.split()
- part-of-speech tagging and why it’s important
- chunking and named entity recognition
- text classification and how it works for sentiment analysis
- training your own models with nltk-trainer
I’ll also be soliciting feedback for a NLTK Tutorial at PyCON 2012. So if you’ll be at the meetup and are interested in attending a NLTK tutorial, come find me and tell me what you’d want to learn.
Updated 9/15/2011: Slides from the talk are online – NLTK in 20 minutes