Programming Collective Intelligence Review

Programming Collective Intelligence

Programming Collective Intelligence is a great conceptual introduction to many common machine learning algorithms and techniques. It covers classification algorithms such as Naive Bayes and Neural Networks, and algorithmic optimization approaches like Genetic Programming. The book also manages to pick interesting example applications, such as stock price prediction and topic identification.

There are two chapters in particular that stand out to me. First is Chapter 6, which covers Naive Bayes classification. What stood out was that the algorithm presented is an online learner, which means it can be updated as data comes in, unlike the NLTK NaiveBayesClassifier, which can be trained only once. Another thing that caught my attention was Fisher’s method, which is not implemented in NLTK, but could be with a little work. Apparently Fisher’s method is great for spam filtering, and is used by the SpamBayes Outlook plugin (which is also written in Python).

Second, I found Chapter 9, which covers Support Vector Machines and Kernel Methods, to be quite intuitive. It explains the idea by starting with examples of linear classification and its shortfalls. But then the examples show that by scaling the data in a particular way first, linear classification suddenly becomes possible. And the kernel trick is simply a neat and efficient way to reduce the amount of calculation necessary to train a classifier on scaled data.

The final chapter summarizes all the key algorithms, and for many it includes commentary on their strengths and weaknesses. This seems like valuable reference material, especially for when you have a new data set to learn from, and you’re not sure which algorithms will help get the results you’re looking for. Overall, I found Programming Collective Intelligence to be an enjoyable read on my Kindle 3, and highly recommend it to anyone getting started with machine learning and Python, as well as anyone interested in a general survey of machine learning algorithms.

  • http://www.michaeldhealy.com Michael D. Healy

    First Python book I ever read was PCI, and I still pick it up to re-read sections from time to time.
    Very approachable although I wouldn’t suggest trying to learn Python from PCI . . . trust me on that one.
    Highly recommended.
    Michael

  • http://alexott.net Alex Ott

    Manning will also release similar book called “Machine learning in Action” with examples in Python…

  • http://alexott.net Alex Ott

    Manning will also release similar book called “Machine learning in Action” with examples in Python…

  • http://alexott.net Alex Ott

    Manning will also release similar book called “Machine learning in Action” with examples in Python…

  • http://www.prodigyproductionsllc.com LuCuS

    I too want to give a thumbs-up for “Programming Collective Intelligence”. I also wrote a review about PCI that can be read at http://www.prodigyproductionsllc.com/articles/programming/programming-collective-intelligence/. I even began porting many of the Python apps from the book into C# which can be found at http://www.prodigyproductionsllc.com/articles/programming/programming-collective-intelligence-in-c/.