Hierarchical classification is an obscure but simple concept. The idea is that you arrange two or more classifiers in a hierarchy such that the classifiers lower in the hierarchy are only used if a higher classifier returns an appropriate result.
For example, the text-processing.com sentiment analysis demo uses hierarchical classification by combining a subjectivity classifier and a polarity classifier. The subjectivity classifier is first, and determines whether the text is objective or subjective. If the text is objective, then a label of
neutral is returned, and the polarity classifier is not used. However, if the text is subjective (or polar), then the polarity classifier is used to determine if the text is positive or negative.
Hierarchical classification is a useful way to combine multiple binary classifiers, if you have a hierarchy of labels that can modeled as a binary tree. In this model, each branch of the tree either continues on to a new pair of branches, or stops, and at each branching you use a classifier to determine which branch to take.