Enhanced Question Classification with Optimal Combination of Features

An important component of question answering systems is question classification. The task of question classification is to predict the entity type of the answer of a natural language question.

In this work I introduce two new semantic features which improve the accuracy of classification. Furthermore, I developed a weighed approach to optimally combine different features. I also applied Latent Semantic Analysis (LSA) technique to reduce the large feature space of questions to a much smaller and efficient feature space. I adopted two different classifiers: Back-Propagation Neural Networks (BPNN) and Support Vector Machines (SVM). I tested our proposed approaches on the well-known UIUC dataset and succeeded to achieve a new record on the accuracy of classification on this dataset.

Related material to my thesis:

Question Classification Library

Question Classification Library is a open source library written by Babak Loni and Gijs van Tulder, for classifying natural language questions. This library is very flexible can be customized based on different features. It adopted Java implementation of LIBSVM and Neuroph framework. This library can extract the following features from a natural language question:

  • Unigrams
  • Bigrams
  • Word-Shapes
  • Wh-Words
  • Head-word
  • Head-rule
  • POS-Tags
  • Tagged Unigrams
  • Hypernyms
  • Query Expansion
  • Question Category
  • Related Words

A rich feature of this library is to extract semantic features based on WordNet heirarchy. We also impemented Latent Semantic Indexing (LSI) method in this library to reduce the large features space of questions. The following two classifers or combination of them with different parameters can be used in this library:

  • Support Vector Machines
  • Back-Propagation Neural Networks

A comprehensive analysis of different features and extraction techniques are described in my master thesis report .

You can download the source code of this library. [ QC Lib Source Code ]

For question about this library contact babak.loni@gmail.com