An important component of question answering systems is question classification. The
task of question classification is to predict the entity type of the answer of a natural
In this work I introduce two new semantic features which improve the accuracy of classification.
Furthermore, I developed a weighed approach to optimally combine different features.
I also applied Latent Semantic Analysis (LSA) technique to reduce the large feature
space of questions to a much smaller and efficient feature space. I adopted two different
classifiers: Back-Propagation Neural Networks (BPNN) and Support Vector Machines
I tested our proposed approaches on the well-known UIUC dataset and succeeded to
achieve a new record on the accuracy of classification on this dataset.
Related material to my thesis:
- [ Master Thesis Report]
- B. Loni, S. Khoshnevis and P. Wiggers. Latent Semantic Analysis for Question Classification with Neural Networks, Proceeding of IEEE Speech Recognition and Understanding Workshop, Dec 2011
- B. Loni, G. Tulder, P. Wiggers, M. Loog and D. Tax. Question Classification by Weighted Combination of Features, Proceeding of 14th international conference on Text, Speech and Dialog, Sep 2011
- B. Loni. A Survey of State-of-the-Art Methods on Question Classification, Literature Survey, Published on TU Delft Repository, Jun 2011 [ PDF ]
Question Classification Library is a open source library written by Babak Loni and Gijs van Tulder, for classifying natural language questions.
This library is very flexible can be customized based on different features. It adopted Java implementation of LIBSVM and Neuroph framework.
This library can extract the following features from a natural language question:
- Tagged Unigrams
- Query Expansion
- Question Category
- Related Words
A rich feature of this library is to extract semantic features based on WordNet heirarchy.
We also impemented Latent Semantic Indexing (LSI) method in this library to reduce the large features space of questions. The following two classifers
or combination of them with different parameters can be used in this library:
- Support Vector Machines
- Back-Propagation Neural Networks
A comprehensive analysis of different features and extraction techniques are described in my
master thesis report .
You can download the source code of this library. [ QC Lib Source Code ]
For question about this library contact firstname.lastname@example.org