Forward Decoding Kernel Machines: A Hybrid HMM/SVM Approach to Sequence Recognition
Forward Decoding Kernel Machines (FDKM) combine large-margin
classifiers with Hidden Markov Models (HMM) for Maximum a Posteriori
(MAP) adaptive sequence estimation. State transitions in the sequence
are conditioned on observed data using a kernel-based probability
model, and forward decoding of the state transition probabilities with
the sum-product algorithm directly produces the MAP sequence. The
parameters in the probabilistic model are trained using a recursive
scheme that maximizes a lower bound on the regularized cross-entropy.
The recursion performs an expectation step on the outgoing state of
the transition probability model, using the posterior probabilities
produced by the previous maximization step. Similar to
Expectation-Maximization (EM), the FDKM recursion deals effectively
with noisy and partially labeled data.
We also introduce a multi-class support vector machine for sparse
conditional probability regression, GiniSVM based on a quadratic
formulation of entropy. Experiments with benchmark classification
data show that GiniSVM generalizes better than other multi-class SVM
techniques. In conjunction with FDKM, GiniSVM produces a sparse
kernel expansion of state transition probabilities, with drastically
fewer non-zero coefficients than kernel logistic regression.
Preliminary evaluation of FDKM with GiniSVM on a subset of the TIMIT
speech database reveals significant improvements in phoneme
recognition accuracy over other SVM and HMM techniques.