Abstract

Large templated factor graphs with complex structure that changes during inference have been shown to provide state-of-the-art experimental results on tasks such as identity uncertainty and information integration. However, learning parameters in these models is difficult because computing the gradients require expensive inference routines. In this paper we propose an online algorithm that instead learns preferences over hypotheses from the gradients between the atomic steps of inference. Although there are a combinatorial number of ranking constraints over the entire hypothesis space, a connection to the frameworks of sampled convex programs reveals a polynomial bound on the number of rankings that need to be satisfied in practice. We further apply ideas of passive aggressive algorithms to our update rules, enabling us to extend recent work in confidence-weighted classification to structured prediction problems. We compare our algorithm to structured perceptron, contrastive divergence, and persistent contrastive divergence, demonstrating substantial error reductions on two real-world problems (20% over contrastive divergence).

Citation

@inproceedings{wick09sample,
  author = {Michael Wick and Khashayar Rohanimanesh and {\bf Aron Culotta} and Andrew McCallum},
  title = {SampleRank: Learning preferences from atomic gradients},
  booktitle = {Neural Information Processing Systems (NIPS) Workshop on Advances in Ranking},
  shortbooktitle = {NIPS},
  year = {2009}
}