In order for relation extraction systems to obtain human-level performance, they must be able to incorporate relational patterns inherent in the data (for example, that one’s sister is likely one’s mother’s daughter, or that children are likely to attend the same college as their parents). Hand-coding such knowledge can be time-consuming and inadequate. Additionally, there may exist many interesting, unknown relational patterns that both improve extraction performance and provide insight into text. We describe a probabilistic extraction model that provides mutual benefits to both “top-down” relational pattern discovery and “bottom-up” relation extraction.


  author = {Aron Culotta and Andrew McCallum and Jonathan Betz},
  title = {Integrating probabilistic extraction models and data mining to discover relations and patterns in text},
  shortbooktitle = {HLT-NAACL},
  booktitle = {Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL)},
  year = {2006},
  pages = {296--303},
  address = {New York, NY},
  month = {June},