<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>WIRE Collection:</title>
    <link>http://hdl.handle.net/2436/8715</link>
    <description />
    <pubDate>Mon, 20 May 2013 19:51:24 GMT</pubDate>
    <dc:date>2013-05-20T19:51:24Z</dc:date>
    <item>
      <title>Discovery of language resources on the Web: information extraction from heterogeneous documents</title>
      <link>http://hdl.handle.net/2436/15899</link>
      <description>Title: Discovery of language resources on the Web: information extraction from heterogeneous documents
Authors: Pekar, Viktor; Evans, Richard
Abstract: The present article is concerned with the problem of automatic database population via information extraction (IE) from web pages obtained from heterogeneous sources, such as those retrieved by a domain crawler. Specifically, we address the task of filling single multi-field templates from individual documents, a common scenario that involves free-format documents with the same communicative goal such as job adverts, CVs, or meeting/seminar announcements. We discuss challenges that arise in this scenario and propose solutions to them at different levels of the processing of web page content. Our main focus is on the issue of information extraction, which we address with a two-step machine learning approach that first aims to determine segments of a page that are likely to contain relevant facts and then delimits specific natural language expressions with which to fill template fields. We also present a range of techniques for the enrichment of web pages with semantic annotations, such as recognition of named entities, domain terminology and coreference resolution, and examine their effect on the information extraction method. We evaluate the developed IE system on the task of automatically populating a database with information on language resources available on the web.
Description: Metadata only</description>
      <pubDate>Mon, 01 Jan 2007 00:00:00 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2436/15899</guid>
      <dc:date>2007-01-01T00:00:00Z</dc:date>
    </item>
    <item>
      <title>A computer-aided environment for generating multiple-choice test items</title>
      <link>http://hdl.handle.net/2436/15898</link>
      <description>Title: A computer-aided environment for generating multiple-choice test items
Authors: Mitkov, Ruslan; An ha, Le; Karamanis, Nikiforos
Abstract: This paper describes a novel computer-aided procedure for generating multiple-choice test items from electronic documents. In addition to employing various Natural Language Processing techniques, including shallow parsing, automatic term extraction, sentence transformation and computing of semantic distance, the system makes use of language resources such as corpora and ontologies. It identifies important concepts in the text and generates questions about these concepts as well as multiple-choice distractors, offering the user the option to post-edit the test items by means of a user-friendly interface. In assisting test developers to produce items in a fast and expedient manner without compromising quality, the tool saves both time and production costs.
Description: Metadata only</description>
      <pubDate>Sun, 01 Jan 2006 00:00:00 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2436/15898</guid>
      <dc:date>2006-01-01T00:00:00Z</dc:date>
    </item>
    <item>
      <title>A framework for named entity recognition in the open domain</title>
      <link>http://hdl.handle.net/2436/15896</link>
      <description>Title: A framework for named entity recognition in the open domain
Authors: Nicolov, Nicolas; Bontcheva, Kalina; Angelova, Galia; Mitkov, Ruslan
Description: Paper presented at the 2003 International Conference on “Recent Advances in Natural Language Processing”</description>
      <pubDate>Thu, 01 Jan 2004 00:00:00 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2436/15896</guid>
      <dc:date>2004-01-01T00:00:00Z</dc:date>
    </item>
    <item>
      <title>Applying Machine Learning Toward an Automatic Classification of It</title>
      <link>http://hdl.handle.net/2436/15859</link>
      <description>Title: Applying Machine Learning Toward an Automatic Classification of It
Authors: Evans, Richard
Abstract: In the majority of cases, the pronoun it illustrates nominal anaphora, tending to refer back to another noun phrase in the text. However, in a significant minority of cases, the pronoun is used in exceptional ways that fail to demonstrate strict nominal anaphora. The identification of these uses of it is important in all fields where pronoun resolution has an impact After a survey of previous treatments of the pronoun it in the literature, some features of instances of it are proposed that can be used in a novel memory-based learning method to automatically classify those instances. On evaluating the method, it is found that the implemented system performs comparably well with respect to a rule-based system, and with an extended training set it is expected that the accuracy of the system will improve, offering greater coverage than rule-based methods.
Description: Metadata only</description>
      <pubDate>Mon, 01 Jan 2001 00:00:00 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2436/15859</guid>
      <dc:date>2001-01-01T00:00:00Z</dc:date>
    </item>
  </channel>
</rss>

