Pāriet uz galveno navigāciju Pāriet uz meklēšanu Pāriet uz galveno saturu

Portable extraction of partially structured facts from the web

  • Andrew Salway*
  • , Liadh Kelly
  • , Inguna Skadiņa
  • , Gareth J.F. Jones
  • *Šī darba korespondējošais autors
  • Dublin City University
  • Tilde Company

Zinātniskās darbības rezultāts: Nodaļa grāmatā/enciklopēdijā/konferences krājumāKonferences zinātniskais rakstsPētniecībakoleģiāli recenzēts

3 Atsauces (Scopus)

Kopsavilkums

A novel fact extraction task is defined to fill a gap between current information retrieval and information extraction technologies. It is shown that it is possible to extract useful partially structured facts about different kinds of entities in a broad domain, i.e. all kinds of places depicted in tourist images. Importantly the approach does not rely on existing linguistic resources (gazetteers, taggers, parsers, etc.) and it ported easily and cheaply between two rather different languages (English and Latvian). Previous fact extraction from the web has focused on the extraction of structured data, e.g. (Building-LocatedIn-Town). In contrast we extract richer and more interesting facts, such as a fact explaining why a building was built. Enough structure is maintained to facilitate subsequent processing of the information. For example, the partial structure enables straightforward template-based text generation. We report positive results for the correctness and interest of English and Latvian facts and for their utility in enhancing image captions.

OriģinālvalodaAngļu
Rīkotāja publikācijas nosaukumsAdvances in Natural Language Processing - 7th International Conference on NLP, IceTAL 2010, Proceedings
Lapas345-356
Lapu skaits12
DOIs
Publikācijas statussPublicēts - 2010
Ārēji publicēts
Pasākums7th International Conference on NLP, IceTAL 2010 - Reykjavik, Islande
Ilgums: 16 aug. 201018 aug. 2010

Publikāciju sērijas

NosaukumsLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Sējums6233 LNAI
ISSN (Drukātā versija)0302-9743
ISSN (Elektroniskā versija)1611-3349

Konference

Konference7th International Conference on NLP, IceTAL 2010
Valsts/TeritorijaIslande
PilsētaReykjavik
Periods16/08/1018/08/10

Nospiedums

Uzziniet vairāk par pētniecības tēmām “Portable extraction of partially structured facts from the web”. Kopā tie veido unikālu nospiedumu.

Citēt šo