Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-23604
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSmith, Ellery-
dc.contributor.authorPapadopoulos, Dimitris-
dc.contributor.authorBraschler, Martin-
dc.contributor.authorStockinger, Kurt-
dc.date.accessioned2021-11-29T14:12:37Z-
dc.date.available2021-11-29T14:12:37Z-
dc.date.issued2021-
dc.identifier.issn0306-4379de_CH
dc.identifier.urihttps://digitalcollection.zhaw.ch/handle/11475/23604-
dc.description.abstractQuerying both structured and unstructured data via a single common query interface such as SQL or natural language has been a long standing research goal. Moreover, as methods for extracting information from unstructured data become ever more powerful, the desire to integrate the output of such extraction processes with ``clean'', structured data grows. We are convinced that for successful integration into databases, such extracted information in the form of ``triples'' needs to be both 1) of high quality and 2) have the necessary generality to link up with varying forms of structured data. It is the combination of both these aspects, which heretofore have been usually treated in isolation, where our approach breaks new ground. The cornerstone of our work is a novel, generic method for extracting open information triples from unstructured text, using a combination of linguistics and learning-based extraction methods, thus uniquely balancing both precision and recall. Our system called LILLIE (LInked Linguistics and Learning-Based Information Extractor) uses dependency tree modification rules to refine triples from a high-recall learning-based engine, and combines them with syntactic triples from a high-precision engine to increase effectiveness. In addition, our system features several augmentations, which modify the generality and the degree of granularity of the output triples. Even though our focus is on addressing both quality and generality simultaneously, our new method substantially outperforms current state-of-the-art systems on the two widely-used CaRB and Re-OIE16 benchmark sets for information extraction.de_CH
dc.language.isoende_CH
dc.publisherElsevierde_CH
dc.relation.ispartofInformation Systemsde_CH
dc.rightshttp://creativecommons.org/licenses/by/4.0/de_CH
dc.subjectInformation extractionde_CH
dc.subjectData integrationde_CH
dc.subjectMachine learning for database systemsde_CH
dc.subject.ddc006: Spezielle Computerverfahrende_CH
dc.titleLILLIE : information extraction and database integration using linguistics and learning-based algorithmsde_CH
dc.typeBeitrag in wissenschaftlicher Zeitschriftde_CH
dcterms.typeTextde_CH
zhaw.departementSchool of Engineeringde_CH
zhaw.organisationalunitInstitut für Informatik (InIT)de_CH
dc.identifier.doi10.1016/j.is.2021.101938de_CH
dc.identifier.doi10.21256/zhaw-23604-
zhaw.funding.euinfo:eu-repo/grantAgreement/EC/H2020/863410//INODE - Intelligent Open Data Exploration/INODEde_CH
zhaw.originated.zhawYesde_CH
zhaw.publication.statuspublishedVersionde_CH
zhaw.volume105de_CH
zhaw.publication.reviewPeer review (Publikation)de_CH
zhaw.webfeedDatalabde_CH
zhaw.webfeedInformation Engineeringde_CH
zhaw.webfeedZHAW digitalde_CH
zhaw.funding.zhawINODE – Intelligent Open Data Exploration (EU Horizon 2020)de_CH
zhaw.author.additionalNode_CH
zhaw.display.portraitYesde_CH
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2021_Smith_LILLIE_InformationSystems.pdf2.36 MBAdobe PDFThumbnail
View/Open
Show simple item record
Smith, E., Papadopoulos, D., Braschler, M., & Stockinger, K. (2021). LILLIE : information extraction and database integration using linguistics and learning-based algorithms. Information Systems, 105. https://doi.org/10.1016/j.is.2021.101938
Smith, E. et al. (2021) ‘LILLIE : information extraction and database integration using linguistics and learning-based algorithms’, Information Systems, 105. Available at: https://doi.org/10.1016/j.is.2021.101938.
E. Smith, D. Papadopoulos, M. Braschler, and K. Stockinger, “LILLIE : information extraction and database integration using linguistics and learning-based algorithms,” Information Systems, vol. 105, 2021, doi: 10.1016/j.is.2021.101938.
SMITH, Ellery, Dimitris PAPADOPOULOS, Martin BRASCHLER und Kurt STOCKINGER, 2021. LILLIE : information extraction and database integration using linguistics and learning-based algorithms. Information Systems. 2021. Bd. 105. DOI 10.1016/j.is.2021.101938
Smith, Ellery, Dimitris Papadopoulos, Martin Braschler, and Kurt Stockinger. 2021. “LILLIE : Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms.” Information Systems 105. https://doi.org/10.1016/j.is.2021.101938.
Smith, Ellery, et al. “LILLIE : Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms.” Information Systems, vol. 105, 2021, https://doi.org/10.1016/j.is.2021.101938.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.