LD4IE2017

Overview

LD4IE 2017, is the fifth international workshop on Linked Data for Information Extraction, co-located with ISWC2017.

LD4IE past editions:

LD4IE focuses on the exploitation of Linked Data for Web Scale Information Extraction (IE), which concerns extracting structured knowledge from unstructured/semi-structured documents on the Web. One of the major bottlenecks for the current state of the art in IE is the availability of learning materials (e.g., seed data, training corpora), which, typically are manually created and are expensive to build and maintain. Linked Data (LD) defines best practices for exposing, sharing, and connecting data, information, and knowledge on the Semantic Web using uniform means such as URIs and RDF. It has so far been created a gigantic knowledge source of Linked Open Data (LOD), which constitutes a mine of learning materials for IE. However, the massive quantity requires efficient learning algorithms and the unguaranteed quality of data requires robust methods to handle redundancy and noise. LD4IE intends to gather researchers and practitioners to address multiple challenges arising from the usage of LD as learning material for IE tasks, focusing on (i) modelling user defined extraction tasks using LD; (ii) gathering learning materials from LD assuring quality (training data selection, cleaning, feature selection etc.); (iii) robust algorithms for various IE tasks using LD; (iv) publishing IE results to the LOD cloud.

Call for Contributions

Topics

Modelling Extraction Tasks

extracting knowledge patterns for task modelling
user friendly approaches for querying linked data

Information Extraction

selecting relevant portions of LOD as training data
selecting relevant knowledge resources from linked data
IE methods robust to noise in training data
Information Extractions tasks/applications exploiting LOD (Wrapper induction, Table interpretation, IE from unstructured data, Named Entity Recognition, …)
Domain specific IE consuming and producing LOD (social data, scholarly data, health data, ...)
publishing information extraction results as Linked Data
linking extracted information to existing LOD datasets

Linked Data for Learning

assessing the quality of LOD data for training
select optimal subset of LOD to seed learning
managing incompleteness, noise, and uncertainty of LOD
scalable learning methods
pattern extraction from LOD

All submissions must be written in English. We accept the following formats of submissions:

Full paper with a maximum of 12 pages including references.
Short paper with a maximum of 6 pages including references.

Two formats are possible for the submission: PDF and HTML. PDF submissions must be formatted according to the information for LNCS Authors (http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0.). We would like to encourage you to submit your paper as HTML, in which case you need to submit a zip archive containing an HTML file and all used resources. If you are new to HTML submission these are good places to start:

dokieli is a clientside editor for decentralised article publishing, annotations and social interactions. It is compliant with the Linked Research initiative. Example papers using LNCS and ACM: http://csarven.ca/dokieli-rww and on website https://dokie.li/.
Research Articles in Simplified HTML (RASH) format: documentation and stylesheets at https://github.com/essepuntato/rash

In order to check if your HTML submission is compliant with the page limit constraint, please use one of the LNCS layouts and printing/storing it as PDF. Please submit your contributions electronically in PDF or HTML format to EasyChair
Accepted papers will be published online via CEUR-WS.

Program

When: October 22nd 2017, 14:00-17:20
Where: room TC.2.01
Proceedings: CEUR volume and BibTeX file

14:00-14:40 Invited Talk: Capturing Social and Clinical Knowledge for Personalised Care
Vanessa Lopez
14:40-15:00 Multilingual Entity Linking: Comparing English and Spanish
Henry Rosales-Méndez,Barbara Poblete, Aidan Hogan
15:00-15:20 Semi-Automatic Example-Driven Linked Data Mapping Creation
Pieter Heyvaert, Anastasia Dimou, Ruben Verborgh, Erik Mannens

15:20-16:00 Coffee Break

16:00-16:20 Towards a Large Corpus of Richly Annotated Web Tables for Knowledge Base Population
Basil Ell, Sherzod Hakimov, Philipp Braukmann, Lorenzo Cazzoli, Fabian Kaupmann, Amerigo Mancino, Junaid Altaf Memon, Kai Rother, Abhishek Saini, Philipp Cimiano
16:20-16:40 Understanding Knowledge Networks [canonical URL]
Krishna Mangaladevi, Wouter Beek, Tobias Kuhn
16:40-17:00 Discrimination of Word Senses with Hypernyms
Artem Revenko,Victor Mireles
17:00-17:20 Towards Odalic, a Semantic Table Interpretation Tool in the ADEQUATe Project
Tomas Knap

Organization

Workshop Chairs

Anna Lisa Gentile IBM Research Almaden

Andrea Giovanni Nuzzolese STLab, ISTC-CNR, Italy

Ziqi Zhang Nottingham Trent University, UK

Program Committee
Rabeeh Ayaz Abbasi,King Abdulaziz University, Jeddah, Saudi Arabia
Nitish Aggarwal, IBM Research Almaden, CA, US
Payam Barnaghi, University of Surrey
Pierpaolo Basile, University of Bari
Amparo E. Cano, Data Scientist at Cube Global
Annalina Caputo, ADAPT - School of Computer Science and Statistics, Trinity College Dublin
Claudia d'Amato, University of Bari
Mauro Dragoni, Fondazione Bruno Kessler - FBK-IRST
Anca Dumitrache, VU University Amsterdam
Darío Garigliotti, University of Stavanger
Ashutosh Jadhav, IBM Research Almaden, CA, US
Petr Knoth, KMi, The Open University
Vanessa Lopez , IBM Research
Andrea Moro, Microsoft, London
Varish Mulwad, GE Global Research
Matthias Nickles, National University of Ireland, Galway, Digital Enterprise Research Institute
Jay Pujara, University of California, Santa Cruz
Achim Rettinger, Karlsruhe Institute of Technology
Martin Rezk, Rakuten, Inc.
Giuseppe Rizzo, ISMB
Mariano Rodríguez Muro, IBM Research
Victoria Uren, Aston University