LD4IE 2017, is the fifth international workshop on Linked Data for Information Extraction, co-located with ISWC2017.

LD4IE past editions:

LD4IE focuses on the exploitation of Linked Data for Web Scale Information Extraction (IE), which concerns extracting structured knowledge from unstructured/semi-structured documents on the Web. One of the major bottlenecks for the current state of the art in IE is the availability of learning materials (e.g., seed data, training corpora), which, typically are manually created and are expensive to build and maintain. Linked Data (LD) defines best practices for exposing, sharing, and connecting data, information, and knowledge on the Semantic Web using uniform means such as URIs and RDF. It has so far been created a gigantic knowledge source of Linked Open Data (LOD), which constitutes a mine of learning materials for IE. However, the massive quantity requires efficient learning algorithms and the unguaranteed quality of data requires robust methods to handle redundancy and noise. LD4IE intends to gather researchers and practitioners to address multiple challenges arising from the usage of LD as learning material for IE tasks, focusing on (i) modelling user defined extraction tasks using LD; (ii) gathering learning materials from LD assuring quality (training data selection, cleaning, feature selection etc.); (iii) robust algorithms for various IE tasks using LD; (iv) publishing IE results to the LOD cloud.

Call for Contributions


  • Modelling Extraction Tasks
    • extracting knowledge patterns for task modelling
    • user friendly approaches for querying linked data
  • Information Extraction
    • selecting relevant portions of LOD as training data
    • selecting relevant knowledge resources from linked data
    • IE methods robust to noise in training data
    • Information Extractions tasks/applications exploiting LOD (Wrapper induction, Table interpretation, IE from unstructured data, Named Entity Recognition, …)
    • Domain specific IE consuming and producing LOD (social data, scholarly data, health data, ...)
    • publishing information extraction results as Linked Data
    • linking extracted information to existing LOD datasets
  • Linked Data for Learning
    • assessing the quality of LOD data for training
    • select optimal subset of LOD to seed learning
    • managing incompleteness, noise, and uncertainty of LOD
    • scalable learning methods
    • pattern extraction from LOD

All submissions must be written in English. We accept the following formats of submissions:

  • Full paper with a maximum of 12 pages including references.
  • Short paper with a maximum of 6 pages including references.

Two formats are possible for the submission: PDF and HTML. PDF submissions must be formatted according to the information for LNCS Authors (http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0.). We would like to encourage you to submit your paper as HTML, in which case you need to submit a zip archive containing an HTML file and all used resources. If you are new to HTML submission these are good places to start:

In order to check if your HTML submission is compliant with the page limit constraint, please use one of the LNCS layouts and printing/storing it as PDF. Please submit your contributions electronically in PDF or HTML format to EasyChair
Accepted papers will be published online via CEUR-WS.

Important Dates

  • Abstract submission: 14th July 2016 ASAP before submission deadline
  • Paper submission deadline: 21st July 2017 28th July 2017
  • Author notification: 24th August 2017
  • Final version deadline: 1st September 2017
  • Workshop date: 22 October 2017


When: October 22nd 2017, 14:00-17:20
Where: room TC.2.01
Proceedings: CEUR volume and BibTeX file

15:20-16:00 Coffee Break


Workshop Chairs

Anna Lisa Gentile IBM Research Almaden

Andrea Giovanni Nuzzolese STLab, ISTC-CNR, Italy

Ziqi Zhang Nottingham Trent University, UK

Program Committee
Rabeeh Ayaz Abbasi,King Abdulaziz University, Jeddah, Saudi Arabia
Nitish Aggarwal, IBM Research Almaden, CA, US
Payam Barnaghi, University of Surrey
Pierpaolo Basile, University of Bari
Amparo E. Cano, Data Scientist at Cube Global
Annalina Caputo, ADAPT - School of Computer Science and Statistics, Trinity College Dublin
Claudia d'Amato, University of Bari
Mauro Dragoni, Fondazione Bruno Kessler - FBK-IRST
Anca Dumitrache, VU University Amsterdam
Darío Garigliotti, University of Stavanger
Ashutosh Jadhav, IBM Research Almaden, CA, US
Petr Knoth, KMi, The Open University
Vanessa Lopez , IBM Research
Andrea Moro, Microsoft, London
Varish Mulwad, GE Global Research
Matthias Nickles, National University of Ireland, Galway, Digital Enterprise Research Institute
Jay Pujara, University of California, Santa Cruz
Achim Rettinger, Karlsruhe Institute of Technology
Martin Rezk, Rakuten, Inc.
Giuseppe Rizzo, ISMB
Mariano Rodríguez Muro, IBM Research
Victoria Uren, Aston University

Contact Us