From hyperlinks to Semantic Web properties using Open Knowledge Extraction

Open information extraction approaches are useful but insufficient alone for populating the Web with machine readable information as their results are not directly linkable to, and immediately reusable from, other Linked Data sources. This work proposes a novel paradigm, named Open Knowledge Extraction, and its implementation (Legalo) that performs unsupervised, open domain, and abstractive knowledge extraction from text for producing machine readable information. The implemented method is based on the hypothesis that hyperlinks (either created by humans or knowledge extraction tools) provide a pragmatic trace of semantic relations between two entities, and that such semantic relations, their subjects and objects, can be revealed by processing their linguistic traces (i.e. the sentences that embed the hyperlinks) and formalised as Semantic Web triples and ontology axioms. Experimental evaluations conducted on validated text extracted from Wikipedia pages, with the help of crowdsourcing, confirm this hypothesis showing high performances.

Publication type: 
Author or Creator: 
Presutti Valentina
Nuzzolese Andrea Giovanni
Consoli Sergio
Gangemi Aldo
Reforgiato Recupero Diego
IOS Press, Amsterdam , Paesi Bassi
Semantic web (Print) 7 (2016): 351–378. doi:10.3233/SW-160221
info:cnr-pdr/source/autori:Presutti Valentina; Nuzzolese Andrea Giovanni; Consoli Sergio; Gangemi Aldo; Reforgiato Recupero Diego/titolo:From hyperlinks to Semantic Web properties using Open Knowledge Extraction/doi:10.3233/SW-160221/rivista:Semantic web
Resource Identifier:
ISTC Author: 
Aldo Gangemi's picture
Real name: 
Valentina Presutti's picture
Real name: