MOSIG Master 2ND YEAR Research
YEAR 2016/2017

MASTER TOPIC PROPOSAL

ADVISOR: Jérôme David and Jérôme Euzenat

EMAIL: Jerome:David#inria:fr, Jerome:Euzenat#inria:fr

TEAM: Exmo team, INRIA & Univ. Grenoble Alpes

LABORATORY: LIG

MASTER PROFILE: Artificial intelligence and the web

Reference number: Proposal n°2155

TITLE:

Extracting RDF link keys with Formal Context Analysis

The goal of the semantic web is to take advantage of formalised knowledge at the scale of the worldwide web. This has led to the release of a vast quantity of data expressed in semantic web formalisms (RDF) [Heath 2011a]. Part of the added value of linked data lies in the links identifying the same entity in different data sets as it allows for making inference between data sets. For instance, they may identify the same books and articles in different bibliographical data sources. So finding the manifestation of the same entity across several data sets is an important task of linked data.

One way of identifying entities is to use link keys which are a generalisation of keys usually found in data bases to several data sets. A link key [Atencia 2014b] is a statement of the form:

( {⟨p1, q1⟩,... ⟨pn, qn⟩} link key ⟨c, d⟩ )
stating that whatever an instance of the class c has the same values for properties p1,... pn as an instance of class d has for properties q1,... qn, then these two are the same entity. For example, it may be that a instance of the class Livre is equivalent to an instance of the class Novel as soon as their properties auteur and titre on the one side and creator and title on the other side have the same values.

Formal concept analysis (FCA) is a technique to extract concepts between two interdependent ordered sets [Ganter 1999a]. It as been used for infering database keys by providing the dependencies between maximal sets of attributes and the partitions of the data that they generate. We provided the generalisation needed for database link keys [Atencia 2014d]. For RDF link keys there are several issues:

The goal of the project is to study possible extensions of the proposed FCA-based link key extraction techniques to link keys by dealing with some of the issues above.

Expected results

References

[Atencia 2014b] Manuel Atencia, Jérôme David, Jérôme Euzenat, Data interlinking through robust link key extraction, Proc. 21st ECAI, Prague (CK), pp15-20, 2014
[Atencia 2014c] Manuel Atencia, Michel Chein, Madalina Croitoru, Michel Chein, Jérôme David, Michel Leclère, Nathalie Pernelle, Fatiha Saïs, François Scharffe, Danai Symeonidou, Defining key semantics for the RDF datasets: experiments and evaluations, in: Proc. 21st ICCS, Iasi (RO), pp65-78, 2014
[Atencia 2014d] Manuel Atencia, Jérôme David, Jérôme Euzenat, What can FCA do for database link key extraction?, Proc. ECAI workshop on "what can FCA do for AI?", Prague (CK), 2014
[Ganter 1999a] Bernhard Ganter, Rudolf Wille, Formal concept analysis: mathematical foundations, Springer, Berlin (DE), 1999
[Heath 2011a] Tom Heath and Christian Bizer, Linked Data: Evolving the Web into a Global Data Space, Morgan & Claypool, 2011
[Hacene 2013a] Mohamed Rouane Hacene, Marianne Huchard, Amedeo Napoli, Petko Valtchev, Relational concept analysis: mining concept lattices from multi-relational data, Annals of Mathematics and Artificial Intelligence


http://exmo.inria.fr/training/M2R-2016-fcakey.html

$Id: M2R-2016-fcakey.html,v 1.6 2021/12/17 16:02:27 euzenat Exp $