The web has been constantly evolving from a distributed hypertext system to a very large information processing machine. As fast as it is, this evolution is grounded on theoretical principles borrowing to several fields of computer science such as programming languages, data bases, structured documentation, logic and artificial intelligence. The smooth operation of the past and future web at a large scale is relying on these foundations. The goal of this course is to present them, the problem that they solve as those that they uncover. It considers three milestones of this evolution: XML, the social web and the semantic web.
The first part introduces the foundations of XML technologies: the XML language for document markup, DTDs as a type system for XML documents, XML query languages (XPath and XQuery) and XML transformation language XSLT. We will consider the major results obtained on each of these languages as well as the open questions. Then we introduce the challenges raised by these technologies to theoretical computer science. This covers formal methods used for grounding these technologies (tree automata, tree logics, their algorithms and complexity) as well as their application to XML query typing and static analysis of XML transformation languages.
The second part summarizes data models and algorithms required to extract, manage and access massive amounts of social content. The course examples are drawn from real-world applications such as URL search and recommendation on Delicious, group recommendation in MovieLens and extracting travel itineraries from Flickr photos. The course goals are: acquire knowledge on scalable algorithms for processing large volumes of social data and extracting value from that data and learn how to run and interpret large-scale user studies.
The third part introduces the semantics of knowledge representation on the web. The semantic web extends the web with richer and more precise information because it is expressed in a formal language using a vocabulary defined in an ontology (a structured vocabulary of concepts and properties defined in a logic). Ontologies are used for describing web resource content and reasoning about these resources formally. We introduce the semantic web languages (RDF, RDFS, OWL) and show their relations with knowledge representation formalisms (conceptual graphs, description logics) and XML. This provides tools for reasoning with ontologies and, in particular, to evaluate queries. However, the distributed nature of the web leads to heterogeneous ontologies which must be matched before using them. We discuss ontology matching and explain how to semantically interpret the relations between ontologies. Finally, this is applied to network of peers using knowledge together.
This year the course will start by the Semantic web part.
The course is scheduled in 12 sessions of three hours plus a final exam:
Title | Lecturer |
Semantic web languages (Data: URI, RDF, closure, interpolation lemma) | JE |
Semantic web languages (Ontologies: RDFS and OWL) | JE |
Querying RDF (SPARQL) | JE |
Querying data though ontologies (NSPARQL, PSPARQL, DL-Lite) | JE |
Alignment semantics and networked ontologies + Mid-term exam | JE |
Introduction to the social web | SAY |
Search and recommendation in the social web | SAY |
Core XML (XML, Schemas, Parsing) | PG |
Programming with XML (Streaming Validation, XPath, XQuery) | PG |
Foundations of XML Types (Tree Grammars, Tree Automata) | PG |
Tree Logics (FO, MSO) | PG |
Tree Logics continued (μ-calculus) | PG |
Final exam |
Lecturer: Jérôme Euzenat
This part of the course is now collected into a single Lecture notes volume. These notes are always evolving so, avoid printing them until before the exams. It is easier to download (and update) it and browse through the PDF. It is divided in three parts correponding to the main sessions.
Lecturer: Sihem Amer Yahia
Lecturer: Pierre Genevès
Slides are available from: http://www.pierresoft.com/pierre.geneves/teaching.htm.
In previous years, we had 3h exams at the end of the course. Starting in 2010-2011, we have two exams. This aims at being sure that the students know what is expected from them. In addition here are some past exams.
Here are some questions of an exam proposed at EPFL in 2009 and their corrections (in English) for the XML part only.
Here is the exam of 2008-2009 (in French) and its correction (in English) for the semantic web part only.
Here is the exam of 2009-2010 (in French or English) and its correction (in English) for the semantic web part only.
Here is the exam of 2010-2011 (in French or English) and its correction (in English) for the semantic web part only.
Here is the exam of 2012-2013 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.
Here is the exam of 2013-2014 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.
Here is the exam of 2014-2015 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.
Here is the midterm exam of 2015-2016 and its correction for the semantic web part (in English).
Here is the midterm exam of 2016-2017 and its correction for the semantic web part (in English).