Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Semantic Web and Web 3.0 - Lecture 9 - Web Technologies (1019888BNR)

Beat Signer
November 20, 2023

Semantic Web and Web 3.0 - Lecture 9 - Web Technologies (1019888BNR)

This lecture forms part of the course Web Technologies given at the Vrije Universiteit Brussel.

Beat Signer

November 20, 2023
Tweet

More Decks by Beat Signer

Other Decks in Education

Transcript

  1. 2 December 2005
    Web Technologies
    Semantic Web and Web 3.0
    Prof. Beat Signer
    Department of Computer Science
    Vrije Universiteit Brussel
    beatsigner.com

    View full-size slide

  2. Beat Signer - Department of Computer Science - [email protected] 2
    November 21, 2023
    The Semantic Web
    I have a dream for the Web [in which com-
    puters] become capable of analyzing all the
    data on the Web–the content, links, and
    transactions between people and computers.
    A 'Semantic Web', which should make this
    possible, has yet to emerge, but when it
    does, the day-to-day mechanisms of trade,
    bureaucracy and our daily lives will be
    handled by machines talking to machines.
    The 'intelligent agents' people have touted
    for ages will finally materialize.
    Weaving the Web - The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor,
    Tim Berners-Lee, Harper San Francisco, September 1999
    Tim Berners-Lee

    View full-size slide

  3. Beat Signer - Department of Computer Science - [email protected] 3
    November 21, 2023
    The Semantic Web ...
    The Semantic Web is a vision: the idea of having data on
    the Web defined and linked in a way that it can be used by
    machines not just for display purposes, but for auto-
    mation, integration and reuse of data across various
    applications. Metadata provides a means to make
    statements and create machine-readable statements.
    W3C, 2003

    View full-size slide

  4. Beat Signer - Department of Computer Science - [email protected] 4
    November 21, 2023
    The Semantic Web ...
    ▪ Meaning of data on the Web can not only be inferred by
    people but also discovered by machines without (or with
    less) human intervention
    ▪ Web of Data instead of Web of Documents
    ▪ the Web as a huge decentralised database (knowledge base)
    ▪ machine-accessible data
    ▪ data may be interconnected similar to today's webpages
    ▪ machine-readable metadata for existing web content
    ▪ combination of data from different sources to derive new facts
    ▪ machines (agents) may use logical reasoning to infer facts that
    are not explicitly recorded
    ▪ Crucial component of Web 3.0 or Giant Global Graph

    View full-size slide

  5. Beat Signer - Department of Computer Science - [email protected] 5
    November 21, 2023
    Video: The Future Internet

    View full-size slide

  6. Beat Signer - Department of Computer Science - [email protected] 6
    November 21, 2023
    Semantic Web Stack
    ▪ The Semantic Web Stack
    (or Semantic Web Cake)
    describes the architecture
    of the Semantic Web
    ▪ URI/IRI
    - unique identification of semantic
    web resources
    ▪ Unicode
    - representing/manipulating text
    in different languages
    ▪ XML
    - interchange of structured data
    over the Web
    Character set: UNICODE
    Cryptography
    Syntax: XML and XML Namespaces
    Data interchange: RDF
    Taxonomies: RDFS
    Ontologies:
    OWL
    Querying:
    SPARQL
    Unifying Logic
    Trust
    User interface and applications
    Proof
    Rules:
    RIF/SWRL
    Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]
    Identifiers:
    URI/IRI

    View full-size slide

  7. Beat Signer - Department of Computer Science - [email protected] 7
    November 21, 2023
    Semantic Web Stack ...
    ▪ XML Namespaces
    - uniquely qualify markup from
    multiple sources (integration)
    ▪ Resource Description
    Framework (RDF)
    - define RDF triples and repre-
    sent resource information in
    a graph structure
    ▪ RDF Schema (RDFS)
    - create hierarchies of classes
    and properties
    Character set: UNICODE
    Cryptography
    Syntax: XML and XML Namespaces
    Data interchange: RDF
    Taxonomies: RDFS
    Ontologies:
    OWL
    Querying:
    SPARQL
    Unifying Logic
    Trust
    User interface and applications
    Proof
    Rules:
    RIF/SWRL
    Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]
    Identifiers:
    URI/IRI

    View full-size slide

  8. Beat Signer - Department of Computer Science - [email protected] 8
    November 21, 2023
    Semantic Web Stack ...
    ▪ Web Ontology Language
    (OWL)
    - language to define vocabularies
    - extends RDFS with more ad-
    vanced features (e.g.cardinality)
    - enables reasoning based on
    description logic
    ▪ SPARQL
    - query language to query any
    RDF-based data
    ▪ Rule Interchange Format
    (RIF) and Semantic Web
    Rule Language (SWRL)
    - describe relations that cannot be
    described in OWL
    Character set: UNICODE
    Cryptography
    Syntax: XML and XML Namespaces
    Data interchange: RDF
    Taxonomies: RDFS
    Ontologies:
    OWL
    Querying:
    SPARQL
    Unifying Logic
    Trust
    User interface and applications
    Proof
    Rules:
    RIF/SWRL
    Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]
    Identifiers:
    URI/IRI

    View full-size slide

  9. Beat Signer - Department of Computer Science - [email protected] 9
    November 21, 2023
    Semantic Web Stack ...
    ▪ Unifying Logic
    - logical reasoning (infer new
    facts and check consistency)
    ▪ Proof
    - explain logical reasoning steps
    ▪ Cryptography
    - protect RDF data via encryption
    - validate the source of facts by
    digitally signing RDF data
    ▪ Trust
    - authentication of sources and
    trustworthiness of derived facts
    ▪ User Interface
    - user interfaces for semantic web
    applications
    Character set: UNICODE
    Cryptography
    Syntax: XML and XML Namespaces
    Data interchange: RDF
    Taxonomies: RDFS
    Ontologies:
    OWL
    Querying:
    SPARQL
    Unifying Logic
    Trust
    User interface and applications
    Proof
    Rules:
    RIF/SWRL
    Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]
    Identifiers:
    URI/IRI

    View full-size slide

  10. Beat Signer - Department of Computer Science - [email protected] 10
    November 21, 2023
    Resource Description Framework
    ▪ The Resource Description Framework (RDF) has
    been designed to describe
    ▪ data and metadata about specific subjects
    ▪ structure of data sets
    ▪ relationships between bits of data
    ▪ An RDF statement (triple) consists of three parts
    ▪ subject
    ▪ predicate (property)
    ▪ object (value)
    {person-1, name, "Niklaus Wirth"}
    subject predicate object

    View full-size slide

  11. Beat Signer - Department of Computer Science - [email protected] 11
    November 21, 2023
    Resource Description Framework ...
    ▪ Subjects, predicates and objects are all resources
    ▪ subject is either a URI reference or a blank node
    ▪ predicate is a URI reference defining the relationship
    ▪ object is either a URI reference, a literal or a blank node
    ▪ RDF data is often stored in relational databases or
    so-called triplestores such as Apache Jena (TDB)
    ▪ up to billions of triples

    View full-size slide

  12. Beat Signer - Department of Computer Science - [email protected] 12
    November 21, 2023
    RDF Graph
    ▪ A set of RDF statements can be represented as a
    directed labelled graph
    ▪ note that in RDF we can only define statements about specific
    instances but not about generic concepts
    - RDFS/ontologies have to be used to define statements about generic concepts
    Beat
    Signer
    w:hasFamilyName
    w:hasGivenName
    https://wise.vub.ac.be/beat-signer

    View full-size slide

  13. Beat Signer - Department of Computer Science - [email protected] 13
    November 21, 2023
    RDF Graph ...
    ▪ Anonymous resources have no explicit identifier
    ▪ in the example, the "office" is an anonymous resource
    ▪ anonymous resources are also called blank nodes or bnodes
    ▪ blank nodes can only be used as subjects or objects
    Beat Signer
    w:hasFamilyName
    w:hasGivenName
    http://wise.vub.ac.be
    w:hasDirector
    https://wise.vub.ac.be/beat-signer
    w:isMember
    Lode
    https://wise.vub.ac.be/lode-hoste
    Hoste
    w:hasFamilyName
    w:hasGivenName
    w:isColleague
    w:hasOffice
    10F733 026293306
    w:room w:phone

    View full-size slide

  14. Beat Signer - Department of Computer Science - [email protected] 14
    November 21, 2023
    RDF Reification
    ▪ An RDF triple is not a resource and can therefore not
    become subject of another statement
    ▪ we have to reify the original statement
    - make a resource out of the statement
    Beat Signer
    w:hasFamilyName
    w:hasGivenName
    https://wise.vub.ac.be
    w:hasDirector w:isMember
    Lode
    Hoste
    w:hasGivenName
    rdf:subject rdf:object
    rdf:statement isColleague
    rdf:type
    rdf:Property
    1
    w:forYears
    w:hasFamily Name
    https://wise.vub.ac.be/beat-signer https://wise.vub.ac.be/lode-hoste

    View full-size slide

  15. Beat Signer - Department of Computer Science - [email protected] 15
    November 21, 2023
    Advantages of RDF
    ▪ Simple
    ▪ Enables the combination (merging) of data from
    different data models
    ▪ not easily possible in a relational database (different schemas)
    ▪ The same resource can be annotated by different people
    ▪ resource referenced by URI
    ▪ separation of data and metadata
    ▪ Well-defined standard
    ▪ many tools available
    - triplestores, parsers, editors, frameworks, ...

    View full-size slide

  16. Beat Signer - Department of Computer Science - [email protected] 16
    November 21, 2023
    RDF Schema (RDFS)
    ▪ Vocabulary description language for RDF
    ▪ domain vocabulary and structure
    ▪ Define common concepts and relationships
    ▪ classes (rdfs:Class) and subclasses (rdfs:subClassOf)
    ▪ properties and sub-properties (rdfs:subPropertyOf)
    ▪ domain (rdfs:domain) and range (rdfs:range) of a property
    ▪ rdfs:seeAlso, rdfs:isDefinedBy (utility properties)
    ▪ rdfs:label, rdfs:comment
    ▪ ...
    ▪ Provides the basic elements for the definition of
    ontologies

    View full-size slide

  17. Beat Signer - Department of Computer Science - [email protected] 17
    November 21, 2023
    RDF Schema Example
    Beat Signer
    w:hasFamilyName
    w:hasGivenName
    Researcher
    https://wise.vub.ac.be/beat-signer
    Lode
    https://wise.vub.ac.be/lode-hoste
    Hoste
    w:isColleague
    w:hasFamilyName
    w:hasGivenName
    rdf:type rdf:type
    Person isColleague
    rdfs:Class rdf:Property
    rdf:type rdf:type
    rdfs:domain
    rdfs:range
    rdfs:subClassOf
    rdfs:Literal rdfs:Literal rdfs:Literal rdfs:Literal
    rdf:type rdf:type rdf:type rdf:type

    View full-size slide

  18. Beat Signer - Department of Computer Science - [email protected] 18
    November 21, 2023
    Advantages of RDFS
    ▪ With RDFS we have a richer expressiveness
    (e.g.subClassOf) than with RDF
    ▪ Simple reasoning (e.g.type hierarchy)
    ▪ Many existing tools to deal with RDFS
    ▪ However, some things cannot be expressed; for example
    ▪ "a person must have a family name"
    ▪ "a person can have at most one family name" (cardinality)
    ▪ "if Beat is a colleague of Lode then Lode is a colleague of Beat"
    (symmetry)
    → these issues are addressed by the Web Ontology
    Language (OWL)

    View full-size slide

  19. Beat Signer - Department of Computer Science - [email protected] 19
    November 21, 2023
    RDF(S)/XML Serialisation
    ▪ Syntax not so easy to learn
    ▪ many different ways to construct the same statement
    ▪ long URIs are hard to read
    {https://wise.vub.ac.be/beat-signer, isColleague,
    https://wise.vub.ac.be/lode-hoste}



    Beat
    ...

    ...

    View full-size slide

  20. Beat Signer - Department of Computer Science - [email protected] 20
    November 21, 2023
    RDF Notation 3 (N3)
    ▪ Short non-XML serialisation
    ▪ separate predicates with a semicolon
    ▪ finish subject definition with a full stop
    ▪ Note that the N3 notation offers more features than are
    necessary for RDF(S) serialisation
    ▪ e.g.support for RDF-based rules
    m:isColleague ;
    ...
    m:hasGivenName "Beat".

    View full-size slide

  21. Beat Signer - Department of Computer Science - [email protected] 21
    November 21, 2023
    RDF Turtle Notation
    ▪ Terse RDF Triple Language
    ▪ Subset of N3 language
    ▪ only describes RDF features (RDF graph model)
    ▪ Syntax looks similar to Notation 3
    ▪ https://www.w3.org/TeamSubmission/turtle/
    ▪ Many RDF frameworks (e.g.Jena) offer Turtle parser
    and serialisation features

    View full-size slide

  22. Beat Signer - Department of Computer Science - [email protected] 22
    November 21, 2023
    RDF Applications
    ▪ Annotea project
    ▪ defines an RDF schema for the types of annotations that can be
    used to annotate webpages
    ▪ RSS
    ▪ some RSS versions use RDF(S) /XML serialisation
    ▪ Dublin Core
    ▪ widely used to describe digital media (also in standard HTML)
    - bibliographic metadata such a title, creator, description, ...
    ▪ uses RDF(S) /XML serialisation as one possible representation

    ...



    View full-size slide

  23. Beat Signer - Department of Computer Science - [email protected] 23
    November 21, 2023
    SPARQL Query Language
    ▪ RDF query language which can be used to
    ▪ extract information as URIs, literals, blank nodes or subgraphs
    ▪ SPARQL SELECT queries return variable bindings
    ▪ SPARQL querying relies on graph pattern matching
    ▪ Example
    ▪ get the name and mbox of all subjects that have both of these
    properties defined
    SELECT ?name ?mbox
    WHERE { ?x foaf:name ?name .
    ?x foaf:mbox ?mbox }

    View full-size slide

  24. Beat Signer - Department of Computer Science - [email protected] 24
    November 21, 2023
    Web Ontology Language (OWL)
    ▪ OWL evolved from DAML+OIL
    ▪ DAML is the DARPA Agent Markup Language
    ▪ OIL stands for Ontology Inference Layer
    ▪ There exist 3 different OWL sublanguages (flavours) with
    different expressiveness
    ▪ OWL Full
    - maximum expressiveness (full language)
    - no computational guarantee
    ▪ OWL DL
    - maximal OWL Full subset that is still computationally decidable
    ▪ OWL Lite
    - classification hierarchy and simple constraints (limited cardinality constraints)
    - weakest of the three variants

    View full-size slide

  25. Beat Signer - Department of Computer Science - [email protected] 25
    November 21, 2023
    Jena Semantic Web Framework
    ▪ Open source Semantic Web framework for Java
    ▪ create and access data from RDF graphs via an RDF API
    ▪ offers an OWL API
    ▪ data can be stored in files, databases or accessed via URLs
    ▪ https://jena.apache.org
    ▪ RDF graphs can be serialised into different formats
    ▪ RDF/XML
    ▪ Notation 3
    ▪ Turtle
    ▪ relational database
    ▪ SPARQL query interface
    ▪ Multiple reasoners

    View full-size slide

  26. Beat Signer - Department of Computer Science - [email protected] 26
    November 21, 2023
    Protégé
    ▪ Free open-source platform
    to create, manipulate and
    visualise ontologies
    ▪ Two modelling tools
    ▪ Protégé-Frames editor
    - build and populate frame-based
    ontologies
    - Java API for plug-ins
    ▪ Protégé-OWL editor
    - build Semantic Web ontologies

    View full-size slide

  27. Beat Signer - Department of Computer Science - [email protected] 27
    November 21, 2023
    Friend of a Friend (FOAF)
    ▪ First social Semantic Web
    application
    ▪ Miller and Brickley, 2000
    ▪ Describe a social network
    without a central database
    ▪ links can be followed by
    spiders (data mining)
    ▪ no unique identifier
    - identification by description
    (predicates and objects)
    ▪ "six degrees of separation" or
    "small world phenomenon"
    ▪ FOAFNaut browser

    View full-size slide

  28. Beat Signer - Department of Computer Science - [email protected] 28
    November 21, 2023
    Friend of a Friend (FOAF)
    ▪ Personal information and connections to friends in RDF
    ▪ http://www.foaf-project.org
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:foaf="http://xmlns.com/foaf/0.1/">

    Beat Signer
    Prof.
    Beat
    Signer
    Beat
    ce6d419869307d57839feef6445a9d64f784eb36
    ...


    Moira C. Norrie
    4cb61b36a6feaa48c78acbb51fcce7cb356afdd6



    ...


    View full-size slide

  29. Beat Signer - Department of Computer Science - [email protected] 29
    November 21, 2023
    Semantic Wikis
    ▪ Use Semantic Web
    technologies to provide
    machine-processable
    Wiki content
    ▪ page content
    ▪ link metadata
    ▪ Ontology reasoning
    ▪ much richer query interface
    ▪ Existing semantic Wikis
    ▪ DBPedia
    ▪ Semantic MediaWiki
    ▪ ...

    View full-size slide

  30. Beat Signer - Department of Computer Science - [email protected] 30
    November 21, 2023
    Linked Data
    ▪ Link different data sources (URIs) on the Web
    ▪ provide metadata about the resources via RDF/XML, N3, etc.
    ▪ provide links to resources in other data sets on the Web
    ▪ Linked Open Data (LOD) cloud project
    ▪ RDF triples from currently 1314 datasets (DBPedia, GeneID, ...)
    ▪ more than 100 billion triples with billions of links
    https://lod-cloud.net

    View full-size slide

  31. Beat Signer - Department of Computer Science - [email protected] 31
    November 21, 2023
    Linked Open Data

    View full-size slide

  32. Beat Signer - Department of Computer Science - [email protected] 32
    November 21, 2023
    Semantic Desktops
    ▪ Apply Semantic Web tech-
    nologies to personal infor-
    mation management (PIM)
    ▪ inter-application data sharing
    ▪ enhancement of limited
    filesystem functionality
    - add document metadata
    ▪ Examples
    ▪ Haystack
    ▪ Nepomuk
    Nepomuk Integration with Dolphin (KDE 4.0)

    View full-size slide

  33. Beat Signer - Department of Computer Science - [email protected] 33
    November 21, 2023
    Microformats
    ▪ Add semantics to (X)HTML pages
    ▪ Makes use of specific (X)HTML tag attributes
    ▪ class and rel attributes
    - e.g. rel="nofollow" for search engines
    ▪ Specific microformats
    ▪ hCard: contact information
    ▪ hCalendar: event information
    ▪ hProduct: product information

    View full-size slide

  34. Beat Signer - Department of Computer Science - [email protected] 34
    November 21, 2023
    hCard Microformat Example
    ▪ Some search engines (e.g.Google and Yahoo) pay
    attention to different types of microformats

    ...

    ...

    Lode Hoste
    Vrije Universiteit Brussel
    32 2629 3306

    http://wise.vub.ac.be/members/lode-hoste

    View full-size slide

  35. Beat Signer - Department of Computer Science - [email protected] 35
    November 21, 2023
    RDF in Attributes (RDFa)
    ▪ Add a set of attribute extensions to (X)HTML for
    embedding RDF metadata
    ▪ Different vocabularies
    ▪ FOAF, video, audio, commerce, …
    ▪ Search engines (e.g. Yahoo and Google) process certain
    RDFa metadata (e.g. product information)
    about="http://www.amazon.com/...">
    and the will to live. Simpson
    dedicates the book Touching the Void to
    the... The book was published in content="1989-12-01">December 1989.

    View full-size slide

  36. Beat Signer - Department of Computer Science - [email protected] 36
    November 21, 2023
    Microdata
    ▪ Add machine readable metadata (semantics) to
    HTML5 documents in the form of key/value pairs
    ▪ can be used by crawlers, search engines (SEO) and browsers to
    provide a richer browsing experience
    ▪ alternative to Microformats and RDFa
    W3C Working Group Note

    Hello, my name is Beat Signer and I am a
    Professor at the
    Vrije Universiteit Brussel.
    My address is:
    Pleinlaan 2,
    1050
    Brussels,
    Belgium.


    View full-size slide

  37. Beat Signer - Department of Computer Science - [email protected] 37
    November 21, 2023
    Exercise 9
    ▪ Semantic Web

    View full-size slide

  38. Beat Signer - Department of Computer Science - [email protected] 38
    November 21, 2023
    References
    ▪ Tim Berners-Lee, James Hendler and Ora
    Lassila, The Semantic Web, Scientific American
    Magazine, May 2001
    ▪ https://www.scientificamerican.com/article.cfm?id=the-semantic-web
    ▪ The Future Internet: Service Web 3.0
    ▪ https://www.youtube.com/watch?v=off08As3siM
    ▪ Resource Description Framework (RDF)
    ▪ https://www.w3.org/RDF/
    ▪ Toby Segaran, Colin Evans and Jamie Taylor, Program-
    ming the Semantic Web: Build Flexible Applications with
    Graph Data, O'Reilly Media, August 2009

    View full-size slide

  39. Beat Signer - Department of Computer Science - [email protected] 39
    November 21, 2023
    References ...
    ▪ The Linked Open Data Cloud
    ▪ https://lod-cloud.net

    View full-size slide

  40. 2 December 2005
    Next Lecture
    Web Search and SEO

    View full-size slide