Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Public Transport Route Planning over Lightweigh...

Public Transport Route Planning over Lightweight Linked Data Interfaces

Research presentation on how to publish transport data for maximum reuse.

Presented at ICWE2017

Pieter Colpaert

June 08, 2017
Tweet

More Decks by Pieter Colpaert

Other Decks in Technology

Transcript

  1. pietercolpaert.be linkedconnections.org Content 1. State of the art in sharing

    data on the Web for route planners 2. Proposal: let’s fragment our dataset instead 3. Evaluation design: replaying real query logs on different set-ups 4. Results: user perceived performance and cost-efficiency 5. Conclusion: new valuable trade-off established
  2. pietercolpaert.be linkedconnections.org Option 1 Cheap publishing solution Straight-forward to keep

    online Publishing a GTFS data dump High effort required from user agents User agents need to import your data over and over again zip-file containing CSV files
  3. pietercolpaert.be linkedconnections.org Let’s Web-engineer route planning! REST for a high

    user perceived performance, caching and cost-efficiency Hypermedia for enabling intelligent agents Linked Data for semantic interoperability
  4. pietercolpaert.be linkedconnections.org Let’s take a look at the data a

    connection departureTime + departureStop arrivalTime + arrivalStop another connection departureTime + departureStop arrivalTime + arrivalStop
  5. pietercolpaert.be linkedconnections.org time Connection Scan Algorithm ~ creating a minimum

    spanning tree through a sorted directed acyclic graph Squares are connections
  6. pietercolpaert.be linkedconnections.org Resource X Resource ... Resource 2 Resource 1

    time hydra:next hydra:next X requests needed instead of just 1
  7. pietercolpaert.be linkedconnections.org Is this new trade-off more cost-efficient for the

    data publisher? How much slower is it for the data reuser? Evaluation
  8. pietercolpaert.be linkedconnections.org Three set-ups 1. A query server Real cost-efficiency

    of the Linked Connections interface will be found in-between 2. Linked Connections with always unique user agents 3. Linked Connections with only one user agent over the entire Web
  9. pietercolpaert.be linkedconnections.org Three set-ups 1. A query server Real cost-efficiency

    of the Linked Connections interface will be found in-between 2 and 3 2. Linked Connections server + client 3. Linked Connections server + client with cache https://api.{myapp}/ ?from={A}&to={B} https://{myhost}/{datafragmentid} MongoDB with connections server client server server client client cache
  10. pietercolpaert.be linkedconnections.org source: https://api.irail.be/logs 2. Real Query Logs 1. Real

    schedules Open research data source: https://gtfs.irail.be/ https://github.com/ linkedconnections/ benchmark-belgianrail
  11. pietercolpaert.be linkedconnections.org Results 1. CPU time on the server 2.

    Average time spent by the client per connection = an indication of the user perceived performance
  12. pietercolpaert.be linkedconnections.org CPU time on the server Linked Connections has

    definitely a more lightweight interface Real world between these 2 values
  13. pietercolpaert.be linkedconnections.org Average time spent by the client per connection

    Under low load, Linked Connections is slightly slower, yet under high load, Linked Connections gives better response times Real world between these 2 values
  14. pietercolpaert.be linkedconnections.org Non measured benefits User profile only on your

    smartphone → Privacy by design Combining it with other datasets becomes easy Route planning becomes merely adding a library to your software project “happiest route” by @danielequercia
  15. pietercolpaert.be linkedconnections.org “Transfers” are now a semantic interoperability problem A

    problem we can solve with Linked Data Connection A departureTime T1 departureStop S1 arrivalTime T2 arrivalStop S2 Rail Station S2 longitude X1 latitude Y1 name ... ParentStation Station S3 As S2 becomes reachable, others stops become reachable as well: nearby Bus Stop S4 Bus Stop S5 ... has parent stop
  16. pietercolpaert.be linkedconnections.org Conclusion New trade-off established for cost-efficiently maximizing possible

    reuse of public transport data Data dumps Linked Connections Answer any question on the server Route planning algorithms as a service Data publishing Data services http://api.{myapp}/?from={A}&to={B} http://{myhost}/{datafragmentid} Average cache hit-rate of 78%