Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction - Lecture 1 - Advanced Topics in Big Data (4023256FNR)

Beat Signer
February 20, 2024

Introduction - Lecture 1 - Advanced Topics in Big Data (4023256FNR)

This lecture forms part of a seminar on Advanced Topics in Big Data given at the Vrije Universiteit Brussel.

Beat Signer

February 20, 2024
Tweet

More Decks by Beat Signer

Other Decks in Education

Transcript

  1. 2 December 2005 Advanced Topics in Big Data Introduction Prof.

    Beat Signer Department of Computer Science Vrije Universiteit Brussel beatsigner.com
  2. Beat Signer - Department of Computer Science - [email protected] 2

    February 13, 2024 Seminar Organisation ▪ Prof. Beat Signer Vrije Universiteit Brussel PL9.3.60 (Pleinlaan 9) +32 2 629 1239 [email protected] wise.vub.ac.be/beat-signer ▪ Prof. Bas Ketsman Vrije Universiteit Brussel F.10.741 +32 2 629 3480 [email protected] https://www.basketsman.com
  3. Beat Signer - Department of Computer Science - [email protected] 3

    February 13, 2024 Seminar Organisation … ▪ Prof. Pieter Libin Vrije Universiteit Brussel PL9.3 (Pleinlaan 9) +32 2 629 2964 [email protected] ai.vub.ac.be/team/pieter-libin/ ▪ Dr. Audrey Sanctorum Vrije Universiteit Brussel PL9.3.56 (Pleinlaan 9) +32 2 629 3749 [email protected] wise.vub.ac.be/audrey-sanctorum
  4. Beat Signer - Department of Computer Science - [email protected] 4

    February 13, 2024 Seminar Organisation … ▪ Further we have the following individual supervisors ▪ Dr. Heba Aamer Mohamed ▪ Tim Baccaert ▪ Fernando Hechavarria ▪ Yoshi Malaise ▪ Ingela Rossing ▪ Isaac Valadez ▪ Kushal Soni
  5. Beat Signer - Department of Computer Science - [email protected] 5

    February 13, 2024 Prerequisites ▪ Students who want to enrol for this course, must have passed or be enrolled in Scalable Analytics and Information Visualisation
  6. Beat Signer - Department of Computer Science - [email protected] 6

    February 13, 2024 Course Goals ▪ In this seminar the student gets insights about recent developments in the field of Big Data systems. They will deepen their knowledge about specific topics in Big Data systems and are required to communicate the outcome to other course participants. The student should be able to critically review the assigned research papers, identify the main contributions and communicate the content in the form of a presentation as well as in a written report. ▪ The student is required to identify the contributions as well as strengths and weaknesses of a given research paper. They should further get an insight of how evaluate and position a research paper in the context of related work.
  7. Beat Signer - Department of Computer Science - [email protected] 7

    February 13, 2024 Course Goals ▪ As part of the seminar the student is required to clearly communicate about the assigned research topic. The attendee shows that they can reflect on a given research topic and discuss it with colleagues by asking and answering scientific questions.
  8. Beat Signer - Department of Computer Science - [email protected] 8

    February 13, 2024 Course Material ▪ All material will be available on Canvas ▪ lecture slides, papers, presentations, links, ... ▪ Make sure that you are subscribed to the Advanced Topics in Big Data course on Canvas ▪ https://canvas.vub.be/courses/34716
  9. Beat Signer - Department of Computer Science - [email protected] 9

    February 13, 2024 Data Management Big Data systems Main Domains of the Seminar scalable data management advanced query processing (e.g. approximate query processing) large-scale analytical database systems data integration and interoperability innovative data storage exploratory search complex data exploration and analysis multimodal information retrieval visual data discovery data mining interactive data processing data physicalisation mixed reality and TUIs cross-media information management and interaction information visualisation context-awareness and personalisation hypermedia and linked data DAMA Human-Data Interaction Data Processing and Discovery
  10. Beat Signer - Department of Computer Science - [email protected] 10

    February 13, 2024 Seminar Topics 1. Concurrency in Databases ▪ Developer's Responsibility or Database’s Responsibility? Rethinking Concurrency Control in Databases, Chaoyi Cheng, Spyros Blanas, Mingzhe Han, Michael D. Bond, Nuo Xu and Yang Wang, Proceedings of CIDR 2023, 13th Annual Conference on Innovative Data Systems Research, Amsterdam, The Netherlands, January 2023. [https://www.cidrdb.org/cidr2023/papers/p30-cheng.pdf] 2. Query Languages ▪ A Critique of Modern SQL And A Proposal Towards A Simple and Expressive Query Language, Thomas Neumann and Viktor Leis, Proceedings of CIDR 2024, 14th Annual Conference on Innovative Data Systems Research, Chaminade, USA, January 2024. [https://www.cidrdb.org/cidr2024/papers/p48-neumann.pdf]
  11. Beat Signer - Department of Computer Science - [email protected] 11

    February 13, 2024 Seminar Topics ... 3. Transactions and Debugging ▪ Transactions Make Debugging Easy, Qian Li, Peter Kraft, Michael Cafarella, Çağatay Demiralp, Goetz Graefe, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh and Matei Zaharia, Proceedings of CIDR 2023, 13th Annual Conference on Innovative Data Systems Research, Amsterdam, The Netherlands, January 2023. [https://www.cidrdb.org/cidr2023/papers/p26-li.pdf] 4. Database Reasoning Over Text ▪ Database Reasoning Over Text, James Thorne, Majid Yazdani, Marzieh Saeidi, Fabrizio Silvestri, Sebastian Riedel and Alon Halevy, Proceedings of ACL/IJCNLP 2021, 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Virtual Event, August 2021. [https://aclanthology.org/2021.acl-long.241.pdf]
  12. Beat Signer - Department of Computer Science - [email protected] 12

    February 13, 2024 Seminar Topics ... 5. Artificial General Intelligence and Safety ▪ Scalar Reward is Not Enough: A Response to Silver, Singh, Precup and Sutton, Peter Vamplew, Benjamin J. Smith, Johan Källström, Gabriel Ramos, Roxana Rădulescu, Diederik M. Roijers, Conor F. Hayes, Fredrik Heintz, Patrick Mannion, Pieter J.K. Libin, Richard Dazeley and Cameron Foale, Autonomous Agents and Multi-Agent Systems, 36(41), 2022. [https://doi.org/10.1007/s10458-022-09575-5] 6. Individual-based Models ▪ FluTE, a Publicly Available Stochastic Influenza Epidemic Simulation Model, Dennis L. Chao, M. Elizabeth Halloran, Valerie J. Obenchain, Ira M. Longini Jr., PLoS Computational Biology, 6(1), January 2010. [https://doi.org/10.1371/journal.pcbi.1000656]
  13. Beat Signer - Department of Computer Science - [email protected] 13

    February 13, 2024 Seminar Topics ... 7. Multi-armed Bandits ▪ Using Confidence Bounds for Exploitation-Exploration Trade-offs, Peter Auer, The Journal of Machine Learning Research, 3, March 2003. [https://dl.acm.org/doi/10.5555/944919.944941] 8. Reinforcement Learning Applications ▪ Human-Level Control Through Deep Reinforcement Learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg and Demis Hassabis, Nature, 518, February 2015. [https://doi.org/10.1038/nature14236]
  14. Beat Signer - Department of Computer Science - [email protected] 14

    February 13, 2024 Seminar Topics ... 9. Augmented Reality ▪ XAIR: A Framework of Explainable AI in Augmented Reality, Xuhai Xu, Anna Yu, Tanya R. Jonker, Kashyap Todi, Feiyu Lu, Xun Qian, João Marcelo Evangelista Belo, Tianyi Wang, Michelle Li, Aran Mun and, Te- Yen Wu, Junxiao Shen, Ting Zhang, Narine Kokhlikyan, Fulton Wang, Paul Sorenson, Sophie Kim and Hrvoje Benko, Proceedings of CHI 2023, International Conference on Human Factors in Computing Systems, Hamburg, Germany, April 2023. [https://doi.org/10.1145/3544548.3581500] 10.Cross-Media Interfaces ▪ Living Papers: A Language Toolkit for Augmented Scholarly Communication, Jeffrey Heer, Matthew Conlen, Vishal Devireddy, Tu Nguyen and Joshua Horowitz, Proceedings of UIST 2023, 36th Annual ACM Symposium on User Interface Software and Technology, San Francisco, October 2023. [https://doi.org/10.1145/3586183.3606791]
  15. Beat Signer - Department of Computer Science - [email protected] 15

    February 13, 2024 Seminar Topics ... 11.Data Physicalisation ▪ SensorBricks: A Collaborative Tangible Sensor Toolkit to Support the Development of Data Literacy, Hans Brombacher, Rosa Van Koningsbruggen, Steven Vos, Steven Houben, Proceedings of TEI 2024, 18th International Conference on Tangible, Embedded, and Embodied Interaction, San Francisco, February 2024. [https://doi.org/10.1145/3623509.3633378] 12.Data Extraction ▪ ChartDetective: Easy and Accurate Interactive Data Extraction from Complex Vector Charts, Damien Masson, Sylvain Malacria, Daniel Vogel, Edward Lank and Géry Casiez, Proceedings of CHI 2023, International Conference on Human Factors in Computing Systems, Hamburg, Germany, April 2023. [https://doi.org/10.1145/3544548.3581113]
  16. Beat Signer - Department of Computer Science - [email protected] 16

    February 13, 2024 Assignment of Topics ▪ Select 6 topics/papers from the presented list and mark them (with A to F) according to your preferences ▪ Send an email with your choices (e.g. 6A, 3B, 2C, 4D, 1E, 5F) to [email protected] no later than February 18 ▪ Each student will be assigned a paper that has to be presented in the seminar and the final seminar schedule will be made available by next week
  17. Beat Signer - Department of Computer Science - [email protected] 17

    February 13, 2024 Seminar Organisation ▪ Presentation should be 30 minutes long (not longer but also not shorter!) ▪ make use of the available time ▪ have some backup slides/material in case you finish too early and for the Q&A ▪ Structure of your presentation ▪ introduction of topic and problem statement (5-10 mins) ▪ proposed approach (15-20 mins) ▪ review (5 mins) - critical analysis - at least two positive and two negative points about the paper
  18. Beat Signer - Department of Computer Science - [email protected] 18

    February 13, 2024 Seminar Organisation … ▪ Send a draft of your presentation to your supervisor no later than one week before the presentation and arrange a meeting with your supervisor ▪ you will get feedback about the structure and content of your presentation ▪ Immediately after your presentation, please send us ([email protected]) your slides in order that we can make them available for your colleagues on Canvas
  19. Beat Signer - Department of Computer Science - [email protected] 19

    February 13, 2024 Seminar Organisation … ▪ Each student has to write a report about their presented paper/topic ▪ same structure as presentation - introduction of topic and problem statement - proposed approach - review ▪ no longer than 5 pages ▪ send a draft to your supervisor to get some feedback - arrange a meeting with your supervisor ▪ deadline for final report: May 21
  20. Beat Signer - Department of Computer Science - [email protected] 20

    February 13, 2024 Seminar Organisation … ▪ Each student will be assigned as a reviewer for two additional papers ▪ hand in a review via the conference system ▪ deadline: at least a week before the paper is presented ▪ Each student is assigned as a metareviewer for one paper ▪ hand in a metareview via the conference system ▪ based on the two reviews and the metareviewer's knowledge ▪ deadline: latest Sunday (midnight) before the paper is presented ▪ prepare at least two questions to open the discussion round ▪ template and example (meta)reviews are available on Canvas
  21. Beat Signer - Department of Computer Science - [email protected] 21

    February 13, 2024 Seminar Organisation … ▪ Each student has to read the papers to be presented every week before the seminar takes place and submit two questions via an online form by latest Sunday (midnight) before the lecture ▪ https://wise.vub.ac.be/atobi/
  22. Beat Signer - Department of Computer Science - [email protected] 22

    February 13, 2024 Seminar Organisation … ▪ Final grade is based on ▪ presentation (70%) ▪ written report ▪ reviews and metareview ▪ active participation in the seminar and submitted questions ▪ Everybody is expected to read the papers before the lecture takes place! ▪ after each presentation, there is enough time for questions and a discussion about the topic and content of the paper ▪ Attendance to all presentations is mandatory! ▪ Schedule will be made available on Canvas ▪ first presentations: March 12
  23. 2 December 2005 Next Week Assign Topics and Answer Questions

    Some Tips for the Presentation Conference System