Privacy-Enhancing Data Science (SSI Fellowship 2022)
Presentation for my Software Sustainability Institute (SSI) Fellowship Application to foster the adoption of Privacy-Preserving Data Science tools and methods.
My pronouns are he/him And I am very pleased to meet you! ☺ I am 🇬🇧 Now in Credits @Nathan Riley “Clifton Suspension Bridge, Bristol, United Kingdom” Published on August 14, 2017 - Source: https://unsplash.com/photos/iOMkcADNoq8
Research Associate - Fondazione Bruno Kessler (FBK) Trento, Italy Training data [Classifier tuning] Validation data Ranked biomarkers Classification model Internal training set Data splitting Internal validation set Prediction Performance evaluation Selected biomarkers Prediction Predicted labels Selected biomarkers Best model Repeat 10 times 5-fold CV Random labels Random labels sanity check Reproducible genomics: DNA-Seq to enhance research in precision medicine DAP (Data Analysis Pipelines) gitlab. f bk.eu/mpba /phylogenetic-cnn /dap /dapper AI for Healthcare Grant kube f low-kale.github.io KubeCon 2021 - Keynote drawXORRect => draw|XOR|Rectangle Identi f ic a tion of Code Siblings Code Identi f iers Processing graphics Genomics Histology ML4SE: Machine Learning for Software Engineering Cloud RSE
Source: UK Birth Cohorts as a Platform for Ground Truth in Mental Health Data Science O. Davis/ C. Haworth ATI Fellowship Platform to enable ML algorithms for Mental-Health Data Science in UK birth cohort studies Privacy-Preserving Machine Learning Aw a rded by JGI Seed-Corn Fundings 2021 je a ngoldinginstitute.blogs.bristol. a c.uk/2021/01/07/seed-corn-funding-winner- a nnouncement/ Member of the Writing/Doc Te a m & Technic a l Mentor @ Priv a te AI Series bristol.ac.uk/alspac/ PPML PPML
‘22 1. Shared Interest for Sustainable Research Software Principles and Reproducible Science Practice • Being a SSI fellow will de fi nitely support me in disseminating these principles among researchers at the University 2. Join a community of peers with whom I wish to collaborate, exchange ideas, and to learn from.
to be the Data Science paradigm of the future • Joint e ff ort of Open Source & ML & Security Communities • I wish to disseminate the knowledge about these new methods and technologies among researchers • Focus on Reproducibility of PPML work fl ows SSI Fellowship Plans What I would like to do
• Increase visibility by writing blog posts and short tutorials • Eventually aiming at submitting the material as a proposal for a new Data Carpentry Curriculum gather.town • Run at least two data carpentry-style workshops on PPML • Pay for hosting and cloud computing to host and run teaching materials • (Ideally) Having funds for some travel costs & catering for attendees • (More realistically) Purchasing professional equipment for recording (e.g. webcam) Host the bootcamp on remote premises (e.g. gather.town)