sessions and coursework Computational exercises Paired with each lecture (Due at the end of each computer lab) Research challenge Assignment to complete (details after Lecture 9) Registration of absence or mitigation goes via the student office
skills you have picked up so far • To extend your knowledge through self-study, exploration, and cohort interactions • To produce an annotated code with comparison to community benchmarks An opportunity to develop your practical skills. Goals:
is to produce an original model for the given classification or regression task Some tasks use chemical composition only, while others use composition and structure
Read the matbench paper and the models that have been tested I. Data Preparation II. Model Selection, Training & Testing III. Discussion of Results https://doi.org/10.1038/s41524-020-00406-3
unique solution for a given problem You may be interested in speed or clarity, but ultimately want a robust code • Check package manuals, e.g. https://matplotlib.org & https://scikit-learn.org • Search https://stackexchange.com & https://github.com for ideas
you use an LLM (e.g. GPT, Gemini, Co-Pilot)? • Specify tasks (e.g. code assistance) • Were any limitations/biases noted? • How did you ensure ethical use? Statement to be included in the submitted notebook
rooms: Class 9 14:00-15:00 Class 10 14:00-15:00 Computer room is also booked on Feb 13th/16th/20th for your own independent research Submission deadline: 9 March 15:00
notebook (.ipynb) and 2. Recorded presentation* (max 5 min) where you introduce your code and your results on model training, selection, and performance *Format is flexible. Could be recorded in PowerPoint, screenshare on Zoom, or plain video
appropriate pre-processing steps Model Selection, Training and Testing 20 % Justify model based on the problem, with appropriate validation and testing Model Analysis and Discussion 20 % Analysis of model performance, including high-quality plots Python Code Quality 20 % Clearly structured code with meaningful annotations Recorded Presentation 30 % Clarity and conciseness in model choices, results, limitations
on decision making processes How do these translate to the materials context? Transparency and Explainability Interpretation of model predictions Privacy and Data Protection Collection, storage and using sensitive data Social Impacts From productivity increases to job displacements