Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding (ICLR 2024)

Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding (ICLR 2024)

Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding
Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki, Naoya Chiba, Kotaro Saito, Yoshitaka Ushiku, and Kanta Ono
In The Twelfth International Conference on Learning Representations (ICLR 2024)
Paper: https://openreview.net/forum?id=fxQiecl9HB
Project page: https://omron-sinicx.github.io/crystalformer/

Tatsunori Taniai

April 16, 2024
Tweet

Other Decks in Research

Transcript

  1. 1 Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding Tatsunori

    Taniai OMRON SINIC X Corporation Ryo Igarashi OMRON SINIC X Corporation Yuta Suzuki Toyota Motor Corporation Naoya Chiba Tohoku University Kotaro Saito Randeft Inc. Osaka University Yoshitaka Ushiku OMRON SINIC X Corporation Kanta Ono Osaka University The Twelfth International Conference on Learning Representations May 7th through 11th, 2024 at Messe Wien Exhibition and Congress Center Vienna, Austria
  2. 2 Materials science and crystal structures Materials science Explore and

    develop new materials with useful properties and functionalities, such as superconductors and battery materials. Crystal structure • Source code of material. • Infinitely repeating, periodic arrangement of atoms in 3D space. • Described by a minimum repeatable pattern called unit cell. Crystal structure of NaCl Unit cell
  3. 3 Material property prediction Crystal structure Material properties • Formation

    energy • Total energy • Bandgap • Energy above hull • etc. Neural network • Much faster than density functional theory (DFT) calculations. • Useful for accelerating material discovery and development processes. (unit cell)
  4. 4 Interatomic message passing layers Periodic SE(3) invariant prediction Material

    properties • Formation energy • Total energy • Bandgap • Energy above hull • etc. Rotation Translation Periodic boundary shift Crystal structure • Evolve the state feature of each unit cell atom via interatomic interactions. • For property prediction, networks need to be invariant to periodic SE(3) transformations of atomic positions. • While not our focus, force prediction requires networks to be SE(3) equivariant.
  5. 5 Crystals Molecules Advances in ML Advances in material representation

    learning Many geometric GNNs • Duvenaud+ 2015 • Kearnes+ 2016 • Gilmer+ 2017 Success of Transformers • Graphormer [Ying+ 2021] • Equiformer [Liao+ 2023] CNNs (2011-) • ResNet [He+ 2015] GNNs (2015-) • PointNet [Qi+ 2016] • DeepSets [Zaheer+ 2017] • GCN [Kipf & Welling 2017] • GIN [Xu+ 2018] Transformers (2017-) • Transformer [Vaswani+ 2017] • BERT [Devlin+ 2018] • Image generation [Parmar+ 2018] • ViT [Dosovitskiy+ 2020] 3D arrangement of finite atoms 3D arrangement of infinite atoms Many geometric GNNs • CGCNN [Xie & Grossman, 2018] • SchNet [Schütt+ 2018] • MEGNet [Chen+ 2019] Emergence of Transformers • Matformer [Yan+ 2022] Graphormer (2021) showed that fully connected self-attention networks are effective for molecules. However, whether such dense attention is applicable for crystal structures is still an open question because of their infinite structure sizes.
  6. 6 Atomic state evolution by self-attention Molecule Fully connected self-attention

    for finite elements Relative position representations and • Encode relative position between atoms and . • | | scalar bias for softmax logits. • | | : vector bias for value features. • Distance-based representations ensure SE(3) invariance. Atom-wise state features • , , and : linear projections of input atom-wise state feature . • : output atom-wise state feature.
  7. 7 Atomic state evolution by self-attention Molecule Crystal structure Fully

    connected self-attention for finite elements Infinitely connected self-attention for periodic elements Unit cell (𝒏) and (𝒏) encode relative position (𝒏) to reflect periodic unit cell shifts 𝒏.
  8. 9 Interpretation as neural potential summation where Distance decay attention

    Interpreted as interatomic energy calculations in abstract feature space • 𝑟 ─ Abstract interatomic potential between atoms and ( ) • ── Abstract influences on atom from atom ( ) Analogy to potential summation in physics simulations • For example, the electric potential energy between one and many points, with electric charges , is calculated as σ 1 4 0 − .
  9. 10 Periodic spatial encoding Periodic edge encoding Performed as finite-element

    self-attention Infinitely connected attention can be performed just like standard self-attention for finite elements with new position encoding and .
  10. 11 Evaluations on the Materials Project dataset Form E. Bandgap

    Bulk mod. Shear mod. Train/Val/Test 60000/5000/4239 60000/5000/4239 4664/393/393 4664/392/393 MAE unit eV/atom eV log(GPa) log(GPa) CGCNN [Xie & Grossman, 2018] SchNet [Schütt+, 2018] MEGNet [Chen+, 2019] GATGNN [Louis+, 2020] M3GNet [Chen & Ong, 2022] ALIGNN [Choudhary & DeCost, 2021] Matformer [Yan+, 2022] PotNet [Lin+, 2023] 0.031 0.033 0.03 0.033 0.024 0.022 0.021 0.0188 0.292 0.345 0.307 0.28 0.247 0.218 0.211 0.204 0.047 0.066 0.06 0.045 0.05 0.051 0.043 0.04 0.077 0.099 0.099 0.075 0.087 0.078 0.073 0.065 Crystalformer 0.0198 0.201 0.0399 0.0692 Consistently outperforms most of the existing methods in various property prediction tasks, while competitive with the GNN-based SOTA, PotNet [Lin+, 2023]. Win
  11. 12 Evaluations on the JARVIS-DFT 3D 2021 dataset Consistently outperforms

    most of the existing methods in various property prediction tasks, while competitive with the GNN-based SOTA, PotNet [Lin+, 2023]. Form E. Total E. Bandgap (OPT) Bandgap (MBJ) E hull Train/Val/Test 44578/5572/5572 44578/5572/5572 44578/5572/5572 14537/1817/1817 44296/5537/5537 MEA unit eV/atom eV/atom eV eV eV CGCNN [Xie & Grossman, 2018] SchNet [Schütt+, 2018] MEGNet [Chen+, 2019] GATGNN [Louis+, 2020] M3GNet [Chen & Ong, 2022] ALIGNN [Choudhary & DeCost, 2021] Matformer [Yan+, 2022] PotNet [Lin+, 2023] 0.063 0.045 0.047 0.047 0.039 0.0331 0.0325 0.0294 0.078 0.047 0.058 0.056 0.041 0.037 0.035 0.032 0.2 0.19 0.145 0.17 0.145 0.142 0.137 0.127 0.41 0.43 0.34 0.51 0.362 0.31 0.3 0.27 0.17 0.14 0.084 0.12 0.095 0.076 0.064 0.055 Crystalformer 0.0319 0.0342 0.131 0.275 0.0482 Win
  12. 13 Model efficiency comparison Our model has high model efficiency

    compared to the GNN and Transformer- based SOTA methods, PotNet and Matfomer. Furthermore, the overall network architecture is simple and closely follows the original Transformer-encoder architecture. Arch. type Train/Epoch Total train Test/Mater. # Params # Params/Block PotNet [Lin+, 2023] Matformer [Yan+, 2022] Crystalformer GNN Transformer Transformer 43 s 60 s 32 s 5.9 h 8.3 h 7.2 h 313 ms 20.4 ms 6.6 ms 1.8 M 2.9 M 853 K 527 K 544 K 206 K Multi-head attention (Figure 2) + Concat Linear + Feed forward Self-attention block Self-attention block Self-attention block Self-attention block Pooling Feed forward Train and test times are evaluated on JARVIS-DFT 3D (formulation energy) dataset.
  13. 14 Summary and future work Crystalformer: Transformer encoder framework for

    crystal structures. • Simple with minimal modifications to the original Transformer [Vaswani+ 2017]. • Good performance with high model efficiency. • Outperforms most of the existing methods in various property prediction tasks. New interpretation of self-attention as neural potential summation. • Key concept to enable dense self-attention between infinite atoms. • Can more explicitly bridge ML and physics. Possible extensions. • Incorporate long-range interactions via Fourier space attention. (Demo in paper) • Incorporate known interatomic potential forms into models. • Extend to SE(3) equivariant networks for force prediction. Visit our project site at https://omron-sinicx.github.io/crystalformer/