I lead several large scale open source software and algorithm development efforts in machine learning in science, arts and humanities. The machine learning in science efforts focus on high energy physics, astronomy, planetary science, quantum information science and others. I have founded and organized 3 open source organizations: CERN-HSF, Machine Learning for Science (ML4SCI) and HumanAI that have continuously taken part in the Google Summer of Code program and together completed more than 350 software development projects since 2016.
I have developed and made significant contributions to the following open source software packages:
Machine Learning for Science
Deep Learning Analysis and Simulation Framework for Strong Gravitational Lensing (DeepLense)
Authors: Sergei Gleyzer, Michael Toomey, Stephon Alexander, Hanna Parul, Pranath Reddy et al.
Publications:
- P. Reddy, M. Toomey, H. Parul and S. Gleyzer, DiffLense: a Conditional Diffusion Model For Super-Resolution of Gravitational Lensing Data, Mach. Learn. Sci & Tech 5 (2024)
- A. Ojha, S. Gleyzer, M. Toomey and P. Reddy, LensPINN: Physics Informed Neural Network for Learning Dark Matter Morphology in Lensing, NeurIPS2024 Machine Learning and the Physical Sciences (2024)
- P. Reddy, M. Toomey, H. Parul and S. Gleyzer, DiffLense: a Conditional Diffusion Model For Super-Resolution of Gravitational Lensing Data, NeurIPS2024 Machine Learning and the Physical Sciences (2024)
- H. Parul, M. Toomey, P. Reddy and S. Gleyzer, Domain adaptation in application to gravitational lens finding, NeurIPS2024 Machine Learning and the Physical Sciences (2024)
- P. Guan, M. Toomey, and S. Gleyzer, Semi-supervised Super-resolution for Gravitational Lenses with Estimated Degradation Model, NeurIPS2024 Machine Learning and the Physical Sciences (2024)
- A. Shankar, M. Toomey, and S. Gleyzer, Unsupervised Physics-Informed Super-Resolution of Strong Lensing Images for Sparse Datasets, NeurIPS2024 Machine Learning and the Physical Sciences (2024)
- S. Alexander et al., Domain Adaptation for Dark Matter Searches using Strong Gravitational Lensing, The Astrophysics Journal 954 (2023)
- G. Cheeramvelil, M. Toomey and S. Gleyzer, “Equivariant Neural Networks for Signatures of Dark Matter Morphology in Strong Lensing Data“, NeurIPS2023 Machine Learning and the Physical Sciences (2023)
- Y. Deshmukh, K. Sachdev, M. Toomey and S. Gleyzer, “Learning Dark Matter Representation from Strong Lensing Images through Self-Supervision“, NeurIPS2023 Machine Learning and the Physical Sciences (2023)
- J. Velôso de Souza, M. Toomey and S. Gleyzer, “Lensformer: A Physics-Informed Vision Transformer for Gravitational Lensing“, NeurIPS2023 Machine Learning and the Physical Sciences (2023)
- S. Alexander et al., “Decoding Dark Matter Without Supervision“, arXiv:2008.12731
- S. Alexander et al., “Deep Learning the Morphology of Dark Matter Substructure”, arXiv: 1909.07346, The Astrophysical Journal 893 (2020)
Google Summer of Code Project(s):
- Learning Representation Through Self-Supervised Learning on Real Gravitational Lensing Images (2024)
- Diffusion Models for Gravitational Lensing Simulation (2024)
- Physics-Guided Machine Learning (2024)
- Resilient Physics-Informed Anomaly Detection and Inference of Lensing Images on Sparse Datasets (2024)
- Learning Representation Through Self-Supervised Learning on Real Gravitational Lensing Images (2024)
- Superresolution for Strong Gravitational Lensing (2024)
- Equivariant Neural Networks for Dark Matter Morphology with Strong Gravitational Lensing (2023)
- Self-Supervised Learning for Strong Gravitational Lensing (2023)
- Lensiformer: A Physics-Informed Vision Transformer Architecture for Dark Matter Morphology (2023)
- Super-Resolution for Strong Gravitational Lensing (2023)
- Self-Supervised Learning for Strong Gravitational Lensing (2023)
- Updating the DeepLense Pipeline (2023)
- Equivariant Transformers for Decoding Dark Matter with Strong Gravitational Lensing (2022)
- Transformers for Dark Matter Morphology with Strong Gravitational Lensing (2022)
- Deep Regression Exploration (2022)
- Transformers for Dark Matter Morphology with Strong Gravitational Lensing (2022)
- Gravitational Lens Finding for Dark Matter Substructure Pipeline (2022)
- Updating the DeepLense Pipeline (2022)
- Domain Adaptation for Decoding Dark Matter with Strong Gravitational Lensing (2021)
- Equivariant Neural Networks for Dark Matter Morphology with Strong Gravitational Lensing (2021)
- Direct Objective Function for Anomaly Detection (2021)
- Building a Python-based Framework for Unsupervised Deep Learning Applications in Strong Lensing Cosmology (2020)
Code: https://github.com/ML4SCI/DeepLense
End-to-End Deep Learning in High-Energy Physics (E2E) Project:
Authors: Sergei Gleyzer, Michael Andrews et al.
Publications:
- CMS Collaboration, “Search for Exotic Higgs Boson decays H to AA to 4 gamma with events containing two merged diphotons in proton-proton collisions at sqrt(s) = 13 TeV“, Physical Review Letters 131 (2023)
- CMS Collaboration, “Reconstruction of Decays of Merged Photons using End-to-end Deep Learning with Domain Continuation in the CMS Detector“, Physical Review D 108 (2023)
- M. Andrews et al., “End-to-End Jet Classification of Boosted Top Quarks with the CMS Open Data“, Physical Review D 105 (2022)
- M. Andrews et al., “End-to-End Identification of Quarks and Gluons with the CMS Open Data”, Nuclear Instruments and Methods A 977 (2020)
- M. Andrews, M. Paulini, S. Gleyzer and B. Poczos,, “End-to-End Physics Event Classification with the CMS Open Data: Applying Image-based Deep Learning on Detector Data to Directly Classify Collision Events at the LHC“, Computing and Software for Big Science 4 (2020)
Google Summer of Code Project(s):
- Self-Supervised Learning for End-to-End Particle Reconstruction for the CMS Experiment (2024)
- Masked Auto-Encoders for Efficient E2E Particle Reconstruction & Compression for CMS Experiment (2024)
- Masked Auto-Encoders for End-to-End Particle Reconstruction and Compression for the CMS Experiment (2024)
- Vision Transformers for End-to-End Particle Reconstruction for the CMS Experiment (2023)
- Graph Neural Networks for End-to-End Particle Identification with the CMS Experiment (2023)
- Exploring the underlying symmetries in particle physics with equivariant neural networks (2023)
- Vision Transformers for End-to-End Particle Reconstruction for the CMS Experiment (2022)
- Graph Neural Networks for End-to-End Particle Identification with the CMS Experiment (2022)
- End-to-End Deep Learning Reconstruction for CMS Experiment (2022)
- Graph Neural Networks for End-to-End Particle Identification with the CMS Experiment (2022)
- Graph Neural Networks for End-to-End Particle Identification with the CMS Experiment (2021)
- End-to-End Deep Learning Regression for Measurements with the CMS Experiment (2021)
- End-to-End Deep Learning Reconstruction for CMS Experiment (2021)
- End-to-end Deep Learning Reconstruction for the CMS Experiment (2020)
- Accelerating End-to-End Deep Learning Reconstruction using Graph Neural Networks (IRIS-HEP) (2021)
EXXA: Exoplanets with AI
Authors: Sergei Gleyzer, Jason Terry, Jack Mcnish, M. , Mihir Tripathi, Alexandra Murariu, G. Shukla
Publications:
- J. Terry and S. Gleyzer, “Locating Hidden Exoplanets with Machine Learning“, NeurIPS2023 Machine Learning and the Physical Sciences (2023)
- J. Terry, C. Hall, S. Abreu and S. Gleyzer, “Kinematic Evidence of an Embedded Protoplanet in HD 142666 Identified by Machine Learning“, Astrophysical Journal 947 (2023)
- J. Terry, C. Hall, S. Abreu and S. Gleyzer, “Locating Hidden Exoplanets in ALMA Data using Machine Learning“, Astrophysical Journal 941 (2022)
Google Summer of Code Project(s):
- Equivariant Vision Networks for Predicting Planetary Systems’ Architectures (2024)
- Exoplanet Atmosphere Characterization (2024)
- Finding Exoplanets with Astronomical Observations (2023)
- Identifying the Physical Process of Planet Formation (EXXA) (2023)
- Finding Exoplanets with Astronomical Observations (2022)
- Finding Exoplanets with Astronomical Observations (2022)
- Finding Exoplanets with Astronomical Observations (2022)
Falcon: Fast Non-Parametric Detector Simulator
Authors: Sergei Gleyzer, Ali Hariri, Harrison Prosper, Omar Zapata Mesa, Darya Dyachkova, Tom Magorsch
Publications:
- Ali Hariri, S. Gleyzer and D. Dyachkova, “Graph Generative Models for Fast Detector Simulations in High Energy Physics”, arXiv:2104.01725
- S. Gleyzer et al., “Graph Generative Models for Fast Detector Simulation in Particle Physics“, 2020
- S. Gleyzer et al., “Falcon: Towards an Ultra Fast Non-Parametric Detector Simulator”, arxiv: 1605.02684, 2016
Google Summer of Code Project(s):
- Non-local GNNs for Jet Classification (2024)
- Diffusion Models for Fast Detector Simulation (2023)
- Anomaly Detection (2022)
- On the potential of graph-based models in High Energy Physics (2021)
- Normalizing Flows for Fast Detector Simulation (2021)
- Graph Generative Models for Fast Detector Simulations in Particle Physics (IRIS-HEP) (2021)
- Fast Simulation with Deep Generative Models (2020)
- Optimize fast detector simulation and multiobjective regression (2018)
- Scaling up Falcon: TMVA implementation of neural networks for multi-jet regression (2017)
Code: Falcon, DeepFalcon, Genie
Quantum Machine Learning for High Energy Physics (QMLHEP) Project
Google Summer of Code Project(s):
- Quple – Quantum GAN (2021)
- Quantum Convolutional Neural Networks for High Energy Physics Analysis at the LHC (2021)
- Quantum Machine Learning for HEP (2020)
Code: Quple (2020)
ROOT/TMVA/SOFIE – The Toolkit for Multivariate Data Analysis
The Toolkit for Multivariate Data Analysis provides a ROOT-integrated machine-learning environment for the processing and parallel evaluation of sophisticated machine learning classification and regression techniques.
From 2015-2017, I have led a significant upgrade and re-design of TMVA focused on robust gpu-capable deep learning libraries, modularity and parallelization. Since 2017, this effort is led by Lorenzo Moneta (CERN), also including fast inference applications (SOFIE).
Authors: Sergei Gleyzer, Lorenzo Moneta, Omar Zapata, Kim Albertsson et al.
Website: http://www.root.ch/tmva
Publication:
- S. Gleyzer et al., “Machine Learning Developments n ROOT”, in Proceedings of International Conference in High Energy and Nuclear Physics, 2017
Google Summer of Code Project(s):
- TMVA Deep Learning Developments – Inference Code Generation for Batch Normalization (2021)
- 3D Convolutions for GPU (2021)
- ROOT Storage of Deep Learning models in TMVA (2021)
- Inference Code Generation for Recurrent Neural Networks (2021)
- Graph Neural Networks for HEP (2020)
- Development of 3D CNN in TMVA (2020)
- Development of PyTorch Interface in TMVA (2020)
- LSTM and GRU Layers in TMVA (2019)
- Generative Adversarial Networks for Particle Physics Applications (2019)
- Recurrent Neural Networks and LSTMs for Particle Physics Appplications (2018)
- Generative Adversarial Networks for Particle Physics Applications (2018)
- Convolutional Neural Networks on GPUs for Particle Physics Applications (2018)
- Variational Auto-encoders on GPUs for Particle Physics Applications (2018)
- Development of Deep Learning Optimization Algorithms (2018)
- Convolutional Neural Networks on GPUs for Particle Physics Applications (2017)
- Recurrent Neural Networks on GPUs for Particle Physics Applications (2017)
- Deep Auto-Encoders for Particle Physics Applications (2017)
- Integration of TMVA and OpenML (2017)
- GPU-Accelerated Deep Neural Networks in TMVA (2016)
- Integrating Machine Learning in Jupyter Notebooks (2016)
- Integration of Spark Parallelization in TMVA (2016)
- Feature engineering in TMVA (2016)
Code: TMVA
Other Google Summer of Code Projects and Software Development:
- Graph Neural Networks for Particle Momentum Estimation in the CMS Trigger System (2021)
- Machine Learning Model for the Albedo of Mercury (2021)
- Machine Learning Model for the Planetary Albedo (2021)
- Background Estimation with Neural Autoregressive Flows (2021)
- Dimensionality Reduction for Studying Diffuse Circumgalactic Medium (2021)
- Uncovering the Enigma of Type-Ia Supernovae: Thermonuclear Supernova Classification via their Nuclear Signatures (2021)
- Deep Learning Algorithms for Momentum Estimation in the CMS Trigger System (2020)
- Cosmic-Ray Imaging Studies via Mission Imagery from Space (2020)
CODER: CMS Open Data Analysis Environment
CODER is a collection of interactive Jupyter notebooks focused on introductory programming concepts and analysis of Open data for K-12 teachers and students.
Authors: Sergei Gleyzer, Omar Zapata
Website: coder.cern.ch
Code: Gallery
PARADIGM: Decision-making Framework for Variable Selection and Reduction in High Energy Physics
Primary Authors: Sergei Gleyzer
Publication:
- S. Gleyzer and H. Prosper, “PARADIGM: Decision-Making Framework for Variable Selection and Reduction in High Energy Physics”, in Proceedings of XII International Workshop on Advanced Computing and Analysis Techniques in Physics Research, 2009
Code: partially integrated into TMVA since 2015