Artificial Intelligence

Machine Learning

As described in my research interests page, data science and machine learning is at the very core of my research. I develop novel algorithms and apply them to data-intensive problems in particle physics.

In particular, I am interested in the following areas of algorithm development and applications:

  • End-end-deep learning
  • Feature extraction
  • Graph Neural Networks
  • Anomaly Detection
  • Physics-inspired deep learning
  • Generative Models for Fast Simulation
  • Quantum Machine Learning


Feature extraction

Feature extraction is one of the most fundamental problems in data science and data analysis. The outcome of the learning process is greatly influenced by the ingredients that the system is able to learn from. Finding the optimal features or building them from provided ingredients is a critical part of achieving a successful outcome for any general machine learning problem.

In the area of feature extraction, I have focused mostly on supervised and unsupervised methods. In the area of supervised methods, I have developed the probabilistic FAST (Feature extraction using Seed Trees) algorithm that makes probabilistic selection in the parameter space of the input features and extracts feature importance from classifiers built from such “seeds”. More details about the FAST algorithm can be found here.

Code: as of 2015 the FAST algorithm is the default feature importance algorithm in the Toolkit for Multivariate Data Analysis.

In the area of unsupervised feature extraction, I have focused on unsupervised feature extraction methods at the pre-processing stage, similar to convolutional filters in convolutional neural networks. In particular, I focus on applying physical conservation constraints to unsupervised and pre-processing layers to build physically consistent higher-order features.


End-end Deep Learning

Together with collaborators, I have been working on end-to-end deep learning applications in particle physics, a combination of feature-extracting deep learning methods and low-level detector data representation. Our current work focuses on multi-dimensional representations of such models and their application to identifying new particles and signatures of new physics in real LHC physics analyses. For more information please take a look at recent publications for particle and event identification using the end-to-end approach.


Physics-inspired Deep Learning

Parameter space of possible machine learning models is immense. However, physics conservation laws provide constraints that can help guide the creation of powerful models and individual layers within models. I work on introducing physics-inspired model layers that represent real physical quantities that can be formed from the input features. By comparing the resulting layers with known physical quantities of interest, we can infer the validity of such approach and perhaps find new insights in the data. A mixture of machine learning and particle physics may be a promising next step in machine learning applications to particle physics.


Graph Neural Networks for Particle Physics

I have recently been focusing on developing algorithms and applications of graph networks (geometric deep learning) in the area of detector reconstruction and physics. In particular, I am studying graph-based architectures and models to reconstruct the data from the CMS High Granularity Endcap Calorimeter (HGCAL) to improve particle identification and reduce the backgrounds due to pile-up collisions for the High-Luminosity Large Hadron Collider, as well as on novel applications to fast detector simulation and anomaly detection.


 Fast Detector Simulation, Multi-objective Regression and Generative Modeling

I am particularly interested in development of advanced function estimation techniques and models with machine learning that have many applications in particle physics.  In the past, I have developed models that estimate single target function and have applied it to measurement of particle energies, leading to 20-30% improvements over existing methods. More recently I have extended this to deep models and multiple targets, i.e. several simultaneous function learning.

One application of this technique is in fast detector simulation. Current detector simulation is very time and resource consuming – on the order of 1sec/event on a typical cpu. By by-passing this step and learning directly on the output (target) of reconstructed particles, it is possible to achieve a many-orders of magnitude speed-up compared to the current state-of-the art simulation, without losing the required accuracy. More details can be found here and  here. An alternative approach to the same problem is to train a generative model, something we work on as well.


Quantum Machine Learning

Quantum Machine Learning is an emerging area of algorithm development and potential future applications in science. I have completed my first Quantum Computing project in 2001, while working at HYPRES, and have been working on quantum machine learning algorithm development and applications for particle physics. Our results show the potential for quantum machine learning models to have an advantage compared to their classical counterparts. This work is supported in part by 2023 Director Reserve Allocation and 2024 DOE Mission Science Allocation Awards at NERSC.


Machine Learning Software Development:

As modern machine learning is mostly performed and developed in software, I have devoted a significant amount of time to various projects in this area. For more details on specific projects please see my Software Development section. This builds upon open-source machine learning developments in high-energy physics that I presented at the NeurIPS2018 conference.


Machine Learning in Hardware:

As high-energy physics experiments are currently throwing away more than 99.9% of all the data collected due to storage space constraints, there is a significant challenge of making rapid sub-microsecond decisions based on limited information provided by the detectors. Currently, CMS relies on a hardware-based trigger, which implements decision logic on FPGAs to rapidly select most interesting events for further analysis.

Trained machine learning algorithms have low inference latency and are therefore attractive to real-time trigger applications. We have implemented first such application for estimating the muon momentum in the hardware trigger of CMS, based on discrete boosted decision trees implemented on FPGAs. More recently the focus has been on deep-learning based models for this and other trigger use-cases. In a joint project with MathWorks, we are working with graph neural network models for realtime detector applications.


Teaching Data Science

Data science is a rapidly growing field. I have developed and taught a number of machine learning lectures and courses, including hands-on tutorials. Please take a look at my teaching page for more details.