This lecture will introduce the broad concept of Machine Learning, its connection to Artificial Intelligence, and will broadly review the use of ML in High Energy Physics.
This lecture will focus on supervised learning, a setting where the training data set is “labelled”, that is the target quantity of learning is known. Automatic Differentiation, the technique that powers up modern machine learning frameworks will then be explained in detail, together with its connection to differentiable programming.
Classification is a category of supervised learning where the goal is to classify the data into different categories. For the CMS search of the supersymmetric partner of the top quark in the compressed mass scenario a Boosted Decision Tree (BDT) algorithm was used to distinguish between signal-like and background-like events. In this exercise, a neural network will be implemented to achieve...
Inductive bias refers to the process of encoding into the learning process some properties of the data known a priori: this can happen by manipulating the training data (augmentation), by modifying the structure of the algorithm (e.g. dense vs convolutional networks), or by modifying the learning target (loss function). The exercise will consist in comparing the performance of generic...
Transformers are an architecture that powers up most Large Language Models in the market nowadays. This lecture will explain the inner structure of a transformer.
When the data set is unlabelled, that is when the target quantity for learning is not known, traditional supervised learning techniques cannot be used. This lecture will explain the corresponding techniques to obtain learning algorithms without an explicitly known target, such as reinforcement learning.
The data challenge will consist in solving a machine learning problem on a given data set.
The participants will be provided access to the data set, and skeleton code to set up the study.
Participants will have to submit a series of predictions for an evaluation data set, as well as the code and an explanation of the logic behind it.
The models faring the best in the evaluation...