This course will develop an introduction to ML, by reviewing the fundamental principles and methods. Broadly speaking, Machine Learning (ML) is the scientific field aiming at building models and inferring knowledge by applying algorithms to data. Therefore, the process involves the (statistical) analysis of data, and the design of models, possibly predictive. During this course, we will be more interested in the framework of use of the different methods rather than their mathematical foundations or their effective computer implementations.
This minor is open to students from LIFE and SPECTRUM graduate schools.
Classroom number for the entire fall semester :
Regression is the problem concerned with the prediction a response value from variables. This course will cover the basics of the method including the selection of variables and the design of sparse models.
Topics :
Logistic regression is a supervised classification algorithm used to model the probability of an observation to belong to a given class. To do so, a linear model is used to estimate the parameters.
Topics :
SVM are a popular and robust class of models to perform supervised classification. The main difficulties are to deal with classes which are partially mixed -- e.g. due to noise, and whose boundaries have a complex geometry.
Topics :
This lecture will introduce the foundations of Bayesian statistics and compare it to the so-called Naive Bayes classifier.
Topics :
Dimensionality reduction methods aim at embedding high-dimensional data into a lower-dimensional space, while preserving specific properties such as pairwise distances, the data spread, etc. Originating with the celebrated Principal Components Analysis method, recent methods have focused on data located on non linear spaces.
Topics :
In a non supervised context, clustering aims at grouping the data in homogeneous groups by minimizing the intra-group variance. This fundamental task is surprisingly challenging due to several difficulties: the (generally) unknown number of clusters, clusters whose boundaries have a complex geometry, dealing with overlapping clusters (due to noise), dealing with high dimensional data, etc. This class will present two main clustering techniques:
Topics :
In this 30-minute conclusive lecture, we will explore some applications of the subjects presented throughout the course. We will also have a look at what this course did not include, and the many reasons why you should continue to learn about machine learning.