UE Artificial intelligence : Introduction to machine learni.

ECUE's code : RMUSP29

This course give 3.0 ECTS.

Structure

EUR SPECTRUM

Disciplinary domain

Informatique

Campus

Campus Valrose

Course's level

Master 1 , Master 2 , Doctorat

Semester

Semestre impair

Language

Anglais

PRESENTATION

This course will develop an introduction to ML, by reviewing the fundamental principles and methods. Broadly speaking, Machine Learning (ML) is the scientific field aiming at building models and inferring knowledge by applying algorithms to data. Therefore, the process involves the (statistical) analysis of data, and the design of models, possibly predictive. During this course, we will be more interested in the framework of use of the different methods rather than their mathematical foundations or their effective computer implementations.

This minor is open to students from LIFE and SPECTRUM graduate schools.

Classroom number for the entire fall semester :

Valrose Campus,
Room M14.

Course's manager(s)

, Edoardo Sarti

In class

13h of lectures
16h of directed studies
12h of CM
18h of TP

PREREQUISITES

Before the start of the course, I must ...

Obtain a bachelor degree (fr. « licence ») in a scientific discipline

OBJECTIVES

By the end of this course, I should be able to...

o Know the principles of Machine Learning, the main classes of problems, the main models
o Know how to use an extended range of tools to analyze data requiring some pre-processing

CONTENT

• Chapter 1 : Bias, variance, and nomenclature (Edoardo Sarti)
• Chapter 2 : Regression with the linear model (Edoardo Sarti)
Regression is the problem concerned with the prediction a response value from variables. This course will cover the basics of the method including the selection of variables and the design of sparse models.

Topics :
- Linear regression and least squares
- Errors and model adequacy
- Regularizers
- Sparse models
- Introduction to convexity theory
• Chapter 3 : Classification with the logistic regression (Edoardo Sarti)
Logistic regression is a supervised classification algorithm used to model the probability of an observation to belong to a given class. To do so, a linear model is used to estimate the parameters.

Topics :
- Classification using linear models
- The logistic regression
- Using entropy : cross-entropy as a cost function
• Chapter 4 : Support Vector Machines (Igor Carrara)
SVM are a popular and robust class of models to perform supervised classification. The main difficulties are to deal with classes which are partially mixed -- e.g. due to noise, and whose boundaries have a complex geometry.

Topics :
- Linear separability and support vectors
- Soft margin separators
- Kernels and non-linear separation
- Multiclass classification
• Chapter 5 : Bayesian statistics and Naive Bayes (Edoardo Sarti)
This lecture will introduce the foundations of Bayesian statistics and compare it to the so-called Naive Bayes classifier.

Topics :
- Conditional probability, independence and conditional independence
- Bayes rule : prior, likelihood, posterior
- Bayesian interpretation : linear regression as a max likelihood method
- Nonparametric Bayesian models : k-nearest neighbors
- Parametric Bayesian models : Naive Bayes classifier
- Generating data with Naive Bayes
• Chapter 6 : Dimensionality reduction (Igor Carrara)
Dimensionality reduction methods aim at embedding high-dimensional data into a lower-dimensional space, while preserving specific properties such as pairwise distances, the data spread, etc. Originating with the celebrated Principal Components Analysis method, recent methods have focused on data located on non linear spaces.

Topics :
- Principal Component Analysis
- t-Stochastic Neighbor Embedding (t-SNE)
- UMAP and other approaches
• Chapter 7 : Clustering (k-means, hclust) (Edoardo Sarti)
In a non supervised context, clustering aims at grouping the data in homogeneous groups by minimizing the intra-group variance. This fundamental task is surprisingly challenging due to several difficulties: the (generally) unknown number of clusters, clusters whose boundaries have a complex geometry, dealing with overlapping clusters (due to noise), dealing with high dimensional data, etc. This class will present two main clustering techniques:

Topics :
- k-means and k-means++
- Elbow method, silhouette method
- Other clustering methods : dbscan, spectral clustering
- Hierarchical clustering
• Chapter 8 : Applications (Edoardo Sarti)

In this 30-minute conclusive lecture, we will explore some applications of the subjects presented throughout the course. We will also have a look at what this course did not include, and the many reasons why you should continue to learn about machine learning.

Access to complete Syllabus (Authentification required)

Important

This syllabus has no contractual value. Its content is subject to change throughout this year: be aware to the last updates