The courses of this first semester gather modern topics in applied mathematics. The goal is to prepare the second semester and to provide keys for research activities.

The curriculum core is displayed in the same time in Saint-Etienne (Major « High peformance computing and simulation ») and Lyon (all the remaining majors).

Refresher courses in informatics, analysis and statistics are proposed at the begining of the semester. The containts will be adapted to the students level.

## Deterministic modelling

- From physics to Partial Derivative Equations. Second law of Newton. Forces statement (preasure gradient, viscosity, Coriolis, gravity). Continuity law. Classical equations: Navier-Stockes, Euler.
- Sobolev spaces. Classical Theorems: injection, Poincaré inequality, Rellich. Extensions, functionnal.
- Elliptic variationnal theory. Lions-Stampacchia and Lax-Milgram Lemmas. Dirichlet and Neumann problems. Limit problems Spectral theory. Galerkin method.
- Parabolic problems. Approximated solution constructions, estimations on approximated solutions and compacity. Passage to the limit.
- Scalar hyperbolic problems. Weak solutions, non unicity. Entropic solution. Viscosity solution. Extension conservation law systems.
- Taking into account limit conditions for elliptic and parabolic problems.

## Stochastic modelling and statistical learning

- High dimensional regression: real examples and modelling, the linear model and its extansions (modelling, hypotheses, least square and likelihood, Fisher and Student tests,…) – Introduction to model selection (Cp/AIC/BIC criterium, oracle inequalities, igh dimensional behaviour) – Ridge method (heuristic, link with Tikhonov regularisation, risk) – Introduction to the LASSO method (construction and heuristics, existing links with compressed sensing, theoretical properties / oracle inequalities, compatibility condition).
- Supervised classification: real examples and modelling, classical algorithms (kNN, SVM, neural networks, logistic regression, …), theoretical results (concentration inequalities, kernel,…).
- Un-supervised classification: PCA, Clustering (kmeans, hierarchical methods,…), Gaussian mixture models, Spectral clustering.

## Optimisation and machine learning

**Introduction**: List of some concrete optimisation problems (Ridge, LASSO, SVM, logistic regression, neural networks, inverse problems, PDE): expression as a minimisation problem, some examples of cost functions.**Convex generalities**: Investigation of some algorithms (gradient descent, 1st and 2nd order methods), Proximal methods, Lagrange and Fenchel duality, Primal/dual formulation, KKT constraints, Convergence, Strong convexity and rate of convergence

–> TP1 : Minimisation of a differentiable problems (PDE / inverse problems with hyperbolic penalisation) : gradient method, backtracking, Newton, BFGS

–> TP2: Minimisation of a non-differentiable problem (change detection in the dual and then proximal-gradient or Douglas-Rachford, sparse logistic for classification)**Non-convex generalities**: Gradient methods, alternate minimisation methods (Gauss-Seidel, PAM, PALM)

–> TP: sparse logistic with non-convex penalty / Non-negative matrix factorisation**Pratical approaches**: Linear algebra : eigenvalues and eigenvectors computation; power and inverse power methods, linear system inversion, direct methods, direct methods via factorisation (LU, QR), classical iterative methods (Jacobi, Gauss-Seidel and relaxation), Préconditioning, resolution of large linear and non-linear sparse systems, sparse storage technics.

–> TP: power method for the computation of the Lipschitz constant in a proximal gradient descent, matrix inversion and application to a sparse regression example.**Optimisation and stochastic simulation**: forward-backward stochastic gradient descent (main convergence results and rate of convergence), Adam, Adagrad,…. et and some important topics (learning rate, batch, dropout, …), EM algorithm, Gibbs, Metropolis Hastings, prox MALA.

–> TP1: Sparse and Ridge regression with stochastic gradient and proximal gradient. Comparison between non-stochastic gradient and proximal gradient. Convergence rates.

–> TP2: Change detection with MCMC and prox MALA on the dual.**Deep learning**: Automatic differentiation, chain derivation, backpropagation, deep network structures, uncoil algorithms (uncoil forward-backward, LISTA), link between proximal operator and activation function

–> TP1: automatic differentiation, introduction to Pytorch

–> TP2: construction of a neuronal network for the MNIST classification and/or ODE resolution with Pytorch (neural network construction (feed forward) : affine et non-affine layer, optimisation and comparison with different algorithms)