First cycle
degree courses
Second cycle
degree courses
Single cycle
degree courses
School of Science
DATA SCIENCE
Course unit
MATHEMATICAL MODELS AND NUMERICAL METHODS FOR BIG DATA
SCP7079406, A.A. 2018/19

Information concerning the students who enrolled in A.Y. 2017/18

Information on the course unit
Degree course Second cycle degree in
DATA SCIENCE
SC2377, Degree course structure A.Y. 2017/18, A.Y. 2018/19
N0
bring this page
with you
Number of ECTS credits allocated 6.0
Type of assessment Mark
Course unit English denomination MATHEMATICAL MODELS AND NUMERICAL METHODS FOR BIG DATA
Website of the academic structure http://datascience.scienze.unipd.it/2018/laurea_magistrale
Department of reference Department of Mathematics
Mandatory attendance No
Language of instruction English
Branch PADOVA
Single Course unit The Course unit can be attended under the option Single Course unit attendance
Optional Course unit The Course unit can be chosen as Optional Course unit

Lecturers
Teacher in charge Stefano Cipolla MAT/08

ECTS: details
Type Scientific-Disciplinary Sector Credits allocated
Educational activities in elective or integrative disciplines MAT/08 Numerical Analysis 6.0

Course unit organization
Period First semester
Year 2nd Year
Teaching method frontal

Type of hours Credits Teaching
hours
Hours of
Individual study
Shifts
Lecture 6.0 48 102.0 No turn

Calendar
Start of activities 01/10/2018
End of activities 18/01/2019

Examination board
Examination board not defined

Syllabus
Prerequisites: Background on Matrix Theory: Type of matrices: Diagonal, Symmetric, Normal, Positive De nite; Matrix canonical forms: Diagonal, Schur; Matrix spectrum: Kernel, Range, Eigenvalues, Eigenvectors and Eigenspaces Matrix Factorizations: LU, Cholesky, QR, SVD
Target skills and knowledge: Learning the mathematical and computational foundations of state-of-the-art numerical algorithms that arise in the analysis of big data and in many machine learning applications. By using modern Matlab toolboxes for large and sparse data, the students will be guided trough the implementation of the methods on real-life problems arising in network analysis and machine learning.
Examination methods: Written exam
Course unit contents: Numerical methods for large linear systems

◦ Jacobi and Gauss-Seidel methods ◦ Subspace projection (Krylov) methods ◦ Arnoldi method for linear systems (FOM) ◦ (Optional) Sketches of GMRES ◦ Preconditioning: Sparse and incomplete matrix factorizations



Numerical methods for large eigenvalue problems

◦ The power method ◦ Subspace Iterations ◦ Krylov-type methods: Arnoldi (and sketches of Lanczos + Non-Hermitian Lanczos) ◦ (Optional) Sketches of their block implementation ◦ Singular values VS Eigenvalues ◦ Best rank-k approximation



Large scale numerical optimization

◦ Steepest descent and Newton's methods ◦ Quasi Newton methods: BFGS ◦ Stochastic steepest descent ◦ Sketches of inexact Newton methods ◦ Sketches Limited memory quasi Newton method



Network centrality

◦ Perron-Frobenius theorem ◦ Centrality based on eigenvectors (HITS and Pagerank) ◦ Centrality based on matrix functions



Data and network clustering

◦ K-Means algorithm ◦ Principal component analysis and dimensionality reduction ◦ Laplacian matrices, Cheeger constant, nodal domains ◦ Spectral embedding ◦ (Optional) Lovasz extension, exact relaxations, nonlinear power method (sketches)



Supervised learning

◦ Linear regression ◦ Logistic regression ◦ Multiclass classi cation ◦ (Optional) Neural networks (sketches)
Planned learning activities and teaching methods: Lectures supported by exercises and lab
Textbooks (and optional supplementary readings)