First cycle
degree courses
Second cycle
degree courses
Single cycle
degree courses
School of Science
STATISTICAL SCIENCES
Course unit
INFORMATIC METHODS FOR STATISTICS AND DATA SCIENCE
SCP7081820, A.A. 2019/20

Information concerning the students who enrolled in A.Y. 2019/20

Information on the course unit
Degree course Second cycle degree in
STATISTICAL SCIENCES
SS1736, Degree course structure A.Y. 2014/15, A.Y. 2019/20
N0
bring this page
with you
Number of ECTS credits allocated 9.0
Type of assessment Mark
Course unit English denomination INFORMATIC METHODS FOR STATISTICS AND DATA SCIENCE
Website of the academic structure http://www.stat.unipd.it/studiare/ammissione-laurea-magistrale
Department of reference Department of Statistical Sciences
E-Learning website https://elearning.unipd.it/stat/course/view.php?idnumber=2019-SS1736-000ZZ-2019-SCP7081820-N0
Mandatory attendance No
Language of instruction Italian
Branch PADOVA
Single Course unit The Course unit can be attended under the option Single Course unit attendance
Optional Course unit The Course unit can be chosen as Optional Course unit

Lecturers
Teacher in charge EMANUELE DI BUCCIO ING-INF/05

ECTS: details
Type Scientific-Disciplinary Sector Credits allocated
Educational activities in elective or integrative disciplines ING-INF/05 Data Processing Systems 9.0

Course unit organization
Period Second semester
Year 1st Year
Teaching method frontal

Type of hours Credits Teaching
hours
Hours of
Individual study
Shifts
Lecture 9.0 64 161.0 No turn

Calendar
Start of activities 02/03/2020
End of activities 12/06/2020
Show course schedule 2019/20 Reg.2014 course timetable

Examination board
Board From To Members of the board
3 Commissione a.a.2019/20 01/10/2019 30/09/2020 DI BUCCIO EMANUELE (Presidente)
MELUCCI MASSIMO (Membro Effettivo)
MORO MICHELE (Membro Effettivo)

Syllabus
Prerequisites: The prerequisites are relatively simple but necessary: foundations of data structures (variable, file, vector, matrix), algorithms, computer science, and database management systems.
The knowledge of a programming language is useful, but not strictly necessary. The knowledge of R is discouraged.
Target skills and knowledge: We aim to provide effective knowledge of computational methods for a student to have greater competence in Statistics than an IT specialist and greater competence in Computer Science than a Statistician. Particular emphasis will be placed on programming and data management and on overcoming the way of writing software induced by languages such as R and packages of pre-packaged software.
Assessment criteria: We will evaluate the understanding of the problems and the ability to find and design automated solutions for the organization, management and analysis of data in order to carry out the tasks illustrated in the contents and provided for by the oral test project.
Course unit contents: 1. Introduction to Python: environment, constructs, first examples.
2. Collection, organization and management of large masses of data: pattern matching, parsing (XML, CSV).
3. Basic data structures: lists, hashes, graphs, trees.
4. Fundamental algorithms: recursion, research, ordering.
5. Architectures distributed with MapReduce.
6. Representation and indexing, retrieval and ranking.
7. Networks, links and click-through: WWW, Link Analysis, HITS, Pagerank.
8. Decomposition and reduction of the dimensionality.
9. Frequent sets.
Planned learning activities and teaching methods: The contents will be treated in a mainly laboratory form by developing programs and using software libraries in Python.
The methodological elements will be introduced in order to know the underlying issues, to design and implement projects, and to use the tools in a conscious way.
Additional notes about suggested reading: Teaching material will be distributed during the lessons in addition to the reference texts. Some texts, especially those for programming and data management, will be indicated at the beginning of the lessons.
Textbooks (and optional supplementary readings)