First cycle
degree courses
Second cycle
degree courses
Single cycle
degree courses
School of Science
PHYSICS OF DATA
Course unit
MANAGEMENT AND ANALYSIS OF PHYSICS DATASET (MOD. B)
SCP8082535, A.A. 2019/20

Information concerning the students who enrolled in A.Y. 2019/20

Information on the course unit
Degree course Second cycle degree in
PHYSICS OF DATA
SC2443, Degree course structure A.Y. 2018/19, A.Y. 2019/20
N0
bring this page
with you
Number of ECTS credits allocated 6.0
Type of assessment Mark
Course unit English denomination MANAGEMENT AND ANALYSIS OF PHYSICS DATASET (MOD. B)
Website of the academic structure http://physicsofdata.scienze.unipd.it/2019/laurea_magistrale
Department of reference Department of Physics and Astronomy
Mandatory attendance No
Language of instruction English
Branch PADOVA

Lecturers
Teacher in charge DONATELLA LUCCHESI FIS/01

Integrated course for this unit
Course unit code Course unit name Teacher in charge
SCP8082533 MANAGEMENT AND ANALYSIS OF PHYSICS DATASET (C.I) DONATELLA LUCCHESI

ECTS: details
Type Scientific-Disciplinary Sector Credits allocated
Core courses FIS/01 Experimental Physics 6.0

Course unit organization
Period Annual
Year 1st Year
Teaching method frontal

Type of hours Credits Teaching
hours
Hours of
Individual study
Shifts
Lecture 6.0 48 102.0 No turn

Calendar
Start of activities 30/09/2019
End of activities 20/06/2020
Show course schedule 2019/20 Reg.2018 course timetable

Syllabus

Common characteristics of the Integrated Course unit

Prerequisites: Elements of analysis and algebra.
General physics.
Statistics.
Basic programming elements.
Target skills and knowledge: Fundamental knowledge of Unix operating systems
Knowledge of distributed computing.
Knowledge of the management of big data on distributed architectures.
Ability to build a cluster with the available hardware.
Data management on the distributed cluster.
Analysis of data on distributed clusters.
Examination methods: Development of a project assigned at the end of the course. Presentation and discussion of the project, questions on the material presented in class.
Assessment criteria: Evaluation of the project delivered: accuracy, completeness and correctness of the work.
Presentation of the assignment: ability to synthesize information, completeness, correctness and accuracy in the presentation.
Evaluation of the answers: correctness, completeness and accuracy.

Specific characteristics of the Module

Course unit contents: Part 1) Distributed computing
Distributed Computing systems and the Grid paradigm
Computing Models
Dask principles
Setup of a cluster with Dask
Data movement and analysis on dask cluster
Machine learning on a dask cluster
Part 2) Data Management
Data Workflows in scientific computing
Storage Models
Data management components:
-Name Servers and databases
-Data Access protocols
-Reliability
-Availability
-Access Control and Security
-Cryptography
Authentication, Authorization, Accounting
Scalability
-Cloud storage
-Block storage
Analytics
Data Replication
Data Caching
Monitoring, Alarms
Quota
Planned learning activities and teaching methods: Lectures and several exercises and examples in the computer lab
Additional notes about suggested reading: The slides used during the lectures will be made available including references to books and open access articles
Textbooks (and optional supplementary readings)