First cycle
degree courses
Second cycle
degree courses
Single cycle
degree courses
School of Science
DATA SCIENCE
Course unit
FUNDAMENTALS OF INFORMATION SYSTEMS
SCP7078720, A.A. 2018/19

Information concerning the students who enrolled in A.Y. 2018/19

Information on the course unit
Degree course Second cycle degree in
DATA SCIENCE
SC2377, Degree course structure A.Y. 2017/18, A.Y. 2018/19
N0
bring this page
with you
Number of ECTS credits allocated 12.0
Type of assessment Mark
Course unit English denomination FUNDAMENTALS OF INFORMATION SYSTEMS
Website of the academic structure http://datascience.scienze.unipd.it/2018/laurea_magistrale
Department of reference Department of Mathematics
Mandatory attendance No
Language of instruction English
Branch PADOVA
Single Course unit The Course unit can be attended under the option Single Course unit attendance
Optional Course unit The Course unit can be chosen as Optional Course unit

Lecturers
Teacher in charge GABRIELE TOLOMEI INF/01
Other lecturers ARMIR BUJARI INF/01
NICOLO' NAVARIN INF/01

ECTS: details
Type Scientific-Disciplinary Sector Credits allocated
Core courses INF/01 Computer Science 6.0
Core courses ING-INF/05 Data Processing Systems 6.0

Course unit organization
Period First semester
Year 1st Year
Teaching method frontal

Type of hours Credits Teaching
hours
Hours of
Individual study
Shifts
Lecture 12.0 96 204.0 No turn

Calendar
Start of activities 01/10/2018
End of activities 18/01/2019

Examination board
Examination board not defined

Syllabus
Prerequisites: The student should have basic knowledge of computer programming and problem solving skills.
Target skills and knowledge: The aim of this class is to teach the concepts, methods, and technologies which any modern data scientist should master.
In particular, the focus of this class is on the processing/storaging of data and big data, which also involves elements of computer networking.
The ability of processing data effectively and efficiently will be gained using Python, which is possibly the reference programming language for data scientists. Ultimately, students will aquire coding skills to collect, clean, visualize, and analyse data, and more generally to tackle with any data science/machine learning task.
Concerning storage, the basics of relational databases are introduced, followed by a review of non-relational solutions typically adopted for big data. Basics of systems for storage of streams of data are presented as well. The networking submodule provides an introduction to fundamental concepts in the design and implementation of computer communication networks, their protocols, and applications. Topics covered in this part include: layered network architecture, data link protocols, network and transport protocols and applications. Examples will be drawn from the Internet TCP/IP protocol suite. After that, advanced and emerging networking paradigms aimed at addressing QoS and engineering flexibility of current infrastructure networks are introduced. Topics covered range from software defined networking to cloud provisioning schemes and datacenters.
Examination methods: The student is expected to pass a written and an oral exam.
Assessment criteria: The written and the oral exams will be evaluated on the basis of the following criteria: i) student’s knowledge of the concepts, methods, and technologies at the basis of the topics covered in the course; ii) student’s capacity for synthesis, clarity, and abstraction.
Course unit contents: The course is structured into 3 submodules:
- Python Programming (for Data Science)
This submodule provides students with the foundational coding skills they need as data scientists. First, the basics of the Python programming language are covered (i.e., built-in data types, fuctions, I/O, etc.) along with the environment which is used throughout the class (i.e., Jupyter Notebook). Afterwards, students will dig into a set of the most up-to-date data science Python packages; those are: numpy/scipy (for numerical/scientific computing), pandas (for data manipulation), matplotlib/seaborn (for data visualization), and finally scikit-learn (for learning from data). Eventually, at the end of this submodule students will be able to implement all the stages of a typical machine learning pipeline: from collecting data to building predictive models for solving either a classification or a regression problem.
- Databases
This submodule is dedicated to data storaging, and it covers the following topics:
Introduction to relational databases: data model; relational algebra; SQL; DBMS;
NoSQL technologies: characteristics of NoSQL databases; aggregate data models: key value stores, document databases, column family stores, graph databases, others; distribution models: sharding, replication (master-slave,peer-to-peer).
Streams of Data: architecture(s); data modeling; query processing and optimization.
- Networking
This submodule allows students to get familiar with computer networking. In particular, it focuses on the following topics:
Networking Fundamentals: Network architectures (OSI Model); TCP and UDP Transport layer protocols; IP Addressing and Routing; Link Layer Forwarding; DNS and DHCP.
Advanced Networking: Virtual LAN (VLAN) and Virtual eXtensible Lan (VXLAN), Software Defined Networking: control, data plane and virtualization; concepts on Cloud Computing: service and deployment models: data centers architectures, topologies, addressing, routing, traffic characteristics; Case Study: The Web of Things (IoT standards and protocols).
Planned learning activities and teaching methods: The course consists of lectures.
Additional notes about suggested reading: Slides presented during the lectures are made ​​available to students as reference material.
Textbooks (and optional supplementary readings)

Innovative teaching methods: Software or applications used
  • Moodle (files, quizzes, workshops, ...)
  • Jupyter Notebook