Ogni modulo equivale a 3 crediti ECTS. È possibile scegliere un totale di 10 moduli/30 ECTS nelle seguenti categorie:
- 12-15 crediti ECTS in moduli tecnico-scientifici (TSM)
I moduli TSM trasmettono competenze tecniche specifiche del profilo e si integrano ai moduli di approfondimento decentralizzati. - 9-12 crediti ECTS in basi teoriche ampliate (FTP)
I moduli FTP trattano principalmente basi teoriche come la matematica, la fisica, la teoria dell’informazione, la chimica ecc. I moduli ampliano la competenza scientifica dello studente e contribuiscono a creare un importante sinergia tra i concetti astratti e l’applicazione fondamentale per l’innovazione - 6-9 crediti ECTS in moduli di contesto (CM)
I moduli CM trasmettono competenze supplementari in settori quali gestione delle tecnologie, economia aziendale, comunicazione, gestione dei progetti, diritto dei brevetti, diritto contrattuale ecc.
La descrizione del modulo (scarica il pdf) riporta le informazioni linguistiche per ogni modulo, suddivise nelle seguenti categorie:
- Insegnamento
- Documentazione
- Esame
This course is about Data Engineering and Information Retrieval. It covers methods and technologies for managing, processing and analyzing potentially large and distributed data collections for transactional or analytical use, including multi-model databases and NoSQL stores. And it covers also mastering data in unstructured form (full text search). The course consists of four parts: 1. Database Management; 2. Data Warehousing and Data Analytics (Business Intelligence); 3. Data Integration including Data Synthesis; and 4. Information Retrieval.
- UML Class Diagrams
- Relational Models, Relational Algebra, Normalization
- Relational Database Management Systems (RDBMS) e.g. Codd's 12 rules, ACID, architecture, etc.
- Query Optimization, Indexes
- Transaction Processing, Concurrency Control
- Security in (Relational) Database Systems
- Knowledge of an operating system including shell
Obiettivi di apprendimento
This module covers following important aspects of Data Engineering:
- Students understand the use of modern database technologies for processing and managing potentially large and distributed data collections for transactional or analytical use.
- Students will be proficient in modern query languages such as the post-relational SQL (SQL:2023 and newer).
- Reaching beyond RDBMS, students learn about data structures (data types) and know which of these to use depending on the requirements and type of data available (polyglot persistence, multi-model databases).
- Students know NoSQL stores and selected cloud data stores.
- Students know methods and tools to integrate, to cleanse and to synthesize data.
- Students know how to deal with full text information using databases and search engines (information retrieval; prompt engineering).
- Students can also apply the acquired knowledge in their own working environment.
Contenuti del modulo
The course is divided into four parts:
- Database Management (DB): New data structures (types) and alternatives to RDBMS. Storing data with post- and non-relational aspects, including NoSQL technologies (especially graph databases), and a selection of advanced topics such as cloud or vector databases.
- Data Warehousing and Data Analytics (DW): Methods and tools for data aggregation and data analytics such as the ones involved in business intelligence.
- Data Integration (DI): Methods and tools for data integration, data cleansing and data synthesizing (e.g. for training and testing) are explained.
- Information Retrieval (IR): Methods and tools for finding information in full text using databases and (enterprise) search engines, including crawling.
Weighting between the parts will be confirmed at the beginning of semester. Tentative weighting:
- DB: ~4-6 weeks
- DW: ~2-4 weeks
- DI: ~1-3 weeks
- IR: ~3-5 weeks
Metodologie di insegnamento e apprendimento
Frontal teaching, case studies, exercises, discussions, (group) work assignments (i.e. laboratory work or mini-project).
Optional literature suggestions (books):
- DB: Advanced Data Management for SQL, NoSQL, Cloud and Distributed Databases. R. Wiese. De Gruyter Textbook. 2015. ISBN 978-3-11-044140-6.
- DB: SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis. R. Teate. Wiley. 2021. ISBN 978-1-119-66936-4.
- IR: Introduction to Information Retrieval. C.D. Manning, P. Raghavan, H. Schütze. Cambridge UP, 2008.
- IR: Information Retrieval in Practice. B. Croft, D. Metzler, T. Strohman. Pearson Education, 2009.
Scarica il descrittivo completo del modulo