Lecturer:
Francesco Civardi
Course name: Basi di dati II e data mining (mn)
Course code: 064204
Degree course: Ing. per l'Ambiente e il Terr. (LS), Ing. Informatica (LS)
Disciplinary field of science: ING-INF/05
The course relates to:
University credits: CFU 6
Course website: n.d.
Specific course objectives
The student will learn how to design and implement dimensional schemas with relational and multidimensioonal (OLAP) technology.
The student will also learn how to apply data mining algorithms for classification, prediction, clustering and association.
Course programme
Database II and Data Mining is an introduction to the design of analytical databases, aka datawarehouses, and about their use for decision support, analysis and forecasting (KDD, Knowledge Discovery from Database). The last part of the course focuses on data-driven discovery, or Data Mining.
Part I: Introduction to data warehousing and dimensional modelling
- The Knowledge Discovery from Database (KDD) process
- Definition of a data warehouse
- Components of a data warehouse: Data Sources, ETL, Staging Area, Star Schema
- Inmon and Kimball approaches
- Star and snowflake schemas, facts and dimensions
Part 2: multidimensional design (OLAP)
- Multidimensional databases
- Thomsen's LC methodology
- MS Analysis Services
Part 3: Introduction to MDX
- MDX, the "de facto" language for OLAP
- Visualization tools: Excel, Reports
Part IV: Data Mining
- Algorithms for Classification
- Algorithms for numerical Prediction
- Algorithms for Clustering
- Algorithms for Association
- Data Mining with MS Anaysis Services and Excel
- Data Mining with Weka/Knime/Orange/Rapidminer
Course entry requirements
Basic knowledge of database concepts
Course structure and teaching
Lectures (hours/year in lecture theatre): 45
Practical class (hours/year in lecture theatre): 0
Practicals / Workshops (hours/year in lecture theatre): 0
Project work (hours/year in lecture theatre): 0
Suggested reading materials
R. Kimball, R. Margy. The Data Warehouse Toolkit, 2nd edition. Wiley, 2002. (tradotto in Italiano da Hoepli, "Data Warehouse: La guida completa").
Erik Thomsen. Olap Solutions, 2nd edition. Wiley, 2002.
G. Spofford, S.Harinath, C.Webb, D.H. Huang, F. Civardi. MDX Solutions, 2nd edition. Wiley, 2006.
P.N. Tan, M. Steinbach, V. Kumar. Introduction to Data Mining. Pearson International Edition /Addison Wesley, 2005.
Paolo Giudici. Data mining. Modelli informatici, statistici e applicazioni, 2/ed. McGraw-Hill, 2005.
C.Vercellis. Business intelligence.Modelli matematici e sistemi per le decisioni. McGraw-Hill, 2006.
J.Han and M. Chamber. Data Mining - Concepts and Techniques. Morgan Kaufmann.
Testing and exams
A written test and/or a project
|