Each module contains 3 ECTS. You choose a total of 10 modules/30 ECTS in the following module categories:
- 12-15 ECTS in technical scientific modules (TSM)
TSM modules teach profile-specific specialist skills and supplement the decentralised specialisation modules. - 9-12 ECTS in fundamental theoretical principles modules (FTP)
FTP modules deal with theoretical fundamentals such as higher mathematics, physics, information theory, chemistry, etc. They will teach more detailed, abstract scientific knowledge and help you to bridge the gap between abstraction and application that is so important for innovation. - 6-9 ECTS in context modules (CM)
CM modules will impart additional skills in areas such as technology management, business administration, communication, project management, patent law, contract law, etc.
In the module description (download pdf) you find the entire language information per module divided into the following categories:
- instruction
- documentation
- examination
One of the most used (statistical) models for inferential data analysis is the linear regression model. But it is restricted to a Gaussian distributed response and a linear function for linking the linear combination of predictors with the expected response. Generalized Linear and Additive Models (GLM, GAM) allow us to relax some of these restrictions by specifying a more general set of response distributions and non-linear link functions. Hence we can analyse a wider variety of real world phenomenon such as counts, binary outcomes proportions and amounts (i.e. non-negative real-valued data). The aim of this modelling approach is to better understand the response outcome induced by the predictors based on the available data, allowing for better and more informed interpretation of the phenomenon. The first part of this course will provide an overview over the GLM/GAM approach and will detail many benefits and a few pitfalls.
The second part of this course introduces to the basic concepts of causality. Many statistical and machine learning methods (including the statistical models learned in the first part of this module) are about association rather than causation. We will have a closer look at how causal effects are mathematically defined and what assumptions about data and model are necessary for drawing causal conclusions (e.g. interventions, instrumental variables, counterfactuals). In a first step, we start with the definition of causality and introducing graphical models. This enables us in a second step to estimate causal effects and interfere on causal relationships. In particular, the course introduces to structural equation models.
Prerequisites
- Basic calculus and linear algebra
- Basic knowledge in probability, statistical inference and regression analysis on the level of Devore, Farnum and Doi, “Applied Statistics for Engineers and Scientists”, 2014 Cengage Learning.
- User knowledge of R, MATLAB, Python or any other statistical software.
Learning Objectives
- The students are able to analyse data by Generalized Linear and Additive Models (GLM and GAM) and understand the benefits that these model approaches offer for the analysis of normally and non-normally distributed response variables.
- The students understand when causal reasoning is important and what regression and machine learning are actually doing. They understand the importance of the data generating process. They can determine causal effects from observational data using graphical models.
- The students acquire a comprehensive overview how the open source statistical environment R is used and are able to perform a data analysis applying the techniques introduced in the course on real data sets.
Contents of Module
First Part (8 weeks):
- Review of the concepts of multiple linear regression analysis with respect to inference, prediction, model evaluation and model building. Introducing some advanced topics in linear regression modelling. (3 weeks)
- Extending the linear regression model to generalized linear and additive models including logistic, Poisson, and Gamma regression. Revise inference, evaluation and variable selection for such models. (5 weeks)
Second Part (6 weeks):
- Pitfalls of drawing conclusions from observational data; Simpson's paradox and its implications. Causation versus association, When is causal reasoning important? What is regression and machine learning actually doing?
- Introducing the concept of instrumental variables and interventions, determining causal effects from observational data using the most recent approaches such as causal graphical models or structural equation models, Estimating total and direct causal effects.
Both parts:
- The contents listed are illustrated with used cases from the industrial and scientific fields. The practical work is done with the open source statistical analysis environment R.
Teaching and Learning Methods
Classroom teaching and practical work on computer with the statistical analysis environment R/RStudio.
Literature
Slides and lecture notes will be available in addition to recommended book chapters.
Download full module description
Back