Introduction to Statistical Modeling for Online Behavior Data
Opal: Link here.
Time Thursdays, 5. DS
Location BAR/0I89
Department Computer Science
Modules INF-BAS3 (Software- und Web-Engineering)
Language English
Assessment oral exam
Description:
Online platforms generate vast amounts of behavioral data, including subjective ratings (e.g., likes, stars, reactions) and objective engagement metrics (e.g., views, clicks, watch time). While machine learning excels at predicting user behavior, statistical modeling—a cornerstone of empirical research—is critical for interpretable analysis. It helps uncover relationships between user behavior and various factors (e.g., how demographics influence engagement patterns) and enables causal inference in experiments (e.g., measuring the impact of different feed-ranking algorithms on user experience). These methods are widely used in UX research, product analytics, and A/B testing.
This course provides a hands-on introduction to statistical modeling in R, focusing on methods most useful for analyzing behavioral data. We’ll cover key concepts and progress to model selection based on outcome type. Topics include logistic regression for binary outcomes (e.g., like/dislike reactions), ordinal regression for rating scales, beta regression for continuous feedback (e.g., sliders), and hierarchical models for nested data (e.g., multiple ratings from the same users). In the end, you’ll not only know which model to use but also how to effectively visualize and communicate your findings through clear, interpretable result presentations.
No prior statistical modeling experience is required. If you’re familiar with Python or similar languages, the transition to R should be easy. All scripts and exercises are provided.
Schedule:
Part 1: General Introduction
Week 1 Introduction to statistical modeling: goals and principles
Week 2 Linear regression: the foundational model, applications and limitations
Part 2: Selected Generalized Linear Models
Week 1 Logistic regression 1: analysis of likes and dislikes
Week 2 Logistic regression 2: multivariate analysis
Week 3 Ordinal regression 1: stars and other likert-type ratings
Week 4 Ordinal regression 2: modeling distributional parameters
Week 5 Beta regression 1: slider-type ratings
Week 6 Beta regression 2: modeling distributional parameters
Week 7 Interim summary, additional exercises, and other GLMs
Part 3: Experimental design & hierarchical models
Week 1 Modeling between- and within-individual experimental designs
Week 2 Hierarchical models 1: introduction
Week 3 Hierarchical models 2: logistic and ordinal regression
Week 4 Hierarchical models 3: Beta regression
Week 5 Final session, Q&A, discussion and feedback