Introduction to Statistical Modeling for Online Behavior Data

Opal: Link here.

Time Thursdays, 5. DS

Location BAR/0I89

Department Computer Science

Modules INF-BAS3 (Software- und Web-Engineering)

Language English

Assessment oral exam

Description:

Online platforms generate vast amounts of behavioral data, including subjective ratings (e.g., likes, stars, reactions) and objective engagement metrics (e.g., views, clicks, watch time). While machine learning excels at predicting user behavior, statistical modeling—a cornerstone of empirical research—is critical for interpretable analysis. It helps uncover relationships between user behavior and various factors (e.g., how demographics influence engagement patterns) and enables causal inference in experiments (e.g., measuring the impact of different feed-ranking algorithms on user experience). These methods are widely used in UX research, product analytics, and A/B testing.

This course provides a hands-on introduction to statistical modeling in R, focusing on methods most useful for analyzing behavioral data. We’ll cover key concepts and progress to model selection based on outcome type. Topics include logistic regression for binary outcomes (e.g., like/dislike reactions), ordinal regression for rating scales, beta regression for continuous feedback (e.g., sliders), and hierarchical models for nested data (e.g., multiple ratings from the same users). In the end, you’ll not only know which model to use but also how to effectively visualize and communicate your findings through clear, interpretable result presentations.

No prior statistical modeling experience is required. If you’re familiar with Python or similar languages, the transition to R should be easy. All scripts and exercises are provided.

Schedule:

Part 1: General Introduction

Week 1 Introduction to statistical modeling: goals and principles

Week 2 Linear regression: the foundational model, applications and limitations

Part 2: Selected Generalized Linear Models

Week 1 Logistic regression 1: analysis of likes and dislikes

Week 2 Logistic regression 2: multivariate analysis

Week 3 Ordinal regression 1: stars and other likert-type ratings

Week 4 Ordinal regression 2: modeling distributional parameters

Week 5 Beta regression 1: slider-type ratings

Week 6 Beta regression 2: modeling distributional parameters

Week 7 Interim summary, additional exercises, and other GLMs

Part 3: Experimental design & hierarchical models

Week 1 Modeling between- and within-individual experimental designs

Week 2 Hierarchical models 1: introduction

Week 3 Hierarchical models 2: logistic and ordinal regression

Week 4 Hierarchical models 3: Beta regression

Week 5 Final session, Q&A, discussion and feedback