Text Analysis Course

This page hosts material for the 5-day course in text analysis at the Max Planck Summer School.

Syllabus
Code Base (zip file, scripts numbered according to accompanying slides)
Course Assignment

Anaconda Installation Instructions

Windows Instructions

MacOS Instructions

Acknowledgement

Thanks to Chris Bail, Brandon Stewart, Piero Molino, and Michael McMahon for useful slide decks, on which some of these  lectures are based.

Part 0 — Introduction

Slides – 00 – Introduction

Part 1 — From Documents to Features

Slides – 01 – Introducing Corpora

Slides – 02 – Features

Part 2 — Describing the Feature Matrix

Slides – 03- Topic Models

Slides – 04 – Word Embeddings

Slides – 05 – Similarity and Clustering

Part 3 — Supervised Learning with Text Data

Slides – 06 – Regression

Slides – 07 – Classification

Part 4 — Research Design with Text Data

Slides – 08 – Research Design

Slides – 09 – Course Recap