Text as Data

The UM campus was lucky to have one of the developers of the Structured Topical Model (stm) come for a 1-day workshop on this R package. Participants got access to more than can be publically posted on the PSC Info Blog, but there is still plenty to explore below. And, if I find I can post more of the workshop materials, I will update this post at a later time.

stm Vignette
This vignette is an annotated guide to using the stm package. Participants in the workshop got a more interactive vignette, which didn’t assume as much knowledge about R. But, assuming you have familiarity with R, this vignette walks the user through stm.

Publications based on stm
[scroll about halfway down for methods papers and two thirds for substantive papers]

The stm package is assuming that the several other packages have been installed. Here are a few:
Getting started with quanteda – An R package for managing and analyzing text

Additional Quanteda vignettes
These are additional vignettes. The more you work through these, the more you learn. Even better, is if you substitute your own data.

Getting started with tidyverse in r

Corpus – an R package for managing text

New Directions in Analyzing Text as Data Workshop
Materials for a 2-day workshop by Ken Benoit

Quantitative Text Analysis
Materials and exercises for a 4 session short course by Ken Benoit

