event image

PDHP Workshops Series:

Principles of Text Analysis

a Workshop

Patrick van Kessel (Pew Research Center)

Wednesday, 11/18/2020, 9:00am to 1:00pm

Location: Zoom

PDHP resumes our 2020 workshop series on Nov. 15th, with a workshop entitled Principles of Text Analysis, presented by Patrick van Kessel, senior data scientist at Pew Research Center. This half-day workshop is geared toward data analysts with unstructured text data (e.g. open-ended survey responses or web-curated text), and will provide a tutorial on cleaning, processing, and analyzing data from text-based sources using state-of-the-art text analytics techniques primarily using Python, with some examples also provided in R (experience with either of these languages is recommended but not required).

Topics include:

* Preprocessing and cleaning messy text data
* Feature extraction using TF-IDF vectorization
* Text analytics techniques including topic modelling and unsupervised clustering methods
* Software demonstration featuring the scikitlearn library for Python.

Related Material:

Registration Required


Patrick van Kessel is a senior data scientist at Pew Research Center, specializing in computational social science research and methodology. He is the author of studies that have used natural language processing and machine learning to measure negative political discourse and news sharing behavior by members of Congress on social media, and is involved in the ongoing development of best practices for the application of data science methods across the Center. Van Kessel received his master's degree in social science from the University of Chicago, where he focused on open-ended survey research and text analytics. He holds bachelor's degrees in economics and political science from the University of Texas at Austin. Prior to joining Pew Research Center, he worked at NORC at the University of Chicago as a data scientist and technical advisor on a variety of research projects related to health, criminal justice and education.

Forthcoming . Past . Next

PSC In The News

RSS Feed icon

Shaefer comments on the Cares Act impact in negating hardship during COVID-19 pandemic

Heller comments on lasting safety benefit of youth employment programs

More News


Dean Yang's Combatting COVID-19 in Mozambique study releases Round 1 summary report

Help Establish Standard Data Collection Protocols for COVID-19 Research

More Highlights

Connect with PSC follow PSC on Twitter Like PSC on Facebook