Introduction to Text as Data
Instructor: June Yang
Text data has gained popularity over the last decade due to the increased data availability, the emergence of new methods, and the decreasing costs of computational resources. Based on the book Text As Data: A New Framework for Machine Learning and the Social Sciences, this workshop introduces the methods that could be used to select and represent text, conduct research discoveries, and build measurements out of text data.
We will review the principles briefly, take an overview of the methods for each section, and deep dive into one or two of the most common methods using Python. This workshop is designed to help researchers in social science and demography with no prior experience in working with text.
Prerequisites |
Experience working with text is not required, but some understanding of Python and writing functions would be helpful. |
Materials |
Tutorial files: https://github.com/jyang32/TAD_workshop Google Colab playground: https://colab.research.google.com/drive/1RWUytojPxMkMQs2pDo7EYKwMBRP76IUo?usp=sharing |