Text Cleaning and Preparation for Natural Language Processing
This workshop will review the process of preparing, cleaning, and formatting text for natural language processing projects. Topics will include stop words, n-grams, stemming, lemmatization, and other techniques for pre-processing text. Although we will populate and evaluate a machine learning text classification model, empahsis will be on the programming work involved in preparing text to build and populate the model rather than algorithms or analysis.
This workshop will take place online over UCSF Zoom. Registered participants will recieve an email through LibCal with connection information the day before the workshop.
Familiarity with Python.
Geoffrey Boushey with the UCSF Library Data Science Initiative
- Friday, Dec 18 2020
- 9:00am - 12:00pm
- Time Zone:
- Pacific Time - US & Canada (change)