Event box

Text Cleaning and Preparation for Natural Language Processing


This workshop will review the process of preparing, cleaning, and formatting text for natural language processing projects. Topics will include stop words, n-grams, stemming, lemmatization, and other techniques for pre-processing text. Although we will populate and evaluate a machine learning text classification model, empahsis will be on the programming work involved in preparing text to build and populate the model rather than algorithms or analysis.

This workshop will take place online over UCSF Zoom. Registered participants will recieve an email through LibCal with connection information the day before the workshop.


Familiarity with Python.


Geoffrey Boushey with the UCSF Library Data Science Initiative

Friday, Feb 26 2021
9:00am - 12:00pm
Time Zone:
Pacific Time - US & Canada (change)
  Data Science Initiative > Programming  

Registration is required. There are 30 seats available.

Event Organizer

Profile photo of Geoffrey Boushey
Geoffrey Boushey