Event box

Advanced computational methods with UCSF clinical data on Information Commons In-Person

Workshop Overview

Getting ready to apply Machine Learning and other advanced computational methods to your research? You can do it with UCSF Information Commons, a high performance compute environment powered by AWS Apache Spark cluster. In this hands-on workshop, we will go through a real case study to explore de-identified UCSF Electronic Health Records using UCSF Information Commons. You will learn how to query UCSF clinical data and gain some of the skills necessary for building your own computational models in this environment.

Learning Objectives

In this workshop, you will learn how to do the following on Information Commons:

Run SQL queries to extract de-identified clinical data of interest
Manage your files on the cluster
Launch JupyterHub and run Jupyter notebooks with Python, R or SparkSQL code
Train a machine learning model using Spark-based tools

Prerequisites

In order to benefit from this workshop, you must have an Information Commons account (see Accessing Information Commons) and permission to access UCSF de-identified clinical data (see Research Data and Tools Access Request). Please make sure that you do this by January 20, as this process can take up to 2 weeks.

We also strongly advise that you are comfortable with Unix shell scripting, SQL, and Jupyter notebooks. Familiarity with AWS s3 commands, Python and concepts of machine learning will also be helpful. Tutorials are available on the Information Commons Wiki.

Be sure to bring your laptop to the workshop!

Instructors

Geoff Boushey is an Application Developer for the Data Science Initiative and Center for Knowledge Management in the UCSF Library

Angelo Pelonero is an Instructional Designer for the Data Science Initiative in the UCSF Library and for the Bakar Computational Health Sciences Institute at UCSF

And others from the Bakar Computational Health Sciences Institute and Library

Date:: Thursday, Feb 6 2020
Time:: 3:00pm - 5:00pm
Time Zone:: Pacific Time - US & Canada (change)
Location:: BH215
Campus:: Mission Bay
Categories:: Data Science

Registration has closed.

Browse/Search for more events

Event Organizer

Karla Lindquist

karla.lindquist@ucsf.edu