The Center for Computational Thinking’s Ph.D. Computational Fellowship provides Ph.D. students in any Ph.D. program who lack summer funding with foundational classes in data science to accelerate their own research program. The fellowship provides students with project- and team-based learning to better prepare them for data-centric research problems, and aims to create a pipeline of diverse, curious, and savvy individuals who will be future leaders in computation. Through this program, students continuously develop their emerging programming skills to apply to academic centered research questions. During this summer-long fellowship, students apply their new computational skills to research problems relevant to their fields, from the humanities to the social sciences to the sciences. At the conclusion of the summer, the students prepare a short presentation and record of their project to share broadly.

Introduction to Data Science Boot Camp 

The CCT fellowship includes an introduction to data science at the beginning of the summer. 

This workshop provides an introduction to the emerging field of Data Science, including data analysis and visualization. Students will be provided with datasets, and introduced to packages and code used to examine data. In the first half of each class, students will be lectured on methods and shown demonstrations; in the second half of each class, students will use tools to analyze real data; laptop computers are required. Methods for filtering, sorting, and transforming data will be discussed along with visualization tools and options. Particular attention will be paid to code interpretation and data provenance methods by learning to generate reproducible data output files. For a final project for the bootcamp, students will present their research question and exploratory data analysis, and will share findings with the class in a short oral presentation. Although specific pedagogical datasets will be used for analysis in class, this workshop will provide broadly applicable tools to reproducibly analyze and visualize data across domains. 

For more information, contact