All MSDS students complete a capstone project. Capstone projects challenge students to acquire and analyze data to solve real-world problems. Project teams consist of two to four MSDS students and one or more faculty advisors. These teams generally work with sponsoring organizations to provide valuable recommendations to address strategic and operational issues. Depending on the needs of the sponsor, teams may develop web-based applications that can support ongoing decision making. The capstone project is conducted over the course of 2 semesters and culminates in a paper and presentation. You may wish to review previous examples of capstone projects.

It is always foremost in our priorities when considering capstone projects that the projects provide our students with the opportunity to achieve the primary learning objectives associated with this culminating experience. These learning outcomes include:

  • Synthesizing the concepts you have learned throughout the program in various courses (this requires that the question posed by the project be complex enough to require the application of appropriate analytical approaches learned in the program and that the available data be of sufficient size to qualify as ‘big’)
  • Experience working with ‘raw’ data exposing you to the data pipeline process you are likely to encounter in the ‘real world’
  • Demonstrating oral and written communication skills through a formal paper and presentation of project outcomes
  • Acquisition of team building skills on a long-term, complex, data science project

What does the process look like?

  1. The School of Data Science periodically puts out a Call for Proposals. Prospective project sponsors complete applications and work with the SDS Associate Director for Research Development to finalize the materials and ensure that projects meet the requirements.
  2. Finalized lists of projects are then presented to students in a ‘capstone open house’ near the start of the term in which their capstone project work begins (fall term for residential MSDS students, term 4 for online MSDS students). During ‘capstone open house’ project sponsors describe the capstone project and students have the opportunity to ask questions.
  3. Students then individually rank their top project choices using a mechanism described to them by one of the SDS faculty members. An algorithm is used to sort students into groups based on their choices and the number of groups desired which is based on the number of total students enrolled that semester and a desired group size of approximately 3 to 4 students per group.
  4. Project assignments are communicated by faculty; each group is assigned a faculty mentor which will meet with approximately 4 groups each week in a seminar style format

What is the seminar approach to mentoring capstones?

We utilize a seminar approach to managing capstones to provide faculty mentorship and streamlined logistics. This approach involves one mentor supervising three to four loosely related projects and meeting with these groups on a regular basis. Project teams often encounter similar roadblocks and issues so meeting together to share information and report on progress toward key milestones is highly beneficial.

Frequently Asked Questions (FAQ)s

Not necessarily. Generally in our capstone program each group works with a sponsor from outside the School of Data Science. Some sponsors are corporate, some are from nonprofit and governmental organizations, and some are professors in other departments at UVA. As is the case with all of what we do, we are constantly evolving and looking for ways to improve what we do at the School of Data Science. One of the challenges we continue to encounter when curating capstone projects with external sponsors is appropriately scoping and defining a question that is of sufficient depth for our students, obtaining data of sufficient size, obtaining access to the data in sufficient time for adequate analysis to be performed and navigating a myriad of legal issues (including conflicts of interest). While we continue to strive to use sponsored projects and work to solve these issues, we also look for ways to leverage openly available data to solve interesting societal problems which allow students to apply the skills learned throughout the program. While not all capstones have sponsors, all capstones have clients. That is, the work is being done for someone who cares and has investment in the outcome.

Because data science is a team sport! All of our capstones, online and residential, are group work projects. While coordinating group projects requires additional coordination and collaboration than individual groups, there are benefits to group work as well. These are big projects and it would be too much to ask of one person. Also, most data science jobs involve a high degree of group work and, as a result, building this capability in our students is one of our core learning objectives for the capstone project.

First, remember that the point of the capstone projects isn’t the subject matter, it’s the data science. For example, if you couldn’t care less about political speeches, maybe you can appreciate the challenge of building a document store and running Latent Dirichlet Allocation on cloud computing. You might not care enough about the election to bother to vote, but you can still get a lot by learning how to generate causal inferences from the vote-by-mail natural experiment. You might hate social media, but you might need to learn how to wrangle a tough API like Twitter’s and how to run recurrent neural networks on the time series output. Professional data scientists often find themselves in positions in which they work on topics they find boring, but use methods they enjoy. That said, there are many ways to tackle a subject, and we are more than happy to work with you to find an approach to the work that most aligns with your interests.

Your ability to influence which project you work on is in the ranking process after ‘capstone open house’ and in encouraging your company or department to submit a proposal during the Call for Proposal process. At a minimum it takes several months to work with a sponsor to adequately scope a project, confirm access to the data and put the appropriate legal agreements into place. Before you ever see a project presented on capstone open house, a lot of work has taken place to get it to that point!

Each spring, we put forward a public call for capstone projects. You are encouraged to share this call widely with your community, including your employer, non-profit organizations, or any entity that might have a big data problem that we can help solve. As a reminder, capstone projects are group projects so the project would require sufficient student interest after ‘capstone open house’. In addition, you (the student) cannot serve as the project sponsor (someone else within your employer organization must serve in that capacity). If my project doesn’t have a corporate sponsor, am I losing out on a career opportunity? The completion of a capstone that produces good results presented in a paper and through code on github will provide more career opportunities than the sponsor of the project. Although it does happen from time to time, it is rare that capstones lead to a direct job offer with the capstone sponsor’s company. The purpose of the capstone is to provide you with the opportunity to do relevant and quality work (as described above in the learning objectives) which can be included on a resume and discussed during job interviews. We have an excellent career services team led by Myra Blanchard and Reggie Leonard. Capstone projects are just one networking opportunity available to you in the program.

The SIEDS conference takes place annually in the spring. Traditionally, graduates of the residential MSDS program participate in it where they submit papers for publication and if accepted they present their results at the conference. For the online program we will make available publication opportunities for the final papers and also provide an opportunity to present results to an online audience. At a minimum, we anticipate those online students presenting at School of Data Science venues.