PDSG Team Project Competition

For the Spring 2017 semester, we’re holding a competition with prizes of $50 Amazon gift cards for each member (excluding PDSG board members) of winning teams. There will be at least 2 winning teams. To be eligible for the Amazon gift cards, here’s what you have to do:

  1. Form a team of 2-4 PDSG members
  2. Let us know your team members and the dataset(s) you want to work with by posting on our Piazza account by February 19th.
  3. Work on your project. Any way you want to extract some value or insight from data is valid for this competition. However, you will not be eligible for an Amazon gift card if your project is a class assignment.
  4. Write a short blog post-style report of the your project by April 6th.
  5. Give a 10-15 minute presentation on your project in front of a panel of judges around April 20th (actual date TBD).

Judging Criteria

  1. Novelty: Is this a new analysis, or something mostly from a tutorial?
  2. Validity: Was the analysis meaningful and done correctly?
  3. Clarity: Was the project presented clearly and effectively?
  4. Impact: How does/could this project contribute to society?

Frequently Asked Questions

Below are some frequently asked questions. If you have a question that isn’t answered here, please post it on Piazza. Other people probably have the same question.

What do I get out of this if I don’t win a $50 Amazon gift card?

In addition to the experience you gain, people you meet, and portfolio content you generate, the more you participate, the higher your chance of winning smaller prizes (like $5 Starbucks gift cards) at the end of the semester. Throughout the semester, you can earn raffle tickets. At the end of the semester, we’ll draw one raffle ticket for each prize we have to give away and give the prize to the owner of that raffle ticket. Here’s how to earn raffle tickets:

  • 2 raffle tickets for leading a workshop. Email us (board@penndsg.com) if you have a tool or technique you want to share with.
  • 1 raffle ticket for submitting a blog post-style project report for us to post on the website
  • 1 raffle ticket for giving a 10-15 minute talk at a PDSG event. Possible topics include internship experiences, research experiences, or sharing project results. For team project presentations, each team member who presents earns a raffle ticket.
  • 1 raffle ticket for introducing the PDSG board, via email or in person, to a data professional who ends up giving a career talk for PDSG.

How can I find people to work with on a team project?

On our Piazza account, there’s a “Search for Teammates” widget on the top left within the “Q&A” section. You can either respond to other peoples’ posts or create your own. You could create a post with a specific project idea, or just say you want to work on a project but haven’t decided what to do.

To gain access to the Piazza account, join our group and we’ll add you. If you’ve already joined and still don’t have access, email us at board@penndsg.com to let us know.

I don’t have a project idea. What should I do?

Our resources page has some links to public datasets and tips about building a data science portfolio. You could look through some of those and pick a dataset that looks interesting and something you want to do with it.

We highly recommend doing a completed or ongoing project from the Kaggle website. Kaggle competitions have a few key features that are great for learning data science:

  • Interactive python or R scripts posted by other users that you can use to get started
  • Forums to ask questions and get help with your project
  • Clean datasets and clear objectives
  • Immediate scoring of every prediction you submit

We recommend choosing between one of these competitions:

  • Titanic: Machine Learning from Disaster — This is an ongoing learning competition based on the classic Titanic dataset, which contains information on all of the passengers of the Titanic, including whether or not they survived. This is a great project for beginners since there are detailed tutorials to introduce machine learning techniques in a straighforward way.
  • Digit Recognizer — This is an ongoing learning competition based on the classic MNIST dataset of images of handwritten digits. Since this dataset has been so extensively studied, many useful scripts and resources are available. This makes it a great option for easing into image-based machine learning.
  • Shelter Animal Outcomes — This competition is already complete, but teams can still create create submissions and see how well different models score compared to everyone else. Predict the outcome for each animal from the Austin Animal Center based on intake information including breed, color, sex, and age.
  • TalkingData Mobile User Demographics — This competition is also already complete but still scores your submissions. Build a model predicting users’ demographic characteristics based on their app usage, geolocation, and mobile device properties.
  • Predicting Facebook Check-ins — This is another complete competition that still scores your submissions. Predict where people like to check in on Facebook based on fictitious data similar to data Facebook employees use.

I’m new to data science. How can I get help with my project?

The best way to get help is to choose a project that already has lots of online resources, like the Titanic Kaggle competition. Additionally, you can post questions either as yourself or anonymously to our Piazza group. Either a fellow group member or a PDSG board member will answer your question or try to point you to someone/something else that can help you.

I’m doing a data science project as part of a class. Can I use this project for the Team Project Competition?

If you do, you won’t be eligible for the $50 Amazon gift cards, but you’ll be eligible for everything else. This means you will be able to get help for your project on Piazza, and you will be able to receive raffle tickets for blog posts and presentations regarding your project.