The goal of the Final Project is to prove or disprove a hypothesis using skills learned in this class, and demonstrate understanding of those techniques through explaining them to others. It's open-ended — you decide what you're investigating. We're looking for you to be creative, and just the right amount of ambitious.
- General assignment information
- Create a new notebook to do the actual analysis; that is what you'll turn in.{% if id == "nyu" %} To create, click:
File
New notebook
Python [conda env:python-public-policy]
{% endif %}
- Go back and find any information that's available around the data, to get a better understanding of what it contains and means.
- Might include a data dictionary
- Might involve poking around a government agency's web site to understand their processes
- Understand what all the different columns and values represent
- If you end up answering your initial research question easily (haven't met the requirements below), ask and answer follow-up question(s).
In addition to the applicable general assignment requirements, your submission should:
- Read like a blog post - 35 points
- Pretend you're explaining to a peer who hasn't taken this class. You don't need to teach them to code, but they should be able to follow what's going on.
- Re-state the question, hypothesis, and data source(s) with link(s)
- Walk the reader through what you're doing in every step and what they should be taking away from it.
- You are more than welcome to inject personality in there; doesn't need to be dry.
- Use text cells with Markdown for formatting.
- You'll need to change the cell type to Markdown.
- If you hit any dead ends in your analysis, leave them in.
- For example, include charts that you generate that may not show anything interesting and explain what you are choosing to look at instead.
- You should still be cleaning up unused/broken code to make your notebook readable.
- You may need to tweak your research question as you go. Show and explain why.
- Have a conclusion that speaks to your question and hypothesis.
- Use pandas - 15 points
- Not be trivial - 35 points - requiring:
- At least 40 lines of code to come to a conclusion
- That code should be relevant to answering your question. In other words, having 40 lines of
print("hello world")
wouldn't count. - If you meet all the other requirements, you will likely be well over this number.
- How to count them automaticaly
- That code should be relevant to answering your question. In other words, having 40 lines of
- Transforming data through grouping, merging, and/or reshaping of DataFrames
- Operations that aren't easily done in a spreadsheet.
- At least 40 lines of code to come to a conclusion
- Have a visualization (chart or map) of some kind - 15 points
- Follow best practices
If you answer the first question easily, that's fine; dig into / build off of it. Go deep, not broad.
- DO NOT WAIT UNTIL THE LAST MINUTE TO SUBMIT. Leave yourself time to fix any issues that come up in doing so, computer crashing, etc.
- Please try to preserve anonymity.
- Keep your name/username out of the notebook title, text cells, file paths, etc. {% if id == "columbia" -%}
- Hold off on responding to comments on your notebook before you get your Project grade. {%- endif %}
- Don't leave any sensitive information in the notebook, such as:
- API keys
- Personally-identifiable information (PII)
Because it's the end of the course and your peers are doing the reviews, there will be no extensions{% if id == "nyu" %} or resubmissions{% endif %}.
The instructor and {{assistant_name}} don't have bandwidth to review everyone's full notebooks. Therefore, to be fair to everyone, we will deny any requests to have notebooks reviewed end to end, aside from appeals to the peer grade. In other words, please don't ask us "I think I'm done — can you make sure my Final Project is ok?" That said, we are more than happy to answer specific questions and help troubleshoot specific sections.
To confirm you meet the requirements prior to submitting, you can:
- Take a pass through your own notebook, pretending you are grading someone else
- Ask someone else in the class to do so
{% if id == "columbia" %}
Make sure the notebook will be visible to other students for peer grading:
- Open the Sharing settings.
- Under
General access
, change toLionMail
(orAnyone with the link
), thenViewer
. {% endif %}