Firstly, find a publicly available dataset that is interesting to you. You can revisit activity 3.4 for some ideas on where you can find datasets. You will need to describe the dataset. Specifically, describe:
- How was it collected?
- What does one row represent?
- What does each variable describe?
If that information is not provided, do your best to infer it, but be clear where your information is from.
- What questions do you have about the variables?
- Is any data missing?
- Is the data binned?
- Is this dataset longitudinal or cross-sectional?
- Why do you think these data points were chosen to be collected?
- Are you satisfied with them?
- What other elements would you want the dataset to include?
- Are any of the variables proxies for other information? If so, are they appropriate proxies?
- Do the proxy variables introduce any bias?
- What questions would you want to ask of this dataset?
- What relationships interest you?
This essay should be 1—2 pages in length, double-spaced, and should be grammatically correct, but it does not have to be formally/academically written. Write it in “business-casual.” Once you are satisfied with your work, click Start Assignment at the top-right of this page, upload your document as a PDF, and click the Submitbutton. Then, click Next to wrap up Module 3—outstanding stuff!