1. Data Description and Curiosity Questions about the data:
background or the context of data selected – sources, description of how it was collected, time period it represents, context in it was collected if available,
reason(s) why you selected it?
Description of the data:how big is it (number of observations, variables),
how many numeric variables,
how many categorical variables,
description of the variables, if available
Are there any missing values?
Any duplicate rows?
Compute summary statistics (mean, median, mode, standard deviation, variance, range).
Select one categorical variable, compute these statistics on a numeric variable by grouping on a categorical variable
Record your observation. What did you find the most fascinating from your descriptive analysis.
2. Descriptive Statistics and Visualization (at least two out of the four listed below)
Relationship between variables
Distribution of the variable(s)
Spatial data representation
Comparison of summary statistics across categories
3. Generate at least one hypothesis and perform hypothesis test.
4. Summarize your observations
Please add as much observations as you can and comments, I will provide the data with the files