Duration: ~30 Minutes
tidyverse
package.If you need help, refer to the Tidyverse practice slides.
In this section, you will create a second dataset that summarizes people based on their survival.
In a workflow, you might write a new script for this section so that each new dataset has its own script file.
Create variable SURVIVAL
where:
INJ_SEV
does not equal "fatal"INJ_SEV
equals "fatal"
new_fars <- my_fars %>%
mutate(SURVIVAL = ifelse(INJ_SEV=="fatal", "died",
"survived") )
Group the data by your new variable
summary <- new_fars %>% group_by(SURVIVAL)
Summarize by:
summary <- summary %>%
summarize(
total = n(),
age_mean = mean(AGE, na.rm=TRUE),
age_median = median(AGE, na.rm=TRUE),
males = sum(SEX == "male"),
females = sum(SEX == "female")
)
Chain these steps together:
SURVIVAL
(use new_fars
)
fars_summary <- new_fars %>% group_by(SURVIVAL) %>%
summarize(
total = n(),
age_mean = mean(AGE, na.rm=TRUE),
age_median = median(AGE, na.rm=TRUE),
males = sum(SEX == "male"),
females = sum(SEX == "female")
) %>% ungroup()
View your summarized dataset.
fars_summary
head(fars_summary)
Save your data in the folder data
.
write.csv(fars_summary, "data/fars2015nys_Albany_person_bysurvival.csv",
row.names = FALSE)
Enjoy R and please fill out our evaluation.