Case Study: U.S. Births

(Section 8.3)

US Births

library(mosaicData)
?Births78

Old Scatter Plot

ggplot(Births78, aes(x = date, y = births)) + 
  geom_point() +
  labs(x = "Day of the Year", y = "Number of U.S. Births",
       title = "Daily U.S. Birth-Numbers in 1978")

Some of the days have significantly fewer births. What’s going on?

A Hunch

ggplot(Births78, aes(x = wday, y = births)) + 
  geom_violin(fill = "burlywood") +
  geom_jitter()

There are fewer births on weekends!

New Variable

This new variable says whether or not a day is at the end of the week:

weekend <- ifelse(
  Births78$wday %in% c("Sat","Sun"),
  "weekend", 
  "weekday"
)

Births78$weekend <- weekend

New Graph

Expand to see code
ggplot(Births78, aes(x = date, y = births)) + 
  geom_point(aes(color = weekend)) +
  labs(
    x = "Day of the Year", 
    y = "Number of U.S. Births"
  )

Investigate the Exceptions

subset(
  Births78, 
  weekend != "weekend" & births <= 8500,
  select = c(wday, month, day_of_month)
)
    wday month day_of_month
2    Mon     1            2
149  Mon     5           29
185  Tue     7            4
247  Mon     9            4
327  Thu    11           23
359  Mon    12           25

They are all major holidays!