--- title: "Mileage of American Cars" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set() # need ggplot2 for dataset and graphing functions library(ggplot2) # setting the ggplot theme theme_set(theme_classic()) # setting colors to use throughout fill <- scale_fill_brewer(type="qual", palette=3, direction=-1) color <- scale_color_brewer(type="qual", palette=3, direction=-1) ``` ## Some data managment. First, we will subset to just American cars. We will also make a "long" version of the data, where city and highway mileage are combined into a single column. We show the first few rows of the long dataset. ```{r data-management, echo=FALSE} # subsetting to American mpg_am <- subset(mpg, (mpg$manufacturer %in% c("chevrolet", "dodge", "ford", "jeep", "lincoln", "mercury", "pontiac"))) # creating a single column of both mileage types mpg_am_long <- reshape(as.data.frame(mpg), varying=list(c("cty", "hwy")), v.names="mpg", direction="long", timevar="mpg_type", times=c("cty", "hwy")) # first 10 rows of the long dataset knitr::kable(head(mpg_am_long, n=10), align='c', row.names=FALSE) ``` ## The sample of cars Let's see what the sample consists of in terms of manufacturer and class: ```{r sample} # sample by manufacturer and class ggplot(mpg_am, aes(x=manufacturer, fill=class)) + geom_bar() + fill ``` We have quite a few SUVs! ## Mileage graphs Let's look at mean city and highway mileage with with standard error bars of mileage across various factors: ```{r mileage-graphs} # mileage by class ggplot(mpg_am_long, aes(x=class, y=mpg, color=mpg_type)) + stat_summary(fun.data="mean_se") + color # mileage by cylinders and trans ggplot(mpg_am_long, aes(x=trans, y=mpg, color=factor(cyl))) + stat_summary(fun.data="mean_se") + color # mileage changes over years, by class ggplot(mpg_am_long, aes(x=factor(year), y=mpg, color=class, group=class)) + stat_summary(fun.data="mean_se") + stat_summary(fun="mean", geom="line") + facet_wrap(~mpg_type) + color ``` As expected, the smallest cars have the best mileage. ```{r subgroup-plot, eval=FALSE, results='asis',echo=FALSE} # Notice the use of Markdown syntax inside of an R code chunk # This works because knitr converts the output of each # code chunk to Markdown and text (or figures) # Requires the use of knitr option results='asis' to print the results # of cat() without any further markup from knitr cat("## Plot of city vs highway mileage for manufacturer ", params$manufacturer) # subset to manufacturer specified in header subgroup <- subset(mpg_am, mpg_am$manufacturer==params$manufacturer) # plot city vs highway, colored by class, with best fit lm line ggplot(subgroup, aes(x=cty, y=hwy, color=class)) + geom_point() + geom_smooth(aes(color=NULL), method="lm", se=FALSE) + color ```