443-970-2353 [email protected] CV Resume

In [6]:

```
data(ToothGrowth)
str(ToothGrowth)
library(ggplot2)
```

In [9]:

```
message('Table 1: Sample Size')
size=list()
size$dose0.5=sum(ToothGrowth$dose==0.5)
size$dose1=sum(ToothGrowth$dose==1)
size$dose2=sum(ToothGrowth$dose==2)
size$OJ=sum(ToothGrowth$supp=="OJ")
size$VC=sum(ToothGrowth$supp=="VC")
size=as.data.frame(size)
row.names(size)=c("Sample Size")
names(size)=c('0.5 mg/day','1 mg/day','2 mg/day','ascorbic acid (VC)','orange juice (OJ)')
size
```

As shown in Table 1, we have equal sample size from with each dose amount and supplement type.

Let's viziualize the data using **ggplot2**.

In [7]:

```
ToothGrowth$dose=as.factor(ToothGrowth$dose)
ggplot(ToothGrowth, aes(x=dose,y=len))+geom_boxplot(aes(fill = dose))+
ggtitle('Fig.1. Tooth Growth Dependence on Dose and Supplement ')+ facet_grid(.~supp)+
theme(axis.title.y = element_text(colour="gray20",size=12,angle=90,hjust=.5,vjust=1),
axis.title.x = element_text(colour="gray20"),
plot.title = element_text(vjust=1.5,size = 12,colour="purple"),
axis.text.x = element_text(colour="red",size=10,angle=45,hjust=.5, vjust=.5))
```

In [10]:

```
message('Table 2: 95% confidence interval for mean teeth \n length for each dose amount and supplement')
conf_int95=list()
conf_int95$len_dose0.5=round(t.test(ToothGrowth$len[ToothGrowth$dose==0.5])$conf.int,2)
conf_int95$len_dose1=round(t.test(ToothGrowth$len[ToothGrowth$dose==1.0])$conf.int,2)
conf_int95$len_dose2=round(t.test(ToothGrowth$len[ToothGrowth$dose==2.0])$conf.int,2)
conf_int95$OJ=round(t.test(ToothGrowth$len[ToothGrowth$supp=='OJ'])$conf.int,2)
conf_int95$VC=round(t.test(ToothGrowth$len[ToothGrowth$supp=='VC'])$conf.int,2)
conf_int95=as.data.frame(conf_int95)
row.names(conf_int95)=c('Lower Limit','Upper Limit')
names(conf_int95)=c('0.5 mg/day','1.0 mg/day','2.0 mg/day','OJ','VC')
conf_int95
```

In [8]:

```
message('Table 3: 95% confidence interval for mean\n teeth length for each supplement and dose combination')
conf_int95=list()
conf_int95$OJ0.5=round(t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==0.5])$conf.int,2)
conf_int95$OJ1=round(t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==1])$conf.int,2)
conf_int95$OJ2=round(t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==2])$conf.int,2)
conf_int95$VJ0.5=round(t.test(ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==0.5])$conf.int,2)
conf_int95$VJ1=round(t.test(ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==1])$conf.int,2)
conf_int95$VJ2=round(t.test(ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==2])$conf.int,2)
conf_int95=as.data.frame(conf_int95)
row.names(conf_int95)=c('Lower Limit','Upper Limit')
names(conf_int95)=c('OJ and 0.5 mg/day','OJ and 1 mg/day','OJ and 2 mg/day','VC and 0.5 mg/day','VC and 1 mg/day','VC and 2 mg/day')
conf_int95
```

In [11]:

```
message("Table 4: P-values for Welch's Two Sample T-test\n teeth length for each supplement and dose combination")
p_values=list()
p_values$dose0.5vs1=t.test(ToothGrowth$len[ToothGrowth$dose==0.5],ToothGrowth$len[ToothGrowth$dose==1])$p.value
p_values$dose0.5vs2=t.test(ToothGrowth$len[ToothGrowth$dose==0.5],ToothGrowth$len[ToothGrowth$dose==2])$p.value
p_values$dose1vs2=t.test(ToothGrowth$len[ToothGrowth$dose==1.0],ToothGrowth$len[ToothGrowth$dose==2])$p.value
p_values$VCvsOJ=t.test(ToothGrowth$len[ToothGrowth$supp=="VC"],ToothGrowth$len[ToothGrowth$supp=="OJ"])$p.value
p_values=as.data.frame(p_values)
row.names(p_values)=c('P-value')
names(p_values)=c('0.5 mg/day vs 1.0 mg/day','0.5 mg/day vs 2.0 mg/day','1.0 mg/day vs 2.0 mg/day','VC vs OJ')
p_values
```

In [12]:

```
message("Table 5: P-values for Welch's Two Sample T-test\n teeth length for each supplement and dose combination")
p_values=list()
p_values$OJ_VC_0.5 = t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==0.5],ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==0.5])$p.value
p_values$OJ_VC_1 = t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==1],ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==1])$p.value
p_values$OJ_VC_2 = t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==2],ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==2])$p.value
p_values=as.data.frame(p_values)
row.names(p_values)=c('P-value')
names(p_values)=c('VC vs OJ with 0.5 mg/day','VC vs OJ with 1.0 mg/day','VC vs OJ with 2.0 mg/day')
p_values
```

In this analysis, basic statistical inference is used to investigate impacts of different doses of vitamin C (0.5, 1.0, and 2.0 mg/day) and supplements (Orange Juice and ascorbic acid ) on teeth length. Unequal variance Welch's Two Sample t-test is used for hypothesis test. Unequal variance is used because that is better and safer assumption when in doubt. Moreover, we assume that the populations are independent. The basic conclusion from this analysis are:

**. Dose amount has significant impact on teeth length. As the dose increases, the teeth length increases.**

**. The true difference of the means of teeth length for the two suppments is not significantly different from zero at the 95% confidence level.**

**. The Supplement type has significant impact on teeth length for doses of 0.5 and 1.0 mg/day. With doses of 0.5, and 1.0 mg/day, the length of teeth with supplment Orange Juice is significantly longer than the teeth length with ascorbic acid supplement. **

**. However, with dose of 2.0 mg/day, the true means of teeth length with both supplemnets are not significantly different from each other, that is as the p-value is above the critical value, we do do not reject the null hypothesis, which states that the difference of the true means is zero.**