# Fisseha Berhane, PhD

#### Data Scientist

443-970-2353 fisseha@jhu.edu CV Resume

## The Effect of Vitamin C on Tooth Growth in Guinea Pigs¶

### Introduction¶

In this analysis, inferential statistics is employed to investigate if the length of odontoblasts (cells responsible for tooth growth) in guinea pigs is influenced by the dose levels of vitamin C (0.5, 1, and 2 mg/day). Moreover, the impacts of two delivery methods (orange juice or ascorbic acid) on the length of odontoblasts are studied.

### Exploratory Data Analysis¶

In [6]:
data(ToothGrowth)
str(ToothGrowth)
library(ggplot2)

'data.frame':	60 obs. of  3 variables:
$len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...$ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
$dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...  We see that the data frame has 60 observations on 3 variables: len, supp and dose. From the documentation, "len" is length of odontoblasts,"supp" is supplement type (ascorbic acid coded as VC or orange juice coded as OJ) and "dose" is dose in milligrams/day. #### Sample size with each supplement type and dose amount.¶ In [9]: message('Table 1: Sample Size') size=list() size$dose0.5=sum(ToothGrowth$dose==0.5) size$dose1=sum(ToothGrowth$dose==1) size$dose2=sum(ToothGrowth$dose==2) size$OJ=sum(ToothGrowth$supp=="OJ") size$VC=sum(ToothGrowth$supp=="VC") size=as.data.frame(size) row.names(size)=c("Sample Size") names(size)=c('0.5 mg/day','1 mg/day','2 mg/day','ascorbic acid (VC)','orange juice (OJ)') size  Table 1: Sample Size  0.5 mg/day1 mg/day2 mg/dayascorbic acid (VC)orange juice (OJ) Sample Size2020203030 As shown in Table 1, we have equal sample size from with each dose amount and supplement type. Let's viziualize the data using ggplot2. In [7]: ToothGrowth$dose=as.factor(ToothGrowth$dose) ggplot(ToothGrowth, aes(x=dose,y=len))+geom_boxplot(aes(fill = dose))+ ggtitle('Fig.1. Tooth Growth Dependence on Dose and Supplement ')+ facet_grid(.~supp)+ theme(axis.title.y = element_text(colour="gray20",size=12,angle=90,hjust=.5,vjust=1), axis.title.x = element_text(colour="gray20"), plot.title = element_text(vjust=1.5,size = 12,colour="purple"), axis.text.x = element_text(colour="red",size=10,angle=45,hjust=.5, vjust=.5))  From Fig.1., we observe that teeth length increases with the amount of dose for all dose levels. Further, we see that the supplement type influences teeth length when the doses are 0.5 and 1.0 mg/day. However, with a dose of 2.0 mg/day, the teeth length for the two supplement types seem similar. We will investigate if the summaries we drew from Fig.1. are statistically significant using hypothesis test. ### Hypothesis Test¶ Let's use hypothesis test to explore the sifnificance of the impacts of doze amount and supplement type on teeth growth. Let's assume unequal variances because that is safer when in doubt. Moreover, we assume that the populations are independent. So, we will use Welch Two Sample t-test. The null hypothesis of this test is that there is no difference between the population means. If the p-value of this test is smaller than 0.05, we will reject the null hypothesis and accept the alternative hypothesis, which states that the two means are different. ### Confidence intervals for the mean¶ In [10]: message('Table 2: 95% confidence interval for mean teeth \n length for each dose amount and supplement') conf_int95=list() conf_int95$len_dose0.5=round(t.test(ToothGrowth$len[ToothGrowth$dose==0.5])$conf.int,2) conf_int95$len_dose1=round(t.test(ToothGrowth$len[ToothGrowth$dose==1.0])$conf.int,2) conf_int95$len_dose2=round(t.test(ToothGrowth$len[ToothGrowth$dose==2.0])$conf.int,2) conf_int95$OJ=round(t.test(ToothGrowth$len[ToothGrowth$supp=='OJ'])$conf.int,2) conf_int95$VC=round(t.test(ToothGrowth$len[ToothGrowth$supp=='VC'])$conf.int,2) conf_int95=as.data.frame(conf_int95) row.names(conf_int95)=c('Lower Limit','Upper Limit') names(conf_int95)=c('0.5 mg/day','1.0 mg/day','2.0 mg/day','OJ','VC') conf_int95  Table 2: 95% confidence interval for mean teeth length for each dose amount and supplement  0.5 mg/day1.0 mg/day2.0 mg/dayOJVC Lower Limit 8.5017.6724.3318.2013.88 Upper Limit12.7121.8027.8723.1320.05 From the 95% confidence intervals for the means shown in Table 2 above, we see that the upper limit and lower limit for each dose amount do not overlap, i.e, as dose amount increases, length increases significantly. However, with the supplement type, the ranges of the means overlap and this shows that length is not significantly different for the two supplment types. However, the supplment type has different impacts for the various doses as shown in Table 3. The supplement type has significant impact on teeth length for doses of 0.5 and 1.0 mg/day but not on 2.0 mg/day. With doses of 0.5, and 1.0 mg/day, the length of teeth with supplment Orange Juice is significantly longer than the teeth length with ascorbic acid supplement. In [8]: message('Table 3: 95% confidence interval for mean\n teeth length for each supplement and dose combination') conf_int95=list() conf_int95$OJ0.5=round(t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==0.5])$conf.int,2)
conf_int95$OJ1=round(t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==1])$conf.int,2) conf_int95$OJ2=round(t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==2])$conf.int,2)

conf_int95$VJ0.5=round(t.test(ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==0.5])$conf.int,2) conf_int95$VJ1=round(t.test(ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==1])$conf.int,2)
conf_int95$VJ2=round(t.test(ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==2])$conf.int,2) conf_int95=as.data.frame(conf_int95) row.names(conf_int95)=c('Lower Limit','Upper Limit') names(conf_int95)=c('OJ and 0.5 mg/day','OJ and 1 mg/day','OJ and 2 mg/day','VC and 0.5 mg/day','VC and 1 mg/day','VC and 2 mg/day') conf_int95  Table 3: 95% confidence interval for mean teeth length for each supplement and dose combination  OJ and 0.5 mg/dayOJ and 1 mg/dayOJ and 2 mg/dayVC and 0.5 mg/dayVC and 1 mg/dayVC and 2 mg/day Lower Limit10.0419.9 24.166.02 14.9722.71 Upper Limit16.4225.5 27.969.94 18.5729.57 ### P-values¶ In [11]: message("Table 4: P-values for Welch's Two Sample T-test\n teeth length for each supplement and dose combination") p_values=list() p_values$dose0.5vs1=t.test(ToothGrowth$len[ToothGrowth$dose==0.5],ToothGrowth$len[ToothGrowth$dose==1])$p.value p_values$dose0.5vs2=t.test(ToothGrowth$len[ToothGrowth$dose==0.5],ToothGrowth$len[ToothGrowth$dose==2])$p.value p_values$dose1vs2=t.test(ToothGrowth$len[ToothGrowth$dose==1.0],ToothGrowth$len[ToothGrowth$dose==2])$p.value p_values$VCvsOJ=t.test(ToothGrowth$len[ToothGrowth$supp=="VC"],ToothGrowth$len[ToothGrowth$supp=="OJ"])$p.value p_values=as.data.frame(p_values) row.names(p_values)=c('P-value') names(p_values)=c('0.5 mg/day vs 1.0 mg/day','0.5 mg/day vs 2.0 mg/day','1.0 mg/day vs 2.0 mg/day','VC vs OJ') p_values  Table 4: P-values for Welch's Two Sample T-test teeth length for each supplement and dose combination  0.5 mg/day vs 1.0 mg/day0.5 mg/day vs 2.0 mg/day1.0 mg/day vs 2.0 mg/dayVC vs OJ P-value1.268301e-074.397525e-141.90643e-05 0.06063451 In [12]: message("Table 5: P-values for Welch's Two Sample T-test\n teeth length for each supplement and dose combination") p_values=list() p_values$OJ_VC_0.5 =  t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==0.5],ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==0.5])$p.value p_values$OJ_VC_1  =  t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==1],ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==1])$p.value p_values$OJ_VC_2  = t.test(ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose==2],ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose==2])\$p.value

p_values=as.data.frame(p_values)
row.names(p_values)=c('P-value')
names(p_values)=c('VC vs OJ with 0.5 mg/day','VC vs OJ with 1.0 mg/day','VC vs OJ with 2.0 mg/day')
p_values

Table 5: P-values for Welch's Two Sample T-test
teeth length for each supplement and dose combination

VC vs OJ with 0.5 mg/dayVC vs OJ with 1.0 mg/dayVC vs OJ with 2.0 mg/day
P-value0.0063586070.0010383760.9638516

Similar to the confidence intervals for the means shown in Table 2, the p-values are very small (less than 0.05) for the dose combinations (Table 4). This shows that teeth length is positively correlated with dose amount. If we see the p-value for the supplment types in Table 4, since it is greater than 0.05, we can not reject the null hypothesis. But from Table 5, we observe that supplement type has significant impact on dose amounts 0.5 mg/day and 1.0 mg/day but not on 2.0 mg/day.

### Conclusions¶

In this analysis, basic statistical inference is used to investigate impacts of different doses of vitamin C (0.5, 1.0, and 2.0 mg/day) and supplements (Orange Juice and ascorbic acid ) on teeth length. Unequal variance Welch's Two Sample t-test is used for hypothesis test. Unequal variance is used because that is better and safer assumption when in doubt. Moreover, we assume that the populations are independent. The basic conclusion from this analysis are:

. Dose amount has significant impact on teeth length. As the dose increases, the teeth length increases.

. The true difference of the means of teeth length for the two suppments is not significantly different from zero at the 95% confidence level.

. The Supplement type has significant impact on teeth length for doses of 0.5 and 1.0 mg/day. With doses of 0.5, and 1.0 mg/day, the length of teeth with supplment Orange Juice is significantly longer than the teeth length with ascorbic acid supplement.

. However, with dose of 2.0 mg/day, the true means of teeth length with both supplemnets are not significantly different from each other, that is as the p-value is above the critical value, we do do not reject the null hypothesis, which states that the difference of the true means is zero.