In [1]:

```
noquote(paste('Mean of simulation =',round(mean(rexp(1000,rate=0.2)),2)))
noquote(paste('Variance of simulation =',round((sd(rexp(1000,rate=0.2)))^2,2)))
```

Out[1]:

Out[1]:

Now, let's simulate 40 exponentials 1000 times and draw the histogram of the means of each 40 exponentials and demonstrate the central limit theorem. Fig.2 shows the histogram of the means of 1000 fourty exponentials.

The CLT theorem states that the sampling mean is approximately normally distributed with mean of the population mean (five in this case, because lambda=0.2 and for exponential distribution mean =1/lambda=1/0.2=5) and variance of of the population variance divided by the sample size (40 in our case). In short the sampling distribution is N(Î¼,Ïƒ2/n). Figure 2 clearly demonstartes the CLT theorem because it is bell-shaped and it is approximately centrered at 5. Let's callculate mean and variance of a sampling ditribution of 1000 means from exponenetials of size 40 and compare it with what the CLT tells us.

In [2]:

```
means = NULL
for (i in 1 : 1000) {means = c(means, mean(rexp(40,rate=0.2)))}
noquote(paste('Mean of sampling distribution =',round(mean(means),2)))
noquote(paste('Variance of sampling distribution =',round(sd(means)^2,2)))
```

Out[2]:

Out[2]:

We can also compare the sampling distribution with a standard normal (a normal distribution with mean=0 and std=1). Our interest is to check that $\frac{\bar X_n - \mu}{\sigma / \sqrt{n}}$ is approximately a standard normal . Here $\bar X_n$ is the mean of each 40 simulations, $\mu$ is the population mean (five in our case since lambda=0.2 and mean=1/lambda for exponential distribution) and $\sigma$ is the population standard deviation (which is also five for an exponential distribution with lambda=0.2).

Fig. 3. shows a standardized sampling distribution. As the CLT states, we see that it is bell-shaped and cental at about zero (hence a standard normal). Let's calculate mean and variance of a standardized sampling ditribution of 1000 means from exponenetials of size 40 and compare it with what the CLT tells us.

In [3]:

```
Z = NULL
for (i in 1 : 1000) {
se <-5/sqrt(40)
x= mean(rexp(40,rate=0.2))
x=(x-5)/se
Z = c(Z,x )}
noquote(paste('Mean of standardized sampling distribution =',round(mean(Z),2)))
noquote(paste('Variance of standardized sampling distribution =',round(sd(Z)^2,2)))
```

Out[3]:

Out[3]:

In [11]:

```
Z= rexp(1000, rate=0.2)
hist(Z, breaks=30, col="skyblue", xlab=" ",
main="Fig. 1. Sample Distribution",cex.main=1)
abline(v=5,col='purple',lwd=2)
abline(v=mean(Z),col='blue')
```

In [14]:

```
means = NULL
for (i in 1 : 1000) {means = c(means, mean(rexp(40,rate=0.2)))}
h<-hist(means, breaks=20, col="skyblue", xlab="Sample Mean", main="Fig. 2. Sampling Distribution",cex.main=1)
xfit<-seq(min(means),max(means),length=40)
yfit<-dnorm(xfit,mean=5,sd=sd(means)) #sd here is the standard error
yfit <- yfit*diff(h$mids[1:2])*length(means) # getting a multiplier from the hist object
lines(xfit, yfit, col="red", lwd=2)
abline(v=5,col='purple',lwd=2)
abline(v=mean(means),col='blue',lwd=2)
```

In [15]:

```
Z = NULL
for (i in 1 : 1000) {
se <-5/sqrt(40)
x= mean(rexp(40,rate=0.2))
x=(x-5)/se
Z = c(Z,x )}
h<-hist(Z, breaks=20, col="skyblue", xlab="Sample Mean", main="Fig. 3. Standardized Sampling Distribution ",cex.main=1)
xfit<-seq(min(Z),max(Z),length=1000)
yfit<-dnorm(xfit,mean=0,sd=sd(Z))
yfit <- yfit*diff(h$mids[1:2])*length(Z)
lines(xfit, yfit, col="red", lwd=2)
abline(v=0,col='purple',lwd=2)
abline(v=mean(Z),col='blue')
```