Chapter 6 part III

Problem 1

For generating \(Y \sim ~N(0,1)\) using an Accept/Reject algorithm, we could generate \(U \sim uniform\) and \(V \sim exponential ( \lambda)\), and attach a random sign to V(+/- each with eqaul probability). The resulting V is the Double exponential familiy with density \(g(x) = \lambda e^{-\lambda\mid x \mid}/2\) for \(- \infty < x < \infty\).

a) Show that for any \(\lambda> 0\) the double exponential density and the standard normal density intersect at a value of x >0 and a value of x < 0. This implies that the double exponential family does not dominate the standard normal density (By symmerty, it is sufficient to show this for x >0).

Suppose X > 0, set the standard normal and the double exponential equal to each other to find the X, at which they interesect.

\[ (2\pi)^{-1/2}e^{-x^2/2} = \frac{\lambda e^{-\lambda x}}{2} \]

Take the natural log of both sides and move everything to one side.

\[ \frac{x^2}{2} - \lambda x+ ln(\lambda) - ln(2) - \frac{ln(2 \pi)}{2} = 0 \]

Multiply by 2 and simplify.

\[ x^2 - 2 \lambda x+ ln(\frac{\lambda ^{2} \pi}{2}) = 0 \]

Applying the quadratic formula, the solution to x is,

\[ x = \lambda \pm \sqrt{\left( \lambda^2 - ln(\frac{\lambda \pi}{2})\right)} \]

The inside of the square root is greater than or equal to 0 for all \(\lambda\). This implies that for x>0 there exist at least one point for which the standard normal and the double exponential are equal. When the inside of the square root is non-zero, then this implies that the two functions intersect at two different points. By symmetery, this result applys to x < 0.

b) In light of part(a), to have an envelope we need to set \(e(x) = cg(x)\), for some c > 1. Find an optimal c for \(\lambda = 1\) that minimizes the number of rejections.

In order to minimize the number of rejections we much find a \(c\), such that, \(f(x) = cg(x)\) only has two solutions for x, for \(- \infty < x < \infty\). That is, the cg(x) sits on top of f(x) and is only tangant to f(x) at two points, so that \(f(x) \leq g(x)\) for \(- \infty < x < \infty\).

We do this by equating, \(f(x) = cg(x)\), for \(\lambda = 1\). We’ll compute c for x>0.

\[ (2\pi)^{-1/2}e^{-x^2/2} = \frac{c e^{-x}}{2} \] Take the natural log of both sides and move everything to one side.

\[ \frac{x^2}{2} - x+ ln(c) - ln(2) - \frac{ln(2 \pi)}{2} = 0 \]

Multiply by 2 and simplify.

\[ x^2 - 2x+ ln(\frac{c \pi}{2}) = 0 \]

Apply the quadratic equation and simplify.

\[ x = 1 \pm \sqrt{\left(1- ln(\frac{c^2 \pi}{2}) \right)} \]

In order to minimize the number of rejections, we must choose c so that x = 1. In order to do this, we must solve the bottom equation.

\[ 1-ln(\frac{c^2 \pi}{2}) = 0 \]

Solving for c in this equation yields,

\[ c = \sqrt{\left(\frac{2 e^{1}}{\pi}\right)} \]

c) Apply the Accept/Reject algorithm to generate X from the standard normal distribution, by generating \(U \sim uniform(0,1)\) and using

An envelope of Cauchy random variable
An envelope of double exponential random variable.
Compare the algorithms. Which one do you recommend?

Here is a plot of a standard normal distribution with a envelope of 1.6*cauchy(0,1)

test = seq(-6,6,by =.001) 
stdnorm = dnorm(test,0,1)
cauchy = 1.6*dcauchy(test)

We’ll apply the Accept/Reject algorithm with 10,000 samples from cauchy(0,1)

n = 10000
samp.chy = rcauchy(n)
u = runif(n)

f = dnorm(x = samp.chy) 
g = 1.6*dcauchy(samp.chy) #envelope
X = samp.chy[which(u <(f/g))] # accepting 
accepted.chy = length(X)# number of accepted points

The number of accepted points under cauchy(0,1) is: 6286

Now we’ll calculate an envelope of double exponential. We will use dlaplace from the rmulti package for our double exponential distribution.

test = seq(-6,6,by =.001) 
library(rmutil)
# testing our dbexp with dlaplace
y = dlaplace(test)
c = sqrt((2*exp(1))/pi)
z = c*y

n = 10000
samp.dbexp = rlaplace(n)
u= runif(n)

f = dnorm(x = samp.dbexp)
c = sqrt((2*exp(1))/pi) # optimal c
g = c*dlaplace(samp.dbexp) # envelope
X = samp.dbexp[which(u <(f/g))] # accepting 
accepted.dbexp = length(X) # number of accepted points

The number of accepted points under double exponential is: 7614

Upon inspection, I would recommend the use of the double exponential because it rejected less sample points and was therefore more efficient.

	Accepted
Cauchy	6286
Double Accepted	7614

Problem 2: Do problem 6.4 a) of the text.

Below is the code for obtaining the posterior distribution through SIR.

coal = read.table("/Users/josegonzalez/Documents/R/coal.dat", header = TRUE)
m = 10000 # number of draws from prior
theta = sample(c(1:111), size = m, replace = TRUE)
a1 = rgamma(m,10,10); a2 = rgamma(m,10,10);
lam1 = rgamma(m,3,a1); lam2 = rgamma(m,3,a2)

parm = cbind(theta, lam1, lam2)


like = function(par, X = coal$disasters){
  par = as.numeric(par)
  # par is a row containing: theta_i, lam1_i, lam2_i
  t = par[1]; l1 = par[2]; l2 = par[3];
  temp1 = X[1:t]; temp2 = X[(t+1):112];
  l = -t*l1 + log(l1)*sum(temp1) - (112-t)*l2 + log(l2)*sum(temp2) # loglike
  exp(l) #likelihood
}

l = apply(parm, MARGIN = 1, function(x) like(x))


weights = l/sum(l)
resamp = sample(c(1:m), m, replace = TRUE, prob = weights)
post = parm[resamp, ]

The posterior mean’s for theta, lambda1, and lambda2 are:

# post mean for parameters, histogram of theta, and credible interval for theta
post_mean = apply(post, 2, mean)

	Posterior Mean
theta	40.3984000
lam1	3.0942197
lam2	0.9900696

Below is a histogram of the posterior distribution of theta.

The 95% credible interval is found using quantile. The interval is in between the following two numbers.

	Theta
2.5%	36
97.5%	46

The original sampled points for lamba1 and lambda2 are plotted in black against each other, while the resampled points are plotted in red.

The initial and resampling sample sizes are the same, 10000. The number of unique points, tables of counts, and highest observed frequency in resample are as follows:


Number of Unique Points	19

Observation.Var1	Observation.Freq
30	3
32	3
33	60
34	40
35	30
36	604
37	1638
38	314
39	90
40	1276
41	3708
42	609
43	341
44	481
45	17
46	611
47	141
48	21
49	13

	Observation.with.Maximum.Frequency
41	0.3708