Math 534 Midterm 1

Problem 1

A scientist studied the effects of a very strong storm in trees in a forest in Minnesota as a function of the tree diameters and the severity of winds in the areas where the trees were located. The data for $n$ = 659 trees are given at: http://mathfaculty.fullerton.edu/mori/data/math534/blowBF.txt

\[ \begin{aligned} D &= \text{Diamter of the tree.}\\ S &= \text{Severity of the storm at the location of the tree in the scale of 0 to 1, with 1 being most severe.} \\ y &= \text{0 if the tree survived and 1 if the tree is blown down.}\\ \end{aligned} \]

Here, $y_1 , \dots , y_n$ is a set of $n$ observations where $y_i$ ~ $Bernoulli(\pi_i)$. We are to estimate $\pi_i = P(y_i = 1)$, the probability that a tree is blown down, as a function of $D$ and $S$. In particular we parameterize this probability by the logit funciton:

\[ \pi_i(\beta) = \frac{\exp(x_{i}^{T}\beta)}{1 + \exp(x_{i}^{T}\beta)} \space\space \text{for i} = 1,\dots, n \]

where $x_{i}^{T}$ = $(x_{i1}, x_{i2}, x_{i3})$, where $x_{i1} = 1$ is the intercept, $x_{i2} = log(D)$, and $x_{i3} = S$, and $\beta (\beta_1,\beta_2, \beta_3)^{T}$ is a set of 3 parameters to be estimated.

(a) Write the log-likelihood function $\ell(\beta)$ as a function of $y_{i}$’s and $\pi_i(\beta)$.

\[ \begin{aligned} \ell(\beta) &= \sum_{i=1}^{n} y_i\log(\pi_i(\beta)) + (1-y_i)\log(1 - \pi_i(\beta)) \\ &= \sum_{i=1}^{n} y_i [x_{i}^{T}\beta - \log(1 + \exp(x_{i}^{T}\beta))] + (1-y_i)\log\left(1-\frac{\exp(x_{i}^{T}\beta)}{1+\exp(x_{i}^{T}\beta)}\right) \end{aligned} \]

(b) Obtain the first differential $d\pi_i(d\beta)$ and the second differential $dd\pi_i(d\beta,d\beta)$.

First differential of $\pi_i(\beta)$: $d\pi_i(d\beta)$

\[ d\pi_i(d\beta) = \frac{x_{i}^{T}d\beta e^{(x_{i}^{T}\beta)}}{(1 + e^{(x_{i}^{T}\beta)})^{2}} \]

Second differential of $\pi_i(\beta)$: $dd\pi_i(d\beta, d\beta)$ \[ dd\pi_i(d\beta,d\beta) = \frac{x_{i}^{T}d\beta x_{i}^{T}d\beta e^{(x_{i}^{T}\beta)}}{(1 + e^{(x_{i}^{T}\beta)})^{2}} - \frac{2x_{i}^{T}d\beta x_{i}^{T}d\beta e^{(2x_{i}^{T}\beta)}}{(1 + e^{(x_{i}^{T}\beta)})^{3}} \]

(c) Obtain the first differential $d\ell(d\beta)$ and use it to derive formulas for $\partial\ell(\beta)/\partial\beta_i$ for i = 1,2,3.

\[ \begin{aligned} d\ell(d\beta) &= \sum_{i = 1}^{n} \left[y_i\frac{1}{1+e^{x_{i}^{T}\beta}}x_{i}^{T}d\beta - (1-y_i)\frac{e^{x_{i}^{T}\beta}}{1+e^{x_{i}^{T}\beta}}x_{i}^{T}d\beta\right]\\ &= \sum_{i = 1}^{n} (y_i - \frac{e^{x_{i}^{T}\beta}}{1+e^{x_{i}^{T}\beta}})x_{i}^{T}d\beta\\ &= \sum_{i =1}^{n} (y_i - \pi_i(\beta))x_{i}^{T}d\beta = V d\beta \end{aligned} \] where, $V = \sum_{i =1}^{n} (y_i - \pi_i(\beta))x_{i}^{T}$ is a 1 x 3 row vector. This implies that the formulas for $\partial\ell(\beta)/\partial\beta_i$’s are: \[ \frac{\partial\ell(\beta)}{\partial\beta_i} = V_i \space, \space \text{for i} = 1,2,3 \]

(d) Write an R code to implement steepest ascent method with step-halving to obtain the maximum likelihood estimates for $\beta$. Use modified relative error as a convergence criterion with threshold value of $10^{-4}$.

datat <- read.table("http://mathfaculty.fullerton.edu/mori/data/math534/blowBF.txt", header = TRUE)
attach(datat)

n = length(datat$y)
X = cbind(rep(1, n), log(datat$D), datat$S) 
Y = matrix(datat$y, nrow = n, ncol = 1)

# likelihood
likemvn <- function(datat, B){
  n = length(datat$y)
  X = cbind(rep(1, n), log(datat$D), datat$S)
  Y = matrix(datat$y, nrow = n, ncol = 1)
  
  sum = 0
  for( i in 1:n){
    ex <- as.numeric(exp(X[i,]%*%B)) # compute exp(X_iB) to avoid clunkiness
    sumnd <- Y[i]*log(ex/(1+ex)) + (1-Y[i])*(log(1-(ex/(1 +ex))))
    sum = sum + sumnd
  }
  l = sum
  grad = grad(X,B,n) # computes gradient using function line 27
  list(l = l, grad = grad)
}

# gradient 
grad = function(X,B,n){
  sumg = 0
  for(i in 1:n){
    ex <- as.numeric(exp(X[i,]%*%B))
    sumndg <- (Y[i] - ex/(1+ex))*X[i,]
    sumg = sumg + sumndg 
  }
  sumg <- matrix(sumg, 3,1)
  return(sumg)
}


# steepest ascent 
optmstp <- function(datat, B, maxit){
  it = 0
  condition = FALSE
  res <- list() # creating list to store values
  while(it < maxit & condition == FALSE){
    like = rep("NA", 22)
    grad_norm = rep("NA", 22)
    it = it + 1
    halve = 0
    
    a = likemvn(datat, B) 
    like[1] <- formatC(a$l, digits =4, format = "f") # Storing formatted version of likelihood at B_0
    grad_norm[1] <- format(norm(a$grad, type = "2"), scientific = TRUE, digits =2) # Storing formatted version of gradient at B_0
    B1 = B + a$grad # Steepest Ascent 
    atmp = likemvn(datat, B1) 
    
    # Following prevents the function from stopping if it approaches NaNs
    while(is.nan(atmp$l) == TRUE & halve <20){
      halve = halve + 1
      B1 = B + a$grad/2^halve
      atmp = likemvn(datat, B1)
      
    }
    like[halve+2] <- formatC(atmp$l, digits =4, format = "f")
    grad_norm[halve+2] <- format(norm(atmp$grad, type = "2"), scientific = TRUE, digits = 2)
    # Step-halving
    while(atmp$l < a$l & halve <20){
      halve = halve + 1
      B1 = B + a$grad/2^halve
      atmp = likemvn(datat, B1)
      like[halve+2] <- formatC(atmp$l, digits =4, format = "f")
      grad_norm[halve+2] <- format(norm(atmp$grad, type = "2"),scientific = TRUE, digits = 2)
    }
    mre_norm <- norm((B1-B)/pmax(1,abs(B1)), type = "I") # calcualtes MRE norm for last iteration 
    if(halve >= 20){
      print(paste("Step Halving failed after 20 atempts", it))
      condition = TRUE
    }
    if(mre_norm <= 10^(-4)){
      print(paste('Convergence Criteria Met', mre_norm, it))
      condition = TRUE
    }
    values <- cbind(rep(it,length(halve +2)), cbind(c("NA",seq.int(0,halve))),
                    like[1:(halve +2)], grad_norm[1:(halve +2)],cbind(c(rep("NA", (halve +1)), mre_norm)))
    dimnames(values) <- list(rep("", (halve+2)),c("IT", "Halving","Log-Likelihood","||gradient||", "MRE"))
    res[[it]] <- values
    B = B1
  }
  print(res[1:5], quote = FALSE) # prints first 5
  print(res[(length(res)-2):length(res)], quote = FALSE) # prints last 3 
  mleB <<- B1
}


B <- matrix(0,3,1)
maxit <- 1200
optmstp(datat, B, maxit)

## [1] "Convergence Criteria Met 8.30780454975805e-05 607"
## [[1]]
##  IT Halving Log-Likelihood ||gradient|| MRE              
##  1  NA      -456.7840      1.7e+02      NA               
##  1  0       -104399.5101   6.3e+02      NA               
##  1  1       -52199.7551    6.3e+02      NA               
##  1  2       -26099.8775    6.3e+02      NA               
##  1  3       -13049.9388    6.3e+02      NA               
##  1  4       -6524.9694     6.3e+02      NA               
##  1  5       -3262.4913     6.3e+02      NA               
##  1  6       -1632.9088     6.3e+02      NA               
##  1  7       -845.4744      5.7e+02      NA               
##  1  8       -533.3509      3.7e+02      NA               
##  1  9       -451.1653      1.4e+02      0.274532877479257
## 
## [[2]]
##  IT Halving Log-Likelihood ||gradient|| MRE              
##  2  NA      -451.1653      1.4e+02      NA               
##  2  0       NA             NA           NA               
##  2  1       NA             NA           NA               
##  2  2       NA             NA           NA               
##  2  3       NA             NA           NA               
##  2  4       -8034.4810     9.7e+02      NA               
##  2  5       -3858.5437     9.7e+02      NA               
##  2  6       -1779.9833     9.5e+02      NA               
##  2  7       -826.6247      7.5e+02      NA               
##  2  8       -509.8254      3.8e+02      NA               
##  2  9       -444.1664      1.2e+02      0.264789493142006
## 
## [[3]]
##  IT Halving Log-Likelihood ||gradient|| MRE              
##  3  NA      -444.1664      1.2e+02      NA               
##  3  0       -71781.6175    6.3e+02      NA               
##  3  1       -35905.0973    6.3e+02      NA               
##  3  2       -17966.8372    6.3e+02      NA               
##  3  3       -8997.7071     6.3e+02      NA               
##  3  4       -4513.1422     6.3e+02      NA               
##  3  5       -2271.0244     6.3e+02      NA               
##  3  6       -1158.2708     6.2e+02      NA               
##  3  7       -653.5970      4.9e+02      NA               
##  3  8       -480.2722      2.8e+02      NA               
##  3  9       -440.1489      1e+02        0.184019118125658
## 
## [[4]]
##  IT Halving Log-Likelihood ||gradient|| MRE              
##  4  NA      -440.1489      1e+02        NA               
##  4  0       NA             NA           NA               
##  4  1       NA             NA           NA               
##  4  2       NA             NA           NA               
##  4  3       NA             NA           NA               
##  4  4       -5545.0425     9.7e+02      NA               
##  4  5       -2635.5846     9.6e+02      NA               
##  4  6       -1214.8275     8.9e+02      NA               
##  4  7       -632.0816      6e+02        NA               
##  4  8       -467.2475      2.7e+02      NA               
##  4  9       -435.7997      9e+01        0.190815471290533
## 
## [[5]]
##  IT Halving Log-Likelihood ||gradient|| MRE              
##  5  NA      -435.7997      9e+01        NA               
##  5  0       -51165.8741    6.3e+02      NA               
##  5  1       -25605.5854    6.3e+02      NA               
##  5  2       -12825.4411    6.3e+02      NA               
##  5  3       -6435.3689     6.3e+02      NA               
##  5  4       -3240.3379     6.3e+02      NA               
##  5  5       -1644.1610     6.3e+02      NA               
##  5  6       -868.5748      5.8e+02      NA               
##  5  7       -548.9491      4.1e+02      NA               
##  5  8       -452.6562      2.1e+02      NA               
##  5  9       -432.5543      7.9e+01      0.126895198259121
## 
## [[1]]
##  IT  Halving Log-Likelihood ||gradient|| MRE                 
##  605 NA      -281.9534      9.7e-02      NA                  
##  605 0       -282.2357      1.8e+01      NA                  
##  605 1       -282.0219      9e+00        NA                  
##  605 2       -281.9694      4.5e+00      NA                  
##  605 3       -281.9568      2.2e+00      NA                  
##  605 4       -281.9540      1.1e+00      NA                  
##  605 5       -281.9534      5.4e-01      0.000304229562372205
## 
## [[2]]
##  IT  Halving Log-Likelihood ||gradient|| MRE                 
##  606 NA      -281.9534      5.4e-01      NA                  
##  606 0       -363.5547      3e+02        NA                  
##  606 1       -302.3803      1.5e+02      NA                  
##  606 2       -287.0193      7.6e+01      NA                  
##  606 3       -283.1999      3.8e+01      NA                  
##  606 4       -282.2556      1.9e+01      NA                  
##  606 5       -282.0243      9e+00        NA                  
##  606 6       -281.9688      4.3e+00      NA                  
##  606 7       -281.9561      1.9e+00      NA                  
##  606 8       -281.9535      6.7e-01      NA                  
##  606 9       -281.9531      1.1e-01      0.000314638923434959
## 
## [[3]]
##  IT  Halving Log-Likelihood ||gradient|| MRE                 
##  607 NA      -281.9531      1.1e-01      NA                  
##  607 0       -283.0717      3.6e+01      NA                  
##  607 1       -282.2311      1.8e+01      NA                  
##  607 2       -282.0213      8.9e+00      NA                  
##  607 3       -281.9695      4.4e+00      NA                  
##  607 4       -281.9569      2.2e+00      NA                  
##  607 5       -281.9539      1.1e+00      NA                  
##  607 6       -281.9532      5.1e-01      NA                  
##  607 7       -281.9531      2.3e-01      8.30780454975805e-05

	MLE
b0	-9.508914
b1	3.177876
b2	4.486720

(e) Obtain the second differential $dd\ell(d\beta,d\beta)$

\[ d\ell(d\beta,d\beta) = \sum_{i =1}^{n} -x_{i}^{T}d\beta\left(\frac{e^{x_{i}^{T}\beta}}{(1+e^{x_{i}^{T}\beta})^2}\right)x_{i}^{T}d\beta \]

Let $W=\frac{e^{x_{i}^{T}\beta}}{(1+e^{x_{i}^{T}\beta})^2}$, where W is a scalar. Then a$-E[dd\ell(d\beta,d\beta)]$ is,

\[ - E\left[\frac{\partial^{2}\ell}{\partial \beta_{k}\partial \beta_{l}}\right] = \sum_{i =1}^{n} x_{i}^{T}d\beta W x_{i}^{T}d\beta \]

Furthermore For $k = l$,

\[ \frac{\partial^{2}\ell}{\partial \beta_{k}\partial \beta_{l}} = \sum_{i =1}^{n} -x_{ik}^{T} W x_{ik}^{T} \]

For $k \neq l$,

\[ \frac{\partial^{2}\ell}{\partial \beta_{k}\partial \beta_{l}} = \sum_{i =1}^{n} -x_{ik}^{T} W x_{il}^{T} \] In general, this implies that the Fisher Information Matrix elements are: For $k \neq l$ and $ k = l$, \[ - E\left[\frac{\partial^{2}\ell}{\partial \beta_{k}\partial \beta_{l}}\right] = \sum_{i =1}^{n} x_{ik}^{T} W x_{il}^{T} \] We’ll only use this formula to compute our fisher information matrix because it is agreeable with both cases $k \neq l$ and $k = l$.

(g) Write an R code to implement the Fisher-scoring algorithm to obtain the maximum likelihood estimates for $\beta$. use modified relative error as a convergence criterion with threshold value $10^{-6}$.

# Fisher Information Matrix
fish <- function(datat, B){
  n = length(datat$y)
  X = cbind(rep(1, n), log(datat$D), datat$S)
  Y = matrix(datat$y, nrow = n, ncol = 1)
f_mat <- matrix(0,3,3)
B <- matrix(0, 3,1)
for(k in 1:3){
  for(l in 1:3){
    for(i in 1:n){
      ex <- as.numeric(exp(X[i,]%*%B))
      sumnd <- X[i,k]*(ex/(1+ex))*X[i,l]
      f_mat[k,l] <- f_mat[k,l] + sumnd
      }
    }
  }
return(f_mat)
}

# Fisher-Scoring
optmfish <- function(datat, B, maxit){
  it = 0
  condition = FALSE
  res <- list()
  while(it < maxit & condition == FALSE){
    like = rep("NA", 22)
    grad_norm = rep("NA", 22)
    it = it + 1
    halve = 0
    
    a = likemvn(datat, B)
    like[1] <- formatC(a$l, digits =4, format = "f")
    grad_norm[1] <- format(norm(a$grad, type = "2"), scientific = TRUE, digits =2) # Storing formatted version of gradient at B_0
    
    # Fisher Info
    I = fish(datat,B)
    invI = solve(I)
    B1 = B + invI%*%a$grad
    
    
    atmp = likemvn(datat, B1)
    
    while(is.nan(atmp$l) == TRUE & halve <20){
      halve = halve + 1
      B1 = B + invI%*%a$grad/2^halve # Fisher Scoring
      atmp = likemvn(datat, B1)
      
    }
    like[halve+2] <- formatC(atmp$l, digits =4, format = "f")
    grad_norm[halve+2] <- format(norm(atmp$grad, type = "2"), scientific = TRUE, digits = 2)
    while(atmp$l < a$l & halve <20){
      halve = halve + 1
      B1 = B + Inv%*%a$grad/2^halve
      atmp = likemvn(datat, B1)
      like[halve+2] <- formatC(atmp$l, digits =4, format = "f")
      grad_norm[halve+2] <- format(norm(atmp$grad, type = "2"),scientific = TRUE, digits = 2)
    }
    mre_norm <- norm((B1-B)/pmax(1,abs(B1)), type = "I")
    if(halve >= 20){
      print(paste("Step Halving failed after 20 atempts", it))
      condition = TRUE
    }
    if(mre_norm <= 10^(-6)){
      print(paste('Convergence Criteria Met', mre_norm, it))
      condition = TRUE
    }
    values <- cbind(rep(it,length(halve +2)), cbind(c("NA",seq.int(0,halve))),
                    like[1:(halve +2)], grad_norm[1:(halve +2)],cbind(c(rep("NA", (halve +1)), mre_norm)))
    dimnames(values) <- list(rep("", (halve+2)),c("IT", "Halving","Log-Likelihood","||gradient||", "MRE"))
    res[[it]] <- values
    B = B1
  }
  print(res[1:5], quote = FALSE)
  print(res[(length(res)-2):length(res)], quote = FALSE)
  mleB <<- B1
}

B <- matrix(0,3,1)
maxit <- 1200
optmfish(datat,B,maxit)

## [1] "Convergence Criteria Met 9.74061166215013e-07 78"
## [[1]]
##  IT Halving Log-Likelihood ||gradient|| MRE
##  1  NA      -456.7840      1.7e+02      NA 
##  1  0       -347.7468      9e+01        1  
## 
## [[2]]
##  IT Halving Log-Likelihood ||gradient|| MRE              
##  2  NA      -347.7468      9e+01        NA               
##  2  0       -314.6320      5.5e+01      0.348326346654852
## 
## [[3]]
##  IT Halving Log-Likelihood ||gradient|| MRE              
##  3  NA      -314.6320      5.5e+01      NA               
##  3  0       -300.6367      3.7e+01      0.182343471553454
## 
## [[4]]
##  IT Halving Log-Likelihood ||gradient|| MRE              
##  4  NA      -300.6367      3.7e+01      NA               
##  4  0       -293.4913      2.7e+01      0.114705559049553
## 
## [[5]]
##  IT Halving Log-Likelihood ||gradient|| MRE               
##  5  NA      -293.4913      2.7e+01      NA                
##  5  0       -289.4290      2e+01        0.0794840532645471
## 
## [[1]]
##  IT Halving Log-Likelihood ||gradient|| MRE                 
##  76 NA      -281.9505      4.3e-04      NA                  
##  76 0       -281.9505      3.8e-04      1.29565165419555e-06
## 
## [[2]]
##  IT Halving Log-Likelihood ||gradient|| MRE                 
##  77 NA      -281.9505      3.8e-04      NA                  
##  77 0       -281.9505      3.3e-04      1.12340704889714e-06
## 
## [[3]]
##  IT Halving Log-Likelihood ||gradient|| MRE                 
##  78 NA      -281.9505      3.3e-04      NA                  
##  78 0       -281.9505      2.8e-04      9.74061166215013e-07

	MLE
b0	-9.562026
b1	3.197544
b2	4.508565

(h) What is the prediction of your model for the probability that a tree with a diameter 10 would blow down as a function of the severity of wind, as it ranges from 0 to 1. Show the probability using a graph, and comment.

s = seq(0,1, length.out = 1000) # range for wind severity 
p = rep(NA, 1000) # vector for probabilities 
for(i in 1:length(s)){
  x = matrix(c(1,log(10),s[i]),1,3) # severity is the only variable ranging from 0 to 1
  ex <- exp(x%*%mleB) # mleB is obtained from fisher-scoring 
 p[i] = (ex/(1+ex)) # This is P(y_i = 1), 
}
plot(s,p, type = 'l', xlab = "Wind Severity", ylab = "Probability Tree is Blown Down")

The plot shows an almost linear relationship between severity of wind and and the probability of a 10-diameter tree being blown down, as is logically expected. Further evidence of this logic could be seen if you take the mean value of y for trees with diameter of 10 and a severity greater than .7 from our given observations. This analysis produces results similar to those seen on the plot.

d10 <- datat[which(datat$D == 10),]
pred = mean(d10[which(d10$S>=.7),]$y)

pred = matrix(pred, 1,1, dimnames = list(c('Probabilty 10 Diameter Tree Falls'), c("Prediction")))
pred%>%
  kable()%>%
  kable_styling(full_width = F)

	Prediction
Probabilty 10 Diameter Tree Falls	0.8333333

(i) What is the prediction of your model for the probability that a tree would blown down with wind severity 0.4 and diamters ranging from 5 to 30. Show the probability using a graph, and comment.

d = seq(5,30)
p = rep(NA, length(d)) # range of tree diameter
for(i in 1:length(d)){
  x = matrix(c(1,log(d[i]),.4),1,3) # diameter is the only variable everything is is constant
  ex <- exp(x%*%mleB) # mle B is obtained from fisher information 
  p[i] = (ex/(1+ex)) # This is P(y_i =1)
}
plot(d,p, type = 'l', xlab = "Tree Diameter", ylab = "Probability Tree is Blown Down")

The relationship between Tree Diameter and Probability of Being Blown Down is much more aggressive than the one found in part (h). Once the diameter of a tree reaches 10, its chances of being blown down increase at a much higher rate. This is logically sound, as wider trees have more surface area, which allows the wind to damage the tree at a higher rate. Efficacy of this graph in explaning the data could also be demonstrated by calculating the average Tree Diameter of a blown down tree around .40 severity from our data, which coincides with 50th percentile tree diameter presented displayed by the plot(diameter = 12).

s40 <- datat[which(datat$S>= .39 & datat$S<=.41),]
pred = mean(s40[which(s40$y ==1),]$D)

pred = matrix(pred, 1,1, dimnames = list(c('Avg. Diameter Fallen Tree for Severity .40'), c("Prediction")))
pred%>%
  kable()%>%
  kable_styling(full_width = F)

	Prediction
Avg. Diameter Fallen Tree for Severity .40	11.4

(j) Plot the three-dimensional graph of $\pi(\beta)$, using your maximum likelihood estimates and as a function of $X_1 = S$ and $X_2 = \log(D)$. For your plot, $X_1$ should range in the interval (0,1) and $X_2$ shoudl range in the range of the observed values for $\log(D)$. Moreover, draw the constant value contours and use it to explain how changing vaules of $S$ and $D$ effects the probability that a tree is blown down.

d = seq(5,30)
s = seq(0,1,length.out = length(d))
prob <- function(d,s){
  len = length(d)
  r = numeric(length(len))
  mleB <- c(-9.562026, 3.197544, 4.508565)
  for(i in 1:len){
    x = matrix(c(1,log(d[i]),s[i]),1,3)
    ex <- exp(x%*%mleB)
    r[i] <- ex/(1+ex)
  }
  r
}

threed <- outer(d,s,prob)

library(plot3D)
persp3D(x = s, y = d , z = threed, col = "red", alpha = .5, xlab = "Severity", ylab = "Diameter", zlab = "Probability Y = 1", main = "")

contour(x = s,y = d,z = threed)

Although the 3D provides a unique visual of the effects of Diameter and Severity to the probability of a tree being blown down, the constant value contours of this graph display their relationship more precisely. As was demonstrated in the previous single varaiable plots, the change in diameter appears to have a more prominante effect on the probability of y = 1 than a change in severity, for larger values of D. That is, the rate the probability is effected by changes in severity is significantly larger for larger values of Diameter. The probability a 10 diameter tree will be blown down in .8 severity winds is equal to the probability that a 27 diameter tree will be blown down in .2 severity winds, which is .8. Again, as larger trees have more surface area, changes in severity have exponentially larger effects on their survival probabilites.

Math 534 Midterm 1

Jose Luis Gonzalez

3/11/2018

Problem 1

(a) Write the log-likelihood function \(\ell(\beta)\) as a function of \(y_{i}\)’s and \(\pi_i(\beta)\).

(b) Obtain the first differential \(d\pi_i(d\beta)\) and the second differential \(dd\pi_i(d\beta,d\beta)\).

(c) Obtain the first differential \(d\ell(d\beta)\) and use it to derive formulas for \(\partial\ell(\beta)/\partial\beta_i\) for i = 1,2,3.

(d) Write an R code to implement steepest ascent method with step-halving to obtain the maximum likelihood estimates for \(\beta\). Use modified relative error as a convergence criterion with threshold value of \(10^{-4}\).

(e) Obtain the second differential \(dd\ell(d\beta,d\beta)\)

(g) Write an R code to implement the Fisher-scoring algorithm to obtain the maximum likelihood estimates for \(\beta\). use modified relative error as a convergence criterion with threshold value \(10^{-6}\).

(h) What is the prediction of your model for the probability that a tree with a diameter 10 would blow down as a function of the severity of wind, as it ranges from 0 to 1. Show the probability using a graph, and comment.

(i) What is the prediction of your model for the probability that a tree would blown down with wind severity 0.4 and diamters ranging from 5 to 30. Show the probability using a graph, and comment.