# Nonlinear Regression in R

The examples in this post is taken from a class I took at the University of Missouri. These examples are originally provided in SAS and I translated them to R.

# Non-linear Regression with Available Starting Values

The velocity of a chemical reaction ($$y$$) is modeled as a function of the concentration of the chemical ($$x$$). There are a total of 18 observations. The desired model is

$y_i = \frac{\theta_0 x_i}{\theta_1 + x_i} + \epsilon_i.$

To find reasonable starting values for $$\theta_0$$ and $$\theta_1$$, we take the inverse of the model expression (ignoring the error term) and fit the ordinary least-squares regression:

$\frac{1}{y_i} = \frac{\theta_1 + x_i}{\theta_0 x_i} = \frac{1}{\theta_o} + \frac{\theta_1}{\theta_0} (\frac{1}{x_i}).$

## Descriptive analysis

enzyme <- read.table("datasets/enzyme.dat")
names(enzyme) <- c('y', 'x')
y x
1 2.1 1.0
2 2.5 1.5
3 4.9 2.0
4 5.5 3.0
5 7.0 4.0
6 8.4 5.0
7 9.6 6.0
8 10.2 7.5
9 11.4 8.5
10 12.5 10.0
11 13.1 12.5
12 14.6 15.0
13 17.0 17.5
14 16.8 20.0
15 18.6 25.0
16 19.7 30.0
17 21.3 35.0
18 21.6 40.0
# plot the data
require(ggplot2)
p <- ggplot(data = enzyme, aes(x, y)) + geom_point() +
labs(y = "Reaction Velocity", x = "Concentration")
p

## Perform an OLS regression to find starting values

ols_fit <- lsfit(1/enzyme$x, 1/enzyme$y)
ols_fit$coefficients ## Intercept X ## 0.03375868 0.45401397 So the starting values for $$\theta_0$$ and $$\theta_1$$ are given by: $\theta_0^{(0)} = 1 / \beta_0 = \frac{1}{0.03375868} = 29.62$ $\theta_1^{(0)} = \beta_1 / \beta_0 = \frac{0.45401397}{0.03375868} = 13.45$ ## Perform a non-linear regression with Gauss-Newton method ### Fit the model The nls() function in stats package performs nonlinear (weighted) least-square estimates of the parameter of a nonlinear model. It can use a formula object to specify a model and any user-specified function can be used in the model. Because nonlinear models are sometimes complicated, here we showed how to use a customized nlmodel() function to specify the model. To match the SAS output, trace = T is set to print out iteration history. Note that the first column corresponding to the objective function and the other columns corresponding to the parameter estimates for each iteration. nlmodel <- function(x, theta0, theta1) theta0 * x / (theta1 + x) nls_fit <- nls(y ~ nlmodel(x, theta0, theta1), data = enzyme, start = list(theta0 = 29.62, theta1 = 13.45), trace = T) ## 6.57116 : 29.62 13.45 ## 4.303542 : 28.14228 12.59796 ## 4.302271 : 28.13785 12.57534 ## 4.302271 : 28.13708 12.57449 ## 4.302271 : 28.13705 12.57445 nls_fit ## Nonlinear regression model ## model: y ~ nlmodel(x, theta0, theta1) ## data: enzyme ## theta0 theta1 ## 28.14 12.57 ## residual sum-of-squares: 4.302 ## ## Number of iterations to convergence: 4 ## Achieved convergence tolerance: 4.304e-07 ### Parameter estimates Parameter estimates and other model summaries can be obtained using the summary() function. sm <- summary(nls_fit, correlation = T) sm$coefficients  # coefficients and their significance
##        Estimate Std. Error  t value     Pr(>|t|)
## theta0 28.13705  0.7279790 38.65091 3.137221e-17
## theta1 12.57445  0.7630534 16.47913 1.850253e-11
sm$correlation # correlation matrix of parameters ## theta0 theta1 ## theta0 1.0000000 0.9366248 ## theta1 0.9366248 1.0000000 To get individual confidence intervals on parameter estimates, we need to know the standard error of the estimates as well as the degree of freedom of the estimates of $$\sigma^2$$. That can be obtained from df.residual(). rdf <- df.residual(nls_fit) # C.I. for theta0 sm$coefficients[1, 1] + qt(c(.025, .975), rdf) * sm$coefficients[1, 2] ## [1] 26.5938 29.6803 # C.I. for theta1 sm$coefficients[2, 1] + qt(c(.025, .975), rdf) * sm$coefficients[2, 2] ## [1] 10.95685 14.19205 ### Leverage Values To compute the leverage, we need $$F$$ and $$(F'F)^{-1}$$. The tangent plane hat matrix is $$H = F(F'F)^{-1}F'$$. The leverage values are the diagonal elements of the hat matrix. cf <- nls_fit$m$gradient() # cf: capital f matrix sm$cov.unscaled  # (F'F)^(-1)
##          theta0   theta1
## theta0 1.970879 1.934914
## theta1 1.934914 2.165370
ch <- cf %*% sm$cov.unscaled %*% t(cf) # diagonal of hat matrix kable(diag(ch), row.names = T, col.names = c("LEV")) LEV 1 0.0176537 2 0.0328108 3 0.0484048 4 0.0759531 5 0.0956214 6 0.1072992 7 0.1124450 8 0.1117837 9 0.1080296 10 0.1003662 11 0.0882287 12 0.0818877 13 0.0831978 14 0.0919111 15 0.1271639 16 0.1783176 17 0.2379108 18 0.3010149 ### Studentized residual plots The studentized residual is computed as $r_i = \frac{e_i}{s \sqrt{1 - \hat{H}_{ii}} }$ rstudent <- residuals(nls_fit) / (sm$sigma * sqrt(1 - diag(ch)))
enzyme <- data.frame(enzyme, rstudent)
qplot(y, rstudent, data = enzyme, geom = "point", xlab = "Reaction Velocity")

qplot(x, rstudent, data = enzyme, geom = "point", xlab = "Concentration")

### Confidence intervals for $$Y$$

The confidence interval is computed by $\hat{Y}_0 \pm t_{(n-p, 1-\alpha/2)} s [f_0'(F'F)^{-1}f_0]^{1/2} .$ Note that if we want confidence interval at original $$x$$ values $$x_i$$, the definition for $$f_0$$ here is just the transpose of $$F(\theta, x_i)$$.

y0 <- fitted(nls_fit)
se <- apply(t(cf), 2, function(f0) {
sm$sigma * {t(f0) %*% sm$cov.unscaled %*% f0}^(.5)
})

ll <- fitted(nls_fit) + qt(.025, df.residual(nls_fit)) * se
ul <- fitted(nls_fit) + qt(.975, df.residual(nls_fit)) * se
ci <- data.frame(enzyme, ll, ul)

p + stat_function(fun = nlmodel, args = as.list(coef(nls_fit))) +
geom_smooth(aes(ymin = ll, ymax = ul), data = ci, stat="identity")

# prediction intervals are similar
se <- apply(t(cf), 2, function(f0) {
sm$sigma * {1 + t(f0) %*% sm$cov.unscaled %*% f0}^(.5)
})

ll <- fitted(nls_fit) + qt(.025, df.residual(nls_fit)) * se
ul <- fitted(nls_fit) + qt(.975, df.residual(nls_fit)) * se
pi <- data.frame(enzyme, ll, ul)

p + stat_function(fun = nlmodel, args = as.list(coef(nls_fit))) +
geom_smooth(aes(ymin = ll, ymax = ul), data = ci, stat="identity") +
geom_smooth(aes(ymin = ll, ymax = ul), data = pi, stat="identity")