Smart Gradient is a technique to improve the accuracy of numerical gradient computations, especially in iterative optimization settings.
Abstract: Computing the gradient of a function provides fundamental information about its behavior. This information is essential for several applications and algorithms across various fields. One common application that requires gradients is optimization, such as stochastic gradient descent, Newton’s method, and trust-region methods. However, these methods often rely on numerical computation of gradients at every iteration, which is prone to numerical errors.
We propose a simple limited-memory technique for improving the accuracy of these gradients by exploiting: (1) a coordinate transformation of the gradient, and (2) the history of previously taken descent directions. The method is implemented in both C++ and as an R package, and verified empirically on benchmark test functions and real data applications.
Below is an example showing how to use smartGrad in R.
library("devtools")
install_github("esmail-abdulfattah/Smart-Gradient",
subdir = "smartGrad")
We use the Extended Rosenbrock function as an example:
myfun <- function(x) {
res <- 0.0
for(i in 1:(length(x)-1))
res <- res + 100*(x[i+1] - x[i]^2)^2 + (1-x[i])^2
return(res)
}
mygrad <- function(fun,x){
h = 1e-3
grad <- numeric(length(x))
for(i in 1:length(x)){
e = numeric(length(x))
e[i] = 1
grad[i] <- (fun(x+h*e) - fun(x-h*e))/(2*h)
}
return(grad)
}
library("stats")
library("smartGrad")
x_dimension = 5
x_initial = rnorm(x_dimension)
result <- optim(par = x_initial,
fn = myfun,
gr = makeSmart(fn = myfun, gr = mygrad),
method = c("BFGS"))