Issue
I am currently working on a global optimization problem and in order to speed up the calculations I want to include jacobian information in the local optimization. My function is not simple but it should be differentiable, and as such I want to use pytorch for the automatic differentiation. Much like it is done here: https://bnikolic.co.uk/blog/pytorch/python/2021/02/28/pytorch-scipyminim.html
import numpy
import scipy, scipy.optimize
import torch
def minim(obs, f, p0):
"""Fit function f to observations "obs" starting at p0"""
def fitfn(pars):
# NB the require_grad parameter specifying we want to
# differentiate wrt to the parameters
pars=torch.tensor(pars,
requires_grad=True)
y=f(pars)
# Simple least-squares fitting
res=((obs-y)**2).sum()
res.backward()
# Note that gradient is taken from the "pars" variable
return res.data.cpu().numpy(), pars.grad.data.cpu().numpy()
res=scipy.optimize.minimize(fitfn,
p0,
method="BFGS",
jac=True) # NB: we will compute the jacobian
return res
# Points on which observations are made
X=torch.arange(100)
def exfn(p):
y=torch.sin(p[0]*X)+torch.cos(p[1]*X)
return y
# Sample observations
yp=exfn(torch.tensor([0.3, 0]) )
# Sample run
minim(yp, exfn, numpy.array([0.34, 0.01]))
So the above works by providing the keyword jac=True to the minimizer at which point it knows that the function will not only return the value, but also the jacobian when evaluated. The problem is that I need to use a global optimizer, and it seems like the framework does not support this. At least I can't see any way to tell the global optimizer that my function is returning not just the value but also the jacobian.
With dual annealing it would look something like this:
dual_annealing(f,bounds,args=(x,y0),minimizer_kwargs={'jac':True})
but this will not work since it only tells the local minimizer that the jacobian is included, but not the global optimizer that is wrapped around the local minimizer.
Is there any workaround for this problem? Or does anyone have any alternative frameworks that might work?
Solution
I ended up having a good discussion about this issue with one of the contributors to scipy on github.
The thread can be seen here: https://github.com/scipy/scipy/issues/19305
The topic also brings up Nick ODell's suggestion. And it shows a neat trick with caching the derivative results, which in some cases might be handy.
But overall the conclusion is that the method you should use depends on the specific global and local optimizers you are using, and in some cases you can code something that have no wasted calculations while in other cases there will be some wasted forward or derivative calculations.
Answered By - tbolind
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.