octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Contribution to the optimization toolbox


From: Michael Creel
Subject: Re: Contribution to the optimization toolbox
Date: Fri, 4 Sep 2009 16:13:23 +0200

On Fri, Sep 4, 2009 at 2:53 PM, Jaroslav Hajek <address@hidden> wrote:
>
> Here is what I get testing your script, using optim-1.0.6 and
> development octave.
>
> ans =
>
>   0.131749   0.999000   0.055406   0.970000
>
> Since the 5D Rosenbrock function has two local optima, I'm not exactly
> surprised about the 3% failures for fminunc.
> I'd rather say it's interesting that bfgsmin is always successful,
> despite the local optimum.

That's a good point. bfgsmin occasionally does a Hessian reset when it
runs into trouble, so it sometimes switches to steepest descent. Maybe
that's the reason.

>
> Another interesting thing is that I got the times reversed. Maybe you
> used a newer version of bfgsmin?

I'm using the lastest development version of bfgsmin, which hasn't had
changes recently. I don't know when optim-1.0.6 was released, but I
doubt that there is anything different - certainly there have been no
major changes. I would guess that the changes in the relative timings
are more to do with differences between Octave 3.2.3 release
candidate, and the development version. I'm impressed with the timings
for fminunc using the development version! The fact that an .m file
implementation can be so fast is very interesting.

>
> Does it do any checking against local optima?

No, just the Hessian reset when the approx Hessian is not p.d. enough.

> Also, I think bfgsmin somehow attempts to auto-discover the analytic
> gradient, doesn't it? If so, does it happen in this case?

I think that is the case here, I had forgotten about that. The
following code causes bfgsmin to use numeric gradients:

######################## new version ##################
1;
# example obj. fn.: no gradient
function obj_value = obj(x)
        obj_value = rosenbrock(x);
endfunction

dim = 5;
replications = 1000;
results = zeros(replications,4);
ub = 2;
lb = 0;
control = {100,0};

for i = 1:replications
        x = (ub-lb).*rand(dim,1) + lb;
        tic;
        [theta, obj_value, convergence] = bfgsmin("obj", {x}, control);
        results(i,1) = toc;
        results(i,2) = norm(theta - ones(dim,1)) < 1e-5;

        tic;
        [theta, obj_value] = fminunc("obj", x);
        results(i,3) = toc;
        results(i,4) = norm(theta - ones(dim,1)) < 1e-5;
endfor

mean(results)
######################## end new version ##################


With this, I get
octave:1> compare
ans =

   0.030752   1.000000   0.082802   0.970000

octave:2> compare
ans =

   0.029723   1.000000   0.081364   0.964000




>
> Interesting stuff starts to happen if I shift the optimum of the
> objective so that x_opt(1) = 100:

That is a pretty brutal thing to do to an optimization algorithm! This
problem would be detected by looking at the gradient at the start
value, and proper scaling would solve it, I believe. Poor scaling is a
user error, not a problem of the algorithm, I would say.

>
> ans =
>
>   0.515509   0.010000   0.191053   0.060000
>
>
> Clearly, neither code reacts well. Hmmm. Perhaps I can improve fminunc
> a bit in this regard.

Great. But to facilitate testing and to avoid having redundant
algorithms, it would be nice to have a set of test problems to work
with. For example, this little test shows that fminunc is getting a
lot faster with the development version of Octave, and could probably
be made a little more robust with a little work. I'll have a look at
the fminunc code and see if I can make some suggestions. If it gets as
robust as bfgsmin, and is faster, then bfgsmin will probably need to
be retired. Of course, some other test problems would need to be
tried, too.

Cheers, Michael



reply via email to

[Prev in Thread] Current Thread [Next in Thread]