|
|
The optimization methods described in this chapter attempt to minimize a given
objective function. There are several ways to do this and COPASI
supports many different methods for the minimization of an objective
function. 4.1. Evolutionary Programming
Evolutionary programming (EP)
[Fogel92][Baeck93][Baeck97]
is a computational technique that mimics evolution and is based on
reproduction and selection. An EP algorithm is composed of individuals
that reproduce and compete, each one is a potential solution to the
(optimization) problem and is represented by a "genome" where each
gene corresponds to one adjustable parameter. At each generation of
the EP, each individual reproduces asexually, i.e. divides into two
individuals. One of these contains exactly the same "genome" as the
parent while the other suffers some mutations (the parameter values of
each gene change slightly). At the end of the generation, the
algorithm has double the number of individuals. Then each of the
individuals is confronted with a number of others to count how many
does it outperform (the number of wins is the number of these
competitors that represent worse solutions than itself). All the
individuals are ranked by their number of wins, and the population is
again reduced to the original number of individuals by eliminating
those which have worse fitness (solutions).
Options for Evolutionary Programming - Number of Generations
The parameter is a positive integer value to determine the number of
generations the algorithm shall evolve the population. The default is
'200'.
- Population Size
The parameter is a positive integer value to determine the size of the
population, i.e., the number of individuals that survive after each
generation. The default is '20'.
- Random Number Generator
The parameter is an enumeration value to determine which random number
generator this method shall use. COPASI provides two random number
generators R250 [Maier91]
(selected through the value 0) and the
Mersenne
Twister [Matsumoto98]
(selected through the value 1 (default)).
- Seed
The parameter is a positive integer value to determine the seed for the
random number generator. A value of zero instructs COPASI to select a
"random" value.
4.2. Evolutionary Strategy (SRES)
Evolutionary Strategies with Stochastic Ranking (SRES)
[Runarsson00] is similar to Evolutionary Programming. However, a parent has multiple
offsprings during each generation. Each offspring will contain a
recombination of genes with another parent and additional
mutations. The algorithm assures that each parameter value will be
within its boundaries. But constraints to the solutions may be
violated. Whenever this happens the square of the size of the
violation is summed up, i.e., we calculate
where the constraints are given by
.
The value
is used within
the selection, which is performed as described in Genetic Algorithm SR
Options for Evolutionary Strategy (SRES) - Number of Generations
The parameter is a positive integer value to determine the number of
generations the algorithm shall evolve the population. The default is
'200.'
- Population Size
The parameter is a positive integer value to determine the size of the
population, i.e., the number of individuals that survive after each
generation. The default is '20'.
- Random Number Generator
The parameter is an enumeration value to determine which random number
generator this method shall use. COPASI provides two random number
generators R250 [Maier91]
(selected through the value 0) and the
Mersenne
Twister [Matsumoto98]
(selected through the value 1 (default)).
- Seed
The parameter is a positive integer value to determine the seed for the
random number generator. A value of zero instructs COPASI to select a
"random" value.
- Pf
This parameter is a numerical value in the interval (0, 1) determining
the chance that individuals either outside the parameter boundaries or
violating the constraints are compared during the selection. The
default is '4.75'.
The genetic algorithm (GA)
[Baeck97][Baeck93][Michalewicz94][Mitchell95]
is a computational technique that mimics
evolution and is based on reproduction and selection. A GA is composed
of individuals that reproduce and compete, each one is a potential
solution to the (optimization) problem and is represented by a
"genome" where each gene corresponds to one adjustable
parameter. At each generation of the GA, each individual is paired
with one other at random for reproduction. Two offspring are produced
by combining their genomes and allowing for "cross-over",
i.e., the two new individuals have genomes that are formed from a
combination of the genomes of their parents. Also each new gene might
have mutated, i.e. the parameter value might have changed slightly. At
the end of the generation, the algorithm has double the number of
individuals. Then each of the individuals is confronted with a number
of others to count how many does it outperform (the number of wins is
the number of these competitors that represent worse solutions than
itself). All the individuals are ranked by their number of wins, and
the population is again reduced to the original number of individuals
by eliminating those which have worse fitness (solutions).
Many features of a GA may be varied. The details of this particular
implementation of the GA for optimization of biochemical kinetics are:
Parameters are encoded in genes using floating-point representation,
rather than the more usual binary representation.
Mutation is carried out by adding to the gene a random number drawn
from a normal distribution with zero mean and a standard deviation of
10% of the parameter value. Whenever this makes the parameter (gene)
exceed one boundary, it is set to that boundary value.
Cross-over is always performed at gene boundaries so that no gene is
ever disrupted. The number of cross-over points is a random number
between zero and half the number of adjustable parameters (uniform
distribution).
Selection is done by a tournament where each individual competes with
a number of others equal to 20% the population size. The competitors
are chosen at random.
The initial population contains one individual whose genes are the
initial parameter values, the genes of all other individuals are
initialized to a random value between their boundaries. If the
boundaries span two orders of magnitude or more, the random
distribution is exponential, otherwise normal.
Whenever the fittest individual has not changed for the last 10
generations, the 10% less fit individuals are replaced by individuals
with random genes. When the fittest individual has not changed for 30
generations, the worse 30% are substituted by individuals with random
genes. When the fittest individual has not changed for 50 generations,
the worse 50% are substituted by individuals with random genes. This
procedure helps the algorithm escape local minima and is somewhat
equivalent to increasing the mutation rate when the population has
become uniform.
Options for Genetic Algorithm - Number of Generations
The parameter is a positive integer value to determine the number of
generations the algorithm shall evolve the population. The default is
'200'.
- Population Size
The parameter is a positive integer value to determine the size of the
population, i.e., the number of individuals that survive after each
generation. The default is '20'.
- Random Number Generator
The parameter is an enumeration value to determine which random number
generator this method shall use. COPASI provides two random number
generators R250 [Maier91]
(selected through the value 0) and the
Mersenne
Twister [Matsumoto98]
(selected through the value 1 (default)).
- Seed
The parameter is a positive integer value to determine the seed for the
random number generator. A value of zero instructs COPASI to select a
"random" value.
4.4. Genetic Algorithm SR
The genetic algorithm with stochastic ranking is very similar to the
before described Genetic Algorithm with tournament
selection. With two exception which are the mutations are not forced
to be within the boundaries and the selection is done through a bubble
sort with a random factor as described in
[Runarsson00].
Parameters are encoded in genes using floating-point representation,
rather than the more usual binary representation.
Mutation is carried out by adding to the gene a random number drawn
from a normal distribution with zero mean and a standard deviation of
10% of the parameter value. Parameters may exceed boundaries. Whenever
this happens or a constraint to the solution is violated the square of
the size of the violation is summed up, i.e., we calculate
where the parameters are given by
and the constraints by
.
The value
is used within
the selection.
Cross-over is always performed at gene boundaries so that no gene is
ever disrupted. The number of cross-over points is a random number
between zero and half the number of adjustable parameters (uniform
distribution).
Selection is done by the bubble sort described in
[Runarsson00].
This sort incorporates a probability to compare
objective values for individuals with a
. The pseudo code for the sort
is:
// Here sweepNum is optimal number of sweeps from paper, i.e., TotalPopulation
for (i = 0; i < sweepNum; i++)
{
wasSwapped = false;
for (j = 0; j < TotalPopulation - 1; j++)
{
// within bounds or random chance
if ((phi(j) == 0 and phi(j + 1) == 0) or UniformRandom(0, 1) < Pf)
{
// compare objective function values
if (Value(j) > Value(j + 1))
{
swap(j, j + 1);
wasSwapped = true;
}
}
else // phi != 0
{
// individual j further outside then j + 1
if (phi(j) > phi(j + 1))
{
swap(j, j + 1);
wasSwapped = true;
}
}
}
// if no swap then break
if (wasSwapped == false) break;
}
The initial population contains one individual whose genes are the
initial parameter values, the genes of all other individuals are
initialized to a random value between their boundaries. If the
boundaries span two orders of magnitude or more, the random
distribution is exponential, otherwise normal.
Whenever the fittest individual has not changed for the last 10
generations, the 10% less fit individuals are replaced by individuals
with random genes. When the fittest individual has not changed for 30
generations, the worse 30% are substituted by individuals with random
genes. When the fittest individual has not changed for 50 generations,
the worse 50% are substituted by individuals with random genes. This
procedure helps the algorithm escape local minima and is somewhat
equivalent to increasing the mutation rate when the population has
become uniform.
Options for Genetic Algorithm SR - Number of Generations
The parameter is a positive integer value to determine the number of
generations the algorithm shall evolve the population. The default is
'200'.
- Population Size
The parameter is a positive integer value to determine the size of the
population, i.e., the number of individuals that survive after each
generation. The default is '20'.
- Random Number Generator
The parameter is an enumeration value to determine which random number
generator this method shall use. COPASI provides two random number
generators R250 [Maier91]
(selected through the value 0) and the
Mersenne
Twister [Matsumoto98]
(selected through the value 1 (default)).
- Seed
The parameter is a positive integer value to determine the seed for the
random number generator. A value of zero instructs COPASI to select a
"random" value.
- Pf
This parameter is a numerical value in the interval (0, 1) determining
the chance that individuals either outside the parameter boundaries or
violating the constraints are compared during the selection. The
default is '4.75'.
The method of Hooke and Jeeves
[Bell66][Hooke61][Kaupe63][Swann72]
is a direct search algorithm that
searches for the minimum of a nonlinear function without requiring (or
attempting to calculate) derivatives of the function. Instead it is
based on a heuristic that suggests a descent direction using the
values of the function calculated in a number of previous iterations.
Options for Hooke & Jeeves - Iteration Limit
This parameter is positive integer determining the maximum number of
iterations the algorithm shall perform. The default is '50'.
- Tolerance
This parameter is a positive value determining the tolerance with
which the solution shall be determined. If the improvement between two
steps is less than the tolerance the algorithm stops. The default is
'
'.
- Rho
This parameter is a value in (0, 1) determining the factor with which
the steps size is reduced between iterations. The default is '0.2'.
4.6. Levenberg - Marquardt
Levenberg-Marquardt [Levenberg44][Marquardt63] is a gradient descent method. It is a hybrid
between the steepest descent and the Newton methods.
The Newton optimization method searches for the minimum of a nonlinear
function by following descent directions determined from the
function's first and second partial derivatives. The steepest descent
method searches for a minimum based only on the first derivatives of
the function. While the Newton method converges quadratically towards
the minimum in its vicinity, it may not converge at all if it is far
away from it. On the other hand the steepest descent method only
converges linearly but is guaranteed to converge.
Levenberg first suggested an improvement to the Newton method in order
to make it more robust, i.e. to overcome the problem of
non-convergence. His suggestion was to add a factor to the diagonal
elements of the Hessian matrix of second derivatives when not close to
the minimum (this can be judged by how positive definite the matrix
is). The effect when this factor is large compared to the elements of
Hessian is that the method then becomes the steepest descent
method. Later Marquardt suggested that the factor should be
multiplicative rather than additive and also defined a heuristic to
make this factor increase or decrease. The method known as
Levenberg-Marquardt is thus an adaptive method that effectively
changes between the steepest descent to the Newton method.
The original suggestions of Levenberg and Marquardt were effective
to enhance the Gauss-Newton method, a variant of the Newton method
specifically for minimizing least-squares functions. In this case the
advantage is also that the second derivatives do not need to be
calculated as they are estimated from the gradient of the
residuals. Subsequently Goldfeld et
al. [Goldfeld66] extended the method to the
case of general non-linear functions.
Options for Levenberg - Marquardt - Iteration Limit
This parameter is positive integer determining the maximum number of
iterations the algorithm shall perform. The default is '200'.
- Tolerance
This parameter is a positive value determining the tolerance with
which the solution shall be determined. If the improvement between two
steps is less than the tolerance the algorithm stops. The default is
'
'.
This method also known as the simplex method is due to Nelder and Mead
[Nelder65]. A simplex is a polytope of
vertices in
dimensions. The objective function is
evaluated at each vertex. Dependent on these calculated values a new
simplex is constructed. The simplest step is to replace the worst
point with a point reflected through the centroid of the remaining
points. If this point is better than the best current point, then we
can try stretching exponentially out along this line. On the other
hand, if this new point isn't much better than the previous value then
we are stepping across a valley, so we shrink the simplex towards the
best point.
Options for Nelder - Mead - Iteration Limit
This parameter is a positive integer to determine the maximum number of
iterations the method is to perform. The default value is '200'.
- Tolerance
This parameter is a positive number and provides an alternative
termination criteria. If the variance of the values of the objective
function at the vertices of the current simplex is smaller than the
tolerance the algorithm stops. The default is '
'.
- Scale
This parameter is a positive number and determines the size of the
initial simplex. The edges of the polytope are inverse proportional to
the scale, i.e., a larger value makes the initial simplex smaller.
The default is '10'.
The particle swarm optimization method suggested by Kennedy and
Eberhart [Kennedy95] is inspired by a flock of
birds or a school of fish searching for food. Each particle has a
position
and a velocity
in the parameter space. Additionally, it remembers its best achieved
objective value
and position
. Dependent on its own information and the
position of its best neighbor (a random subset of particles of the
swarm) a new velocity is calculated. With this information the
position is updated. The pseudo code for the algorithm for each
particle is:
N = best neighbor's position
for i = 0 to number of parameters do
R1 = uniform random number in [0, 1]
R2 = uniform random number in [0, 1]
V[i]= w * V[i]
+ C * R1 * (M[i]- X[i])
+ C * R2 * (N[i]- X[i])
X[i] = X[i] + V[i]
enddo
current_O = evaluate objective function
if current_O < O then do
O = current_O
M = X
enddo
Options for Particle Swarm - Iteration Limit
This parameter is a positive integer to determine the maximum number of
iterations the method is to perform. The default value is '2000'.
- Swarm Size
This parameter is a positive integer specifying the number of
particles in the swarm. The default value is '50'.
- Std. Deviation
This parameter is a positive number and provides an alternative
termination criteria. If the standard deviation of the
values of the objective function of each particle and the standard
deviation of the best positions is smaller than the provided value the
algorithm stops. The default value is '
'.
- Random Number Generator
The parameter is an enumeration value to determine which random number
generator this method shall use. COPASI provides two random number
generators R250 [Maier91]
(selected through the value 0) and the
Mersenne
Twister [Matsumoto98]
(selected through the value 1 (default)).
- Seed
The parameter is a positive integer value to determine the seed for the
random number generator. A value of zero instructs COPASI to select a
"random" value.
Praxis is a direct search method (for a review see [Swann72]) that searches for the minimum of a nonlinear function without requiring (or attempting to calculate) derivatives of that function. Praxis was developed by Brent [Brent72] after the method proposed by Powell [Powel64]. The inspiration for Praxis was the well-known method of minimising each adjustable parameter (direction) at a time - the principal axes method. In Praxis directions are chosen that do not coincide with the principal axes, in fact if the objective function is quadratic then these will be conjugate directions, assuring a fast convergence rate.
This implementation of the Praxis method was originally written in FORTRAN at the Stanford Linear Accelerator Center (dated 3/1/73). The original FORTRAN code is available from http://www.netlib.org/opt/praxis.
The original code was translated automatically by the F2C program from AT&T Bell Labs (available at http://www.netlib.org/f2c/). The C code was then enhanced to suite COPASI's reentrant optimization method interface.
Options for Praxis - Tolerance
Convergence stopping criterium: the method stops when two consecutive estimates differ by less than Tolerance. This parameter is a positive integer to determine the number of parameter sets to be drawn before the algorithm stops. The default value is '
'.
Random search is an optimization method that attempts to find the
optimum by testing the objective function's value on a series of
combinations of random values of the adjustable parameters. The random
values are generated complying with any boundaries selected by the
user, furthermore, any combinations of parameter values that do not
fulfill constraints on the variables are excluded. This means that the
method is capable of handling bounds on the adjustable parameters and
fulfilling constraints.
For infinite number of iterations this method is guaranteed to find the
global optimum of the objective function. In general one is interested
in processing a very large number of iterations.
Options for Random Search - Number of Iterations
This parameter is a positive integer to determine the number of
parameter sets to be drawn before the algorithm stops. The default
value is '100000'.
- Random Number Generator
The parameter is an enumeration value to determine which random number
generator this method shall use. COPASI provides two random number
generators R250 [Maier91]
(selected through the value 0) and the
Mersenne
Twister [Matsumoto98]
(selected through the value 1 (default)).
- Seed
The parameter is a positive integer value to determine the seed for the
random number generator. A value of zero instructs COPASI to select a
"random" value.
4.11. Simulated Annealing
Simulated annealing is an optimization algorithm first proposed by
Kirkpatrick et al. [Kirkpatrick83] and was inspired
by statistical mechanics
and the way in which perfect crystals are formed. Perfect crystals are
formed by first melting the substance of interest, and then cooling it
very slowly. At large temperatures the particles vibrate with wide
amplitude and this allows a search for global optimum. As the
temperature decreases so do the vibrations until the system settles to
the global optimum (the perfect crystal).
The simulated annealing optimization algorithm uses a similar concept:
the objective function is considered a measure of the energy of the
system and this is maintained constant for a certain number of
iterations (a temperature cycle). In each iteration, the parameters
are changed to a nearby location in parameter space and the new
objective function value calculated; if it decreased, then the new
state is accepted, if it increased then the new state is accepted with
a probability that follows a Boltzmann distribution (higher
temperature means higher probability of accepting the new
state). After a fixed number of iterations, the stopping criterion is
checked; if it is not time to stop, then the system's temperature is
reduced and the algorithm continues.
Simulated annealing is a stochastic algorithm that is guaranteed to
converge if ran for an infinite number of iterations. It is one of the
most robust global optimization algorithms, although it is also one of
the slowest. (Be warned that simulated annealing can run for hours or
even days!).
This implementation of simulated annealing is based on the code of
Corana et al. [Corana87]. This implementation has
tuned some of the parameters of the original implementation as
follows:
Options for Simulated Annealing - Start Temperature
Initial temperature of the system. The higher the temperature, the
larger the probability that a global optimum is found. Note that the
temperature should be very high in the beginning of the method (the
system should be above the "melting" temperature). This value
has the same units as the objective function, so what represents
"high" is different from problem to problem. The default is '1'.
- Cooling Factor
Rate by which the temperature is reduced from one cycle to the next,
given by the formula:
.
The simulated annealing algorithm works best if the temperature is
reduced at a slow rate, so this value should be close to 1. (but
values closer to 1 will also cause the algorithm to run longer). The
default is '0.85'.
- Tolerance
Convergence stopping criteria: the method stops when the change in
the objective function has been smaller than this value in the last
two temperature steps. The default is
'
'. (This is an absolute tolerance)
- Random Number Generator
The parameter is an enumeration value to determine which random number
generator this method shall use. COPASI provides two random number
generators R250 [Maier91]
(selected through the value 0) and the
Mersenne
Twister [Matsumoto98]
(selected through the value 1 (default)).
- Seed
The parameter is a positive integer value to determine the seed for the
random number generator. A value of zero instructs COPASI to select a
"random" value.
Steepest descent [Fogel92] is an optimization
method that follows the direction of steepest descent on the
hyper-surface of the objective function to find a local minimum. The
direction of steepest descent is defined by the negative of the
gradient of the objective function.
Options for Steepest Descent - Iteration Limit
This parameter is positive integer determining the maximum number of
iterations the algorithm shall perform. The default is '100'.
- Tolerance
This parameter is a positive value determining the tolerance with
which the solution shall be determined. If the improvement between two
steps is less than the tolerance the algorithm stops. The default is
'
'.
The Truncated Newton method is a sophisticated variant of the Newton optimisation method. The Newton optimisation method searches for the minimum of a nonlinear function by following descent directions determined from the function’s first and second partial derivatives. The Truncated Newton method does an incomplete (truncated) solution of a system of linear equations to calculate the Newton direction. This means that the actual direction chosen for the descent is between the steepest descent direction and the true Newton direction. A more complete description of the method can be found in [Gill81] and [Nash84].
The particular implementation of the Truncated Newton method used here takes into account simple bounds on the optimisation parameters.
Like the Newton method itself, the Truncated Newton variant converges quadratically when in the vicinity of the minimum. On average this method is one of the best gradient descent methods.
This implementation of the Truncated Newton method was originally written in FORTRAN by Stephen Nash (Operations Research and Applied Statistics Dept., George Mason University, Fairfax, VA 22030, USA). The original code is available from http://www.netlib.org/opt/tn). The original code was translated automatically to C by the F2C program from AT&T Bell Labs (available at http://www.netlib.org/f2c/ ). The C code was then enhanced to suite COPASI's reentrant optimization method interface.
Options for Truncated Newton - None
The algorithm is not tunable.
|