Latest Versions


Stable:
COPASI 4.6 (Build 32)
Development:
COPASI 4.5.31 (development)

VT Tribute

4. Optimization Methods

The optimization methods described in this chapter attempt to minimize a given objective function. There are several ways to do this and COPASI supports many different methods for the minimization of an objective function.

4.1. Evolutionary Programming

Evolutionary programming (EP) [Fogel92][Baeck93][Baeck97] is a computational technique that mimics evolution and is based on reproduction and selection. An EP algorithm is composed of individuals that reproduce and compete, each one is a potential solution to the (optimization) problem and is represented by a "genome" where each gene corresponds to one adjustable parameter. At each generation of the EP, each individual reproduces asexually, i.e. divides into two individuals. One of these contains exactly the same "genome" as the parent while the other suffers some mutations (the parameter values of each gene change slightly). At the end of the generation, the algorithm has double the number of individuals. Then each of the individuals is confronted with a number of others to count how many does it outperform (the number of wins is the number of these competitors that represent worse solutions than itself). All the individuals are ranked by their number of wins, and the population is again reduced to the original number of individuals by eliminating those which have worse fitness (solutions).

Options for Evolutionary Programming

Number of Generations

The parameter is a positive integer value to determine the number of generations the algorithm shall evolve the population. The default is '200'.

Population Size

The parameter is a positive integer value to determine the size of the population, i.e., the number of individuals that survive after each generation. The default is '20'.

Random Number Generator

The parameter is an enumeration value to determine which random number generator this method shall use. COPASI provides two random number generators R250 [Maier91] (selected through the value 0) and the Mersenne Twister [Matsumoto98] (selected through the value 1 (default)).

Seed

The parameter is a positive integer value to determine the seed for the random number generator. A value of zero instructs COPASI to select a "random" value.

4.2. Evolutionary Strategy (SRES)

Evolutionary Strategies with Stochastic Ranking (SRES) [Runarsson00] is similar to Evolutionary Programming. However, a parent has multiple offsprings during each generation. Each offspring will contain a recombination of genes with another parent and additional mutations. The algorithm assures that each parameter value will be within its boundaries. But constraints to the solutions may be violated. Whenever this happens the square of the size of the violation is summed up, i.e., we calculate

Equation 7.10. 


where the constraints are given by . The value is used within the selection, which is performed as described in Genetic Algorithm SR

Options for Evolutionary Strategy (SRES)

Number of Generations

The parameter is a positive integer value to determine the number of generations the algorithm shall evolve the population. The default is '200.'

Population Size

The parameter is a positive integer value to determine the size of the population, i.e., the number of individuals that survive after each generation. The default is '20'.

Random Number Generator

The parameter is an enumeration value to determine which random number generator this method shall use. COPASI provides two random number generators R250 [Maier91] (selected through the value 0) and the Mersenne Twister [Matsumoto98] (selected through the value 1 (default)).

Seed

The parameter is a positive integer value to determine the seed for the random number generator. A value of zero instructs COPASI to select a "random" value.

Pf

This parameter is a numerical value in the interval (0, 1) determining the chance that individuals either outside the parameter boundaries or violating the constraints are compared during the selection. The default is '4.75'.

4.3. Genetic Algorithm

The genetic algorithm (GA) [Baeck97][Baeck93][Michalewicz94][Mitchell95] is a computational technique that mimics evolution and is based on reproduction and selection. A GA is composed of individuals that reproduce and compete, each one is a potential solution to the (optimization) problem and is represented by a "genome" where each gene corresponds to one adjustable parameter. At each generation of the GA, each individual is paired with one other at random for reproduction. Two offspring are produced by combining their genomes and allowing for "cross-over", i.e., the two new individuals have genomes that are formed from a combination of the genomes of their parents. Also each new gene might have mutated, i.e. the parameter value might have changed slightly. At the end of the generation, the algorithm has double the number of individuals. Then each of the individuals is confronted with a number of others to count how many does it outperform (the number of wins is the number of these competitors that represent worse solutions than itself). All the individuals are ranked by their number of wins, and the population is again reduced to the original number of individuals by eliminating those which have worse fitness (solutions).

Many features of a GA may be varied. The details of this particular implementation of the GA for optimization of biochemical kinetics are:

  • Parameters are encoded in genes using floating-point representation, rather than the more usual binary representation.

  • Mutation is carried out by adding to the gene a random number drawn from a normal distribution with zero mean and a standard deviation of 10% of the parameter value. Whenever this makes the parameter (gene) exceed one boundary, it is set to that boundary value.

  • Cross-over is always performed at gene boundaries so that no gene is ever disrupted. The number of cross-over points is a random number between zero and half the number of adjustable parameters (uniform distribution).

  • Selection is done by a tournament where each individual competes with a number of others equal to 20% the population size. The competitors are chosen at random.

  • The initial population contains one individual whose genes are the initial parameter values, the genes of all other individuals are initialized to a random value between their boundaries. If the boundaries span two orders of magnitude or more, the random distribution is exponential, otherwise normal.

  • Whenever the fittest individual has not changed for the last 10 generations, the 10% less fit individuals are replaced by individuals with random genes. When the fittest individual has not changed for 30 generations, the worse 30% are substituted by individuals with random genes. When the fittest individual has not changed for 50 generations, the worse 50% are substituted by individuals with random genes. This procedure helps the algorithm escape local minima and is somewhat equivalent to increasing the mutation rate when the population has become uniform.

Options for Genetic Algorithm

Number of Generations

The parameter is a positive integer value to determine the number of generations the algorithm shall evolve the population. The default is '200'.

Population Size

The parameter is a positive integer value to determine the size of the population, i.e., the number of individuals that survive after each generation. The default is '20'.

Random Number Generator

The parameter is an enumeration value to determine which random number generator this method shall use. COPASI provides two random number generators R250 [Maier91] (selected through the value 0) and the Mersenne Twister [Matsumoto98] (selected through the value 1 (default)).

Seed

The parameter is a positive integer value to determine the seed for the random number generator. A value of zero instructs COPASI to select a "random" value.

4.4. Genetic Algorithm SR

The genetic algorithm with stochastic ranking is very similar to the before described Genetic Algorithm with tournament selection. With two exception which are the mutations are not forced to be within the boundaries and the selection is done through a bubble sort with a random factor as described in [Runarsson00].

  • Parameters are encoded in genes using floating-point representation, rather than the more usual binary representation.

  • Mutation is carried out by adding to the gene a random number drawn from a normal distribution with zero mean and a standard deviation of 10% of the parameter value. Parameters may exceed boundaries. Whenever this happens or a constraint to the solution is violated the square of the size of the violation is summed up, i.e., we calculate

    Equation 7.11. 


    where the parameters are given by and the constraints by . The value is used within the selection.

  • Cross-over is always performed at gene boundaries so that no gene is ever disrupted. The number of cross-over points is a random number between zero and half the number of adjustable parameters (uniform distribution).

  • Selection is done by the bubble sort described in [Runarsson00]. This sort incorporates a probability to compare objective values for individuals with a . The pseudo code for the sort is:

      // Here sweepNum is optimal number of sweeps from paper, i.e., TotalPopulation
      for (i = 0; i < sweepNum; i++)
        {
          wasSwapped = false;
    
          for (j = 0; j < TotalPopulation - 1; j++)
            {
              // within bounds or random chance 
              if ((phi(j) == 0 and phi(j + 1) == 0) or UniformRandom(0, 1) < Pf)
                {
                  // compare objective function values
                  if (Value(j) > Value(j + 1))
                    {
                      swap(j, j + 1);
                      wasSwapped = true;
                    }
                }
              else // phi != 0 
                {
                  // individual j further outside then j + 1
                  if (phi(j) > phi(j + 1))
                    {
                      swap(j, j + 1);
                      wasSwapped = true;
                    }
                }
            }
    
          // if no swap then break
          if (wasSwapped == false) break;
        }
    

  • The initial population contains one individual whose genes are the initial parameter values, the genes of all other individuals are initialized to a random value between their boundaries. If the boundaries span two orders of magnitude or more, the random distribution is exponential, otherwise normal.

  • Whenever the fittest individual has not changed for the last 10 generations, the 10% less fit individuals are replaced by individuals with random genes. When the fittest individual has not changed for 30 generations, the worse 30% are substituted by individuals with random genes. When the fittest individual has not changed for 50 generations, the worse 50% are substituted by individuals with random genes. This procedure helps the algorithm escape local minima and is somewhat equivalent to increasing the mutation rate when the population has become uniform.

Options for Genetic Algorithm SR

Number of Generations

The parameter is a positive integer value to determine the number of generations the algorithm shall evolve the population. The default is '200'.

Population Size

The parameter is a positive integer value to determine the size of the population, i.e., the number of individuals that survive after each generation. The default is '20'.

Random Number Generator

The parameter is an enumeration value to determine which random number generator this method shall use. COPASI provides two random number generators R250 [Maier91] (selected through the value 0) and the Mersenne Twister [Matsumoto98] (selected through the value 1 (default)).

Seed

The parameter is a positive integer value to determine the seed for the random number generator. A value of zero instructs COPASI to select a "random" value.

Pf

This parameter is a numerical value in the interval (0, 1) determining the chance that individuals either outside the parameter boundaries or violating the constraints are compared during the selection. The default is '4.75'.

4.5. Hooke & Jeeves

The method of Hooke and Jeeves [Bell66][Hooke61][Kaupe63][Swann72] is a direct search algorithm that searches for the minimum of a nonlinear function without requiring (or attempting to calculate) derivatives of the function. Instead it is based on a heuristic that suggests a descent direction using the values of the function calculated in a number of previous iterations.

Options for Hooke & Jeeves

Iteration Limit

This parameter is positive integer determining the maximum number of iterations the algorithm shall perform. The default is '50'.

Tolerance

This parameter is a positive value determining the tolerance with which the solution shall be determined. If the improvement between two steps is less than the tolerance the algorithm stops. The default is ' '.

Rho

This parameter is a value in (0, 1) determining the factor with which the steps size is reduced between iterations. The default is '0.2'.

4.6. Levenberg - Marquardt

Levenberg-Marquardt [Levenberg44][Marquardt63] is a gradient descent method. It is a hybrid between the steepest descent and the Newton methods.

The Newton optimization method searches for the minimum of a nonlinear function by following descent directions determined from the function's first and second partial derivatives. The steepest descent method searches for a minimum based only on the first derivatives of the function. While the Newton method converges quadratically towards the minimum in its vicinity, it may not converge at all if it is far away from it. On the other hand the steepest descent method only converges linearly but is guaranteed to converge.

Levenberg first suggested an improvement to the Newton method in order to make it more robust, i.e. to overcome the problem of non-convergence. His suggestion was to add a factor to the diagonal elements of the Hessian matrix of second derivatives when not close to the minimum (this can be judged by how positive definite the matrix is). The effect when this factor is large compared to the elements of Hessian is that the method then becomes the steepest descent method. Later Marquardt suggested that the factor should be multiplicative rather than additive and also defined a heuristic to make this factor increase or decrease. The method known as Levenberg-Marquardt is thus an adaptive method that effectively changes between the steepest descent to the Newton method.

The original suggestions of Levenberg and Marquardt were effective to enhance the Gauss-Newton method, a variant of the Newton method specifically for minimizing least-squares functions. In this case the advantage is also that the second derivatives do not need to be calculated as they are estimated from the gradient of the residuals. Subsequently Goldfeld et al. [Goldfeld66] extended the method to the case of general non-linear functions.

Options for Levenberg - Marquardt

Iteration Limit

This parameter is positive integer determining the maximum number of iterations the algorithm shall perform. The default is '200'.

Tolerance

This parameter is a positive value determining the tolerance with which the solution shall be determined. If the improvement between two steps is less than the tolerance the algorithm stops. The default is ' '.

4.7. Nelder - Mead

This method also known as the simplex method is due to Nelder and Mead [Nelder65]. A simplex is a polytope of vertices in dimensions. The objective function is evaluated at each vertex. Dependent on these calculated values a new simplex is constructed. The simplest step is to replace the worst point with a point reflected through the centroid of the remaining points. If this point is better than the best current point, then we can try stretching exponentially out along this line. On the other hand, if this new point isn't much better than the previous value then we are stepping across a valley, so we shrink the simplex towards the best point.

Options for Nelder - Mead

Iteration Limit

This parameter is a positive integer to determine the maximum number of iterations the method is to perform. The default value is '200'.

Tolerance

This parameter is a positive number and provides an alternative termination criteria. If the variance of the values of the objective function at the vertices of the current simplex is smaller than the tolerance the algorithm stops. The default is ' '.

Scale

This parameter is a positive number and determines the size of the initial simplex. The edges of the polytope are inverse proportional to the scale, i.e., a larger value makes the initial simplex smaller. The default is '10'.

4.8. Particle Swarm

The particle swarm optimization method suggested by Kennedy and Eberhart [Kennedy95] is inspired by a flock of birds or a school of fish searching for food. Each particle has a position and a velocity in the parameter space. Additionally, it remembers its best achieved objective value and position . Dependent on its own information and the position of its best neighbor (a random subset of particles of the swarm) a new velocity is calculated. With this information the position is updated. The pseudo code for the algorithm for each particle is:

  N =  best neighbor's position

  for i = 0 to number of parameters do
    R1 = uniform random number in [0, 1]
    R2 = uniform random number in [0, 1]

    V[i]= w * V[i]
          + C * R1 * (M[i]- X[i])
          + C * R2 * (N[i]- X[i])
    X[i] = X[i] + V[i]
  enddo

  current_O = evaluate objective function

  if current_O < O then do
    O = current_O
    M = X
  enddo

Options for Particle Swarm

Iteration Limit

This parameter is a positive integer to determine the maximum number of iterations the method is to perform. The default value is '2000'.

Swarm Size

This parameter is a positive integer specifying the number of particles in the swarm. The default value is '50'.

Std. Deviation

This parameter is a positive number and provides an alternative termination criteria. If the standard deviation of the values of the objective function of each particle and the standard deviation of the best positions is smaller than the provided value the algorithm stops. The default value is ' '.

Random Number Generator

The parameter is an enumeration value to determine which random number generator this method shall use. COPASI provides two random number generators R250 [Maier91] (selected through the value 0) and the Mersenne Twister [Matsumoto98] (selected through the value 1 (default)).

Seed

The parameter is a positive integer value to determine the seed for the random number generator. A value of zero instructs COPASI to select a "random" value.

4.9. Praxis

Praxis is a direct search method (for a review see [Swann72]) that searches for the minimum of a nonlinear function without requiring (or attempting to calculate) derivatives of that function. Praxis was developed by Brent [Brent72] after the method proposed by Powell [Powel64]. The inspiration for Praxis was the well-known method of minimising each adjustable parameter (direction) at a time - the principal axes method. In Praxis directions are chosen that do not coincide with the principal axes, in fact if the objective function is quadratic then these will be conjugate directions, assuring a fast convergence rate.

This implementation of the Praxis method was originally written in FORTRAN at the Stanford Linear Accelerator Center (dated 3/1/73). The original FORTRAN code is available from http://www.netlib.org/opt/praxis. The original code was translated automatically by the F2C program from AT&T Bell Labs (available at http://www.netlib.org/f2c/). The C code was then enhanced to suite COPASI's reentrant optimization method interface.

Options for Praxis

Tolerance

Convergence stopping criterium: the method stops when two consecutive estimates differ by less than Tolerance. This parameter is a positive integer to determine the number of parameter sets to be drawn before the algorithm stops. The default value is ' '.

4.10. Random Search

Random search is an optimization method that attempts to find the optimum by testing the objective function's value on a series of combinations of random values of the adjustable parameters. The random values are generated complying with any boundaries selected by the user, furthermore, any combinations of parameter values that do not fulfill constraints on the variables are excluded. This means that the method is capable of handling bounds on the adjustable parameters and fulfilling constraints.

For infinite number of iterations this method is guaranteed to find the global optimum of the objective function. In general one is interested in processing a very large number of iterations.

Options for Random Search

Number of Iterations

This parameter is a positive integer to determine the number of parameter sets to be drawn before the algorithm stops. The default value is '100000'.

Random Number Generator

The parameter is an enumeration value to determine which random number generator this method shall use. COPASI provides two random number generators R250 [Maier91] (selected through the value 0) and the Mersenne Twister [Matsumoto98] (selected through the value 1 (default)).

Seed

The parameter is a positive integer value to determine the seed for the random number generator. A value of zero instructs COPASI to select a "random" value.

4.11. Simulated Annealing

Simulated annealing is an optimization algorithm first proposed by Kirkpatrick et al. [Kirkpatrick83] and was inspired by statistical mechanics and the way in which perfect crystals are formed. Perfect crystals are formed by first melting the substance of interest, and then cooling it very slowly. At large temperatures the particles vibrate with wide amplitude and this allows a search for global optimum. As the temperature decreases so do the vibrations until the system settles to the global optimum (the perfect crystal).

The simulated annealing optimization algorithm uses a similar concept: the objective function is considered a measure of the energy of the system and this is maintained constant for a certain number of iterations (a temperature cycle). In each iteration, the parameters are changed to a nearby location in parameter space and the new objective function value calculated; if it decreased, then the new state is accepted, if it increased then the new state is accepted with a probability that follows a Boltzmann distribution (higher temperature means higher probability of accepting the new state). After a fixed number of iterations, the stopping criterion is checked; if it is not time to stop, then the system's temperature is reduced and the algorithm continues.

Simulated annealing is a stochastic algorithm that is guaranteed to converge if ran for an infinite number of iterations. It is one of the most robust global optimization algorithms, although it is also one of the slowest. (Be warned that simulated annealing can run for hours or even days!).

This implementation of simulated annealing is based on the code of Corana et al. [Corana87]. This implementation has tuned some of the parameters of the original implementation as follows:

  • Each temperature cycle takes random steps, with being the number of optimization parameters.

  • At each step, a new candidate solution is accepted if: a) it reduced the objective function value, or b) with a probability equal to , where is the increase in objective function value, and is the current temperature.

  • The stopping criterion is applied to the last two temperature cycles (i.e. the change in objective function between this temperature and the previous must have been smaller than the Tolerance in the last two cycles).

Options for Simulated Annealing

Start Temperature

Initial temperature of the system. The higher the temperature, the larger the probability that a global optimum is found. Note that the temperature should be very high in the beginning of the method (the system should be above the "melting" temperature). This value has the same units as the objective function, so what represents "high" is different from problem to problem. The default is '1'.

Cooling Factor

Rate by which the temperature is reduced from one cycle to the next, given by the formula: . The simulated annealing algorithm works best if the temperature is reduced at a slow rate, so this value should be close to 1. (but values closer to 1 will also cause the algorithm to run longer). The default is '0.85'.

Tolerance

Convergence stopping criteria: the method stops when the change in the objective function has been smaller than this value in the last two temperature steps. The default is ' '. (This is an absolute tolerance)

Random Number Generator

The parameter is an enumeration value to determine which random number generator this method shall use. COPASI provides two random number generators R250 [Maier91] (selected through the value 0) and the Mersenne Twister [Matsumoto98] (selected through the value 1 (default)).

Seed

The parameter is a positive integer value to determine the seed for the random number generator. A value of zero instructs COPASI to select a "random" value.

4.12. Steepest Descent

Steepest descent [Fogel92] is an optimization method that follows the direction of steepest descent on the hyper-surface of the objective function to find a local minimum. The direction of steepest descent is defined by the negative of the gradient of the objective function.

Options for Steepest Descent

Iteration Limit

This parameter is positive integer determining the maximum number of iterations the algorithm shall perform. The default is '100'.

Tolerance

This parameter is a positive value determining the tolerance with which the solution shall be determined. If the improvement between two steps is less than the tolerance the algorithm stops. The default is ' '.

4.13. Truncated Newton

The Truncated Newton method is a sophisticated variant of the Newton optimisation method. The Newton optimisation method searches for the minimum of a nonlinear function by following descent directions determined from the function’s first and second partial derivatives. The Truncated Newton method does an incomplete (truncated) solution of a system of linear equations to calculate the Newton direction. This means that the actual direction chosen for the descent is between the steepest descent direction and the true Newton direction. A more complete description of the method can be found in [Gill81] and [Nash84].

The particular implementation of the Truncated Newton method used here takes into account simple bounds on the optimisation parameters. Like the Newton method itself, the Truncated Newton variant converges quadratically when in the vicinity of the minimum. On average this method is one of the best gradient descent methods.

This implementation of the Truncated Newton method was originally written in FORTRAN by Stephen Nash (Operations Research and Applied Statistics Dept., George Mason University, Fairfax, VA 22030, USA). The original code is available from http://www.netlib.org/opt/tn). The original code was translated automatically to C by the F2C program from AT&T Bell Labs (available at http://www.netlib.org/f2c/ ). The C code was then enhanced to suite COPASI's reentrant optimization method interface.

Options for Truncated Newton

None

The algorithm is not tunable.