next   index

Appendix C - Benchmark Problems

Explicit Equations
NIST Regressions
Differential Equations


For all of these, add links to

  1. images
  2. data files
  3. notebooks

Details of data sets and this is about why we chose each problem.

top

Explicit Equations

Explicit equations are drawn from the SR literature. The main source is a paper titled “GP Needs Better Benchmarks.” We select a subset of these problems, those that have been used more often in published work. In several cases we added parameters to the equations for two reasons. The first is to bring the scale of the data into a reasonable range. We did this so that we could add percentage noise based on the variance of the data. The second reason was to make the equation more complex, thus making the regression tasks more difficult.

Single Variable
Koza 01
$$ 2.3x + 1.4x^{2} + 0.5x^{3} + 0.2x^{4} $$
Koza 02
$$ 1.5x + 0.07x^{5} -2*x^{3} $$
Koza 03
$$ 1.3x^{2} + 0.3x^{6} -2*x^{4} $$
Lipson 01
$$ 1.5x^2 - x^3 $$
Lipson 02
$$ 0.1e^{|x|}\sin(x) $$
Lipson 03
$$ x^2 e^{\sin(x)} + x + \sin(\frac{\pi}{4}-x^3) $$
Nguyen 01
$$ x + x^{2} + x^{3} $$
Nguyen 02
$$ x + x^{2} + x^{3} + x^{4} $$
Nguyen 03
$$ -x - 3.4x^{2} + 1.2x^{3} + 0.3x^{4} - 0.05x^{5} $$
Nguyen 04
$$ x + 3.2x^{2} + 0.8x^{3} - 0.4x^{4} + 0.2x^{5} + 0.04x^{6} $$
Nguyen 05
$$ 3.3\sin(x^{2}) * \cos(x) - 1 $$
Nguyen 06
$$ 3.2\sin(x) + 1.4\sin( x + x^{2} ) $$
Nguyen 07
$$ \log( 1 + x ) + \log( 1 + x^{2} ) $$
Nguyen 08
$$ \sqrt{x} $$
Two variable
Nguyen 09
$$ \sin{x} + \sin{ y^{2} } $$
Nguyen 10
$$ 2 \sin(x) \cos(y) $$
Nguyen 11
$$ x^{y} $$
Nguyen 12
$$ 0.09x^{4} + 0.3x^{3} + 0.1y^{2} - y $$
Five variable
Korns 01
$$ 1.57 + 24.30v $$
Korns 02
$$ 0.23 + \frac{ y + v }{w} $$
Korns 03
$$ -5.41 + \frac{ v + -x + \frac{y}{w} }{w} $$
Korns 04
$$ -2.30 + 0.13 \sin(z) $$
Korns 05
$$ 3.00 + 2.13 \log(w) $$
Korns 06
$$ 1.30 + 0.13 \sqrt{x} $$
Korns 07
$$ 213.81 ( 1 - e^x ) $$
Korns 08
$$ 6.87 + 11\sqrt{7.23xvw} $$
Korns 09
$$ \frac{e^z \sqrt{x}}{\log(y) v^{2}} $$
Korns 10
$$ 0.81 + \frac{24.3 ( 2y + 3z^{2} )}{ 4v^{3} + 5w^{4} } $$
Korns 11
$$ 6.87 + 11 \cos(7.23x^{3}) $$
Korns 12
$$ 2.00 -2.1 \sin(1.30w)\cos(9.80x) $$
Korns 13
$$ 32.00 - \frac{3\tan(x)\tan(z)}{\tan(y)\tan(v)} $$
Korns 14
$$ 22.0 - 4.2*(\cos(x)-\tan(y))*\frac{\tanh(z)}{\sin(v)} $$
Korns 15
$$ 12.0 - 6.0*(\tan(x)/e^{y})(\ln(z)-\tan(v)) $$

top

NIST Regressions

NIST problems are taken from the NIST website. These benchmarks are used to test linear and non-linear regression techniques. Here we apply PGE to these same problems.

NIST website

Linear Problems
Norris
$$ y = B0 + B1*x + e $$
observed
Pontius
$$ y = B0 + B1*x + B2*(x^2) $$
observed
Filip
$$ y = B0 + B1*x + B2*(x^2) + ... + B9*(x^9) + B10*(x^10) + e $$
observed
Longley
$$ y = B0 + B1*x1 + B2*x2 + B3*x3 + B4*x4 + B5*x5 + B6*x6 + e $$
observed
Wampler1
$$ y = B0 + B1*x + B2*(x^2) + B3*(x^3)+ B4*(x^4) + B5*(x^5) $$
generated
Wampler2
$$ y = B0 + B1*x + B2*(x^2) + B3*(x^3)+ B4*(x^4) + B5*(x^5) $$
generated
Wampler3
$$ y = B0 + B1*x + B2*(x^2) + B3*(x^3)+ B4*(x^4) + B5*(x^5) $$
generated
Wampler4
$$ y = B0 + B1*x + B2*(x^2) + B3*(x^3)+ B4*(x^4) + B5*(x^5) $$
generated
Wampler5
$$ y = B0 + B1*x + B2*(x^2) + B3*(x^3)+ B4*(x^4) + B5*(x^5) $$
generated
Noninear Problems
Misra1a
$$ y = b1*(1-exp[-b2*x]) + e $$
observed
Chwirut2
$$ y = exp(-b1*x)/(b2+b3*x) + e $$
observed
Chwirut1
$$ y = exp[-b1*x]/(b2+b3*x) + e $$
observed
Lanczos3
$$ y = b1*exp(-b2*x) + b3*exp(-b4*x) + b5*exp(-b6*x) + e $$
generated
Gauss1
$$ y = b1*exp( -b2*x ) + b3*exp( -(x-b4)**2 / b5**2 ) + b6*exp( -(x-b7)**2 / b8**2 ) + e $$
generated
Gauss2
$$ y = b1*exp( -b2*x ) + b3*exp( -(x-b4)**2 / b5**2 ) + b6*exp( -(x-b7)**2 / b8**2 ) + e $$
generated
DanWood
$$ y = b1*x**b2 + e $$
observed
Misra1b
$$ y = b1 * (1-(1+b2*x/2)**(-2)) + e $$
observed
Kirby2
$$ y = (b1 + b2*x + b3*x**2) / (1 + b4*x + b5*x**2) + e $$
observed
Hahn1
$$ y = (b1+b2*x+b3*x**2+b4*x**3) / (1+b5*x+b6*x**2+b7*x**3) + e $$
observed
Nelson
$$ log[y] = b1 - b2*x1 * exp[-b3*x2] + e $$
observed
MGH17
$$ y = b1 + b2*exp[-x*b4] + b3*exp[-x*b5] + e $$
generated
Lanczos1
$$ y = b1*exp(-b2*x) + b3*exp(-b4*x) + b5*exp(-b6*x) + e $$
generated
Lanczos2
$$ y = b1*exp(-b2*x) + b3*exp(-b4*x) + b5*exp(-b6*x) + e $$
generated
Gauss3
$$ y = b1*exp( -b2*x ) + b3*exp( -(x-b4)**2 / b5**2 ) + b6*exp( -(x-b7)**2 / b8**2 ) + e $$
generated
Misra1c
$$ y = b1 * (1-(1+2*b2*x)**(-0.5)) + e $$
observed
Misra1d
$$ y = b1*b2*x*((1+b2*x)**(-1)) + e $$
observed
Roszman1
$$ y = b1 - b2*x - arctan[b3/(x-b4)]/pi + e $$
observed
ENSO
$$ $$
omitted, missing data
MGH09
$$ y = b1*(x**2+x*b2) / (x**2+x*b3+b4) + e $$
generated
Thurber
$$ y = (b1 + b2*x + b3*x**2 + b4*x**3) / (1 + b5*x + b6*x**2 + b7*x**3) + e $$
observed
BoxBOD
$$ y = b1*(1-exp[-b2*x]) + e $$
observed
Rat42
$$ y = b1 / (1+exp[b2-b3*x]) + e $$
observed
MGH10
$$ y = b1 * exp[b2/(x+b3)] + e $$
generated
Eckerle4
$$ y = (b1/b2) * exp[-0.5*((x-b3)/b2)**2] + e $$
observed
Rat43
$$ y = b1 / ((1+exp[b2-b3*x])**(1/b4)) + e $$
observed
Bennett5
$$ y = b1 * (b2+x)**(-1/b3) + e $$
observed

top

Differential Equations

For each of these, additionally add

  1. description, external link
  2. parameters / constants
Bacteria Respiration
$$ \dot{x} = 20 - x - \frac{xy}{1+0.5x^2} $$
$$ \dot{y} = 10 - \frac{xy}{1+0.5x^2} $$
Bar Magnets
$$ \dot{\theta_1} = 0.5\sin(\theta_1-\theta_2)-\sin(\theta_1) $$
$$ \dot{\theta_2} = 0.5\sin(\theta_2-\theta_1)-\sin(\theta_2) $$
Glider
$$ \dot{v} = -0.05v^2 - \sin(\theta) $$
$$ \dot{\theta} = v - \frac{\cos(\theta)}{v} $$
Ecoli Lac Operation
$$ \dot{G} = \frac{L^2}{1+L^2}-0.01G+0.001 $$
$$ \dot{A} = G \left( \frac{L}{1+L} - \frac{A}{1+L} \right) $$
$$ \dot{L} = \frac{-GL}{1+L} $$
Lorenz
$$ \dot{x} = 10(y-x) $$
$$ \dot{y} = x(C-z)-y $$
$$ \dot{z} = xy - \frac{8}{9}z $$
Shear Flow
$$ \dot{\theta} = \cot(\phi)\cos(\theta) = \frac{\cos\phi}{\sin\phi}\cos\theta $$
$$ \dot{\phi} = (\cos^2\phi - 0.1\sin^2\phi)\sin(\theta) $$
van Der Pol
$$ \dot{x} = 10 \left( y-\left(\frac{1}{3}x^3 - x \right)\right) $$
$$ \dot{y} = -0.1x $$
Preditor-Prey 1
$$ \dot{x} = x \left( 4-x-\frac{y}{1+x} \right) $$
$$ \dot{y} = y \left( \frac{x}{1+x} -0.075y \right) $$
Preditor-Prey 2
$$ \dot{x} = (-0.2+0.001y)x $$
$$ \dot{y} = (0.1 - 0.001x)y $$
Lotka-Volterra
$$ \dot{x} = 3x - 2xy - x^2 $$
$$ \dot{y} = 2y - xy - y^2 $$
Pendulum
$$ \dot{\theta} = v $$
$$ \dot{v} = \frac{-g}{L}\sin \theta $$
Double Pendulum
$$ n/a $$
$$ n/a $$
Single Spring
$$ \dot{x} = v $$
$$ \dot{v} = - \frac{k}{m}x - \frac{b}{m}v $$
Double Spring
$$ \dot{v_1} = -\frac{k_1}{m_1}(x_1-R_1) + \frac{k_2}{m_1}(x_2-x_1-w_1-R_2) $$
$$ \dot{v_2} = -\frac{x_2}{m_2}(x_2-x_1-w_1-R_2) $$

top

Additional Datasets

Yeast
Real World

For each of these, additionally add

  1. description, external link
  2. will not have equations upfront

SymbolicRegression.com benchmarks Lake Data


next ()