Categories: Mathematics, Numerical methods.

Runge-Kutta method

A Runge-Kutta method (RKM) is a popular approach to numerically solving systems of ordinary differential equations. Let x(t)\vb{x}(t) be the vector we want to find, governed by f(t,x)\vb{f}(t, \vb{x}):

x(t)=f(t,x(t))\begin{aligned} \vb{x}'(t) = \vb{f}\big(t, \vb{x}(t)\big) \end{aligned}

Like in all numerical methods, the tt-axis is split into discrete steps. If a step has size hh, then as long as hh is small enough, we can make the following approximation:

x(t)+ahx(t)x(t ⁣+ ⁣ah)f(t ⁣+ ⁣ah,x(t ⁣+ ⁣ah))f(t ⁣+ ⁣ah,x(t) ⁣+ ⁣ahx(t))\begin{aligned} \vb{x}'(t) + a h \vb{x}''(t) &\approx \vb{x}'(t \!+\! a h) \\ &\approx \vb{f}\big(t \!+\! a h,\, \vb{x}(t \!+\! a h)\big) \\ &\approx \vb{f}\big(t \!+\! a h,\, \vb{x}(t) \!+\! a h \vb{x}'(t) \big) \end{aligned}

For sufficiently small hh, higher-order derivates can also be included, albeit still at t ⁣+ ⁣aht \!+\! a h:

x(t)+ahx(t)+bh2x(t)f(t ⁣+ ⁣ah,x(t) ⁣+ ⁣ahx(t) ⁣+ ⁣bh2x(t))\begin{aligned} \vb{x}'(t) + a h \vb{x}''(t) + b h^2 \vb{x}'''(t) &\approx \vb{f}\big(t \!+\! a h,\, \vb{x}(t) \!+\! a h \vb{x}'(t) \!+\! b h^2 \vb{x}''(t) \big) \end{aligned}

Although these approximations might seem innocent, they actually make it quite complicated to determine the error order of a given RKM.

Now, consider a Taylor expansion around the current tt, truncated at a chosen order nn:

x(t ⁣+ ⁣h)=x(t)+hx(t)+h22x(t)+h36x(t)+...+hnn!x(n)(t)=x(t)+h[x(t)+h2x(t)+h26x(t)+...+hn1n!x(n)(t)]\begin{aligned} \vb{x}(t \!+\! h) &= \vb{x}(t) + h \vb{x}'(t) + \frac{h^2}{2} \vb{x}''(t) + \frac{h^3}{6} \vb{x}'''(t) + \:...\, + \frac{h^n}{n!} \vb{x}^{(n)}(t) \\ &= \vb{x}(t) + h \bigg[ \vb{x}'(t) + \frac{h}{2} \vb{x}''(t) + \frac{h^2}{6} \vb{x}'''(t) + \:...\, + \frac{h^{n-1}}{n!} \vb{x}^{(n)}(t) \bigg] \end{aligned}

We are free to split the terms as follows, choosing real factors ωmj\omega_{mj} subject to jωmj=1\sum_{j} \omega_{mj} = 1:

x(t ⁣+ ⁣h)=x+h[j=1N1ω1jx+h2j=1N2ω2jx+...+hn1n!j=1Nnωnjx(n)]\begin{aligned} \vb{x}(t \!+\! h) &= \vb{x} + h \bigg[ \sum_{j = 1}^{N_1} \omega_{1j} \, \vb{x}' + \frac{h}{2} \sum_{j = 1}^{N_2} \omega_{2j} \, \vb{x}'' + \:...\, + \frac{h^{n-1}}{n!} \sum_{j = 1}^{N_n} \omega_{nj} \, \vb{x}^{(n)} \bigg] \end{aligned}

Where the integers N1,...,NnN_1,...,N_n are also free to choose, but for reasons that will become clear later, the most general choice for an RKM is N1=nN_1 = n, Nn=1N_n = 1, and:

Nn1=Nn ⁣+ ⁣2,,Nnm=Nnm+1 ⁣+ ⁣m ⁣+ ⁣1,,N2=N3 ⁣+ ⁣n ⁣ ⁣1\begin{aligned} N_{n-1} = N_n \!+\! 2 ,\quad \cdots ,\quad N_{n-m} = N_{n-m+1} \!+\! m \!+\! 1 ,\quad \cdots ,\quad N_{2} = N_3 \!+\! n \!-\! 1 \end{aligned}

In other words, NnmN_{n-m} is the mmth triangular number. This is not so important, since this is not a practical way to describe RKMs, but it is helpful to understand how they work.

Example derivation

For example, let us truncate at n=3n = 3, such that N1=3N_1 = 3, N2=3N_2 = 3 and N3=1N_3 = 1. The following derivation is very general, except it requires all αj0\alpha_j \neq 0. Renaming ωmj\omega_{mj}, we start from:

x(t ⁣+ ⁣h)=x+h[(α1+α2+α3)x+h2(β2+β31+β32)x+h26γ3x]=x+h[α1x+(α2x+h2β2x)+(α3x+h2(β31+β32)x+h26γ3x)]\begin{aligned} \vb{x}(t \!+\! h) &= \vb{x} + h \bigg[ (\alpha_1 + \alpha_2 + \alpha_3) \, \vb{x}' + \frac{h}{2} (\beta_2 + \beta_{31} + \beta_{32}) \, \vb{x}'' + \frac{h^2}{6} \gamma_3 \, \vb{x}''' \bigg] \\ &= \vb{x} + h \bigg[ \alpha_1 \vb{x}' + \Big( \alpha_2 \vb{x}' + \frac{h}{2} \beta_2 \vb{x}'' \Big) + \Big( \alpha_3 \vb{x}' + \frac{h}{2} (\beta_{31} + \beta_{32}) \vb{x}'' + \frac{h^2}{6} \gamma_3 \vb{x}''' \Big) \bigg] \end{aligned}

As discussed earlier, the parenthesized expressions can be approximately rewritten with f\vb{f}:

x(t ⁣+ ⁣h)=x+h[α1f(t,x)+α2f(t ⁣+ ⁣hβ22α2,  x ⁣+ ⁣hβ22α2x)+α3f(t ⁣+ ⁣h(β31 ⁣ ⁣+ ⁣ ⁣β32)2α3,  x ⁣+ ⁣hβ312α3x ⁣+ ⁣hβ322α3x ⁣+ ⁣h2γ36α3x)]=x+h[α1k1+α2f(t ⁣+ ⁣hβ22α2,  x ⁣+ ⁣hβ22α2k1 ⁣)+α3f(t ⁣+ ⁣h(β31 ⁣ ⁣+ ⁣ ⁣β32)2α3,  x ⁣+ ⁣hβ312α3k1 ⁣+ ⁣hβ322α3f(t ⁣+ ⁣hγ33β32,  x ⁣+ ⁣hγ33β32k1 ⁣) ⁣)]\begin{aligned} \vb{x}(t \!+\! h) = \vb{x} + h &\bigg[ \alpha_1 \vb{f}(t, \vb{x}) + \alpha_2 \vb{f}\Big( t \!+\! \frac{h \beta_2}{2 \alpha_2}, \; \vb{x} \!+\! \frac{h \beta_2}{2 \alpha_2} \vb{x}' \Big) \\ & + \alpha_3 \vb{f}\Big( t \!+\! \frac{h (\beta_{31} \!\!+\!\! \beta_{32})}{2 \alpha_3}, \; \vb{x} \!+\! \frac{h \beta_{31}}{2 \alpha_3} \vb{x}' \!+\! \frac{h \beta_{32}}{2 \alpha_3} \vb{x}' \!+\! \frac{h^2 \gamma_3}{6 \alpha_3} \vb{x}'' \Big) \bigg] \\ = \vb{x} + h &\bigg[ \alpha_1 \vb{k}_1 + \alpha_2 \vb{f}\Big( t \!+\! \frac{h \beta_2}{2 \alpha_2}, \; \vb{x} \!+\! \frac{h \beta_2}{2 \alpha_2} \vb{k}_1 \!\Big) \\ & + \alpha_3 \vb{f}\Big( t \!+\! \frac{h (\beta_{31} \!\!+\!\! \beta_{32})}{2 \alpha_3}, \; \vb{x} \!+\! \frac{h \beta_{31}}{2 \alpha_3} \vb{k}_1 \!+\! \frac{h \beta_{32}}{2 \alpha_3} \vb{f}\Big( t \!+\! \frac{h \gamma_3}{3 \beta_{32}}, \; \vb{x} \!+\! \frac{h \gamma_3}{3 \beta_{32}} \vb{k}_1 \!\Big) \!\Big) \bigg] \end{aligned}

Here, we can see an opportunity to save some computational time by reusing an evaluation of f\vb{f}. Technically, this is optional, but it would be madness not to, so we choose:

β22α2=γ33β32\begin{aligned} \frac{\beta_2}{2 \alpha_2} = \frac{\gamma_3}{3 \beta_{32}} \end{aligned}

Such that the next step of x\vb{x}’s numerical solution is as follows, recalling that jαj=1\sum_{j} \alpha_j = 1:

x(t ⁣+ ⁣h)=x(t)+h(α1k1+α2k2+α3k3)\begin{aligned} \boxed{ \vb{x}(t \!+\! h) = \vb{x}(t) + h \Big( \alpha_1 \vb{k}_1 + \alpha_2 \vb{k}_2 + \alpha_3 \vb{k}_3 \Big) } \end{aligned}

Where k1\vb{k}_1, k2\vb{k}_2 and k3\vb{k}_3 are different estimates of the average slope x\vb{x}' between tt and t ⁣+ ⁣ht \!+\! h, whose weighted average is used to make the tt-step. They are given by:

k1f(t,x)k2f(t+hβ22α2,  x+hβ22α2k1)k3f(t+h(β31 ⁣ ⁣+ ⁣ ⁣β32)2α3,  x+hβ312α3k1+hβ322α3k2)\begin{aligned} \boxed{ \begin{aligned} \vb{k}_1 &\equiv \vb{f}(t, \vb{x}) \\ \vb{k}_2 &\equiv \vb{f}\bigg( t + \frac{h \beta_2}{2 \alpha_2}, \; \vb{x} + \frac{h \beta_2}{2 \alpha_2} \vb{k}_1 \bigg) \\ \vb{k}_3 &\equiv \vb{f}\bigg( t + \frac{h (\beta_{31} \!\!+\!\! \beta_{32})}{2 \alpha_3}, \; \vb{x} + \frac{h \beta_{31}}{2 \alpha_3} \vb{k}_1 + \frac{h \beta_{32}}{2 \alpha_3} \vb{k}_2 \bigg) \end{aligned} } \end{aligned}

Despite the contraints on αj\alpha_j and βj\beta_j, there is an enormous freedom of choice here, all leading to valid RKMs, although not necessarily good ones.

General form

A more practical description goes as follows: in an ss-stage RKM, a weighted average is taken of up to ss slope estimates kj\vb{k}_j with weights bjb_j. Let jbj=1\sum_{j} b_j = 1, then:

x(t ⁣+ ⁣h)=x(t)+hj=1sbjkj\begin{aligned} \boxed{ \vb{x}(t \!+\! h) = \vb{x}(t) + h \sum_{j = 1}^{s} b_j \vb{k}_j } \end{aligned}

Where the estimates k1,...,ks\vb{k}_1, ..., \vb{k}_s depend on each other, and are calculated one by one as:

km=f(t+hcm,  x+hj=1m1amjkj)\begin{aligned} \boxed{ \vb{k}_m = \vb{f}\bigg( t + h c_m,\; \vb{x} + h \sum_{j = 1}^{m - 1} a_{mj} \vb{k}_j \bigg) } \end{aligned}

With c1=1c_1 = 1 and j=1amj=cm\sum_{j = 1} a_{mj} = c_m. Writing this out for the first few mm, the pattern is clear:

k1=f(t,x)k2=f(t+hc2,  x+ha21k1)k3=f(t+hc3,  x+h(a31k1+a32k2))k4=...\begin{aligned} \vb{k}_1 &= \vb{f}(t, \vb{x}) \\ \vb{k}_2 &= \vb{f}\big( t + h c_2,\; \vb{x} + h a_{21} \vb{k}_1 \big) \\ \vb{k}_3 &= \vb{f}\big( t + h c_3,\; \vb{x} + h (a_{31} \vb{k}_1 + a_{32} \vb{k}_2) \big) \\ \vb{k}_4 &= \:... \end{aligned}

The coefficients of a given RKM are usually compactly represented in a Butcher tableau:

0c2a21c3a31a32csas1as2as,s1b1b2bs1bs\begin{aligned} \begin{array}{c|ccc} 0 \\ c_2 & a_{21} \\ c_3 & a_{31} & a_{32} \\ \vdots & \vdots & \vdots & \ddots \\ c_s & a_{s1} & a_{s2} & \cdots & a_{s,s-1} \\ \hline & b_1 & b_2 & \cdots & b_{s-1} & b_s \end{array} \end{aligned}

Each RKM has an order pp, such that the global truncation error is O(hp)\mathcal{O}(h^p), i.e. the accumulated difference between the numerical and the exact solutions is proportional to hph^p.

The surprise is that pp need not be equal to the Taylor expansion order nn, nor the stage count ss. Typically, s=ns = n for computational efficiency, but sns \ge n is possible in theory.

The order pp of a given RKM is determined by a complicated set of equations on the coefficients, and the lowest possible ss for a desired pp is in fact only partially known. For p4p \le 4 the bound is sps \ge p, whereas for p5p \ge 5 the only proven bound is sp ⁣+ ⁣1s \ge p \!+\! 1, but for p7p \ge 7 no such efficient methods have been found so far.

If you need an RKM with a certain order, look it up. There exist many efficient methods for p4p \le 4 where s=ps = p, and although less popular, higher pp are also available.

References

  1. J.C. Butcher, Numerical methods for ordinary differential equations, 3rd edition, Wiley.