Categories: Fiber optics, Mathematics, Nonlinear optics, Physics.

Nonlinear Schrödinger equation

The nonlinear Schrödinger (NLS) equation is a nonlinear 1+1D partial differential equation that appears in many areas of physics. It is often given in its dimensionless form, where it governs the envelope u(z,t)u(z, t) of an underlying carrier wave, with tt the transverse coordinate, and r=±1r = \pm 1 a parameter determining which of two regimes the equation is intended for:

iuz+2ut2+ru2u=0\begin{aligned} \boxed{ i \pdv{u}{z} + \pdvn{2}{u}{t} + r |u|^2 u = 0 } \end{aligned}

Many variants exist, depending on the conventions used by authors. The NLS equation is used to describe pulses in fiber optics (as derived below), waves over deep water, local opening of DNA chains, and much more. Very roughly speaking, it is a valid description of “all” weakly nonlinear, slowly modulated waves in physics.

It exhibits an incredible range of behaviors, from “simple” effects such as dispersive broadening, self-phase modulation and first-order solitons, to weirder and more complicated phenomena like modulational instability, optical wave breaking and periodic higher-order solitons. It is also often modified to include additional physics, further enriching its results with e.g. self-steepening and soliton self-frequency shifting.

We only consider fiber optics here; the NLS equation can be derived in many other ways. We start from the most general form of the electromagnetic wave equation, after assuming the medium cannot be magnetized (μr=1\mu_r = 1):

×(×E)=μ0ε02Et2μ02Pt2\begin{aligned} \nabla \cross \big( \nabla \cross \vb{E} \big) = - \mu_0 \varepsilon_0 \pdvn{2}{\vb{E}}{t} - \mu_0 \pdvn{2}{\vb{P}}{t} \end{aligned}

Using the vector identity ×(×E)=(E)2E\nabla \cross (\nabla \cross \vb{E}) = \nabla (\nabla \cdot \vb{E}) - \nabla^2 \vb{E} and Gauss’s law E=0\nabla \cdot \vb{E} = 0, and splitting the polarization P\vb{P} into linear and nonlinear contributions PL\vb{P}_\mathrm{L} and PNL\vb{P}_\mathrm{NL}:

2Eμ0ε02Et2=μ02PLt2+μ02PNLt2\begin{aligned} \nabla^2 \vb{E} - \mu_0 \varepsilon_0 \pdvn{2}{\vb{E}}{t} &= \mu_0 \pdvn{2}{\vb{P}_\mathrm{L}}{t} + \mu_0 \pdvn{2}{\vb{P}_\mathrm{NL}}{t} \end{aligned}

In general, PL\vb{P}_\mathrm{L} is given by the convolution of E\vb{E} with a second-rank response tensor χ(1)\chi^{(1)}:

PL(r,t)=ε0χ(1)(tt)E(r,t)dt\begin{aligned} \vb{P}_\mathrm{L}(\vb{r}, t) = \varepsilon_0 \int_{-\infty}^\infty \chi^{(1)}(t - t') \cdot \vb{E}(\vb{r}, t') \dd{t'} \end{aligned}

In PNL\vb{P}_\mathrm{NL} we only include third-order nonlinearities, since higher orders are usually negligible, and second-order nonlinear effects only exist in very specific crystals. So we “only” need to deal with a fourth-rank response tensor χ(3)\chi^{(3)}:

PNL(r,t)=ε0χ(3)(t ⁣ ⁣t1,t ⁣ ⁣t2,t ⁣ ⁣t3)E(r,t1)E(r,t2)E(r,t3)dt1dt2dt3\begin{aligned} \vb{P}_\mathrm{NL}(\vb{r}, t) = \varepsilon_0 \iiint_{-\infty}^\infty \chi^{(3)}(t \!-\! t_1, t \!-\! t_2, t \!-\! t_3) \:\vdots\: \vb{E}(\vb{r}, t_1) \vb{E}(\vb{r}, t_2) \vb{E}(\vb{r}, t_3) \dd{t_1} \dd{t_2} \dd{t_3} \end{aligned}

In practice, two phenomena contribute to χ(3)\chi^{(3)}: the Kerr effect due to electrons’ response to E\vb{E}, and Raman scattering due to nuclei’s response, which is slower because of their mass. But if the light pulses are sufficiently long (>1ps in silica), both effects can be treated as fast, so:

χ(3)(t ⁣ ⁣t1,t ⁣ ⁣t2,t ⁣ ⁣t3)=χ(3)δ(tt1)δ(tt2)δ(tt3)\begin{aligned} \chi^{(3)}(t \!-\! t_1, t \!-\! t_2, t \!-\! t_3) &= \chi^{(3)} \delta(t - t_1) \delta(t - t_2) \delta(t - t_3) \end{aligned}

Where δ\delta is the Dirac delta function. To keep things simple, we consider linearly xx-polarized light E=x^E\vb{E} = \vu{x} |\vb{E}|, such that the tensor can be replaced with its scalar element χxxxx(3)\chi^{(3)}_{xxxx}. Then:

PNL=ε0χxxxx(3)(EE)E\begin{aligned} \vb{P}_\mathrm{NL} = \varepsilon_0 \chi^{(3)}_{xxxx} \big( \vb{E} \cdot \vb{E} \big) \vb{E} \end{aligned}

For the same reasons, the linear polarization is reduced to:

PL=ε0χxx(1)E\begin{aligned} \vb{P}_\mathrm{L} &= \varepsilon_0 \chi^{(1)}_{xx} \vb{E} \end{aligned}

Next, we decompose E\vb{E} as follows, consisting of a carrier wave eiω0te^{-i \omega_0 t} at a constant frequency ω0\omega_0, modulated by an envelope EE that is assumed to be slowly-varying compared to the carrier, plus the complex conjugate Eeiω0tE^* e^{i \omega_0 t}:

E(r,t)=x^12(E(r,t)eiω0t+E(r,t)eiω0t)\begin{aligned} \vb{E}(\vb{r}, t) &= \vu{x} \frac{1}{2} \Big( E(\vb{r}, t) e^{- i \omega_0 t} + E^*(\vb{r}, t) e^{i \omega_0 t} \Big) \end{aligned}

Note that no generality has been lost in this step. Inserting it into the polarizations:

PL=x^12ε0χxx(1)(E(r,t)eiω0t+E(r,t)eiω0t)PNL=x^18ε0χxxxx(3)(Eeiω0t+Eeiω0t)3=x^18ε0χxxxx(3)(E3ei3ω0t+3E2Eeiω0t+3E(E)2eiω0t+(E)3ei3ω0t)\begin{aligned} \mathrm{P}_\mathrm{L} &= \vu{x} \frac{1}{2} \varepsilon_0 \chi^{(1)}_{xx} \Big( E(\vb{r}, t) e^{- i \omega_0 t} + E^*(\vb{r}, t) e^{i \omega_0 t} \Big) \\ \vb{P}_\mathrm{NL} &= \vu{x} \frac{1}{8} \varepsilon_0 \chi^{(3)}_{xxxx} \Big( E e^{- i \omega_0 t} + E^* e^{i \omega_0 t} \Big)^{3} \\ &= \vu{x} \frac{1}{8} \varepsilon_0 \chi^{(3)}_{xxxx} \Big( E^3 e^{- i 3 \omega_0 t} + 3 E^2 E^* e^{- i \omega_0 t} + 3 E (E^*)^2 e^{i \omega_0 t} + (E^*)^3 e^{i 3 \omega_0 t} \Big) \end{aligned}

The terms with 3ω03 \omega_0 represent third-harmonic generation, and only matter if the carrier is phase-matched to the tripled wave, which is generally not the case, so they can be ignored. Now, if we decompose the polarizations in the same was as E\vb{E}:

PL(r,t)=x^12(PL(r,t)eiω0t+PL(r,t)eiω0t)PNL(r,t)=x^12(PNL(r,t)eiω0t+PNL(r,t)eiω0t)\begin{aligned} \vb{P}_\mathrm{L}(\vb{r}, t) &= \vu{x} \frac{1}{2} \Big( P_\mathrm{L}(\vb{r}, t) e^{- i \omega_0 t} + P_\mathrm{L}^*(\vb{r}, t) e^{i \omega_0 t} \Big) \\ \vb{P}_\mathrm{NL}(\vb{r}, t) &= \vu{x} \frac{1}{2} \Big( P_\mathrm{NL}(\vb{r}, t) e^{- i \omega_0 t} + P_\mathrm{NL}^*(\vb{r}, t) e^{i \omega_0 t} \Big) \end{aligned}

Then it is straightforward to see that their envelope functions are given by:

PL=ε0χxx(1)EPNL=34ε0χxxxx(3)E2E\begin{aligned} P_\mathrm{L} &= \varepsilon_0 \chi^{(1)}_{xx} E \\ P_\mathrm{NL} &= \frac{3}{4} \varepsilon_0 \chi^{(3)}_{xxxx} |E|^2 E \end{aligned}

The forward carrier eiω0te^{- i \omega_0 t} and the backward carrier eiω0te^{i \omega_0 t} can be regarded as separate channels, which only interact via PNLP_\mathrm{NL}. From now on, we only consider the forward-propagating wave, so all terms containing eiω0te^{i \omega_0 t} are dropped; by taking the complex conjugate of the resulting equations, the backward-propagating counterparts can always be recovered, so no information is really lost. Therefore, the main wave equation becomes:

0=(2Eμ0ε02Et2μ02PLt2μ02PNLt2)eiω0t(2E(1+χxx(1)+34χxxxx(3)E2)μ0ε02Et2)eiω0t\begin{aligned} 0 &= \bigg( \nabla^2 E - \mu_0 \varepsilon_0 \pdvn{2}{E}{t} - \mu_0 \pdvn{2}{P_\mathrm{L}}{t} - \mu_0 \pdvn{2}{P_\mathrm{NL}}{t} \bigg) e^{-i \omega_0 t} \\ &\approx \bigg( \nabla^2 E - \Big( 1 + \chi^{(1)}_{xx} + \frac{3}{4} \chi^{(3)}_{xxxx} |E|^2 \Big) \mu_0 \varepsilon_0 \pdvn{2}{E}{t} \bigg) e^{-i \omega_0 t} \end{aligned}

Where we have used our assumption that EE is slowly-varying to treat E2|E|^2 as a constant, in order to move it outside the tt-derivative. We thus arrive at:

0=(2Eεrc22Et2)eiω0t\begin{aligned} 0 &= \bigg( \nabla^2 E - \frac{\varepsilon_r}{c^2} \pdvn{2}{E}{t} \bigg) e^{-i \omega_0 t} \end{aligned}

Where c=1/μ0ε0c = 1 / \sqrt{\mu_0 \varepsilon_0} is the phase velocity of light in a vacuum, and the relative permittivity εr\varepsilon_r is defined as shown below. Note that this is a mild abuse of notation, since the symbol εr\varepsilon_r is usually reserved for linear materials:

εr1+χxx(1)+34χxxxx(3)E2\begin{aligned} \varepsilon_r \equiv 1 + \chi^{(1)}_{xx} + \frac{3}{4} \chi^{(3)}_{xxxx} |E|^2 \end{aligned}

Next, we take the Fourier transform tωt \to \omega of the wave equation, again treating E2|E|^2 (inside εr\varepsilon_r) as a constant. The constant s=±1s = \pm 1 is included here to deal with the fact that different authors use different sign conventions:

0=F^{(2Eεrc22Et2)eiω0t}=(2Eεrc22Et2)eis(ωω0)tdt=2E+s2(ωω0)2εrc2E\begin{aligned} 0 &= \hat{\mathcal{F}}\bigg\{ \bigg( \nabla^2 E - \frac{\varepsilon_r}{c^2} \pdvn{2}{E}{t} \bigg) e^{-i \omega_0 t} \bigg\} \\ &= \int_{-\infty}^\infty \bigg( \nabla^2 E - \frac{\varepsilon_r}{c^2} \pdvn{2}{E}{t} \bigg) e^{i s (\omega - \omega_0) t} \dd{t} \\ &= \nabla^2 E + s^2 (\omega - \omega_0)^2 \frac{\varepsilon_r}{c^2} E \end{aligned}

We use s2=1s^2 = 1 and define Ωωω0\Omega \equiv \omega - \omega_0 as the frequency shift relative to the carrier wave:

0=2E+Ω2εrc2E\begin{aligned} 0 &= \nabla^2 E + \frac{\Omega^2 \varepsilon_r}{c^2} E \end{aligned}

This is a so-called Helmholtz equation in 3D, which we will solve using separation of variables, by assuming that its solution can be written as:

E(r,Ω)=F(x,y)A(z,Ω)eiβ0z\begin{aligned} E(\vb{r}, \Omega) &= F(x, y) \: A(z, \Omega) \: e^{i \beta_0 z} \end{aligned}

Where β0\beta_0 is the wavenumber of the carrier, which will be determined later. Inserting this ansatz into the Helmholtz equation yields:

0=(2Fx2+2Fy2)Aeiβ0z+2z2(Aeiβ0z)F+Ω2εrc2FAeiβ0z=((2Fx2+2Fy2)A+(2Az2+2iβ0Azβ02A)F+Ω2εrc2FA)eiβ0z\begin{aligned} 0 &= \Big( \pdvn{2}{F}{x} + \pdvn{2}{F}{y} \Big) A e^{i \beta_0 z} + \pdvn{2}{}{z} \Big( A e^{i \beta_0 z} \Big) F + \frac{\Omega^2 \varepsilon_r}{c^2} F A e^{i \beta_0 z} \\ &= \bigg( \Big( \pdvn{2}{F}{x} + \pdvn{2}{F}{y} \Big) A + \Big( \pdvn{2}{A}{z} + 2 i \beta_0 \pdv{A}{z} - \beta_0^2 A \Big) F + \frac{\Omega^2 \varepsilon_r}{c^2} F A \bigg) e^{i \beta_0 z} \end{aligned}

We divide by FAeiβ0zF A \: e^{i \beta_0 z} and rearrange the terms in a specific way:

(2Fx2+2Fy2)1F+Ω2εrc2=2iβ0Az1A+β02\begin{aligned} \Big( \pdvn{2}{F}{x} + \pdvn{2}{F}{y} \Big) \frac{1}{F} + \frac{\Omega^2 \varepsilon_r}{c^2} &= - 2 i \beta_0 \pdv{A}{z} \frac{1}{A} + \beta_0^2 \end{aligned}

Now all the xx- and yy-dependence is on the left, and the zz-dependence is on the right. We have placed the εr\varepsilon_r-term on the left too because it depends relatively strongly on (x,y)(x, y) to describe the fiber’s internal structure, and weakly on zz due to nonlinear effects. Meanwhile, β0\beta_0 is on the right because that will lead to a nicer equation for AA later.

Note that both sides are functions of Ω\Omega. Based on the aforementioned dependences, in order for this equation to have a solution for all (x,y,z)(x, y, z), there must exist a quantity β(Ω)\beta(\Omega) that is constant in space, such that we obtain two separated equations for FF and AA:

β(ω)=(2Fx2+2Fy2)1F+ω2εrc2β(Ω)=2iβ0Az1A+β02\begin{aligned} \beta(\omega) &= \bigg( \pdvn{2}{F}{x} + \pdvn{2}{F}{y} \bigg) \frac{1}{F} + \frac{\omega^2 \varepsilon_r}{c^2} \\ \beta(\Omega) &= - 2 i \beta_0 \pdv{A}{z} \frac{1}{A} + \beta_0^2 \end{aligned}

Note that we replaced Ω\Omega with ω\omega in FF’s equation (and redefined β\beta and εr\varepsilon_r accordingly). This is not an innocent detail: the idea is that ωεr/c\omega \sqrt{\varepsilon_r} / c would be the light’s wavenumber if it had not been trapped in a waveguide, and that β\beta is the confined wavenumber, also known as the propagation constant. If we had kept Ω\Omega, the meaning of β\beta would not be so straightforward.

The difference between β(ω)\beta(\omega) and β0\beta_0 is simply that β0β(ω0)\beta_0 \equiv \beta(\omega_0). Our ansatz for separating the variables contained β0\beta_0, such that the full carrier wave eiβ0ziω0te^{i \beta_0 z - i \omega_0 t} was represented (with eiω0te^{- i \omega_0 t} now hidden inside the Fourier transform). But later, to properly describe how light behaves inside the fiber, the full dispersion relation β(ω)\beta(\omega) will be needed.

Multiplying by FF and AA, we get the following set of equations, implicitly coupled via β\beta:

0=2Fx2+2Fy2+(ω2εrc2β2)F0=2iβ0Az+(β2β02)A\begin{aligned} \boxed{ \begin{aligned} 0 &= \pdvn{2}{F}{x} + \pdvn{2}{F}{y} + \bigg( \frac{\omega^2 \varepsilon_r}{c^2} - \beta^2 \bigg) F \\ 0 &= 2 i \beta_0 \pdv{A}{z} + \big( \beta^2 - \beta_0^2 \big) A \end{aligned} } \end{aligned}

The equation for FF must be solved first. To do so, we treat the nonlinearity as a perturbation to be neglected initially. In other words, we first solve the following eigenvalue problem for β2\beta^2, where n(x,y)n(x, y) is the linear refractive index, with n2=1+Re{χxx(1)}εrn^2 = 1 + \Real\{\chi^{(1)}_{xx}\} \approx \varepsilon_r:

2Fx2+2Fy2+(ω2n2c2β2)F=0\begin{aligned} \pdvn{2}{F}{x} + \pdvn{2}{F}{y} + \bigg( \frac{\omega^2 n^2}{c^2} - \beta^2 \bigg) F = 0 \end{aligned}

This gives us the allowed values of β\beta; see step-index fiber for an example solution. Now we add the small index change Δn(x,y)\Delta{n}(x, y) due to nonlinear effects:

εr=(n+Δn)2n2+2nΔn\begin{aligned} \varepsilon_r = (n + \Delta{n})^2 \approx n^2 + 2 n \: \Delta{n} \end{aligned}

Then it can be shown using first-order perturbation theory that the eigenfunction FF is not really affected, and the eigenvalue β2\beta^2 is shifted by Δ(β2)\Delta(\beta^2), given by:

Δ(β2)=2ω2c2nΔnF2dxdyF2dxdy\begin{aligned} \Delta(\beta^2) = \frac{2 \omega^2}{c^2} \frac{\displaystyle \iint_{-\infty}^\infty n \: \Delta{n} \: |F|^2 \dd{x} \dd{y}} {\displaystyle \iint_{-\infty}^\infty |F|^2 \dd{x} \dd{y}} \end{aligned}

But we are more interested in the wavenumber shift Δβ\Delta{\beta} than the eigenvalue shift Δ(β2)\Delta(\beta^2). They are related to one another as follows:

β2+Δ(β2)=(β+Δβ)2β2+2βΔβ\begin{aligned} \beta^2 + \Delta(\beta^2) = (\beta + \Delta{\beta})^2 \approx \beta^2 + 2 \beta \Delta{\beta} \end{aligned}

Furthermore, we assume that the fiber only consists of materials with similar refractive indices, or in other words, that it confines the light using only a small index difference, in which case we can treat nn as a constant and move it outside the integral. Then Δβ\Delta{\beta} becomes:

Δβ=ω2nβc2ΔnF2dxdyF2dxdy\begin{aligned} \Delta{\beta} = \frac{\omega^2 n}{\beta c^2} \frac{\displaystyle \iint_{-\infty}^\infty \Delta{n} \: |F|^2 \dd{x} \dd{y}} {\displaystyle \iint_{-\infty}^\infty |F|^2 \dd{x} \dd{y}} \end{aligned}

Recall that β\beta is the wavenumber of the confined mode: by solving the unperturbed FF-equation, it can be shown that β\beta’s value is somewhere between the bulk wavenumbers of the fiber materials. Since we just approximated nn as a constant, this means that ωn/cβ\omega n / c \approx \beta, leading us to the general “final” form of Δβ\Delta{\beta}, with all the arguments shown for clarity:

Δβ(ω)=ωcAmodeΔn(x,y,ω)F(x,y)2dxdy\begin{aligned} \boxed{ \Delta{\beta}(\omega) = \frac{\omega}{c \mathcal{A}_\mathrm{mode}} \iint_{-\infty}^\infty \Delta{n}(x, y, \omega) \: |F(x, y)|^2 \dd{x} \dd{y} } \end{aligned}

Where we have defined the mode area Amode\mathcal{A}_\mathrm{mode} as shown below. In order for Amode\mathcal{A}_\mathrm{mode} to be in units of area, FF must be dimensionless, and consequently AA has (SI) units of an electric field.

AmodeF2dxdy\begin{aligned} \mathcal{A}_\mathrm{mode} \equiv \iint_{-\infty}^\infty |F|^2 \dd{x} \dd{y} \end{aligned}

Now we finally turn our attention to the equation for AA. Before perturbation, it was:

0=2iβ0Az+(β2β02)A\begin{aligned} 0 &= 2 i \beta_0 \pdv{A}{z} + \big( \beta^2 - \beta_0^2 \big) A \end{aligned}

Where ββ0\beta \approx \beta_0, so we can replace β2β02\beta^2 - \beta_0^2 with 2β0(ββ0)2 \beta_0 (\beta - \beta_0). Also including Δβ\Delta{\beta}, we get:

0=iAz+(β+Δββ0)A\begin{aligned} 0 &= i \pdv{A}{z} + \big( \beta + \Delta{\beta} - \beta_0 \big) A \end{aligned}

Usually, we do not know a full expression for β(ω)\beta(\omega), so it makes sense to expand it around the carrier frequency ω0\omega_0 as follows, where βn=dnβ/dωnω=ω0\beta_n = \idvn{n}{\beta}{\omega} |_{\omega = \omega_0}:

β(ω)=β0+(ωω0)β1+(ωω0)2β22+(ωω0)3β36+...\begin{aligned} \beta(\omega) &= \beta_0 + (\omega - \omega_0) \beta_1 + (\omega - \omega_0)^2 \frac{\beta_2}{2} + (\omega - \omega_0)^3 \frac{\beta_3}{6} + \: ... \end{aligned}

Spectrally, the broader the light pulse, the more terms must be included. Recall that earlier, in order to treat χ(3)\chi^{(3)} as instantaneous, we already assumed a temporally broad (spectrally narrow) pulse. Hence, for simplicity, we can cut off this Taylor series at β2\beta_2, which is good enough in many cases. Inserting the expansion into AA’s equation:

0=iAz+iβ1s(isΩ)Aβ22s2(isΩ)2A+Δβ0A\begin{aligned} 0 &= i \pdv{A}{z} + i \frac{\beta_1}{s} (-i s \Omega) A - \frac{\beta_2}{2 s^2} (- i s \Omega)^2 A + \Delta{\beta}_0 A \end{aligned}

Which we have rewritten in preparation for taking the inverse Fourier transform, by introducing ss and by replacing Δβ(ω)\Delta{\beta}(\omega) with Δβ0Δβ(ω0)\Delta{\beta_0} \equiv \Delta{\beta}(\omega_0) in order to remove all explicit dependence on ω\omega, i.e. we only keep the first term of Δβ\Delta{\beta}’s Taylor expansion. After transforming and using s2=1s^2 = 1, we get the following equation for A(z,t)A(z, t):

0=iAz+isβ1Atβ222At2+Δβ0A\begin{aligned} 0 &= i \pdv{A}{z} + i s \beta_1 \pdv{A}{t} - \frac{\beta_2}{2} \pdvn{2}{A}{t} + \Delta{\beta}_0 A \end{aligned}

The next step is to insert our expression for Δβ0\Delta{\beta}_0, for which we must first choose a specific form for Δn\Delta{n} according to which effects we want to include. Earlier, we approximated εrn2\varepsilon_r \approx n^2, so if we instead say that εr=(n ⁣+ ⁣Δn)2\varepsilon_r = (n \!+\! \Delta{n})^2, then Δn\Delta{n} should include absorption and nonlinearity. The most commonly used form for Δn\Delta{n} is therefore:

Δn(x,y,ω)=n2(ω)I(x,y,ω)+icα(ω)2ω\begin{aligned} \Delta{n}(x, y, \omega) = n_2(\omega) \: I(x, y, \omega) + i \frac{c \alpha(\omega)}{2 \omega} \end{aligned}

Where II is the intensity (i.e. power per unit area) of the light, n2n_2 is the material’s Kerr coefficient in units of inverse intensity, and α\alpha is the attenuation coefficient consisting of linear and nonlinear contributions (see multi-photon absorption). Specifically, they are given by:

n2=3Re{χxxxx(3)}4ε0cn2α=ωIm{χxx(1)}cn+3ωIm{χxxxx(3)}2ε0c2n2II=ε0cn2F2A2\begin{aligned} n_2 = \frac{3 \Real\{\chi^{(3)}_{xxxx}\}}{4 \varepsilon_0 c n^2} \qquad \alpha = \frac{\omega \Imag\{\chi^{(1)}_{xx}\}}{c n} + \frac{3 \omega \Imag\{\chi^{(3)}_{xxxx}\}}{2 \varepsilon_0 c^2 n^2} I \qquad I = \frac{\varepsilon_0 c n}{2} |F|^2 |A|^2 \end{aligned}

For simplicity we set Im{χxxxx(3)}=0\Imag\{\chi^{(3)}_{xxxx}\} = 0, which is a good approximation for silica fibers. Inserting this form of Δn\Delta{n} into Δβ0\Delta{\beta_0} and neglecting the (x,y)(x, y)-dependence of Δn\Delta{n} yields:

Δβ0=iα2AmodeAmode+ω0ε0cnn22cAmodeA2F4dxdy=iα2+γ0ε0cn2AmodeA2\begin{aligned} \Delta{\beta}_0 &= i \frac{\alpha}{2} \frac{\mathcal{A}_\mathrm{mode}}{\mathcal{A}_\mathrm{mode}} + \frac{\omega_0 \varepsilon_0 c n n_2}{2 c \mathcal{A}_\mathrm{mode}} |A|^2 \iint_{-\infty}^\infty |F|^4 \dd{x} \dd{y} \\ &= i \frac{\alpha}{2} + \gamma_0 \frac{\varepsilon_0 c n}{2} \mathcal{A}_\mathrm{mode} |A|^2 \end{aligned}

Where we have defined the parameter γ0γ(ω0)\gamma_0 \equiv \gamma(\omega_0) like so, involving the effective mode area Aeff\mathcal{A}_\mathrm{eff}, which contains all information about FF needed for solving AA’s equation:

γ(ω)ωn2(ω)cAeff(ω)Aeff(ω)(F2dxdy)2F4dxdy\begin{aligned} \boxed{ \gamma(\omega) \equiv \frac{\omega n_2(\omega)}{c \mathcal{A}_\mathrm{eff}(\omega)} } \qquad \qquad \boxed{ \mathcal{A}_\mathrm{eff}(\omega) \equiv \frac{\displaystyle \bigg( \iint_{-\infty}^\infty |F|^2 \dd{x} \dd{y} \bigg)^2} {\displaystyle \iint_{-\infty}^\infty |F|^4 \dd{x} \dd{y}} } \end{aligned}

Note the ω\omega-dependence of AeffA_\mathrm{eff}: so far we have conveniently ignored that FF also depends on ω\omega, because it is a parameter in its eigenvalue equation. This is valid for spectrally narrow pulses, so we will stick with it. Just beware that some people make the ad-hoc generalization γ0γ(ω)\gamma_0 \to \gamma(\omega), which is not correct in general (this is an advanced topic, see Lægsgaard).

Substituting Δβ0\Delta{\beta_0} into the main problem yields a prototype of the NLS equation:

0=iAz+isβ1Atβ222At2+iα2A+γ0ε0cn2AmodeA2A\begin{aligned} 0 &= i \pdv{A}{z} + i s \beta_1 \pdv{A}{t} - \frac{\beta_2}{2} \pdvn{2}{A}{t} + i \frac{\alpha}{2} A + \gamma_0 \frac{\varepsilon_0 c n}{2} \mathcal{A}_\mathrm{mode} |A|^2 A \end{aligned}

The factor ε0cn/2\varepsilon_0 c n / 2 looks familiar from the intensity II. This, combined with Amode\mathcal{A}_\mathrm{mode} and the fact that AA is an electric field, suggests that we can redefine AAA \to A' such that A2|A'|^2 is the optical power in watts. Hence we make the following transformation:

ε0cn2AmodeA2A2\begin{aligned} \frac{\varepsilon_0 c n}{2} \mathcal{A}_\mathrm{mode} |A|^2 \:\:\to\:\: |A|^2 \end{aligned}

We can divide away the transformation factors from all other terms in the equation, since they are linear, leading to the full nonlinear Schrödinger equation:

0=iAz+isβ1Atβ222At2+iα2A+γ0A2A\begin{aligned} \boxed{ 0 = i \pdv{A}{z} + i s \beta_1 \pdv{A}{t} - \frac{\beta_2}{2} \pdvn{2}{A}{t} + i \frac{\alpha}{2} A + \gamma_0 |A|^2 A } \end{aligned}

This can be reduced by switching to a coordinate system where the time axis slides along the propagation axis at a speed svs v, so we define ZzZ \equiv z and Ttsz/vT \equiv t - s z / v such that:

Az=AZZz+ATTz=AZsvATAt=AZZt+ATTt=AT\begin{aligned} \pdv{A}{z} &= \pdv{A}{Z} \pdv{Z}{z} + \pdv{A}{T} \pdv{T}{z} = \pdv{A}{Z} - \frac{s}{v} \pdv{A}{T} \\ \pdv{A}{t} &= \pdv{A}{Z} \pdv{Z}{t} + \pdv{A}{T} \pdv{T}{t} = \pdv{A}{T} \end{aligned}

We insert this and set v=vgv = v_g, where vg=1/β1v_g = 1 / \beta_1 is the light’s group velocity:

0=iAZβ222AT2+iα2A+γ0A2A\begin{aligned} \boxed{ 0 = i \pdv{A}{Z} - \frac{\beta_2}{2} \pdvn{2}{A}{T} + i \frac{\alpha}{2} A + \gamma_0 |A|^2 A } \end{aligned}

The NLS equation’s name is due to its similarity to the Schrödinger equation of quantum physics, if you set α=0\alpha = 0 and treat γ0A2\gamma_0 |A|^2 as a potential. In fiber optics, the equation is usually rearranged to highlight that ZZ (or zz) is the propagation direction:

AZ=iβ222AT2α2A+iγ0A2A\begin{aligned} \pdv{A}{Z} = - i \frac{\beta_2}{2} \pdvn{2}{A}{T} - \frac{\alpha}{2} A + i \gamma_0 |A|^2 A \end{aligned}

Next, we want to reduce the equation to its dimensionless form. To do so, we make the following coordinate transformation, where A~\tilde{A}, Z~\tilde{Z} and T~\tilde{T} are unitless, and AcA_c, ZcZ_c and TcT_c are dimensioned scale parameters to be determined later:

A~(Z~,T~)=A(Z,T)AcZ~=ZZcT~=TTc\begin{aligned} \tilde{A}(\tilde{Z}, \tilde{T}) = \frac{A(Z, T)}{A_c} \qquad\qquad \tilde{Z} = \frac{Z}{Z_c} \qquad\qquad \tilde{T} = \frac{T}{T_c} \end{aligned}

We insert this into the NLS equation, after setting α=0\alpha = 0 according to convention:

0=iAcZcA~Z~β22AcTc22A~T~2+γ0Ac3A~2A~\begin{aligned} 0 = i \frac{A_c}{Z_c} \pdv{\tilde{A}}{\tilde{Z}} - \frac{\beta_2}{2} \frac{A_c}{T_c^2} \pdvn{2}{\tilde{A}}{\tilde{T}} + \gamma_0 A_c^3 \big|\tilde{A}\big|^2 \tilde{A} \end{aligned}

Multiplying by Zc/AcZ_c / A_c to make all terms dimensionless leads us to:

0=iA~Z~β2Zc2Tc22A~T~2+γ0Ac2ZcA~2A~\begin{aligned} 0 = i \pdv{\tilde{A}}{\tilde{Z}} - \frac{\beta_2 Z_c}{2 T_c^2} \pdvn{2}{\tilde{A}}{\tilde{T}} + \gamma_0 A_c^2 Z_c \big|\tilde{A}\big|^2 \tilde{A} \end{aligned}

The goal is to remove those constant factors. In other words, we demand:

β2Zc2Tc2=1γ0Ac2Zc=r\begin{aligned} \frac{\beta_2 Z_c}{2 T_c^2} = -1 \qquad\qquad \gamma_0 A_c^2 Z_c = r \end{aligned}

Where r±1r \equiv \pm 1, whose sign choice will be explained shortly. Note that we have two equations for three unknowns (AcA_c, ZcZ_c and TcT_c), so one of the parameters needs to fixed manually. For example, we could choose our “input power” Ac1WA_c \equiv \sqrt{1\:\mathrm{W}}, and then:

Zc=2Tc2β2Tc2=rβ22γ0Ac2    Zc=rγ0Ac2Tc=rβ22γ0Ac2\begin{aligned} Z_c = - \frac{2 T_c^2}{\beta_2} \qquad T_c^2 = -\frac{r \beta_2}{2 \gamma_0 A_c^2} \qquad\implies\qquad Z_c = \frac{r}{\gamma_0 A_c^2} \qquad T_c = \sqrt{ -\frac{r \beta_2}{2 \gamma_0 A_c^2} } \end{aligned}

Because TcT_c must be real, we should choose rsgn(γ0β2)r \equiv - \sgn(\gamma_0 \beta_2). We thus arrive at:

0=iA~Z~+2A~T~2+rA~2A~\begin{aligned} \boxed{ 0 = i \pdv{\tilde{A}}{\tilde{Z}} + \pdvn{2}{\tilde{A}}{\tilde{T}} + r \big|\tilde{A}\big|^2 \tilde{A} } \end{aligned}

In fiber optics, γ0>0\gamma_0 > 0 for all materials, meaning rr represents the dispersion regime, so r=1r = 1 is called anomalous dispersion and r=1r = -1 normal dispersion. In some other fields, where β2<0\beta_2 < 0 always, r=1r = 1 is called a focusing nonlinearity and r=1r = -1 a defocusing nonlinearity. The famous bright solitons only exist for r=1r = 1, so many authors only show that case.

References

  1. G.P. Agrawal, Nonlinear fiber optics, 6th edition, Elsevier.
  2. O. Bang, Nonlinear mathematical physics: lecture notes, 2020, unpublished.
  3. J. Lægsgaard, Mode profile dispersion in the generalized nonlinear Schrödinger equation, 2007, Optica.