Chain Rule

Chain Rule#

When we have one function inside another function (this is called a composite function), we differentiate using the chain rule. An example of a composite function is \(y=\sin(x^2 + 1)\). There are two functions \(\sin(\ldots)\) and \(x^2+1\). We are applying the \(\sin\) function to \(x^2+1\), thus making it a function inside a function, or a composite function.

The general form of a composite function is:

\[ y = f[g(x)], \]

where \(f\) and \(g\) are both functions. In the above example \(y=\sin(x^2+1)\), we would have \(f[g(x)] = \sin[g(x)]\) and \(g(x) = x^2+1\).

To use the chain rule, we have the following steps:

  1. Introduce a new variable \(u\) to be equal to \(g(x)\).

  2. Substitute \(u=g(x)\) into the expression \(y=f[g(x)]\), so that \(y=f(u)\).

  3. Find \(\frac{dy}{du}\) and \(\frac{du}{dx}\).

  4. The derivative \(\frac{dy}{dx}\) can be found by this equation

    \[ \frac{dy}{dx} = \frac{dy}{du} \times \frac{du}{dx} \]

    and then subtitute back into \(u=g(x)\).

The chain rule can be summerised as:

\[ \text{If }y=f[g(x)]\text{ let }u=g(x)\text{ hence }y=f[g(x)] = f(u)\text{ then }\frac{dy}{dx}=\frac{dy}{du}\times\frac{du}{dx} \]

Below, we follow the steps of the chain rule to differentiate \(y = sin(x^2 + 1)\).

  1. Introduce a new variable \(u\) to be equal to \(g(x)\). For this example, \(u = x^2+1\).

  2. Substitute \(u=g(x)\) into the expression \(y=f[g(x)]\), so that \(y=f(u)\). As \(u=x^2+1\), this would make \(y = \sin(u)\).

  3. Find \(\frac{dy}{du}\) and \(\frac{du}{dx}\). In this example, we have:

    \[ y = \sin(u) \Rightarrow \frac{dy}{du} = \cos(u) \]

    and

    \[ u = x^2 + 1 \Rightarrow \frac{du}{dx} = 2x \]
  4. The derivative \(\frac{dy}{dx}\) can be found by the equation \(\frac{dy}{dx} = \frac{dy}{du} \times \frac{du}{dx}\) and then substitute back in \(u=g(x)\).

    \[ \frac{dy}{dx} = \cos(u)\times 2x = 2x\cos(x^2+1) \]

Python and sympy is able to use the chain rule.

from sympy import symbols, diff, sin

x = symbols('x')

diff(sin(x ** 2 + 1))
\[\displaystyle 2 x \cos{\left(x^{2} + 1 \right)}\]

Example

Differentiate \(y=(x^2 + 2)^3\) using the chain rule.

Solution: We take \(u = x^2 + 2\) and therefore \(y=u^3\).

Then we differentiate each of those to find:

\[ \frac{du}{dx} = 2x \;\;\;\;\; \frac{dy}{du} = 3u^2 \]
  • From the chain rule: $\frac{dy}{dx} = \frac{dy}{du} \times \frac{du}{dx}

  • Substitute in what we know for \(\frac{dy}{du}\) and \(\frac{du}{dx}\): \(\frac{dy}{dx} = 3u^2 \times 2x\)

  • Substitute that \(u\) is \(x^2+2\): \(\frac{dy}{dx} = 6x(x^2+2)^2\).

Example

Given \(y=ln(x^3)\), find \(\frac{dy}{dx}\).

Solution: We take \(u=x^3\), which menas that \(y = \ln(u)\).

Then differentiate to find that:

\[ \frac{du}{dx} = 3x^2 \;\;\;\;\; \frac{dy}{du} = \frac{1}{u} \]
  • Find the chain rule: \(\frac{dy}{dx} = \frac{dy}{du} \times \frac{du}{dx}\)

  • We can see that: \(\frac{dy}{dx} = \frac{1}{u} \times 3x^2\)

  • Substitute that \(u\) is \(x^3\): \(\frac{dy}{dx} = \frac{1}{x^3} \times 3x^2\)

  • Simplify further: \(\frac{dy}{Dx} = \frac{3}{x}\)

Example

Differentiate \(y=\cos(e^{2x})\)

Solution: Let \(u=e^{2x}\), therefore, \(y=\cos(u)\).

Differentiate to find:

\[ \frac{du}{dx} = 2e^{2x} \;\;\;\;\; \frac{dy}{du} = -\sin(u) \]
  • Using the chain rule: \(\frac{dy}{dx} = \frac{dy}{du} \times \frac{du}{dx}\)

  • We can show that: $\frac{dy}{dx} = -\sin(u) \times 2e^{2x}

  • Substitute that \(u=e^{2x}\)

  • Simplify further: \(\frac{dy}{dx} = -2e^{2x}\sin(e^{2x})\).


We can compute the three examples above with Python as follows:

diff((x ** 2 + 2) ** 3)
\[\displaystyle 6 x \left(x^{2} + 2\right)^{2}\]
from sympy import log

diff(log(x ** 3))
\[\displaystyle \frac{3}{x}\]
from sympy import cos, exp

diff(cos(exp(2 * x)))
\[\displaystyle - 2 e^{2 x} \sin{\left(e^{2 x} \right)}\]

Maxwell Boltzmann Distribution

The Maxwell Boltzmann distribution is a probability distribution of finding particles at certains speed \(v\) in 3-dimensions. It has the form of:

\[ f(x) = Av^2e^{-Bv^2}, \]

where \(A\) and \(B\) are positive constants. Using the chain rule and the product rule, find \(\frac{d}{dv}f(v)\).

Solution: This example involves two steps, first of all the product rule is required to differentiate the whole thing as \(f(v)\) is a product of two functions. The chain rule is also needed to differentiate the exponential. Let’s call \(P = Av^2\) and \(Q = e^{-Bv^2}\), thus making \(f(v) = P\times Q\). From the product rule:

\[ \frac{d}{dv}f(v) = P\frac{dQ}{dv} + Q\frac{dP}{dv} \]

We have \(\frac{dP}{dv} = 2Av\), but we will need to use the chain rule to find $\frac{dQ}{dv}.

We let \(u=-Bv^2\), making \(Q = e^u\). We can then calculate that \(\frac{du}{dv} = -2Bv\) and that \(\frac{dQ}{du} = e^u\).

  • From here, the chain rule tells us that: \(\frac{dQ}{dv} = \frac{dQ}{du} \times \frac{du}{dv}\)

  • We put in what we have for \(\frac{dQ}{du}\) and \(\frac{du}{dv}\): \(\frac{dQ}{dv} = e^u \times (-2Bv)\)

  • Substitute \(u = -Bv^2\): \(\frac{dQ}{dv} = -2Bve^{-Bv^2}\)

This can then be substituted into the previous equation.

  • This gives us: \(\frac{d}{dv}f(v) = P\frac{dQ}{dv} + Q\frac{dP}{dv}\)

  • Substitute in what we have found: \(\frac{d}{dv}f(v) = Av^2(-2Bv\times e^{Bv^2}) + e^{-Bv^2} \times 2Av\)

  • This can be simplified: \(\frac{d}{dv}f(v) = -2 ABv^3e^{-Bv^2} + 2Ave^{-Bv^2}\)

  • Take a factor of \(2Ave^{-Bv2}\) out: \(\frac{d}{dv}f(v) = 2Ave^{-Bv^2}(1-Bv^2)\)


With Python, this rather complex example is simple to get the solution to.

from sympy import simplify

A, B, v = symbols('A B v')

simplify(diff(A * v ** 2 * exp(-B * v ** 2), v))
\[\displaystyle 2 A v \left(- B v^{2} + 1\right) e^{- B v^{2}}\]