Synoptic Exercise: Geometry Optimisation of a Lennard-Jones Potential#

The Lennard-Jones potential describes the interaction between a pair of non-bonded atoms:

\[U_\mathrm{LJ}(r) = \frac{A}{r^{12}} - \frac{B}{r^6}\]

where \(r\) is the interatomic separation. The repulsive term (\(r^{-12}\)) represents electron cloud overlap at short distances, whilst the attractive term (\(r^{-6}\)) represents van der Waals interactions.

In this exercise, you will use both gradient descent and Newton-Raphson methods to find the equilibrium separation for a Lennard-Jones pair, compare their performance, and explore how the choice of starting position affects convergence.

We will use the parameters \(A\) = 1 × 105 eV Å12 and \(B\) = 40 eV Å6.

The derivatives of the Lennard-Jones potential are:

First derivative:

\[U'_\mathrm{LJ}(r) = -12 \frac{A}{r^{13}} + 6\frac{B}{r^7}\]

Second derivative:

\[U''_\mathrm{LJ}(r) = 156\frac{A}{r^{14}} - 42\frac{B}{r^8}\]

Part A: Visualising the Potential Energy Surface#

Before attempting optimisation, it is important to understand the shape of the potential energy surface.

  1. Write a function lennard_jones(r, A, B) that calculates \(U_\mathrm{LJ}(r)\) for a given separation \(r\) and parameters \(A\) and \(B\).

  2. Create an array of \(r\) values from 3.6 Å to 8.0 Å using np.linspace() with at least 200 points.

  3. Plot the potential energy as a function of \(r\) for the given parameters. Label your axes appropriately and include a title.

  4. By examining your plot:

    • Estimate the location of the minimum (the equilibrium bond length)

    • Identify the approximate energy at the minimum

    • Note where the potential energy crosses zero (where attractive and repulsive forces balance to give \(U = 0\))

Part B: Gradient Descent Optimisation#

Now we will use gradient descent with an adaptive step size to find the minimum.

  1. Write a function lennard_jones_gradient(r, A, B) that calculates the first derivative \(U'_\mathrm{LJ}(r)\).

  2. Implement a gradient descent loop that:

    • Starts from \(r = 5.0\) Å

    • Uses the update rule: \(r_{i+1} = r_i - \alpha U'(r_i)\)

    • Uses \(\alpha = 100\) as the learning rate

    • Runs for a maximum of 50 iterations

    • Stops early (using break) when \(|U'(r)| < 0.001\)

    • Stores the position and gradient at each iteration in lists (for later analysis)

  3. After your loop completes, print:

    • The final predicted equilibrium separation

    • The number of iterations required

    • The final gradient value

  4. Create a plot showing how the position \(r\) changes with iteration number. This visualises the convergence path.

Questions to consider:

  • How many iterations were required to converge?

  • Does the convergence appear smooth or does the algorithm oscillate?

  • Look at your convergence plot: does the step size decrease as you approach the minimum? (Remember: the step size is \(\alpha \times |U'(r)|\), and \(U'(r)\) should decrease as you approach the minimum.)

Part C: Exploring Different Starting Positions with Gradient Descent#

The choice of starting position can significantly affect optimisation performance.

  1. Test your gradient descent code with three different starting positions.

    For this comparison, use \(\alpha\) = 10, tolerance = 0.0005, and max_iterations = 500.

    • \(r = 3.2\) Å (close to the minimum, approaching from below)

    • \(r = 4.4\) Å (at the minimum, approximately)

    • \(r = 6.0\) Å (far from the minimum, approaching from above)

    For each starting position, record:

    • The final optimised position

    • The number of iterations required

    • Whether the algorithm converged successfully (i.e., did it find \(|U'(r)| < 0.001\) within 50 iterations?)

  2. Summarise your results, showing the three starting positions and their outcomes.

Questions to consider:

  • Do all starting positions converge to the same minimum?

  • Which starting position converges most quickly? Can you explain why based on your plot from Part A?

  • Did any starting positions fail to converge within 50 iterations?

  • How does the learning rate \(\alpha = 100\) perform for the different starting positions? Based on the gradient magnitudes at different \(r\) values, does this \(\alpha\) seem appropriate everywhere, or only in certain regions?

Part D: Newton-Raphson Optimisation#

The Newton-Raphson method uses both first- and second-derivative information to converge more rapidly.

  1. Write a function lennard_jones_second_derivative(r, A, B) that calculates \(U''_\mathrm{LJ}(r)\).

  2. Implement a Newton-Raphson loop that:

    • Starts from \(r = 5.0\) Å

    • Uses the update rule: \(r_{i+1} = r_i - \frac{U'(r_i)}{U''(r_i)}\)

    • Runs for a maximum of 50 iterations

    • Stops early when \(|U'(r)| < 0.001\)

    • Stores the position and gradient at each iteration

  3. After your loop completes, print the final equilibrium separation and number of iterations required.

  4. Create a plot comparing the convergence paths of gradient descent (from Part B) and Newton-Raphson, both starting from \(r = 5.0\) Å. Plot both curves on the same axes showing position vs. iteration number.

Questions to consider:

  • How does the convergence speed of Newton-Raphson compare to gradient descent?

  • Why doesn’t Newton-Raphson converge in a single step for the Lennard-Jones potential, unlike the harmonic potential?

  • Which method required fewer function evaluations? (Remember: gradient descent needs one derivative per step, whilst Newton-Raphson needs two.)

Part E: Comparing Methods Across Different Starting Positions#

  1. Test Newton-Raphson with the same three starting positions you used for gradient descent in Part C:

    • \(r = 3.2\) Å

    • \(r = 4.4\) Å

    • \(r = 6.0\) Å

    For each starting position, record the final position and the number of iterations required.

Questions to consider:

  • Which method is more robust to different starting positions?

  • For which starting position(s) does Newton-Raphson show the greatest advantage?

  • Based on your results, which method would you recommend for optimising a Lennard-Jones potential? What factors influenced your decision?