.. _dual-doc: .. ipython:: python :suppress: import numpy as np ************ Dual Numbers ************ The ``rateslib.dual`` module creates dual number datatypes that provide this library with the ability to perform automatic differentiation (AD). The implementation style here is known as *forward mode*, as opposed to *reverse mode* (otherwise called *automatic adjoint differentiation* or *back propagation*). Summary ******* Classes ------- .. autosummary:: rateslib.dual.Dual rateslib.dual.Dual2 Methods ------- .. autosummary:: rateslib.dual.gradient rateslib.dual.dual_exp rateslib.dual.dual_log rateslib.dual.dual_solve Example ******* First Derivatives ----------------- .. math:: f(x, y, z) = x^6 + e^{\frac{x}{y}} + \ln {z} \\ f(2, 1, 2) = 72.0822... .. ipython:: python from rateslib.dual import * def func(x, y, z): return x**6 + dual_exp(x/y) + dual_log(z) func(2, 1, 2) For extracting only first derivatives it is more efficient to use :class:`~rateslib.dual.Dual`: .. math:: \frac{\partial f}{\partial x} &= \left . 6 x^5 + \frac{1}{y} e^{\frac{x}{y}} \right |_{(2,1,2)} = 199.3890... \\ \frac{\partial f}{\partial y} &= \left . -\frac{x}{y^2} e^{\frac{x}{y}} \right |_{(2,1,2)} = -14.7781... \\ \frac{\partial f}{\partial z} &= \left . \frac{1}{z} \right |_{(2,1,2)} = 0.50 \\ .. ipython:: python x, y, z = Dual(2, ["x"], []), Dual(1, ["y"], []), Dual(2, ["z"], []) func(x, y, z) gradient(func(x, y, z), ["x", "y", "z"]) Second Derivatives ------------------ For extracting second derivatives we must use :class:`~rateslib.dual.Dual2`: .. ipython:: python x, y, z = Dual2(2, ["x"], [], []), Dual2(1, ["y"], [], []), Dual2(2, ["z"], [], []) func(x, y, z) gradient(func(x, y, z), ["x", "y", "z"]) gradient(func(x, y, z), ["x", "y"], order=2) The ``keep_manifold`` argument is also exclusively available for :class:`~rateslib.dual.Dual2`. When extracting a first order gradient from a :class:`~rateslib.dual.Dual2` this is will use information about second order and transfer it to first order thus representing a linear manifold of the gradient. This is useful for allowing composited automatic calculation of second order gradients. For example consider the following functions, :math:`g(x)=x^2` and :math:`h(y)=y^2`, evaluated at the points :math:`x=2` and :math:`y=4`. This creates the quadratic manifolds centered at those points expressed in the following :class:`~rateslib.dual.Dual2` numbers: .. ipython:: python g = Dual2(4, ["x"], [4], [1]) # g(x=2) h = Dual2(16, ["y"], [8], [1]) # h(y=4) If we wish to multiply these two functions and evaluate the second order derivatives at (2, 4) we can simply do, .. ipython:: python gradient(g*h, order=2) And observe that, say, :math:`\frac{\partial (gh)}{\partial x \partial y} = 4xy|_{(2, 4)} = 32`, as shown in the above array. But, we can also use the product rule of differentiation to assert that, .. math:: d_{x\zeta}^2(gh) = d_x \left ( d_\zeta(g)h + gd_\zeta(h) \right ) \\\\ d_{y\zeta}^2(gh) = d_y \left ( d_\zeta(g)h + gd_\zeta(h) \right ) \\\\ which we express in our dual language as, .. ipython:: python gradient(g, ["x", "y"], keep_manifold=True) * h + g * gradient(h, ["x", "y"], keep_manifold=True) If the manifold is not maintained the product rule fails because information that is required to ultimately determine that desired second derivative is discarded. .. ipython:: python gradient(g, ["x", "y"]) * h + g * gradient(h, ["x", "y"]) More specifically, .. ipython:: python gradient(g, ["x", "y"], keep_manifold=True) while, .. ipython:: python gradient(g, ["x", "y"]) Implementation *************** Forward mode AD is implemented using operating overloading and custom compatible functions. The operations implemented are; - addition (+), - subtraction and negation (-), - multiplication (*), - division and inversion (/) (\*\*-1), - n'th power where n is an integer or a float (\*\*n), - exponential and logarithms (which require the specific methods below), - equality of dual numbers with integers and floats and with each other. .. warning:: :class:`~rateslib.dual.Dual` and :class:`~rateslib.dual.Dual2` are not designed to operate with each other. The purpose for this is to avoid miscalculation of second derivatives. :class:`~rateslib.dual.Dual` should always be replaced by :class:`~rateslib.dual.Dual2` in this instance. ``TypeErrors`` will be raised otherwise. Compatability with NumPy ************************ To enable this library to perform its calculations in a vectorised way we need to leverage NumPy's array calculations. NumPy arrays containing dual numbers are forced to have an ``object`` dtype configuration. This is imposed by NumPy and means that certain functions may not be compatible, for example ``np.einsum`` (although, support for ``object`` dtypes was added to ``np.einsum`` as of version 1.25.0). However, many functions are compatible. Broadcasting ------------ Operations of :class:`~rateslib.dual.Dual` and :class:`~rateslib.dual.Dual2` with ``int`` and ``float`` dtypes permit the NumPy versions; np.int8, np.int16, np.int32, np.int64, np.float16, np.float32, np.float64, and np.float128. Broadcasting of arrays has been implemented so that the following operations work as expected. .. ipython:: python np_arr = np.array([1, 2]) Dual(3, ["x"], []) * np_arr np_arr / Dual(4, ["y"], []) Dual(4, ["x"], []) ** np_arr Elementwise Operations ---------------------- Simple operations on tensors also work as expected. .. ipython:: python x = np.array([Dual(1, ["x"], []), Dual(2, ["y"], [])]) y = np.array([Dual(3, ["x"], []), Dual(4, ["y"], [])]) x + y x * y x / y Linear Algebra -------------- Common linear algebraic operations are also available, such as: - ``np.matmul`` - ``np.inner`` - ``np.dot`` - ``np.tensordot`` .. ipython:: python np.dot(x, y) np.inner(x, y) np.matmul(x[:, np.newaxis], y[np.newaxis, :]) np.tensordot(x[np.newaxis, :, np.newaxis], y[np.newaxis, :], (1, 1)) Solving the linear system, :math:`Ax=b`, is not not directly possible from NumPy, thus a custom solver, :meth:`~rateslib.dual.dual_solve`, has been implemented using the Doolittle algorithm with partial pivoting. .. ipython:: python A = np.array([ [1, 0], [Dual(2, ["z"], []), 1] ], dtype="object") b = np.array([Dual(2, ["y"], []), Dual(5, ["x", "y"], [])])[:, np.newaxis] x = dual_solve(A, b) x np.matmul(A, x)