Tensor product spaces

Now that we know how to solve problems in one dimension, it is time to move on to more challenging tasks. Consider again the Poisson equation, but now in possibly more than one dimension $$ \begin{equation} -\nabla^2 u(\bs{x}) = f(\bs{x}) \quad \text{for }\bs{x} \in \Omega. \tag{42} \end{equation} $$ Lets first consider 2 dimensions, with Dirichlet boundary conditions in the first direction and with periodicity in the second. Let $ \Omega $ be the domain $ [-1, 1] \times [0, 2 \pi] $, and $ T(x,y) = V^C(x) \otimes V^p(y) $ be the tensor product function space. We can solve this problem for some suitable function $ f(\bs{x}) $ in shenfun by constructing a few more classes than were required in 1D

from shenfun.chebyshev.bases import ShenDirichletBasis
from shenfun.fourier.bases import FourierBasis
import numpy as np
from shenfun import Function, TensorProductSpace
from mpi4py import MPI

Now the TensorProductSpace class is used to construct $ T $, whereas Function is a subclass of numpy.ndarray used to hold solution arrays. The MPI communicator, on the other hand, is used for distributing the tensor product grids on a given number of processes

comm = MPI.COMM_WORLD
N = (32, 33)

K0 = ShenDirichletBasis(N[0])
K1 = FourierBasis(N[1], dtype=np.float)
T = TensorProductSpace(comm, (K0, K1))

# Alternatively, switch order for periodic in first direction instead
# T = TensorProductSpace(comm, (K1, K0), axes=(1, 0))

Under the hood, within the TensorProductSpace class, the mesh is distributed, both in real, physical space, and in spectral space. In the real space the mesh is distributed along the first index, whereas in spectral space the wavenumbermesh is distributed along the second dimension. This is the default behaviour of TensorProductSpace. However, the distribution may also be configured specifically by the user, e.g., as shown in the commented out text, where the Dirichlet basis is found along the second axis. In this case the order of the axes to transform over has been flipped, such that in spectral space the data is distributed along the first dimension and aligned in the second. This is required for solving the linear algebra system that arises for the Dirichlet basis. The arrays created using Function are distributed, and no further attention to MPI is required. However, note that arrays may have different type and shape in real space and in spectral space. For this reason Function has a keyword argument forward_output, that is used as w_hat = Function(T, forward_output=True) to create an array consistent with the output of T.forward (solution in spectral space), and w = Function(T, forward_output=False) to create an array consistent with the input (solution in real space). Furthermore, uh = np.zeros_like(w_hat) and w_hat = Function(T, buffer=uh) can be used to wrap a Function instance around a regular Numpy array uh. Note that uh and w_hat now will share the same data, and modifying one will naturally modify also the other.

The solution of a complete Poisson problem in 2D is shown below. Very similar code is required to solve the Poisson problem with the Legendre basis. The main difference is that for Legendre it is natural to integrate the weak form by parts and use matrices = inner(grad(v), grad(u))

from shenfun.chebyshev.la import Helmholtz as Solver

# Create a solution that satisfies boundary conditions
x, y = symbols("x,y")
ue = (cos(4*y) + sin(2*x))*(1-x**2)
fe = ue.diff(x, 2) + ue.diff(y, 2)

# Lambdify for faster evaluation
ul = lambdify((x, y), ue, 'numpy')
fl = lambdify((x, y), fe, 'numpy')

X = T.local_mesh(True)
u = TrialFunction(T)
v = TestFunction(T)

# Get f on quad points
fj = fl(X[0], X[1])

# Compute right hand side of Poisson equation
f_hat = inner(v, fj)

# Get left hand side of Poisson equation
matrices = inner(v, div(grad(u)))

# Create Helmholtz linear algebra solver
H = Solver(**matrices)

# Solve and transform to real space
u_hat = Function(T)           # Solution spectral space
u_hat = H(u_hat, f_hat)       # Solve
u = T.backward(u_hat)

The test functions and function spaces require a bit more attention. Test functions for space $ T(x, y)=V^C(x) \otimes V^p(y) $ are given as $$ \begin{equation} \phi_{\ts{k}}(x, y) = v_l(x) e^{imy}, \tag{43} \end{equation} $$ which introduces the sans serif Cartesian product wavenumber mesh $ \ts{k} = \bs{l}^D \times \bs{l} $ $$ \begin{equation} \ts{k} = \{ (l, m) | l \in \bs{l}^D \text{ and } m \in \bs{l}\}. \tag{44} \end{equation} $$ Similarly there is a Cartesian product grid $ \ts{x} = \bs{x} \times \bs{y} $, where $ \bs{y} = \{y_k\}_{k=0}^{M-1} = 2 \pi k /M $ $$ \begin{equation} \ts{x} = \{ (x_j, y_k) | j=0,1,\ldots, N-1 \text{ and } k=0,1,\ldots, M-1\}. \tag{45} \end{equation} $$ Note that for computing on the Cartesian grids using Numpy arrays with vectorization, the mesh and wavenumber components need to be represented as 2D arrays. As such we create $$ \begin{equation} \bs{\textsf{x}} = (\bs{x}, \bs{y}) = \Big(\{x_i\}_{i=0}^{N-1} \times I^M, I^N \times \{y_j\}_{j=0}^{M-1} \Big), \tag{46} \end{equation} $$ where $ I^N $ is an N-length vector of ones. Similarly $$ \begin{equation} \bs{\textsf{k}} = (\bs{l}, \bs{m}) = \Big(\{ l \}_{l=0}^{N-1} \times I^M, I^N \times \{ m \}_{m=0}^{M/2} \Big). \tag{47} \end{equation} $$ Such Cartesian grids can be very efficiently stored with Numpy arrays, using no more space than the two vectors used to create them. The key to this efficiency is broadcasting. We store $ \ts{k} $ as a list of two numpy arrays, $ \bs{l} $ and $ \bs{m} $, corresponding to the two 1D wavenumber meshes $ \{ l \}_{l=0}^{N-1} $ and $ \{ m \}_{m=0}^{M/2} $. However, $ \bs{l} $ and $ \bs{m} $ are now stored as 2D arrays of shape $ (N, 1) $ and $ (1, M/2+1) $, respectively. And broadcasting takes care of the additional dimension, such that the two arrays work just like if they were stored as $ (N, M/2+1) $ arrays. We can look up $ \bs{l}(l, m) $, just like a regular $ (N, M/2+1) $ array, but the storage required is still only one single vector. The same goes for $ \ts{x} $, which is stored as a list of two arrays $ \bs{x} $, $ \bs{y} $ of shape $ (N, 1) $ and $ (1, M) $ respectively. This extends straightforward to even higher dimensions.

Assembling a weak form like $ (v, \nabla^2 u)_w^N $ leads to two non-diagonal matrices, both the stiffness and mass matrix, since it expands like $$ \begin{equation} (v, \nabla^2 u)_w^N = \left(v, \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} \right)_w^N. \tag{48} \end{equation} $$ Inserting for test function $ v = \phi_{\ts{k}} (= \phi_{l, m} =v_l(x) e^{imy}) $ and trial function $ u = \sum_{(q,r)\in \ts{k}} \hat{u}_{q, r} \phi_{q,r} $, we obtain $$ \begin{align} (v, \nabla^2 u)_w^N &= \left(\phi_{l, m}, \frac{\partial^2}{\partial x^2} \sum_{(q, r) \in \ts{k}} \hat{u}_{q, r} \phi_{q, r} + \frac{\partial^2}{\partial y^2} \sum_{(q,r) \in \ts{k}} \hat{u}_{q, r} \phi_{q, r} \right)_w^N, \tag{49}\\ &= 2\pi \left(\sum_{(q, r) \in \ts{k}} A_{lq} \delta_{rm} \hat{u}_{q,r} - \sum_{(q, r) \in \ts{k}} {r}^2 B_{lq} \delta_{rm} \hat{u}_{q,r}\right), \tag{50}\\ &= 2\pi \left(\sum_{q\in \bs{l}^D} A_{lq} \hat{u}_{q,m} - {m}^2 \sum_{q\in \bs{l}^D} B_{lq} \hat{u}_{q,m}\right) \quad \forall (l, m) \in \bs{l}^D \times \bs{l}. \tag{51} \end{align} $$ As can be seen from Eq.(51), the linear system of equations is set up to act along the Dirichlet direction, whereas for the periodic direction the matrices are diagonal and no additional work is required. The system of equations correspond to a series of 1D Helmholtz problems, that need to be solved once for each $ m \in \bs{l} $. This is what goes on under the hood with the Helmholtz solver imported through from shenfun.chebyshev.la import Helmholtz as Solver.

The right hand side of the Poisson problem is computed as $$ \begin{align} (v, f)_w^N &= 2\pi \underbrace{\sum_{j}\underbrace{\frac{1}{N} \sum_{k} f(x_j, y_k) e^{imy_k} }_{\mathcal{F}_m} v_l(x_j) w_j}_{\mathcal{S}_l} \quad \forall (l, m) \in \bs{l}^D \times \bs{l}, \notag \tag{52}\\ &= 2\pi \mathcal{S}(f) = 2 \pi \mathcal{S}_l(\mathcal{F}_m(f)). \tag{53} \end{align} $$ The TensorProductSpace class can take any number of Fourier bases. A 3 dimensional tensor product space can be created as

N = (32, 33, 34)
K0 = ShenDirichletBasis(N[0])
K1 = C2CBasis(N[1])
K2 = R2CBasis(N[2])
W = TensorProductSpace(comm, (K0, K1, K2))

Here the default behaviour of TensorProductSpace is to distribute the first 2 indices in real space using two subcommunicators, with a decomposition often referred to as pencil decomposition. In spectral space the last two indices will be distributed. For example, using 4 CPUs, a subprocessor mesh of size $ 2 \times 2 $ will be created, and 2 subprocessors share the first index and the other two share the second index. If the program is run with 3 processors, then only the first index will be distributed and the subprocessormesh will be $ 3 \times 1 $. It is also possible to configure TensorProductSpace to run with 4 CPUs and a $ 4 \times 1 $ subprocessormesh, or 40,000 CPUs with a $ 200 \times 200 $ processormesh. The latter requires that the mesh is big enough, though, but otherwise it is just a matter of acquiring computing power. The biggest simulations tested thus far used 64,000 CPUs.

Solving a biharmonic problem is just as easy as the Poisson problem. Consider the fourth order biharmonic PDE in 3-dimensional space $$ \begin{align} \nabla^4 u(\bs{x}) &= f(\bs{x}), \quad \bs{x} \in \Omega \tag{54}\\ u(x=\pm1, y, z) &= \frac{\partial u}{\partial x} (x=\pm 1, y, z) = 0 \tag{55}\\ u(x, y+2\pi, z) &= u(x, y, z), \tag{56}\\ u(x, y, z+2\pi) &= u(x, y, z). \tag{57} \end{align} $$ that is periodic in $ y- $ and $ z- $ directions and with clamped boundary conditions at $ x=\pm 1 $. The problem may be solved using either one of these two bases: $$ \begin{align} V^C &= \text{span}\{T_l - \frac{2(l+2)}{l+3}T_{l+2} + \frac{l+1}{l+3}T_{l+4} , l \in \bs{l}^B\}, \tag{58} \\ V^L &= \text{span}\{L_l - \frac{2(2l+5)}{2l+7}L_{l+2} + \frac{2l+3}{2l+7}, l \in \bs{l}^B\}, \tag{59} \end{align} $$ where $ \bs{l}^B = 0, 1, \ldots, N-5 $. A tensor product space may be constructed as $ W(x,y,z) = V^C(x) \otimes V^p(y) \otimes V^p(z) $, and the variational problem $$ \begin{equation} (v, \nabla^4 u)^N_w = (v, f)^N_w, \tag{60} \end{equation} $$ where $ u $ and $ v $ are trial and test functions in $ W $, may be implemented in shenfun as shown below

from shenfun.chebyshev.bases import ShenBiharmonicBasis, Basis
from shenfun.fourier.bases import R2CBasis, C2CBasis
from shenfun.chebyshev.la import Biharmonic as Solver
from shenfun import inner , div , grad , TestFunction, TrialFunction, project, Dx
from shenfun import Function, TensorProductSpace
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD

N = (32, 33, 34)
K0 = ShenBiharmonicBasis(N[0])
K1 = C2CBasis(N[1])
K2 = R2CBasis(N[2])
W = TensorProductSpace(comm, (K0, K1, K2))
u = TrialFunction(W)
v = TestFunction(W)
matrices = inner(v, div(grad(div(grad(u)))))
fj = Function(W, False)
fj[:] = np.random.random(fj.shape)

f_hat = inner(v, fj) # Some right hand side
B = Solver(**matrices)

# Solve and transform to real space
u_hat = Function(W) # Solution spectral space
u_hat = B(u_hat , f_hat) # Solve
u = Function(W, False)
u = W.backward(u_hat, u)

Note that C2CBasis is used for the Fourier basis along axis 1, whereas R2CBasis is used for the last axis. This is because the right hand side of the Poisson equation, and thus the input data, is real and not complex. Yet, taking the Fourier transform along the last axis, results in a complex array. Hence we need real-to-complex transforms for axis 2 and complex-to-complex (C2CBasis) transforms for axis 1.