A better way to write convolution

September 14, 2015 06:12 UTC (Last updated on December 30, 2021 14:11 UTC)

Tags: math

Normally discrete convolution is written as follows:

\[(f * g)(n) = \sum_{k=0}^n f(k)g(n-k)\]

It is not immediately obvious from this that |f * g = g * f|. Consider this alternate representation:

\[(f * g)(n) = \sum_{j+k=n} f(j)g(k)\]

This formula is obviously symmetric in |f| and |g| and it clarifies an important invariant. For example, to multiply two polynomials (or formal power series) we convolve their coefficients. This looks like:

Let |P(x) = \sum_k a_k x^k| and |Q(x) = \sum_k b_k x^k| then

\[P(x)Q(x) = \sum_{j+k=n} a_j b_k x^n\]

This emphasizes that the |n|th coefficient is the product of the coefficients whose degrees sum to |n|. (You may recognize this as the convolution theorem.)

There is one subtle difference. In this alternate representation, it very much matters what the domain of |j| and |k| are. If |j| and |k| are naturals, we get what we started with. But if they are integers, then we have:

\[\sum_{j+k=n} f(j)g(k) = \sum_{k=-\infty}^\infty f(k)g(n-k)\]

In many combinatorial situations this makes no difference and may even be preferable. If, however, you are doing something like a short-time Fourier transform, then you probably don’t want to include an infinite number of entries (but you probably aren’t indexing by |\mathbb Z|). Personally, I view this as a feature; you should be clear about the domain of your variables.

Let’s prove that |*| is associative, i.e. |((f * g) * h)(n) = (f * (g * h))(n)|. Expanding the left hand side we get:

\[\sum_{m+k=n} \sum_{i+j=m} f(i)g(j)h(k) = \sum_{i+j+k=n} f(i)g(j)h(k) = \sum_{i+m=n} \sum_{j+k=m} f(i)g(j)h(k)\]

and we’re done. It’s clear the associativity of convolution comes from the associativity of addition and distributivity. In fact, the commutativity also came from the commutativity of addition. This suggests a simple but broad generalization. Simply replace the |+| in the indexing with any binary operation, i.e.:

\[(f * g)(n) = \sum_{j \oplus k=n} f(j)g(k)\]

Usually |\oplus| is taken to be an operation of a group, and this makes sense when you need to support the equivalent of |n-k|, but there’s really nothing keeping it from being a monoid, and the unit isn’t doing anything so why not a semigroup, and we only really need associativity if we want the convolution to be associative, and nothing requires |n| to be the same type as |j| or |k| or for them to be the same type as each other for that matter… Of course having a group structure reduces the often sticky problem of factoring |n| into all the pairs of |j| and |k| such that |j \oplus k = n| to the problem of “simply” enumerating the group.

As an update: Conal Elliott uses a similarly generalized notion of convolution in Generalized Convolution and Efficient Language Recognition. In particular, the generalization using an arbitrary binary function |\oplus| allows things like heterogeneous index types. For example, if we want to convolve size-indexed arrays, we have a problem, but we can finesse it by viewing Vec n x as Fin n -> x. With traditional convolution, we’d still have a problem since (+) requires its arguments to be the same type and the type of the result, so not Fin n -> Fin m -> Fin (m + n - 1). This isn’t a problem for |\oplus| though.

Here’s some code for the free monoid (i.e. lists) case:

import Data.Monoid
import Data.List

class Monoid m => CommutativeMonoid m
instance Num a => CommutativeMonoid (Sum a)

-- Overly generalized convolve
ogconvolve :: (CommutativeMonoid m) => (a -> b -> m) -> (n -> [(j, k)]) -> (j -> a) -> (k -> b) -> n -> m
ogconvolve mul factor f g n = mconcat (map (uncurry (\j k -> mul (f j) (g k))) (factor n))

gconvolve :: Num m => (n -> [(n, n)]) -> (n -> m) -> (n -> m) -> n -> m
gconvolve factor f g n = getSum (ogconvolve (\a b -> Sum (a*b)) factor f g n)

lconvolve :: Num m => ([a] -> m) -> ([a] -> m) -> [a] -> m
lconvolve = gconvolve (\as -> zip (inits as) (tails as))

You may have noticed that we never actually use |\oplus| anywhere. It’s only significance is to define the valid factorizations. That is, there’s an implicit constraint in the above code that

all (\(j,k) -> n == j `oplus` k) (factor n)

and also that

factor n == nub (factor n)

i.e. no duplicates. (Conceptually, factor spits out a set, indeed the graph of the relation |(j,k)\mapsto j \oplus k=n|.)

The commutative monoid constraint and mul function were just for the heck of it, but why a commutative monoid and not an arbitrary monoid? Well, we don’t want the result to depend on how factor spits out the factors. In other words, if it actually did return a set then we can’t depend on the order of the elements in the set if we want a well-defined function.

Here’s where I tell you that lconvolve is an important function for something. I suspect it or a variant probably is, but I don’t know. Here’s another monoid case, commutative this time, that definitely is important and interesting: Dirichlet convolution.

Here’s the typical bad definition:

\[(f \star g)(n) = \sum_{d\mid n} f(d)g\left(\frac{n}{d}\right)\] where |d\mid n| means |d| evenly divides |n|.

These are the coefficients of the product of two Dirichlet series. For example,

\[\begin{gather} F(s) = \sum_{n=1}^\infty f(n)n^{-s} \qquad G(s) = \sum_{n=1}^\infty g(n)n^{-s} \\ F(s)G(s) = \sum_{n=1}^\infty (f \star g)(n)n^{-s} \end{gather}\]

We again see a situation where the correspondence doesn’t just jump out at you (or at least it didn’t to me originally) until you realize the above sum can be written:

\[(f \star g)(n) = \sum_{ab=n} f(a)g(b)\]

then it’s pretty easy to see that it is right. We can start doing a lot of number theory with the above combined with the Euler product formula:

\[\sum_{n=1}^\infty f(n)n^{-s} = \prod_{p\ \mathrm{prime}} \sum_{n=0}^\infty f(p^n)p^{-ns}\]

where |f| is multiplicative, which means |f(ab) = f(a)f(b)| when |a| and |b| are coprime. (The real significance of |f| being multiplicative is that it is equivalent to |f| being completely determined by its values on prime powers.) It turns out it is surprisingly easy and fun to derive various arithmetic functions and identities between them. Here’s a brief teaser.

|1(n) = 1| is a multiplicative function and this gives:

\[\zeta(s) = \sum_{n=1}^\infty n^{-s} = \prod_{p\ \mathrm{prime}} \frac{1}{1-p^{-s}}\]

where we’ve used the formula for the sum of a geometric series.

So |\frac{1}{\zeta(s)} = \prod_{p\ \mathrm{prime}} 1-p^{-s} = \sum_{n=1}^\infty \mu(n)n^{-s}|. The question is now, what is |\mu|? Well, all we need to do is say what it is on prime powers and comparing the Euler product formula to the above we see, for prime |p|, |\mu(p) = -1| and |\mu(p^n) = 0| for |n > 1|. This is the Möbius function. Because |\zeta(s) \cdot \frac{1}{\zeta(s)} = 1| we have |(1 \star \mu)(n) = \sum_{d\mid n} \mu(d) = 0| for |n > 1|. Continuing in this vein leads to analytic number theory and the Riemann hypothesis, though you may want to pick up the Mellin transform along the way.