# Combinatorial species

In combinatorial mathematics, the theory of combinatorial species is an abstract, systematic method for analysing discrete structures in terms of generating functions. Examples of discrete structures are (finite) graphs, permutations, trees, and so on; each of these has an associated generating function which counts how many structures there are of a certain size. One goal of species theory is to be able to analyse complicated structures by describing them in terms of transformations and combinations of simpler structures. These operations correspond to equivalent manipulations of generating functions, so producing such functions for complicated structures is much easier than with other methods. The theory was introduced, carefully elaborated and applied by the Canadian group of people around André Joyal.

The power of the theory comes from its level of abstraction. The "description format" of a structure (such as adjacency list versus adjacency matrix for graphs) is irrelevant, because species are purely algebraic. Category theory provides a useful language for the concepts that arise here, but it is not necessary to understand categories before being able to work with species.

The category of species is equivalent to the category of symmetric sequences in finite sets.

## Definition of species

Any structure — an instance of a particular species — is associated with some set, and there are often many possible structures for the same set. For example, it is possible to construct several different graphs whose node labels are drawn from the same given set. At the same time, any set could be used to build the structures. The difference between one species and another is that they build a different set of structures out of the same base set.

This leads to the formal definition of a combinatorial species. Let $mathcal\left\{B\right\}$ be the category of finite sets, with the morphisms of the category being the bijections between these sets. A species is a functor

:$Fcolon mathcal\left\{B\right\} o mathcal\left\{B\right\}.$

For each finite set A in $mathcal\left\{B\right\}$, the finite set F[A] is called the set of F-structures on A, or the set of structures of species F on A. Further, by the definition of a functor, if φ is a bijection between sets A and B, then F[φ] is a bijection between the sets of F-structures F[A] and F[B], called transport of F-structures along φ.

For example, the "species of permutations" maps each finite set A to the set of all permutations of A, and each bijection from A to another set B naturally induces a bijection from the set of all permutations of A to the set of all permutations of B. Similarly, the "species of partitions" can be defined by assigning to each finite set the set of all its partitions, and the "power set species" assigns to each finite set its power set. The adjacent diagram shows a structure on a set of five elements: arcs connect the structure (red) to the elements (blue) from which it is built.

Because a bijection exists between two finite sets if and only if the two sets have the same cardinality (the number of elements), for each finite set A, the cardinality of $F\left[A\right]$, which is finite, depends only on the cardinality of A. (This follows from the formal definition of a functor.) In particular, the exponential generating series F(x) of a species F can be defined: :$F\left(x\right) = sum_\left\{n ge 0\right\} operatorname\left\{Card\right\} F\left[n\right] frac\left\{x^n\right\}\left\{n!\right\}$ where $operatorname\left\{Card\right\} F\left[n\right]$ is the cardinality of $F\left[A\right]$ for any set A having n elements; e.g., $A = \left\{ 1, 2, dots, n \right\}$.

Some examples: writing $f_n = operatorname\left\{Card\right\} F\left[n\right]$,
• The species of sets (traditionally called E, from the French "ensemble", meaning "set") is the functor which maps A to {A}. Then $f_n = 1$, so $E\left(x\right) = e^x$.
• The species S of permutations, described above, has $f_n = n!$. $S\left(x\right) = 1/\left(1 - x\right)$.
• The species T2 of pairs (2-tuples) is the functor taking a set A to A. Then $f_n = n^2$ and $T_2\left(x\right) = x \left(x+1\right) e^x$.

## Calculus of species

Arithmetic on generating functions corresponds to certain "natural" operations on species. The basic operations are addition, multiplication, composition, and differentiation; it is also necessary to define equality on species. Category theory already has a way of describing when two functors are equivalent: a natural isomorphism. In this context, it just means that for each A there is a bijection between F-structures on A and G-structures on A, which is "well-behaved" in its interaction with transport. Species with the same generating function might not be isomorphic, but isomorphic species do always have the same generating function.

Addition of species is defined by the disjoint union of sets, and corresponds to a choice between structures. For species F and G, define (F + G)[A] to be the disjoint union (also written "+") of F[A] and G[A]. It follows that (F + G)(x) = F(x) + G(x). As a demonstration, take E to be the species of non-empty sets, whose generating function is E(x) = e − 1, and 1 the species of the empty set, whose generating function is 1(x) = 1. It follows that E = 1 + E: in words, "a set is either empty or non-empty". Equations like this can be read as referring to a single structure, as well as to the entire collection of structures.

The original definition of the species inspired three directions of investigation.

- On the categorical side, ones needs a larger frame to content both the product and coproduct. The price is the loss of the cycle index.

- Another approach brings in the Burnside rings or rigs. The Burnside summation of representations is a formal notation used in the elaboration of marks tables theory.

- Finally, the usual definition does not take into account the functoriality and the fact that a species, even seen as a rule, is unique. For a rule F there isn't the second rule F to produce a disjoint sum F+F. In this approach the definition of summing is actually a definition by example. The advantage is the natural insertion of the cycle index as the power tool.

### Multiplication

Multiplying species is slightly more complicated. It is possible to just take the Cartesian product of sets as the definition, but the combinatorial interpretation of this is not quite right. (See below for the use of this kind of product.) Rather than putting together two unrelated structures on the same set, the multiplication operator uses the idea of splitting the set into two components, constructing an F-structure on one and a G-structure on the other. :$\left(F cdot G\right)\left[A\right] = sum_\left\{A=B+C\right\} F\left[B\right] imes G\left[C\right].$ This is a disjoint union over all possible binary partitions of A. It is straightforward to show that multiplication is associative and commutative (up to isomorphism), and distributive over addition. As for the generating series, (F · G)(x) = F(x)G(x).

The diagram below shows one possible (F · G)-structure on a set with five elements. The F-structure (red) picks up three elements of the base set, and the G-structure (light blue) takes the rest. Other structures will have F and G splitting the set in a different way. The set (F · G)[A], where A is the base set, is the disjoint union of all such structures.The addition and multiplication of species are the most comprehensive expression of the sum and product rules of counting.

### Composition

Composition, also called substitution, is more complicated again. The basic idea is to replace components of F with G-structures, forming (FG). As with multiplication, this is done by splitting the input set A; the disjoint subsets are given to G to make G-structures, and the set of subsets is given to F, to make the F-structure linking the G-structures. It is required for G to map the empty set to itself, in order for composition to work. The formal definition is:

:$\left(F circ G\right)\left[A\right] = sum_\left\{pi in P\left[A\right]\right\} \left(F\left[pi\right] imes prod_\left\{B in pi\right\} G\left[B\right]\right).$

Here, P is the species of partitions, so P[A] is the set of all partitions of A. This definition says that an element of (F ∘ G)[A] is made up of an F-structure on some partition of A, and a G-structure on each component of the partition. The generating series is $\left(F circ G\right)\left(x\right) = F\left(G\left(x\right)\right)$.

One such structure is shown below. Three G-structures (light blue) divide up the five-element base set between them; then, an F-structure (red) is built to connect the G-structures.These last two operations may be illustrated by the example of trees. First, define X to be the species "singleton" whose generating series is X(x) = x. Then the species Ar of rooted trees (from the French "arborescence") is defined recursively by Ar = X · E(Ar). This equation says that a tree consists of a single root and a set of (sub-)trees. The recursion does not need an explicit base case: it only generates trees in the context of being applied to some finite set. One way to think about this is that the Ar functor is being applied repeatedly to a "supply" of elements from the set — each time, one element is taken by X, and the others distributed by E among the Ar subtrees, until there are no more elements to give to E. This shows that algebraic descriptions of species are quite different from type specifications in programming languages like Haskell.

Likewise, the species P can be characterised as P = E(E): "a partition is a pairwise disjoint set of nonempty sets (using up all the elements of the input set)". The exponential generating series for P is $P\left(x\right) = e^\left\{\left(e^x - 1\right)\right\}$, which is the series for the Bell numbers.

### Differentiation

Differentiation of species intuitively corresponds to building "structures with a hole", as shown in the illustration below.Formally,

:$\left(F\text{'}\right)\left[A\right] = F\left[A uplus \left\{star\right\}\right],$

where $star$ is some distinguished new element not present in $A$.

To differentiate the associated exponential series, the sequence of coefficients needs to be shifted one place to the "left" (losing the first term). This suggests a definition for species: F' [A] = F[A + {*}], where {*} is a singleton set and "+" is disjoint union. The more advanced parts of the theory of species use differentiation extensively, to construct and solve differential equations on species and series. The idea of adding (or removing) a single part of a structure is a powerful one: it can be used to establish relationships between seemingly unconnected species.

For example, consider a structure of the species L of linear orders—lists of elements of the ground set. Removing an element of a list splits it into two parts (possibly empty); in symbols, this is L' = L·L. The exponential generating function of L is L(x) = 1/(1 − x), and indeed:

:$frac d \left\{dx\right\} \left\{\left(1-x\right)\right\}^\left\{-1\right\} = \left\{\left(1-x\right)\right\}^\left\{-2\right\}.$

The species C of cyclic permutations takes a set A to the set of all cycles on A. Removing a single element from a cycle reduces it to a list: C' = L. We can integrate the generating function of L to produce that for C.

:$C\left(x\right) = int_0^x frac\left\{dt\right\}\left\{1-t\right\} = log frac\left\{1\right\}\left\{1-x\right\}.$

A nice example of integration of a species is the completion of a line (coordinatizated by a field) with the infinite point and obtaining a projective line.

### Further operations

There are a variety of other manipulations which may be performed on species. These are necessary to express more complicated structures, such as directed graphs or bigraphs.

Pointing selects a single element in a structure. Given a species F, the corresponding pointed species F is defined by F[A] = A × F[A]. Thus each F-structure is an F-structure with one element distinguished. Pointing is related to differentiation by the relation F = X·F' , so F(x) = x F' (x). The species of pointed sets, E, is particularly important as a building block for many of the more complex constructions.

The Cartesian product of two species is a species which can build two structures on the same set at the same time. It is different from the ordinary multiplication operator in that all elements of the base set are shared between the two structures. An (F × G)-structure can be seen as a superposition of an F-structure and a G-structure. Bigraphs could be described as the superposition of a graph and a set of trees: each node of the bigraph is part of a graph, and at the same time part of some tree that describes how nodes are nested. The generating function (F × G)(x) is the Hadamard or coefficient-wise product of F(x) and G(x).

The species E × E can be seen as making two independent selections from the base set. The two points might coincide, unlike in X·X·E, where they are forced to be different.

As functors, species F and G may be combined by functorial composition: $\left(F ,Box, G\right) \left[A\right] = F\left[G\left[A\right] \right]$ (the box symbol is used, because the circle is already in use for substitution). This constructs an F-structure on the set of all G-structures on the set A. For example, if F is the functor taking a set to its power set, a structure of the composed species is some subset of the G-structures on A. If we now take G to be E × E from above, we obtain the species of directed graphs, with self-loops permitted. (A directed graph is a set of edges, and edges are pairs of nodes: so a graph is a subset of the set of pairs of elements of the node set A.) Other families of graphs, as well as many other structures, can be defined in this way.

## Software

Operations with species are supported by SageMath and, using a special package, also by Haskell.

## Variants

• A species in k sorts is a functor $mathcal\left\{B\right\}^k ightarrow mathcal\left\{B\right\}$. Here, the structures produced can have elements drawn from distinct sources.
• A functor to $mathcal\left\{B\right\}_R$, the category of R-weighted sets for R a ring of power series, is a weighted species.

If “finite sets with bijections” is replaced with “finite vector spaces with linear transformations”, then one gets the notion of polynomial functor (after imposing some finiteness condition).