The Greatest Theorem

This article is part of a series on The Structure of Theorems. See also:
1. Theorems’ Almanack; 2. The Greatest Theorem; 3. Game of Theorems

Ordinary words can take on new meanings within math. Trivial, for example – though hardly part of our daily discourse – appears frequently, describing something subtly in-between “true because of definitions” and “true for reasons simpler than befitting the seriousness of this situation”. A priori is perhaps more interesting. Though it’s traditionally a philosophical term – describing facts knowable without appeal to the senses but rather to logic alone – in math, further logical investigation fills empiricism’s vacant role. “A priori, we do not know whether the following property holds,” a mathematician might explain. “But upon appeal to further (i.e., logical!) mathematical arguments, we’ll soon find out that it does.”

It should be no surprise, then, that the apparently innocuous words weaker and stronger will prove sufficiently interesting to occupy our attention for the remainder of this essay.

Strictly speaking, that proposition A is stronger than B means that A implies B. That A is weaker than B means the opposite thing: that B implies A.

The statement “You’re reading this post” is weaker than the statement “You’re enjoying this post“.

Why? Because if you’re enjoying this post, you’re definitely reading it too.

You’ll make it to the end of this post” is the strongest of all.

Unless you’re very generous.

Entailments among constituents induce entailments among theorems

Entailment facts about the constituent propositions of theorems can carry over into entailment facts between the theorems themselves. The relationship is tricky. Suppose that A is stronger than B and P is stronger than Q.

When P is stronger than Q, it follows that the implication A P is stronger than the implication A Q. Why? Because the former implies the latter! Intuitively, we’ve established a stronger result using the same hypothesis. More explicitly, we can produce the second implication from the first by appending Q to its end, yielding a chain A P Q. We might say that (P Q) induces (A P) → (A Q).

We can also chain up on the other side. Though A is stronger than B, the implication A P is actually weaker than the implication B P. After all, concluding the result P using the weaker hypothesis B alone is a heftier feat than concluding it using the full power of A. Now, the first implication can be obtained from the second by appending A to its beginning, yielding a chain A BP. We can say that (AB) induces (BQ) → (AQ).

These phenomena can be understood through a simple idea from category theory. When we speak of entailments between conclusions, direction is preserved. Fixing some hypothesis A, for example, entailments between conclusions like PQ carry over to larger entailments like (A P) → (A Q), and P and Q appear in their original order. This transformation could be called covariant. On the other hand, when we fix a conclusion and speak of entailments between hypotheses, direction is reversed. Fixing some conclusion P, the entailment relation A B carries over into the entailment (B P) → (A P), and A and B appear in the opposite order. This relationship could be called contravariant.

Understanding theorems as larger trees in the sense of a previous article, the result gets still more complicated. How do entailment relations among constituent propositions induce entailment relationships among theorems? This is quite complex – and I’m not yet sure whether I fully understand it – so I’ll keep things brief.

We’ll start first with a simple chain. Suppose that X Y. Then the two chains AX P and AY P are both stronger than the disjointed theorem consisting of the two chains AY and X P. Indeed, supposing either of the former, we can prove the latter: in either case one of the two implications is given and the other can be attained using a simple affixation. In short, we’ve either proven something weaker or supposed something stronger. On the other hand, both of these chains are weaker than the disjointed theorem consisting of the two chains A X and YP. The formal reasoning is similar. In this latter case we’ve either proven something stronger or supposed something weaker.

Interestingly, in general there is no entailment relation between the two chains AX P and AY P. Between them, we see a non-monotonic shift, gaining strength in one place and losing it in another.

This extends our previous comments. Removing A or P, we recover the covariant and contravariant examples as special cases.

This idea can be extended to the setting of arbitrary theorem trees. Given a node in a tree and an entailment relation concerning that node, we can produce stronger or weaker trees by severing either the link to its parent or the links to its children and adding to the severed link or to each of the severed links a stronger or weaker node (respectively, respectively, in various orders).

Most generally, consider any connected subgraph of the tree – which is necessarily (but not trivially?…) also a tree – and an implication relation between this sub-theorem and any other theorem. Removing this subgraph, we sever a number of paths between, on the one hand, the parent of the subgraph’s root, and, on the other, the children of its leaves. We can produce stronger or weaker trees, respectively, by pre-appending to the subgraph’s parent or post-appending to each of the subgraph’s children a stronger or weaker theorem (again, respectively, respectively).

If we have multiple entailment relations, we can append different objects to the different children. Furthermore, these constructions all have dual versions in the dualized CNF tree (see this comment.) Finally, it should be noted that all of these modifications result in disconnected trees. The ultimate theorem, in each case, is these trees’ conjunction.

Understanding entailment relations as subset relations

There is another way to view many of these ideas related to strength and weakness. Theorems often engage in the act of drawing subsets. A theorem asserting that condition A implies result P could be understood, alternatively, as the statement that in some certain of objects, the subset consisting of those which satisfy condition A is itself a subset of the subset consisting of those for which result P holds.

I’ve distinguished between propositions which I’ve called conditions (A, B) and propositions which I’ve called results (P, Q). The distinction is instructive. The former are understood as assumptions, and the latter are seen as consequences. In mathematics many theorems assert things like “all objects satisfying the following condition enjoy the following result.” The words necessary and sufficient, in fact – ubiquitous, especially in older texts – conceal an implicit designation of precedence: condition A is necessary and sufficient for result P. The other way around would make logical sense, but would violate standard English usage rules, or this choice of precedence (but not both!). This is a fruitful way to understand mathematics.

Nonetheless, the distinction is somewhat arbitrary. More precisely, it evaporates if we consider logic alone. The chain A B P Q means the same thing, after all, whether we place our imagined partition down the middle or elsewhere.

This subset viewpoint well incorporates our new agnostic outlook. Theorems assert that subset the for which one property holds is a subset of the subset for which another property holds, and we need not distinguish between conditions and results. Necessary-and-sufficient statements argue that the boundaries of two propositions’ corresponding subsets fall identically.

The strongest theorems are those that establish the most containment relations. Suppose as before that A is stronger than (contained in) B and P is stronger than (contained in) Q. In the covariant case, our stronger result asserted not that the subset Q contains the fixed subset A, but rather that the even smaller subset P does, yielding P ⊂ Q. In the contravariant case, our stronger result asserted not that A is contained inside the fixed subset P, but rather that the even larger subset B is, yielding A ⊂ B ⊂ P. These two cases are of course indistinguishable from the logical point of view. (The covariance and contravariance results from fixing one side of the implication and permitting the other to induce entailments.)

In more complex theorems, unions, intersections, and complements play the role of disjunctions, conjunctions and negations, respectively. I’ll leave the reader to imagine what a more complicated theorem might “look like”.

Subtly, this subset visualization technique only works on theorems whose propositions all concern the same set of objects. In particular, theorems involving constituent propositions which are themselves theorems generally can’t be visualized in this way. After all, a theorem about a relationship between properties of objects (for example, A P) is itself not a property of these objects, but rather of the world, or of logic, and it can’t be drawn in as some subset of these objects. (Meta-levels can exist. When each member of one collection of objects gives rise to a further collection, which in turn features properties A and P, we could consider the subset of this first collection of objects consisting of those for which the theorem A → P holds. This is indeed a property of the first collection of objects, and thus could be compared to other properties.) Thus this subset-comparison technique can only be applied in the somewhat restrictive case of statements that are all properties of the same set of objects.

Nonetheless, it is valuable for mathematical understanding.

What mathematicians are after

Mathematicians seek to say as many things as possible about as many things as possible. On the one hand, this means proving the theorems which imply the most other theorems. On the other hand, this means describing in as much detail as possible where exactly various subsets’ boundaries fall.

The previous remarks demonstrate that these two are one in the same.

What makes certain properties’ boundaries especially worth demarcating? Which particular strong statements are valuable, and why? One answer is just that mathematicians simply seem to have an idea of which further pieces of information about a mathematical system would best contribute to the better and fuller understanding of that system. These gaps in understanding invite other projects, which beget other projects, and so on, in a hierarchical network of dependency.

But this begs the question. Which factors motivate this original need for understanding? Where does the arc of mathematical progress come from?

I’m unable to satisfyingly answer this question, and so I’ll rest on the old adage: Truth, Beauty, and Applications.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s