This is (a slightly modified version of) a text message exchange which recently occurred between my violin teacher and me.
- Teacher: “Can we meet today instead of tomorrow?”
- Me: “That’d be great!”
- Teacher: “Cool, see you this afternoon.”
- Me: “Ok.”
It would not have been acceptable for me to fail to respond to my teacher’s message (1). If I didn’t respond, my teacher would have no way to know whether I ever received her message – and, hence, whether to come today or tomorrow.
Neither would it have been ok, for that matter, for my teacher to let the conversation end at message (2). Until I receive her confirmation (3), I can in no way be sure whether she has seen or acknowledged my message (2). In other words, with her message (3) unsaid, it could remain the case, for all I know, that my teacher, as yet unaware of my response (2), imagines me unaware of (1) and still intent to come tomorrow.
Even after I received my teacher’s message (3), though, it was important for me to send the further message (4). After all, until she receives my message (4), my teacher may well imagine me unaware of her message (3). In that situation — her thought process might go — I would, unaware of her confirmation (3), be liable to suspect her unaware of my response (2), and hence unsure of my receipt of (1), and so liable to come tomorrow.
Why doesn’t this continue?
- Teacher: “Great.”
- Me: “Yep.”
- Teacher. “Indeed.”
The situation is very complicated.
Significance in game theory
This sort of situation occurs, to begin with, in the presence of a coordination game where the cost of failure to coordinate is high. (Who wants to take the bus all the way to down to Peabody for nothing?) In this situation, though, the interesting behavior is rooted in communication. In a very subtle way, it is difficult for both parties to agree upon a change in strategy. Observe, for example, that this phenomenon is particularly pronounced whenever communication is unreliable, opaque, or intermittent. Indeed, the latency of communication is key.
|Today||10, 10||0, 0|
|Tomorrow||0, 0||8, 8|
Before either of us — my teacher or I — might be willing to change times, this person must be sure that the other party intends to do the same. This first person’s confidence in the prospect that the other party will do so, however, depends in a crucial way on this player’s conviction that the other party believes that the first player will change. This imagined belief on the part of the other party, in turn, depends on the imagined presence of a further corresponding belief in the first player, and so on. Each level of confidence depends on the next. This continues infinitely.
Is this just idle speculation? Probably. On the other hand, if we set the stakes high enough, then this phenomenon, I contend, can become very real. Let’s imagine, indeed, that our coordination game imposes very high costs on acting alone. I don’t want to show up when my teacher is not there; suppose in fact that the shuttle to Peabody requires tickets, of which I have only one. I certainly don’t want my teacher to arrive without me; that would be rude, and makeup lessons aren’t available, as it turns out. Finally, suppose that my recital is this weekend.
|Today||10, 10||-1000, -1000|
|Tomorrow||-1000, -1000||8, 8|
To emphasize the role of communication, furthermore, let’s also assume that my violin teacher is often unable to check her texts, and that she, due to her musical obligations, frequently turns her phone on silent. (None of this is true.)
In this situation, we might imagine that the conversation initially described above could continue well past stage (4).
We’ll see that this phenomenon can be explicated in a quite transparent, formal way.
We’ve seen that to act with confidence requires an infinite chain of affirmations on the part of the two players. This chain, however, can only ever be extended to finite length. The first absent element will create cause for uncertainty.
Take a sequence of messages (1), (2), …, (n), as above, consisting of a proposal for change followed by a series of affirmations. The inevitable result is that:
The recipient of (n) fears that the recipient of (n-1) fears that the recipient of (n-2) fears that… the recipient of (1) fears that the sender of (1) worries that (1) was never delivered.
Here is an outline of the inductive process that leads to the above belief.
- To begin with, the recipient of (n) fears that the sender of (n) worries that (n) was never delivered.
- In fact, suppose that the recipient of (n) fears that the recipient of (n-1) fears that… the recipient of (i) fears that the sender of (i) worries that (i) was never delivered.
- Because message (i) serves to confirm the receipt of message (i-1), and because the recipient of (i) fears that the message’s sender doubts its successful delivery, the recipient of (i) begins, according to this nested chain of beliefs, to postulate an additional fear, whereby the sender of (i) — who, incidentally, received message (i-1) — fears that the sender of (i-1) in turn doubts message (i-1)’s successful delivery.
- In other words, the recipient of (n) fears that the recipient of (n-1) fears that… the recipient of (i-1) fears that the sender of (i-1) worries that (i-1) was never delivered.
- Eventually, we deduce from the initial hypothesis that, unfortunately, the recipient of (n) fears that the recipient of (n-1) fears that… the recipient of (1) fears that the sender of (1) worries that (1) was never delivered.
Hence the eventual transmission of message (n+1). The result of this transmission is, of course, that
The sender of (n) knows that the sender of (n-1) knows that… the sender of (1) knows that (1) was delivered.
Indeed, after message (n+1) is received,
- The sender of (n) now knows that (n) was delivered.
- In fact, suppose that the sender of (n) knows that the sender of (n-1) knows that… the sender of (i) knows that (i) was delivered.
- Because message (i) serves to confirm the delivery of message (i-1), in this nested chain of beliefs, the sender of (i) now also knows that the sender of (i-1) knows that (i-1) was delivered.
- In other words, the sender of (n) knows that the sender of (n-1) knows that … the sender of (i-1) knows that (i-1) was delivered.
- By induction, we deduce that the sender of (n) knows that the sender of (n-1) knows that… the sender of (1) knows that (1) was delivered.
But the transmission of (n+1) just adds to this regression one further level of depth! Here’s the problem:
- The recipient of (n+1) fears that the sender of (n+1) worries that (n+1) was never delivered.
Now the first induction begins again in the mind of the recipient of (n+1).
Practical factors in achieving certainty
How is certainty achieved? It’s not clear that it ever is – through text message, at least. Confidence does grow as the conversation lengthens, converging, in some sense, to surety. In practice, the two communicators will simply achieve a level of certainty sufficient for them to stop. (For us, this occurred at message (4).)
This convergence occurs for a number of reasons. Firstly, as the number of messages exchanged increases, the relevant fears become more intricate and consequently less plausible. Furthermore, an increase in conversation length – and more importantly, response speed – renders the fear of missed messages less compelling. If a number of messages are exchanged in short succession, each party may reasonably take the other to have his/her phone on hand, and the probability that the most recent message awaits unread becomes quite low.
Another solution – from outside, so to speak – is the read receipts featured in many more sophisticated messaging clients; these indicate to a message’s sender that the message has been read. The fact that a message has been read does not provide conclusive information about the reader’s response. In certain situations, though, it can provide crucial clues. In particular, if, in a conversation such as that described above – and, in particular, in one of its later stages – the recipient of message (n) understands its sender to have received a read receipt, then he/she may directly bypass the cycle of worry which lead to the transmission of message (n+1). (If a change of plans were afoot, for one, the recipient would have spoken up, and a read receipt in this special case can be understood as an acknowledgement.)
There’s one crucial flaw, though. The message’s recipient might worry that the message’s sender hasn’t actually seen the read receipt. This, in fact, is the same sort of worry which set off the above cascade to begin with. Read receipts offer no solution at all, and actually just recast the earlier problem in a different form. We would need receipts of the receipt of read receipts, and so on.
Even setting this aside (assuming, that is, that all receipts are seen), there are further difficulties. What if the read receipt were generated accidentally or incorrectly? This prospect would decrease the faith on the part of the message’s recipient that its sender, upon receiving the receipt, would cease to worry about the message’s successful delivery. What if one party took the other to be the sort of rude person who, despite an intent to change or cancel plans, might read a message and fail to respond to it? This prospect, the recipient might fear, could impair the disposition of the sender to accept the read receipt as a confirmation by situational inference alone. These uncertainties are of a different sort, though, and in any case read receipts could speed the process of convergence.
One final possibility is in-person communication. The process of eye contact shortens, in effect, the messages’ latency time to 0, and instantly creates an infinite chain of certainty.
Handshaking in computer science
In computer science, a client and a server seeking to open a channel for secure communication must undergo a process called handshaking. In the TCP (Transmission Control Protocol) three-way handshake, for example:
- The client sends a synchronize message with a random number A.
- The host sends back the synchronize-acknowledgement number, A+1, together with a second random number, B.
- The client sends back the synchronization acknowledgment B+1, to which the host need not reply.  
The numbers A and B, by analogy, might represent the two parties’ respective strategies in some sort of large coordination game. And yet if the proposed channel represented a change from a previous pair of numbers — or from no channel at all — we might ask why the problems discussed above wouldn’t continue to exist here.
The three-way handshake offers a subtle perspective on our problem. The computers declare in advance, as a matter of norm, to cut short the conversation after three steps. (Though the number three here is essentially arbitrary, it permits the communication of both parties’ parameters as well as the recognition on the part of each party that the other party received its parameter.) If things were to proceed as our earlier examples did, then the host might now begin to wonder whether the client is fretting about the successful delivery of his acknowledgement B+1 (and so on). With the handshake norm in place, though, the host need not entertain any such worry — the host understands that, by protocol, he is not expected to reply.
This might seem like an artificially imposed solution. The fact that the host is forbidden from responding does nothing, in the host’s mind, to allay his worries that the client, doubting the host’s successful receipt of the acknowledgement B+1 (and so on), fears the latter’s failure to consider the handshake complete. After all, in the client’s mind (the host imagines), the host’s failure to respond could be a result of protocol or of a failure to receive the message in the first place.
Here’s the thing. The host can instead open the connection immediately after receiving the client’s initial request A. Though the host, as we’ve already mentioned, might imagine, after receiving the final confirmation B+1, that the client wonders whether the confirmation was received, such wondering, if it existed, would be inconsequential on the part of the client, who now has the right to expect — by protocol — that the host has already opened the connection by the time the first confirmation A+1 arrives.
Now it might appear that we’ve just traded this problem for a simpler one. After all, to demand that the host open the connection immediately after receiving the message A is to ask that the host open this connection at a time when, for all the host knows, the client has yet to receive A+1 and B at all and might never receive them. In other words, we’re demanding that the host open the connection using A before being sure that the client will open the connection using B. This seems to violate the “coordination game” philosophy we’ve developed — this demand, according to the standard analysis, represents a significant risk for the host.
The solution might be a disappointing one. This really is not a coordination game. Indeed, we can safely suppose that the host, in offering a connection which might ultimately go unused, has little to lose. The crucial problem hasn’t gone anywhere! (It’s possible that I’m understanding the handshake incorrectly. It’s not clear to me whether the host actually initiates the connection after step 1 or step 3. In any case, each of these strategies has issues, as I describe above.)
This seems to be an unavoidable difficulty in the theory of coordination games.