The World-Builders

This article is part of a series entitled The Unlimited Mind. See also:
1. On Memory; 2. The Genius Within; 3. The World-Builders

I’ve been fascinated with expertise since childhood. And it started over the chessboard. My dad would beat me—swiftly, crushingly, and above all, effortlessly—time and time again. He understood lines and positions in a way that I just couldn’t, and, as it seemed to me, would never be able to. My first question at the end of most games was: “where did I go wrong?”

globe-chess

Chess has served as a popular topic of study for those seeking to understand expertise.

Almost more unnerving than my dad’s ability was the fact that there were people out there who could, just as easily, beat him. “In college in Russia, I played a classmate of mine, who was a master,” he told me once. “I would think all night about my move, and then the next day in class, he’d move right away. Still, he beat me easily.”

Thus my interest in expertise was born. It seemed that some just had some sort of divine gift, which beckoned them onto a higher plane of understanding. For me to attempt to reach those heights would be futile. I could only watch in awe from below.

As I grew older, my skills improved. My games with my dad grew stricter and cleaner, until, one day, I beat him. In time, whether I won or lost, I was always able to give him a fair fight. I came to appreciate chess as an incredibly rich and rewarding game.

But my view of expertise—now that I had a taste of it—had lost a bit of its sparkle. We’ve all heard that 10,000 hours of practice can make an expert (1). I myself was approaching expertise, and it wasn’t by virtue of some sort of visitation. Expertise was, it seemed, simply a matter of paying dues.

Is 10,000 hours all there is to it? I sought to understand expertise further—even at the risk of growing to admire it even less.

Practice time certainly does matter. In his seminal paper The Role of Deliberate Practice in the Acquisition of Expert Performance, Ericsson studies violinists at the Music Academy of West Berlin. Students from a single department were classified as “best violinists” and “good violinists” by their professors.

10000hrs

Students training to become music teachers, from the department of music education, were also included in the study, as were professional violinists from one of Berlin’s top two symphony orchestras.

There’s a clear correspondence between ability and hours practiced. By 18, the best violinists had practiced 7,410 hours on average, compared to the good violinists’ 5,310 (1). By age 20, the best students had largely reached the revered 10,000 hours. Maybe practice really is all there is to perfection.

The 10,000 hours figure shows up elsewhere. Chase and Simon estimate that chess masters have spent between 10,000 and 50,000 hours studying chess positions (2). What’s being achieved during these many hours? Chase and Simon go on to claim that chess masters have amassed stores of roughly 50,000 patterns. Experts are studying long hours to build massive vocabularies.

This finding, though, leads to what Ericsson and Staszewski (3) call “the paradox of expertise”: how do experts so quickly and effectively process the massive amounts of information they store? My dad’s college classmate moved instantly. If he did in fact store ten times the information, one would expect him to take ten times longer to find a move. Experts, though, aren’t just more effective than novices. They’re also quicker.

Ericsson and Staszewski propose that experts don’t just amass memory; they build skilled memory, where knowledge is stored in complex hierarchical structures. They study a runner named DD, who, after several years of practice, attained a digit span of 106 digits, the highest ever recorded. And, in examining the strategies he used to memorize this amount of digits, they found that his storage schemes were not linear, but rather strictly hierarchical.

Hierarchy

A 100-digit string was broken into supergroups, each of which contained several small strings. Each small string was coded as a running time, age, or date. Each supergroup, then, might be coded as a certain runner, of a certain age, who achieved a given time on a given date.  In this way, strings of random numbers were given meaning. Only by drawing meaning from disorder was DD able to recall a hundred digits at a time.

Of course, DD’s meanings were meaningful in a superficial sense. Each time a new 100-digit string was assigned, DD produced new runners, ages and dates. But in the case of chess, the same patterns are seen over and over, and eventually they develop nontrivial meaning. And once patterns develop meaning, they’re organized into the hierarchical structure according to their meaning and significance.

The difficulty of the task may lie in knowing what’s meaningful in the first place, and in knowing which hierarchical arrangements are instructive and which ones aren’t. But, once a simple scheme is built, the expert-in-training may use this scheme to develop a better understanding of the terrain, which he can then use to build more-sophisticated schemes. The process can then repeat itself. Through this procedure, the chess player builds much larger, deeper, and more complicated structures than the mnemonist. As DD learned to memorize 100 meaningless numbers, so the chess player memorizes 50,000 meaningful patterns.

This hierarchical tree allows the chess player to easily and quickly make decisions. In a game, an expert might see a position, which he would then associate with a given node in his tree. This node might correspond to the move white moves pawn to f5, in the context of black’s Sicilian Dragon opening, for example. An expert could quickly accept or disqualify that move, because he has easy access to the information below that node.

Underneath the expert’s f5 node, of course, lies much more detailed information: the fact that f5 unduly gives up control of e5, which is held by black’s black-square bishop in the Dragon defense, for example. Or the fact that white, by playing f5, allows black the opportunity to open up the f-file, if he so pleases, which extends the range of his own rook, which is favorable for black. And under each of these nodes lies more detailed, more fundamental information. Why is it bad for white if black opens up his rook file? Because the rook covers more squares, and can target pieces on white’s back rank. Under these nodes lies information even more fundamental. Why is it good for a rook to cover more squares? Because the piece has better control over the board. Perhaps we’ve descended to the level of the axiom, here. Of course, the expert doesn’t concern himself with any of this information. It’s modularized under the much-higher pawn to f5 is bad for white if black played the Sicilian Dragon node. The expert works with largely with the f5 node alone, only quickly dipping into what’s underneath it if he needs to. And herein lies the power of working with a well-built skilled memory tree: the expert can apply his short-term memory towards larger, more abstract tasks, in effect automating the smaller, more rote tasks.

And nodes exist even higher than the f5 node. Above this, for example, might be the much-larger idea that the Sicilian Dragon is a battle for the black squares and for the long, a1-h8 diagonal. In fact, it’s probably this node–the highest node we can come up with–that the expert truly works with. This way, instead of having thousands of things on his mind, he has perhaps just one. He reduces the entire, complicated game of chess to a single feeling.

This phenomenon explains why Capablanca, when asked how many moves he thinks ahead, replied:

I see only one move ahead, but it is always the correct one.

It also explains why the violinist can pour his heart out into a solo piece, or how the athlete can act largely on instinct. The smaller tasks–like putting one’s fingers in the right place on the fingerboard, or grasping the football along the laces–are automated, living in nodes far below that one with which the violinist or athlete is working. Instead, an expert works with much larger ideas. “Mozart intended for this piece to feel dance-like, and I want to convey that to the audience,” the violinist might say. Or, “I’ll be able to hit my receiver just past the 70-yard line, assuming my guards protect me in the pocket.” Experts and novices likely process information equally-fast, but experts are working with much-larger ideas, at the tops of their masterfully-built skilled memory trees.

Skilled memory allows the expert to overcome the paradox of expertise.

Skilled memory, then, is at the heart of what it means to be an expert. Experts aren’t just committing to memory lists of 50,000 units. They’re building massive hierarchical structures, where one travels not just forward and back, but, over, down, around, past, through, and, most importantly, up. The expert’s skilled memory tree isn’t a list; it’s an entire world, writhing with texture and structure, and the expert sits at the top of it. Experts aren’t list-learners; they’re world-builders. And the worlds they build are fascinating indeed.

ChessPanels

Only an expert can fully grasp the meaning of the position shown in panel a. To a novice, panels a and b are equally nonsensical.
In this video, grandmaster and two-time United States champion Patrick Wolff performs, and describes how he performs, the chess position memorization task.

In a way, the idea of world-building puts the mystique back into expertise. Sure: only through long, strenuous practice can an expert amass his knowledge. But this doesn’t mean that his practice is rote and mindless. Instead, he’s entering unknown terrain, searching for meaning, and using the meaning he finds to build a complex world that only he and fellow experts can access. I’ve got a bit of a chess world built up inside my own brain, and that’s something special. Meanwhile, I may be able to see the hours the grandmasters spend, but I certainly can’t see the worlds they build. Just because expertise is built with time doesn’t mean it’s not fascinating.

We can hope that, by practicing skills in world-building, we can more easily achieve expertise. In my graduate medical studies, my classmates and I are tasked with learning incredibly massive volumes of information. Surprisingly, though, I’ve heard very little talk about how best to assimilate, understand, and eventually commit to memory that information. And I have encountered quite striking instances of my classmates memorizing information using a list-learning approach, where that information could be much better and easier understood with a world-building approach. I hope to focus more on world-building in the context of my studies, and to share my ideas with my classmates when possible, so that we might all more easily attain the expertise required to practice medicine.

And, apart from expertise, we can use world-building to make the task at hand more enjoyable. I, for one, would much rather head to the library to build a world than to study a list.

  1. The Role of Deliberate Practice in the Acquisition of Expert Performance by Anderssen, et. al.
  2. Skill in Chess by Chase and Simon
  3. Skilled Memory and Expertise: Mechanisms of Exceptional Performance by Anderssen and Staszewski
Advertisements

9 comments on “The World-Builders

  1. Richard says:

    Fascinating metaphors, Josh. I think the image of world-building seems apt, though maybe for reasons less metaphorical than the ones you give. I’ve read some stuff on memorization techniques and I suppose we’ve all encountered the notion of a mind map – certainly a rather hierarchical way of patterning one’s knowledge. But I’ve also heard of visualization techniques for memorizing less structured/connected information than that of a chess game. For example, some people can memorize upon command the order of a randomly shuffled set of playing cards by associating each card with features of some imagined or real physical space familiar to them (e.g. one’s house or school). Ironically, I suppose it’s easier to see how that sort of memorization resembles ‘world-building’.

    The point I think you are most interested in pressing in this piece is that the glamour/mystique of chess-memory is not reduced simply because it is something built out of the repetitive (perhaps dull) task of repeated practice. To this I profess only reservation. You say: “Experts aren’t list-learners”. But I think all you mean is that experts aren’t linear learners. And yet, a list, presented linearly, may still encode hierarchically structured information. The question is: why is a list whose lines are themselves lists any less of a list for that?

    Eventually, the expert’s familiarity with the rote aspects of the game might be outsourced to ‘automatic’ cognitive processes, but the rote aspects are no less a part of the overall expert-process for that. Nor are they any less learned. Let me throw you a few metaphors of my own. The tree-structure diagrams are instructive, for if real trees grew upside-down, they would not stand without their uppermost branches. So too the hierarchical structures of a chess-memory would be ‘meaningless’ without their terminal nodes – established through the hum-drum specifics of rote and routine. Contrast photography. Photographic art is no less art just because it is a product of the rather boring dots of colour we call ‘pixels’. Why? Because the precise, non-artistic vocabulary of individual pixels is not at the same ‘level’ of description as artistic language – and so the artist is not held accountable for that vocabulary. However, with a game of chess, as you seem to have demonstrated, an expert may, in principle, be expected to account for the quality of his work all the way down to the pixelated specifics of the game. This rudimentary habit-like understanding is the foundation of his expertise. But the photographer need not know optics.

    Maybe there is no in-principle difference between artistic talent, like that found in photography (or perhaps even mathematics), and expertise of the sort we find in chess. But – at the moment at least – machines cannot out-perform us in art (or in advanced mathematics, for that matter).

    There’s a bit of conceptual engineering that could be useful here, in general. Epistemologists commonly distinguish knowledge-how from knowledge-that (propositional knowledge from procedural knowledge). If you could successfully argue that the lower level rote-learned stuff is outsourced to procedural knowledge, so that the chess expert may not always be fully expected to account for his moves (the way a football player is not expected to know the mechanics of his successful throw/catch), then perhaps you could put some of the mystique back into the game – even though hours of practice would still be essential for expertise. However, some philosophers have argued that know-how/know-that distinction is misleading and that all knowledge is really propositional.

    (See this, in particular:

    http://www.thatmarcusfamily.org/philosophy/Course_Websites/Readings/Stanley%20and%20Williamson%20-%20Knowing%20How.pdf)

    The general spirit of what you’ve written is heartening though, and I agree to whatever extent possible. That a thing is simple or built out of an unglamorous process of repeating structures does not *have to* slight its wonder. Yet, in general, it does. Maybe there are ways of looking at the world where nothing is boring. But, as a mere human, I struggle to imagine them clearly. ‘Ome ignotum pro magnifico’ – most people interpret that as entailing that the relatively straightforward or understandable is somehow less wonderful. But, then, I suppose most people are sheep. “Baa”

    • Josh says:

      Richard, thank you very much for your thoughtful comment. It’s always nice to know that someone is reading these.

      I agree that my statement that “experts aren’t list-learners” is problematic. Rather, they are learning lists, but the structure of their memory is hierarchical, or, their memory is, at least, imbued with meaning beyond that which is apparent to a non-expert who views the same list. In other words, when I and an expert view the same list, I see a list, but the expert sees a world.

      Your comments on the “terminal nodes” are instructive, but, as I’ll point out, perhaps beside the point. In chess, I think it is true that the expert is accountable for, and knowledgeable of, the terminal nodes. It’s also true, though, that the artist need not know optics. To employ another example, the expert in medicine need not know physics. I study concepts in medicine, which are grounded in biology, which in turn are grounded in chemistry, which are in turn grounded in physics. Still, though, I don’t concern myself much with physics. I certainly am not knowledgeable of quarks and gluons, although I do have some basic knowledge in protons and electrons. Nevertheless, my terminal nodes tend to lie somewhere in the realm of biochemistry. And, this does just fine for me. So, in this case, just like in photography, the tree “stands” without its terminal nodes.

      And this does tell us a lot. The important factor, I think, is not whether the chess expert does or does not grasp concepts low in the tree. It’s that the chess expert, much like the artist or the skilled physician, does grasp concepts high in the tree. It’s these concepts that are truly valuable, and interesting, and that should be targets of envy for the non-expert.

      I actually agree that the knowing that/knowing how distinction is probably without substance. Actually, a few years ago I was convinced of this fact by this article, which was written by the same Jason Stanley, along with a neurologist. I’ll read Stanley and Williamson’s paper as soon as I get the chance.

      But even if we abandon the distinction between between that and how, though, we should still revere the expert, not because he does or does not understand rote ideas (which may be matters of that, how, neither or both), but because he does understand lofty ideas. For the same reason, we should find interest in those fields in which we ourselves can claim some degree of expertise. If I myself become bored in a given field of study, I try to look away from the low nodes, and towards the high ones. This always helps spark some degree of interest.

  2. Ben says:

    I’ll try to describe a potential objection to Ericsson’s paper that we’ve discussed over the phone. I’m no statistician here, and I’m not firm on the math; neither am I an expert in expertise. Nonetheless, here goes. Let’s first assume, as I think is safe, that expertise follows a “power law” distribution, in the sense that given any pool of contenders (chess players, violinists, athletes, etc.) the graph which associates to the ordinal ranking of each contender that contender’s skill somehow measured is exponentially decreasing and self-similar at all scales. This means is that expertise is concentrated in a “multiplicative” way among exponentially smaller subsets of the population. 10% percent are very good, 1% are extremely good, .1% are experts, etc.; also, the respective restrictions of the original graph to each of these subsets look the same, after rescaling.

    It follows from this that given any partitioning of the population into groups (recall Ericsson partitioned violinists into “good” and “best”), each of these groups will, within itself, be dominated proportionally by members who are (relatively) less-expert. When we then take the group’s average, the result will reflect the characteristics mainly of the group’s comparative non-experts.

    This could make Ericsson’s averages misleading. Among the ten “best” violinists, perhaps one is a truly fine violinist. The average among the group’s practice times, though, will proportionally reflect rather those of the nine remaining violinists who, though very good, must, like the rest of us, practice.

    To quote Taleb’s The Black Swan:

    In Extremistan, inequalities are such that one single observation can disproportionately impact the aggregate, or the total.

    The result is that Ericsson’s study is really an extreme-observation-smoothing artifice. Take the average, a linear function, over a set of values which varies exponentially. Then see what you get. Taleb’s book warned me against such things years ago.

    It would seem that natural disposition to expertise follows a power law distribution, and once you partition a group practice times “follow” in such a way to give the false impression that increased practice times are causing the differences in skill.

    Witness Wei Yi (the 16-year-old Chinese grandmaster)’s recent stunning victory over Lázaro Bruzón Batista in the Danzhou Super-GM tournament. Wei Yi plays a game which led Josh and I to characterize him as a “cyborg”. Wei Yi became a grandmaster at 13. At 8, he drew grandmaster Zhou Jianchao.

    Has Wei Yi, who will probably become a world champion, spent many hours practicing? Probably.

    • Josh says:

      I’ve done an interesting experiment to try to figure out what’s really going on here in this nature/nurture dilemma.

      I created my own “music school”, with 100 students. These students are named by i = 1, 2, …100. To each of these students, I assigned an innate ability, given by innate ability = exp(i/10).

      I then assigned each student a random weekly practice time, between 1 and 10 hours per week.

      I then calculated each student’s manifest ability, which I found with manifest ability = innate ability * weekly practice time.

      I then plotted the role of both practice time and innate ability, respectively, on manifest skill. We can see that both play a role.

      Layout4

      I can reproduce Ericsson’s finding, too. Among the 10 students with greatest manifest skill, average weekly practice time is 8.10 hours, compared to 5.69 hours among the bottom 90. Innate ability matters as well. Among the top 10 students, mean innate ability is ~37,900, compared to 2,800 among the bottom 90.

      An important note is that my choice of innate ability = exp(i/10) was a bit arbitrary.

      I assumed, as Ben did, that innate ability might be allotted according to an exponential function. However, innate ability = exp(i) doesn’t work. The function grows so quickly that each student’s innate ability is much greater than the last’s. So, no amount of practice was enough to produce a change in the ranks. Manifest skill correlated nearly 1:1 with innate ability. innate ability = exp(i/10) works better.

      Still, there were plenty of other functions I could have used. In fact, I tried power functions, such as innate ability = i^2, and even the linear function innate ability = i. And behavior of these functions isn’t too different from exp(i/10). Both practice and innate ability play a role in the development of expertise. Ericsson’s finding is reproduced equivocally.

      innate ability = i, for example, is shown below.

      Layout4
      Ericsson’s phenomenon manifests: average practice is 5.61 vs 8.80 in the bottom 90 vs. top 10.

      It then appears, then, that Ericsson’s result tells us very little. His result shows that those who are better tend to practice more. But this is the case under virtually every innate ability allocation scheme. Only in a case like innate ability = exp(i) does practice have no effect. So Ericsson can help us rule that out. But he can’t help us choose between:

      innate ability = exp(i/10)
      innate ability = i^2
      innate ability = i
      innate ability = 1

      Or many conceivable others. We already knew that practice helps. What we don’t know is the role that other factors, including innate ones, play. And Ericsson doesn’t help us answer that. He doesn’t even address this question. Instead, he ignores all factors besides practice. And the result that his readers do too.

      Ericsson’s readers may thus come away believing that practice is the only factor. This may or not be the case, but I think that the manner by which innate ability, if it exists, is allocated, merits a discussion, which Ericsson doesn’t provide.

      I’ll initiate such a discussion myself. The key difference between exponential allocation schemes, like exp(i/10), and others, is that, in the former case, true expertise is relatively much more rare. In the former case, only 7% have normalized manifest skill greater than 0.5, as compared to 23% in innate ability = i.

      I think the former case seems to be more correct. In my impression, there are always just a very few people (Bach, Gauss, Fischer, etc.) whose skill is astronomically-greater than that of most of their peers. I think this approaches Ben’s point. Difficulty in making this assessment, though, comes from the fact that it’s difficult to absolutely quantify expertise.

      We can see from this table that, for example, only 6% of the population of chess players are are Class A or better. However, there’s no way to say that at Class A player is “twice as good” as a class F player. His rating is twice as high, but this certainly doesn’t mean that the Class A player is twice as likely to win. Based on the way the chess rating system is built, an 1800 will actually beat a 1700 three times out of four. Nor does it mean that his grasp of chess is twice as strong, since, once again, this can’t be measured.

    • Ben says:

      I’ll add an additional reply to this comment, though, to recall that this discussion was not the primary concern of your essay. Your article really explores expertise’s nature, or inner characteristics, and not its determinants or preconditions. I think the former is more interesting, in any case. Whether or not I’m cut out for expertise in some given field — and whether not such a thing is possible — I’d like to see what goes on in there. How else will I have any hope of getting inside?

    • Josh says:

      As for your theory that Ericsson’s method smooths out differences: this may be supported by my data.

      In histograms below, I’ve plotted the frequency distributions of innate ability and practice time only of those 10 students with the highest manifest skill. Innate ability was allotted by innate ability = exp(i/10).

      Layout9

      As you can see, the distribution of practice is negatively-skewed, but the distribution of innate ability is positively-skewed. This means that the top 10 students are predominated by students who have less innate ability, but greater practice time. Meanwhile, one or two rare students might have had greater innate ability but less practice time; these students, though, don’t contribute as heavily to the average.

      One simple explanation for this is just that the distribution of innate ability, itself, is positively-skewed. So, the distribution of innate ability after sorting by manifest skill will tend to be skewed in the same direction, as long as the ranking of students by manifest skill remains similar to the ranking of students by innate ability.

      In other words, the more unlikely it is for a lesser student to surpass a peer with greater innate ability, the more the innate ability distribution after sorting will resemble the innate ability distribution before sorting.

      In cases like the one above, where students are unlikely to surpass their peers, it might be said that “passes” are rare.

      Consider the scheme innate ability = exp(i).
      Now, consider two students p and q. p and q are consecutively ranked: if both students p and q can be said to equal their i value, qp = 1.
      So, their innate abilities are exp(p) and exp(q), respectively.
      So, q is e times better than p.
      Consider the set of practice times that might be allotted to both p and q. Each student might be given one of 10 practice times, so the probability space corresponding to the number of practice times they might both receive consists of 100 pairs of practice times.
      How many of those 100 pairs will produce a pass? Well, as we mentioned, q is e times better than p. So, in order for p to pass q, p would have to practice e times more than q.
      So, of 100 pairs, only a few have the former student practicing e or more times the latter. We have:
      (3,1)(4,1)(5,1)(6,1)(7,1)(8,1)(9,1)(10,1)
      (6,2)(7,2)(8,2)(9,2)(10,2)
      (9,3)(10,3)
      To make 15 in all. So, if innate ability = exp(i), the chance of a pass is 15%. Passes are relatively rare. The result of this is that, when students are sorted by manifest skill, reordering is minimal. So, the distribution of innate ability, when students are sorted by manifest skill, resembles the distribution of innate ability, when students are sorted by innate ability. Both are positively-skewed.

      In sum: your theory was that true experts are relatively-rare, and artificially bring up the skill level of the top 10 as a whole. I clarify that this is true only so long as reordering of the rankings between innate and manifest skill is rare.

      As the innate ability allocation scheme becomes more egalitarian, practice plays a greater role. It starts getting advanced here, but, as the chances of a pass increase: the practice distribution among the top 10 becomes more negatively skewed (this is obvious) but the innate ability distribution becomes more negatively-skewed as well. Imagine that innate ability = exp(i/100). Passes would be very common, and so the top 10 would be exclusively dominated by those who practiced the most. Further, though, the top 10 would also be stacked towards those with greater innate ability, as opposed to less innate ability. If differences in innate ability are so small, then every last bit of innate ability is useful to achieve an advantage. It would be less likely for someone to skate by with sky-high innate ability and relatively-low practice.

  3. Richard says:

    Nice work lads. Alas, the qualitative features manifest in the making of expert judgments might be beyond our skill to model formally. To the truly gifted, I suspect their great intellectual or artistic gifts are equally mysterious. But, to rephrase my point a little: although, as you say, an artist (or thinker of some stripe) may (having mastered the ‘higher branches’) produce something marvelous, which leads him to wonder at his own ability, it still seems to me that a chess expert is in a slightly different position. He can, in principle, retro-engineer the thought processes behind his expert judgments, accounting for his brilliance in grasping complex chess patterns as a function of mastering basic patterns all the way down.

    There may, of course, be some qualitative, first-personal ‘gap’ between the feelings of making higher level judgments (which earn him our praise) and lower level judgments associated with the underlying accumulated basics (which reduce his art to mere mechanism); but he can bridge this gap. He can, in theory ‘explain it all away’. Must Gauss or Bach say that through mere (though admittedly talent-driven) mastery of basics, they have achieved their wonders? In a chess scenario, isn’t there a decidable best move (/set of moves) to play? And is this the case in mathematics and music? Does it even *look* like it may be so? (Are the two even comparable? If not what does this mean for our understanding of expertise?) If not, then I think chess expertise is something substantially different to the ‘expertise’ of geniuses like Gauss etc. In fact, the word ‘expert’ seems ill-befitting the dignity of Bach and Gauss. Perhaps I’m suffering some biasing blind spot here.

    • Ben says:

      I like the distinction you’ve made — that is, between chess and music/math — and I think the criterion on which it’s based is instructive. I don’t think the distinction pans out very well, though. What I’ll offer here is not a clean theory like you’ve given, but rather some observations that push “each side towards the middle”.

      I would avoid letting chess’s apparent rigidity seduce you into demystifying chess thought. I would suspect, and a few personal experiences (with my dad, say) would seem to confirm, that much of chess “expertise” is equally as mysterious to the player as is that of music or math. Haven’t you heard someone say “I found this move as if by magic”, or, “I’ve never seen a position like this before”? (Recall Honinbo Jowa’s “ghost moves” in the famous Blood-Vomiting Game of, alas not chess, but Go.)

      On the other hand, even in the case of math, I can push a bit in the opposite direction. I’ll put it mildly by saying that I’m no Gauss. But I’ve surely had a few breakthroughs on deep exercises that have been really thrilling. Nonetheless, even in these, I can in some sense “trace things all the way down” (almost more than I imagine that a chess player sometimes could), in the sense that I got there because I was looking in the right way. Mathematical solutions rarely appear from nowhere. Often, the way to do well in math is to acutely understand the deep structural features of a mathematical environment, and then start creating solutions progressively, starting from the most large-scale movements of this system and moving to narrower technical details. When I find solutions, it’s mainly because I understand the deep features well enough to start looking in the right place. Once you’re in that position, nothing is magic. As A. Gorinov said about Mikhail Khovanov‘s Khovanov Homology, “he got there because he was looking for something”.

      Of course, one could argue that it’s in the preliminary understanding of deep structural characteristics that math genius’s “mystery” lies. Perhaps. One could also argue that miraculous solutions can be found which do seem to spring from nowhere, and not as predictable consequences of deeper understanding. Gauss’s most elementary and widely familiar proof (among the seven he gave throughout his lifetime) of quadratic reciprocity is perhaps an example of this.

      As for music genius, man. I have no way of understanding how that could come from anywhere other than magic. I’ve never learned music theory, but I have heard that it identifies particular patterns of chord progressions that are the most pleasing. Maybe there’s some way to systematically codify such knowledge “from the bottom up”? I have no idea.

      • Josh says:

        I’ll add a few more words.

        Perhaps what I’m describing is really just a feeling that comes to an expert when he observes content within his domain. When I observe the figure labelled “A Grandmaster’s Memory”, I’m, within seconds, struck by ideas, patterns, feelings, and memories, as my eyes dart over the position. A non-chess player wouldn’t experience these things. Maybe my claims are that simple. It doesn’t matter why or how these feelings come to me. It matters that they come to me. The manner by which content within an expert’s domain is produced shouldn’t matter.

        When computer chess is mentioned to a layperson, that person will probably think of Kasparov vs. Deep Blue 1997. Deep Blue was the first computer to beat a top chess player; Kasparov’s loss rocked the chess world.

        But chess experts seem to think more highly of the games played between Kramnik and Deep Fritz in 2006. Deep Fritz won the 7-game series in 6 games. My chess teacher said that Deep Fritz’s winning performance in game 6 was “eerily human in its beauty”.

        Of course, the computer worked only by computation, and not by intuition. The best description of a computer’s ability, if we were to anthropomorphize, would probably be to say that the computer plays only at the terminal node. That doesn’t mean that an expert can’t view the results of the computer’s play in awe. (I, on the other hand, couldn’t make any sense out of the game).

        So, whether or not an expert sees all the way to the bottom or not shouldn’t matter. No matter the case, it can’t be disputed that an expert sees things that a novice doesn’t. And I maintain that here lies the extent of my claim, even if my claim is now weakened.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s