Language Machines


A Russian grammatical table.

Each language supports an array of grammatical machinery that functions alongside its words and phrases. The purpose of this machinery is elusive: while it does indeed convey information about the speaker, the subject, the tense, etc., this information is often redundant; English, to name an example, gets along largely without it. Alternatively, though, perhaps this machinery tells us about the language’s speakers: their priorities, their culture, and their attention to detail. Languages, after all, are organic, and these structures came from somewhere.

The machinery’s purpose and origin, alas, are not my areas of expertise. I discuss grammatical organization of a series of languages.

Italian (1)

Italian features a small, well-oiled, clean grammatical machine. It has few exceptions and errors; it’s efficient, avoids bulkiness, and rolls off the tongue with a little training and little effort. More broadly, it effectively organizes verb conjugations, tenses, and noun gender in a very “compact” way.

Each verb conjugates according to its subject. English verbs, perfunctorily, also conjugate – recall I have, you have, he has – but the conjugation only changes in the “he/she/it” case, also known as third person singular. Italian verbs, meanwhile, feature different endings for all six scenarios: first (I), second (you), and third (he/she/it) persons, as well as either singular (I, you, he/she/it) or plural (we, you, them) number. The subject, then, can be identified simply by the ending of the verb; by consequence, Italians often omit subject pronouns before verbs (“he walks” would become “walks”, with no information lost).

Italian nouns each have a gender. A door (porta) is feminine, a tree (albero) is masculine, a table (tavola) is feminine. These genders are denoted by the vowel ending the noun – as you might notice, feminine nouns end in “a” and masculine ones end in “o” (there are rare exceptions). Adjectives which describe nouns, then, must “agree” in gender: they must use the same ending as the noun does. (Red door “porta rossa”, tall tree “albero alto”, beautiful table “bella tavola”.)

Verb tenses, also, are handled very compactly. Different verb tenses such as imperfect (I was cleaning), conditional (I would clean), imperative (clean!) and future (I will clean) each operate by carrying their own set of six conjugational endings. These “modified endings” convey the typical conjugation (person, number) while also, because of their modification, conveying tense. These tensed endings are typically achieved by applying a systematic modification (such as an extra syllable or an accent) to the usual six endings. Other tenses such as past (I have cleaned, I cleaned) are achieved using an auxiliary verb (e.g. to have), which is also conjugated.

Italian, then, efficiently conveys subject, tense, and noun gender through systematic use of sets of verb and noun endings.

Russian (2)

Russian is a vast, bulky machine, with many, many small moving parts which seem (in my stage of learning, at least) to be quite difficult to keep track of. The language, in any case, conveys a wealth of information through its grammatical machinery, including, in addition to the basic conjugation and gender, noun cases, verb prefixes indicating circumstance, and, in general, a much finer degree of subdivision.

Russian, like Italian, features six conjugational endings for each verb, with modified ending packages to convey tenses. It also features noun gender, including the typical masculine and feminine, as well as a third neuter. Russian, however, very quickly ramps up the complexity.

Russian also features six noun cases, which, through noun endings, convey the relation of that noun to the rest of the sentence. The six noun cases loosely partition the various functions a noun may serve in a sentence. Nominative is used on the subject of a sentence: in “I walk the dog”, I is in nominative. Genitive is (mainly) used to describe ownership or possession: in “the family’s car is green” (think “the car of the family is green”), car is in nominative and family is in genitive. Accusative refers to a sentence’s direct object: in “I love you”, I is in nominative and you is in accusative. Dative is used with indirect objects: in “I give the gift to her”, gift is in accusative (direct object), while her is in dative (indirect object). Instrumental is used, most commonly, with prepositions like with and using: in “I went to the movies with my brother”, my brother is in the instrumental case. Finally, prepositional is usually used in prepositional relations: in “I am in Moscow”, Moscow is in the prepositional case. Because of cases, many prepositions (of, to, with, etc.) may be omitted in Russian.

I remind the reader that the ending of a noun must convey both its gender and its case. You might imagine that with three genders and six cases, we’ve already accrued a large possible array of noun endings. Furthermore, an adjective modifying a noun must agree with that noun in both gender and case. Finally, and staggeringly, the set of adjective endings is different from the corresponding set of noun endings, even amongst pairs which agree! This makes proper case use an imposing challenge. I also note that pronouns (I, you, her), possessive pronouns (my, his, their) and even indefinite pronouns (who, what) each, themselves, have a full set of cased versions.

Russian also features finer subdivision than that seen in English. The English verb “to study” might be translated into three Russian verbs: учиться (uchit’sya), to study at a university, as in “I study at UNC”; изучать (izuchat’), to study a subject, as in “I study math”; and заниматься (zanimat’sya), to do schoolwork, as in “I study in the library”. The English word “to go” has a ridiculous number of possible translations, including идти (idti) to walk with destination (I walk to the store), ехать (yekhat’) to drive with destination, ходить (khodit’) to walk without destination (I walk in the park), ездить (yezdit’) to drive without destination, and so on. Finally, each of these verbs of motion can be changed to their “coming” equivalents by adding the prefix при (pri), or to “departure” equivalents by adding the prefix по (po) or у (u). Prefixes exist for “coming near to something”, “motion from the inside to the outside of something”, “motion around something”, and even “motion to the inner part of something for a short while, or movement round the corner of an object or building”. (3)

Russian, at my stage of learning, seems quite a daunting challenge, and I marvel at native speakers (and the human brain which effortlessly facilitated their abilities). The Russian grammatical machinery, like a vastly complex work of engineering, runs through every strain of the language, organizing it into fluid perfection.

Chinese (Mandarin) (4)

Chinese is a language composed of characters, “building blocks” of language which associate a symbol, a one-syllable pronunciation, and a meaning into one package. There is no alphabet or words in the traditional sense, only collections of characters. Many characters mean a full, well-defined word, such as 狗, gǒu, dog. Some characters mean “part of a word” or a more abstract idea: 么 (me) on its own, is translated by Google to the vague “suffix of interrogative and relational pronouns”; much more commonly, however, it’s paired with 什 (shén) (which, coincidentally, also doesn’t have much meaning on its own) to form 什么 (shén me), the ubiquitous question word what. Other characters, finally (see below), mean nothing like a word in the English sense.

While this is a purely phonetic observation, it has strong grammatical consequences. Namely, Chinese cannot feature noun and verb endings like other languages. There’s no way to add an ending to a character. Characters are “fixed”, so to speak, and without letters in the phonetic sense, they may not be modified within sentences. Chinese, then, achieves its desired grammatical constructions by using an interesting array of auxiliary words, or “particles”, which, lacking any feasible translation into English, serve to effect grammatical constructions.

The particle 吗 (mǎ) is placed at the end of a sentence to indicate a question; the particle 吧 (ba) is placed at the end of a sentence to indicate a suggestion (think “let’s go”). The word 的 (de), when following a pronoun, indicates possession: 我的狗 (wǒ de gǒu) means roughly I + 的 + dog, or “my dog”. The particle 得 (de), pronounced the same way, follows a verb and precedes an adverb of degree or potential, which then applies to the verb. The complexity continues to rise.

In English, we might say “a pound of bananas” or “a bag of rice” or “a case of books”. The English words used here – pound, bag, case – are generalized to the Chinese phenomenon of “measure words”. Any description of quantity in Chinese must use a measure word: the phrases “three people”, “two fish”, and “five minutes” all use different measure words (which, in these cases, are absent in English). Measure words tend each to apply to a systematic category of nouns.

Why do different languages feature different grammars? Do they develop by pure chance, or do they say something about the people behind the language? Why do languages use grammar at all, when it seems only to increase complexity? If languages don’t tend automatically to simplicity, which factors and characteristics do shape a language’s evolution? These are questions for which, unfortunately, I don’t have ready answers. (Comments appreciated!)  I do know, for one, that vowels often feature maximally distinct spectra, a sign of lingual evolution. Is grammar too shaped by evolutionary factors?

  1. Spoken Italian
  2. Spoken Russian
  3. Russian Verbs of motion with prefixes
  4. Spoken Chinese

3 comments on “Language Machines

  1. Josh says:

    I’m not sure why grammar exists, but here’s an interesting point: grammar necessarily arises in new speakers of a grammarless language.

    When two or more groups meet who do not have a language in common, the resulting language is called a pidgin. Pidgin languages are a rudimentary, ad-hoc means of communication; they contain words from all the languages involved, but the grammar of none. For example, in the 1830’s immigrants flooded Hawaii to service its booming sugarcane industry. Workers from Portugal, China, Japan, the Philipines, Korea, Russia, Spain, and elsewhere all needed a means of communication; the result was a pidgin that borrowed words from each language involved.

    However, when the migrant workers started to have kids, the kids added their own grammar! Children are literally incapable of learning a grammarless language, and so they invent their own: the result is a creole, which contains the shared words from the pidgin, but also an entirely-new grammar system. Hawaiian Pidgin evolved into Hawaiian Creole, which is now one of Hawaii’s most-spoken languages.

    The emergence of grammar among native speakers of pidgins suggests strongly that, for some reason or another, grammar is hardwired into the brain.

  2. Richard says:

    “Italians often omit subject pronouns before verbs (“he walks” would become “walks”, with no information lost).” Something quite similar is true of Gaelic languages. Same goes for the use of different suffixes for different tenses. Nouns are also gendered, like in Italian and French, and there are some cases, though not as many as Russian, I think. I studied Latin in the past, and doing so certainly is a good way to come to terms with traditional grammatical classifications. Latin has seven noun cases: all the ones you mentioned for Russian (though prepositional case is usually called “ablative” case), plus a vocative case (addressing someone or something using the noun or pronoun) and a locative case (though this was redundant later, due to the ablative and dative cases taking over its work). Finnish, though not gendered, has FIFTEEN cases, as far as I remember.

    I think the comments about why certain grammars are developed rather than others are interesting and, doubtless, grammatical difference is a product of historical environmental influences. However, I also suspect there is a large dose of chance involved too. Josh’s comment in the reply that “the emergence of grammar among native speakers of pidgins suggests strongly that, for some reason or another, grammar is hardwired into the brain” should be alerting us to the need to specify that, when Ben is talking about ‘grammar’ what he seems to mean is what’s sometimes called ‘surface structure’ grammar, rather than ‘deep structure’ grammar. The latter, which is part of a scientific hypothesis found in the work Noam Chomsky and those who work in his paradigm, is supposed to be definitive of Universal Grammar (or “UG”). UG is hypothesized to be the same for all humans, due to an innate language faculty which is, in some sense, ‘hardwired’. And a big part of theoretical linguistics concerns the construction of algorithms, or rule sets, specifying transformational procedures which can take you from a syntactic hierarchy/arrangement of universal grammatical categories (subject, verb, object, etc.) to the specific syntactical output of the individual language. A common, and I think fair, assumption is that these procedures must be recursive, allowing you to generate infinitely long and complex sentences from basic units. One idea is that despite superficial differences due to divergent histories, the underlying recursive grammatical mechanisms are the same.

    Ben asks: “Why do languages use grammar at all, when it seems only to increase complexity?” On a minimal conception of grammar, it’s just some structure defined on a set of elements, the linguistic items, be they words, morphemes, characters whatever. A language is then, in some sense, just an algebra, a structure in the technical sense. This minimal conception would apply both to deep and surface structure. So imagining a language without a grammar component is, par definition, incoherent; it just wouldn’t be a language. The real question is twofold: (i) why do languages use surface grammars at all and (ii) why are there many varieties of these surface grammars. Answering part (i) is interesting. Chomsky argues that it is because the essential function of language is thought and not communication that surface structure arises. The structures present in the surface grammar of a particular natural language, like French or Italian, are linear in a way that the deep structures are not thought to be. The deep structures are thought to be hierarchical and branching.

    Chomsky (in his most recent discussions) believes something like this: we developed the capacity for language in such a way that it became more convenient for our brain to match the eventual external linguistic output (Italian, French etc.) with our capacity for linear representation than to attempt some sort of instantaneous representation of a hierarchical branching structure (imagine if you could communicate your thoughts in the form of a structured ‘picture’ whose entire content would be grasped instantaneously following successful communication!). That we are incapable of speaking or communicating using the hierarchical deep grammar is a matter of the convenience of linear representation over non-linear instantaneous representation. When the hierarchy of deep structure is externalized, our brain imposes different structure. Then, addressing part (ii) is just a matter of realizing that this externalization and imposing of surface structure opens the way to incidental environmental influences. Part (ii), then, is an object of study in socio-linguistics or anthropology.

    Relatedly, there has been no successful identification of a ‘language gene’. They thought they had one for a while (called it the FOXP2 gene) but it turned out to be an overestimate. Language may have evolved as the happy result of the interaction of genes which were selected for other properties (maybe for the exact combination of phenotypes they generate) rather than because of selections of some specific genotypes. It’s an interesting area of research!

    • Ben says:

      The functionality handled in Latin by the ablative case is actually, in Russian, split across the instrumental and prepositional cases, though it resides primarily within the instrumental case. Prepositional, in Russian, seems concerned only with “in” and “on”, as well as, interestingly, “about”. (Certain variants suggesting movement, such as “into” or “to”, are, like in Latin, spoken and written identically but use the accusative case, while others remain in prepositional.) Instrumental, in Russian, is used with “with” as well as well as some of the more abstract notions typical of the Latin ablative case (“by means of”, “using”, etc.). Other functions handled by the Latin ablative, notably those related to the the “Ablative proper” (I’m reading Wikipedia here) might be shifted over to genitive or delegated to verb constructions.

      The case of Finnish, though fascinating, might be a bit misleading. I looked into this once, and if I remember right, though many Finnish noun cases are certainly interesting, at least a few are used virtually exclusively with a single preposition — “into”, for example — and almost act as prepositions themselves. Though the existence of a particular case (and a particular set of noun affixes) for a single preposition certainly increases a language’s difficulty, it might not add a whole lot of linguistic intrigue. Studying difference between the cases for “under” and “over” in Finnish might be less interesting than, say, the difference between accusative and dative in Russian (dative in Russian is very interesting), or the ablative in Latin.

      I now recognize that this article is concerned with grammar in a somewhat superficial sense — a sense which depends more on the contingencies of history than on fundamental facts about language. Hopefully, a sequence of posts I’ve written more recently — the series is called Redividing Linguistics — might address grammar in some more fundamental ways. I’m looking forward to seeing what you think about those.

      On the other hand, studying surface grammars could be insightful, perhaps not because of the details of these languages in particular but because of the light the study could shed on the diversity of possible surface grammars as a collective. I might be less interested in the surface grammars of Russian and Chinese, for instance, than I am about the differences between these surface grammars, about the fact that they’re so different in the first place, and about what the study of them could tell us about the possible diversity of surface grammars in general. (Ok, bad example — I am interested in both of these surface grammars.)

      In particular, studying surface grammars could help lead us to the sorts of insights which you describe regarding hierarchical and linear representations. These are very interesting comments and I’ll have to look into them more.

      One additional fact is interesting. Understanding deep structures as hierarchical and surface structures as linear, it would follow that each language must feature some certain “tree traversal” algorithm by which it transitions from one to the other. (These are an important topic in computer science; see this.) It would seem that different languages feature highly different tree traversal algorithms. Consider a sentence’s syntax tree “ordered” in its usual way, that is, displayed so that its leaves at the bottom, the sentence’s words, appear in their spoken order. In English, syntax trees tend to be very left-skewed. The subject, if not a noun itself usually at most a small determiner phrase, is followed by a handful of extensive verbal, adverbial, and subordinate phrases. In Turkish, the situation appears to be reversed — its trees are very right skewed. (I’m gathering this from superficial observations in Fromkin, et. al.’s Linguistics.) These tree traversal methods could be an interesting thing to study.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s