Indo-European languages

From Citizendium
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
This editable Main Article is under development and subject to a disclaimer.
This article is about the Indo-European family of languages. For other uses of the term Indo-European, please see Indo-European (disambiguation).
This animated map shows an account of the spread of early Indo-European languages

The family of Indo-European languages is a collection of several hundred languages, including the majority of languages spoken in Europe, the plateau of Iran and the subcontinent of India, that share a considerable common vocabulary and linguistic features. These shared traits have led many scholars to believe that these languages derive from a common ancestor, usually designated Indo-European or Proto-Indo-European (or PIE). Among the most famous languages that belong to this group are English, French, German, Greek, Hindi-Urdu, Italian, Latin, Persian (Farsi), Portuguese, Russian, Sanskrit and Spanish.


The exact native name of the first Indo-European population and language remains unknown.

A general name of the language family, accepted by nearly all scholars, is Indo-European, since this family used to cover, during Antiquity and the Middle Ages, a vast territory stretching from India to Europe.

Another name is Indo-Germanic, used mostly by German scientists during the 19th and the 20th centuries, but quite obsolete since the second half of the 20th century. The explanation of this name is quite simple: German scholars have played an important role in the development of Indo-European studies.

An alternative name proposal has been Indo-Hittite,[1] stressing the fact that the Anatolian branch of Indo-European (including the Hittite language) was a very early offshoot from the Indo-European motherland. This name has not found a wide success among scholars.

The name Aryan was used as a synonym for Indo-European by several authors during the 19th century and the beginning of the 20th century. But in fact, Aryan (from Sanskrit Arya) designates chiefly the Indo-Iranian branch of Indo-European, rather than the Indo-European family as a whole. The use of Arya as a native name of the original Indo-European people is only a hypothesis. The main problem is that some racist authors of the 19th century, and then the nazi ideology, misappropriated the term Aryan in order to express the absurd idea of a so-called supremacy of a European “race”. After the massive crimes committed by the Nazis during the Second World War, the term Aryan has been abandoned by scholars as a synonym of Indo-European. But it is still accepted in its Sanskrit, attested sense, as a synonym of the Indo-Iranian branch.


Classic list of branches

The family of Indo-European languages is subdivided into a number of subgroups. These are, according to many classical descriptions:

  1. Indo-Iranian languages or Aryan languages, comprising two close subfamilies: Indo-Aryan and Iranian.
    1. Indo-Aryan languages. These languages are now spoken in the modern countries of India, Bangladesh, Pakistan and Sri Lanka. The oldest literary texts preserved in any Indo-European language are the Vedas. The oldest texts among them date to around 1500 BC. They are composed in an early form of Sanskrit. Among the modern languages belonging to this subgroup are:Hindi, Urdu, Bengali, Punjabi. The intervening period, known as Middle Indo-Aryan, includes Pali, the language of the Pali Canon.
    2. Iranian languages. These languages are spoken on the plateau of Iran. There are close affinities between Iranian and Indian languages, suggesting that the peoples who speak dialects of these respective language subgroups have lived in close proximity with each other for a long time. It is believed by many historical linguists that both Indian and Iranian descended from a common ancestor Proto-Indo-Iranian. The Iranian languages are divided into an eastern and a western branch. The modern language of Farsi (or Persian) is the main representative of the Iranian languages, and it belongs to the eastern branch. Other Iranian languages are Afghan (or Pushtu) and Beluchi, both spoken in parts of Afghanistan, and Kurdish, which is spoken in an area covering northern Iraq, eastern Turkey, and northwestern Iran.
  2. Armenian. Armenian is somewhat isolated within Indo-European, since it does not appear to be linked to any other group by shared linguistic (grammatical) features, though its vocabulary contains numerous items borrowed from Farsi as a result of many centuries of Persian domination. Other lexical items found in Armenian come from Semitic languages, Greek, and Turkish.
  3. Greek or Hellenic. The Greek people (or Hellenes) entered the area now known as Greece around 2000 BC where they displaced numerous other peoples. The early flowering Greek culture produced a number of masterpieces, including the Iliad and the Odyssey, both Homeric poems. The Greek language comprised the following, notable dialects in the classical Antiquity: Ionic, Aeolic, Arcadian-Cyprian, Doric, and Northwest Greek. The inclusion of Ancient Macedonian in Greek is debated. The most prestigious dialects was Attic, the dialect of Ancient Athens, which belonged to the Ionic group. Attic attained supremacy in the fifth century BC through the dominant political and commercial position of Athens. Attic formed the basis of a koiné or lingua franca, that is, a mixture of several dialects to facilitate communication between different parts of the Greek world and for use as a unified standard in foreign commerce and diplomacy. Modern Greek, or Demotic, is ultimately descended from koiné Greek.
  4. Albanian. Albanian is an independent member of the Indo-European family, but this has been recognized only since the early twentieth century because the language is permeated with influences from Latin, Greek, Turkish, and Slavic (or Slavonic). Records for Albanian only go back to the fifteenth century AD.
  5. Italo-Celtic languages, comprising three close subfamilies: Italic, Ancient Ligurian and Celtic.
    1. Italic languages (including the Romance languages). This group includes numerous languages now extinct, such as Faliscan and Umbrian, but the main historical representative of this group is Latin, originally the language of Latium (the area around Rome). Vulgar dialects of Latin were spread throughout the Balkans, the Mediterranean and Western Europe and over time these developed into the Romance languages which are from east to west: Romanian, Italian proper and Northern Italian, Sardinian, Corsican, Friulian, Ladin, Romansh, French, Francoprovençal, Occitan, Catalan, Aragonese, Spanish, Asturian-Leonese and Galician-Portuguese.
    2. Ancient Ligurian language. This language was intermediary between the Italic and the Celtic languages.[2] It was spoken in Antiquity in what are now Provence and Liguria.
    3. Celtic languages. These languages were once spoken throughout Western and Central Europe, but are now confined to the British Isles and Brittany. There are two branches: Goidelic or Gaelic and Brythonic or Britannic. The former are represented by the modern languages of Irish Gaelic, Scottish Gaelic, and Manx. The second group includes Welsh, Cornish and Breton. The prospects of survival for the remaining Celtic languages are not good, as decline for all in favor of English or French has been tremendous.
  6. Balto-Slavic languages fall into two main close groups: Baltic and Slavic (or Slavonic).
    1. The Baltic languages have three representatives: Latvian (sometimes called Lettish), Lithuanian, and the now extinct Prussian. Lithuanian is one of the most conservative Indo-European languages still spoken and is therefore of great interest to historical linguists.
    2. The Slavic languages or Slavonic languages are further subdivided into East Slavic, which includes Russian (also known as "Great Russian"), White Russian, and Ukrainian (also known as "Little Russian"), West Slavic, which includes Polish, Czech, and Slovak, and South Slavic, which includes Bulgarian, Slovenian, and Serbo-Croatian. The oldest texts we have in Slavic are fragments of the Bible and other liturgical texts written by St. Cyril in the ninth century in a language usually referred to as Old Church Slavonic.
  7. Germanic languages. The Germanic languages differ from other Indo-European languages by the First or Germanic Consonant Shift (described as Grimm's Law). The common ancestor for the Germanic languages is called either Germanic or Proto-Germanic. This subgroup has three branches:
    1. East Germanic: This branch is now extinct but it is relatively well known through the fragments of Wulfilla's Gothic Bible, which dates to the fourth century AD.
    2. North Germanic: This branch comprises the Scandinavian languages Swedish, Norwegian, Danish, Icelandic, and Faroese.
    3. West Germanic: This branch includes English, German, Dutch, and Frisian.
  8. Tocharian, more exactly called Agni-Kuchi. This is the most obscure branch of Indo-European since it has been extinct since at least the ninth century AD and because we have virtually no data for it. We know of two (or perhaps three) different languages belonging to this branch, usually referred to as Tocharian A (Agni) and Tocharian B (Kuchi).
  9. Anatolian. Although this most ancient branch of Indo-European has been extinct since ca. 1100 BC, we know relatively much about it as a result of the discovery of cuneiform tablets with inscriptions in Hittite, the main representative of this branch, in the early twentieth century.

Sergent's classification

A comprehensive and detailed classification was proposed in 1995 and revised in 2005 by Bernard Sergent in his huge synthesis of the Indo-European question, compiling a large amount of previous works.[3] This classification does not contradict the classical list of branches, it is rather a comprehensive update of it.

I.Northwest group

  1. Italo-Celtic (Western Europe)
    1. Macro-Celtic (Western Europe)
      1. Celtic (chiefly Western Europe)
        1. Gaelic or Goidelic (British Isles), including Irish, Manx, Scottish Gaelic.
        2. Brythonic (British Isles and mainland Western Europe), including Welsh, Cornish, Breton, most varieties of Gaulish (extinct).
        3. Lepontic (extinct) (Northern Italy)
        4. Celtiberian (extinct) (Iberian Peninsula)
      2. Ancient Asturian (extinct) (Iberian Peninsula)
      3. Ancient Ligurian (extinct; intermediate between Celtic and Italic) (Provence, Liguria)
    2. Italic or Macro-Italic (Western Europe)
      1. Osco-Umbrian (extinct) (Italy), including Umbrian and the Sabellic languages (Sabinian, Samnite, Oscan, Pelignian, Volscan, Marse, Marrucine, Vestinian…).
      2. Latino-Faliscan (Italy), including Faliscan (extinct) and Latin.
        1. Deriving from Latin: the Romance languages (Southern, Western and Central Europe), including Galician-Portuguese, Asturian-Leonese, Spanish, Aragonese, Catalan, Occitan, French, Francoprovençal, Romansh, Ladin, Friulian, Northern Italian, Italian, Corsican, Sardinian, Romanian.
      3. North Adriatic (extinct) (around Venetia), including Venetic.
      4. Dalmato-Pannonian (extinct) (from Dalmatia to Hungary)
      5. possibly: Rhaetic (extinct) (central Alps)
      6. Siculian-Elymian (extinct) (Sicily)
      7. Northwest block or Belgian (extinct) (around Belgium, including parts of Netherlands, Germany and France)
  2. Germanic (chiefly Central and Northern Europe)
    1. East Germanic (extinct), including Gothic, Burgundian, Vandal, Rugian, Gepid, Taifal.
    2. North Germanic or Scandinavian, becoming Old Norse in an early stage, then giving birth to Danish, Swedish, Norwegian, Faeroese, Icelandic.
    3. West Germanic, including English, Frisian, Low German, Dutch, Afrikaans, German proper (or High German), Yiddish.
  3. Balto-Balkanic (chiefly Central and Eastern Europe)
    1. Macro-Baltic (a better name than Balto-Slavic) (chiefly Central and Eastern Europe)
      1. Baltic (chiefly east to the Baltic Sea), including Old Prussian (extinct), Latvian, Lithuanian.
        1. Slavic or Slavonic (Central and Eastern Europe)—in fact, a particular, southern offshoot of Baltic—, including Old Church Slavonic (extinct), Polish, Sorbian, Kashubian, Czech, Slovak, Slovene, Serbo-Croatian, Bulgarian (with Slavomacedonian), Russian, Belarussian, Ukrainian.
    2. Balkanic (Balkans)
      1. Daco-Thracian (Balkans)
        1. Dacian or Daco-Mysian or Getic (around Romania and Central Balkans), including Dardanian, Moesian (extinct), Mysian (extinct).
          1. Albanian (spread around Albania), probably descending from Dardanian.
        2. Thracian (around Bulgaria, Northern Greece, Northwest Turkey), including Thracian proper (extinct), Thynian (extinct) and Bythinian (extinct).
          1. Armenian (Armenia), a far offshoot of Thracian (but developing a close contact with Helleno-Phrygian).
      2. Illyro-Messapian (extinct) (both sides of the Adriatic Sea), including Illyrian and Messapian.
  4. South Italic (extinct; hard to classify) (Southern Europe)
  5. Philistine, maybe the same language as Pelasgian (extinct; hard to classify, possibly a branch of Macro-Italic) (chiefly spread to Greece and Palestine).
  6. Agni-Kuchi (extinct) (Central Asia, chiefly Xinjiang), often called improperly Tocharian, including Agni (or Tocharian A) and Kuchi (or Tocharian B).

II.Southeast group

  1. Helleno-Phrygian (around Greece and Turkey)
    1. Greek or Hellenic (around Greece), including probably Ancient Macedonian and Aetolian.
    2. Phrygian (extinct) (Turkey)
    3. (Armenian, a far offshoot of Thracian, but developping a close contact with Helleno-Phrygian)
  2. Aryan or Indo-Iranian (from Ukraine to Southern Asia)
    1. Iranian (initially stretched from Ukraine to Central Asia, Iran, Afghanistan and part of Pakistan), including:
      1. Extinct languages as Cimmerian, Old Persian, Avestan, Scythian/Saka (with Sarmatian, Alanian, Parthian, Mede), Pehlvi.
      2. Current languages as Modern Persian (including Tajik), Ossetian (which comes from Alanian, initially a variety of Scythian), Afghan (or Pashto), Baluchi, Kurdish, Zaza, Lur, Gorani, Mazandarani, Gilani, various languages of Pamir.
    2. Indo-Aryan (Southern Asia, especially part of Pakistan, Northern and Central India, Nepal, Bangladesh, part of Ceylon, Maldives), including Sanskrit (extinct), the various Dardic languages (including Kashmiri), Nuristani, Lahnda, Sindhi, Gujrati, Mahratti, Bhili, Rajasthani, Punjabi, the various Pahari languages (including Nepalese), Hindi-Urdu, Oriya, Bengali, Bihari, Assamese, Singhalese, Divehi, Romany.

III.Anatolian (extinct) (chiefly Turkey), including Hittite, Palaic, Luwian, Hieroglyphic Luwian, Lycian, Sidetic, Lydian, Pisidian, Carian (possibly).

IV.Indo-European languages with undetermined status

  1. Lusitanian (around Portugal)
  2. Alteuropäisch (“Old European”) (large parts of Europe)
  3. Prehellenic A (possibly belonging to the Anatolian group) (Greece)
  4. Prehellenic B (possibly belonging to the Balto-Balkanic group) (Greece)

V.Hypothetically Indo-European languages

  1. Tartessian (Southern Spain)
  2. North Picenian (Central Eastern Italy)
  3. Etruscan (possibly close to the Anatolian group) (chiefly Central Italy)


The origins of the Indo-European family have been explained by various hypotheses.

Kurgan hypothesis

The most widely accepted explanation is, by far, the kurgan hypothesis. According to it, all Indo-European languages come from a common mother tongue, called Indo-European or Proto-Indo-European (PIE), that was spoken during the mid or late Neolithic by a people of pastoralists—the Proto-Indo-Europeans—who lived across the Pontic-Caspian Steppe (that is: Ukraine, south Russia and west Kazakhstan). Some famous archeological remnants of this Indo-European people are the “kurgans” (a type of tumulus or burial mound).

The hard lifestyle of the Indo-European pastoralists, in the Pontic-Caspian Steppe, led them to invade countries with more advanced agricultures and craft industries—especially Danubian Europe—during the 5th, 4th and 3rd millennia BC, in several waves. Indo-European language and culture spread so to the conquered lands. These waves of Indo-European expansion were favored by an early mastery of the horse and the wheel and by a warlike culture. They would have split the Indo-European mother tongue and created new languages and cultures that kept essentially Indo-European features, but mixed with remnants of the dwindling languages and cultures of the conquered peoples.

Several variants of the kurgan hypothesis suppose that the Pontic-Caspian Steppe could be a secondary and late motherland, preceded by a first motherland located somewhere east of the Caspian Sea, especially around the archeological site of Dzhebel (or possibly, south of the Caucasus or somewhere not far from the Near East). This location could help to explain some interesting common features shared by Indo-European, the Semitic languages and the Kartvelian languages of south Caucasus (for instance, the root for seven: Indo-European *septm- matches with Semitic *sab‘-at-u-m).

Among the numerous scientists who support the kurgan hypothesis—linguists, archeologists, historians, religion specialists, anthropologists—, one can notice the syntheses of archeologists Marija Gimbutas and J.P. Mallory and of historian Bernard Sergent.

Anatolian hypothesis

A minority current of scholars suppose that Indo-European would come from the slow spread of languages and cultures brought by peoples who were expanding agriculture from Anatolia, from the 7th millennium on. This scenario is supported especially by archeologist Colin Renfrew.

Paleolithic continuity theory

A very minority current, whose main proponent is linguist Mario Alinei, states that the Indo-European family would have existed in Europe since the Paleolithic. This suggests a very old continuity.[4] According to Alinei, a lot of boundaries of current Indo-European languages would be very old, even if some former Indo-European languages enclosed within those boundaries have been replaced several times by new Indo-European languages. This theory insists about continuity chiefly in Europe but does not give detailed explanation concerning the presence of Indo-European languages in Asia.

Work in Progress

See also


  1. Edgar Howard STURTEVANT (1929) “The Relationship of Hittite and Indo-European”, Transactions and Proceedings of the American Philology Association, 60: 25-37
  2. SERGENT Bernard (2005 [1995]) Les Indo-Européens: histoire, langues, mythes, Paris: Payot, p. 76-77
  3. SERGENT Bernard (2005 [1995]]) Les Indo-Européens: histoire, langues, mythes, Paris: Payot, p. 65-150.
  4. Continuitas, a website dedicated to the Paleolithic continuity theory.