Category Archives: Paleosiberian

More On The Hardest Languages To Learn – Non-Indo-European Languages

Note: Unbelievably, the PC nutjobs have accused this post, a Linguistics post of all things, of racism. See here for my position statement on racism.

Caution: This post is very long. It runs to 75  pages on the Net.

This is a continuation of the earlier post. I split it up into two parts because it had gotten too long.

The post refers to which languages are the hardest for English speakers to learn, though to some extent, the ratings are applicable across languages. Most Chinese speakers would recognize Spanish as being an easy language, despite its alien nature. And even most Chinese, Navajo, Poles or Czechs acknowledge that their languages are hard to learn. To a certain extent, difficulty is independent of linguistic starting point. Some languages are just harder than others, and that’s all there is to it.

Method, Results and Conclusion. See here.

Ratings: Languages are rated 1-5, easiest to hardest. 1 = easiest, 2 = moderately easy to average, 3 = average to moderately difficult, 4 = very to extremely difficult, 5 = most difficult of all.

Time needed: Time needed to learn the language “reasonably well”: Level 1 languages = 3 months-1 year. Level 2 languages = 6 months-1 year. Level 3 languages = 1-2 years. Level 4 languages = 2 years. Level 5 languages = 3-4 years, but some may take longer.

NE Caucasian, NW Caucasian and Kartvelian

Of course the Caucasian languages like Tsez, Tabasaran, Georgian, Chechen, Ingush, Abkhaz and Circassian are some of the hardest languages on Earth to learn. Chechen, Circassian, Ingush and Abkhaz are rated 5, hardest of all.

NE Caucasian

Tsez has 64-126 different cases, making it by far the most complex case system on Earth! It is said that even native speakers have a hard time picking up the correct inflection to use sometimes.

Tabasaran is rated the 3rd most complex grammar in the world, with 48 different noun cases.

Tsez and Tabasaran are rated 5, hardest of all.

Kartvelian

One problem with Georgian is the strange alphabet: ქართულია ერთ ერთი რთული ენა. It also has lots of glottal stops that are hard for many foreigners to speak, a single verb can have up to 12 different parts, similar to Polish, consonant clusters can be huge – up to eight consonants stuck together, many consonant sounds are strange, and there are six cases and six tenses. In addition, Georgian is both highly agglutinative and highly irregular, which is the worst of two worlds. Georgian is one of the hardest languages on Earth to pronounce.

On the plus side, Georgian has borrowed a great deal of Latinate foreign vocabulary, so that will help anyone coming from a Latinate or Latinate-heavy language background.

Georgian is rated 5, hardest of all.

NW Caucasian

Ubykh, a Caucasian language of Turkey, is now extinct, but there is one second language speaker. It has more consonants than any language on Earth – 78 consonant sounds in all. Combine that with only 2 vowel sounds and a highly complex grammar, and you have one tough language. However, it does lake the convoluted case systems of the Caucasian languages next door.

Ubykh is rated 5, hardest of all.

American Indian Languages

American Indian languages are also notoriously difficult, though few try to learn them in the US anyway. In the rest of the continent, they are still learned by millions in many different nations. You almost really need to learn these as a kid. It’s going to be quite hard for an adult to get full competence in them.

One problem with these languages is the multiplicity of verb forms. For instance, the standard paradigm for the overwhelming number of regular English verbs is a maximum of five forms: steal, steals, stealing, stole, stolen. Many Amerindian languages have over 1000 forms of each verb in the language.

Dene-Yeniseian

Na-Dene

Navajo has long, short and nasal vowels, a tone system, and a grammar totally unlike anything in Indo-European. A stem of only four letters or so can take enough affixes to fill a whole line of text. Some Navajo dictionaries have thousands of entries of verbs only, with no nouns. A verb has no particular form like in English – to walk. Instead, it assumes various forms depending on whether or not the action is completed, incomplete, in progress, repeated, habitual, one time only, instantaneous, or simply desired.

For instance, the verb ndideesh means to pick up or to lift up. But it varies depending on what you are picking up.

For instance, ndideeshtiilto pick up a slender stiff object (key, pole),
ndideeshleel to pick up a slender flexible object (branch, rope)
ndideesh’aalto pick up a roundish or bulky object (bottle, rock)
ndideeshgheelto pick up a compact and heavy object (bundle, pack)
ndideeshjolto pick up a non-compact or diffuse object (wool, hay)
ndideeshteelto pick up something animate (child, dog)
ndideeshnil to pick up a few small objects (a couple of berries, nuts)
ndideeshjihto pick up a large number of small objects (a pile of berries, nuts)
ndideeshtsos -to pick up something flexible and flat (blanket, piece of paper)
ndideeshjil - to pick up something I carry on my back
ndideeshkaalto pick up anything in a vessel
ndideeshtlohto pick up mushy matter (mud).

But picking up is only one way of handling the 12 different consistencies. One can also bring, take, hang up, keep, carry around, turn over, etc. objects. There are about 28 different verbs one can use for handling objects. If we multiply these verbs by the consistencies, there are over 300 different verbs used just for handling objects.

In Navajo textbooks, there are conjugation tables for inflecting words, but it’s pretty hard to find a pattern there. One of the most frustrating things about Navajo is that every little morpheme you add to a word seems to change everything else around it, even in both directions.

It is even said that Navajo children have a hard time learning Navajo as compared to children learning other languages, but Navajo kids definitely learn the language.

Similarly with Hopi below, even linguists find even the best Navajo grammars difficult or even impossible to understand.

Navajo is rated 5, hardest of all.

Hopi is so difficult that even grammars describing the language are almost impossible to understand.

Hopi is rated 5, hardest of all.

Slavey, a Na-Dene language of Canada, is hard to learn. It is similar to Navajo and Apache. Verbs take up to 15 different prefixes. It also uses a completely different alphabet, a syllabic one designed for Canadian Indians.

Slavey is rated 5, hardest of all.

Burushaski

Burushaski is often thought to be a language isolate, related to no other languages, however, I think it is Dene-Caucasian. It is spoken in the Himalaya Mountains of far northern Pakistan in an area called the Hunza. It’s verb conjugation is complex, it has a lot of inflections, there are complicated ways of making sentences depending on many factors, and it is an ergative language, which is hard to learn for speakers of non-ergative languages. In addition, there are very few to no cognates for the vocabulary.

Haida

Haida is often thought to be a Na-Dene language, but proof of its status is lacking. If it is Na-Dene, it is the most distant member of the family. Haida is in the competition for the most complicated language on Earth, with 70 different suffixes.

Salishan

The Salishan languages spoken in the Northwest have a long reputation for being hard to learn, in part because of long strings of consonants, in one case 11 consonants long. The Salish languages are, like Chukchi, polysynthetic. Some translations treat all Salish words are either verbs or phrases. Some say that Salish languages do not contain nouns, though this is controversial. Many of the vowels and consonants are not present in most widely spoken languages.

Nuxálk is a notoriously difficult Salishan Amerindian language spoken in British Colombia. It is famous for having some really wild words and even sentences that don’t seem to have any vowels in them at all. For instance, xłp̓x̣ʷłtłpłłskʷc̓he had a bunchberry plant.

The Salishan languages are rated rated 5, hardest of all.

Kootenai

Yet the Salishans always considered the neighboring language Kootenai to be too hard to learn. Kootenai is an isolate spoken in Idaho.

Kootenai is rated 5, hardest of all.

Algonquian

Central Algonquian

Ojibwa and Cree are very hard to learn. They are written in a variety of different ways with different alphabets and syllabic systems, complicating matters even further. They are both polysynthetic and have long, short and nasal vowels and aspirated and unaspirated voiceless consonants. Words are divided into metrical feet, the rules for determining stress placement in words are quite complex and there is lots of irregularity. Vowels fall out a lot, or syncopate, within words.

Cree adds noun classifiers to the mix, and both nouns and verbs are marked as animate or inanimate. In addition, verbs are marked for transitive and intransitive. In addition, verbs get different affixes depending on whether they occur in main or subordinate clauses.

Cree and Ojibwa ares rated 5, hardest of all.

Plains Algonquian

Cheyenne is well-known for being a hard Amerindian language to learn. Like many polysynthetic languages, it can have very long words.

náohkêsáa’oné’seómepêhévetsêhésto’anéheI truly don’t know Cheyenne very well.

Cheyenne is rated 5, hardest of all.

Uto-Aztecan

Numic

Comanche is legendary for being one of the hardest Indian languages of all to learn. Reasons are unknown, but all Amerindian languages are quite difficult. I doubt if Comanche is harder than other Numic languages.

Bizarrely enough, Comanche has very strange sounds called voiceless vowels, which seems to be an oxymoron, as vowels would seem to be inherently voiced. English has something akin to voiceless vowels in the words particular and peculiar, where the bolded vowels act something akin to a voiceless vowel.

Comanche was used for a while by the codespeakers in World War 2 – not all codespeakers were Navajos. Comanche was specifically chosen because it was hard to figure out. The Japanese were never able to break the Comanche code.

Comanche is rated 5, hardest of all.

Quechuan

Quechua is controversial; some say it is very hard to learn, but others disagree. One argument is that there is a lot of dialectal divergence and a lack of learning materials.

On the difficulty side, some say that Quechua speakers spend their whole lives learning the language. Quechua is a controversial case, but I can’t imagine any Amerindian language getting lower than a 5.

Quechua is rated 5, hardest of all.

Oto-Manguean

Chinantec, an Indian language of southwest Mexico, is very hard for non-Chinantecs to learn. The tone system is maddeningly complex, and the syntax and morphology is very intricate.

Chinantec is rated 5, hardest of all.

Iroquoian

Cherokee is very hard to learn. In addition to everything else, it has a completely different alphabet. It’s polysynthetic, to make matters worse. It is possible to write a Cherokee sentence that somehow lacks a verb. There are five categories of verb classifiers. Verbs needing classifiers must use one. Each regular verb can have an incredible 21,262 inflected forms! All verbs contain a verb root, a pronominal prefix, a modal suffix and an aspect suffix. In addition, verbs inflect for singular, plural and also dual. Number is marked for inclusive vs. exclusive.

Cherokee also have lexical tone, with complex rules about how tones may combine with each other. Tone is not marked in the orthography.

Cherokee is rated 5, most difficult of all.

Nambikwaran

This is actually a series of closely related languages as opposed to one language, but the Nambikwara language is the most well-known of the family, with 1,200 speakers in the Brazilian Amazon.

Phonology is complex. Consonants distinguish between aspirated, plain and glottalized, common in the Americas. There are strange sounds like prestopped nasals glottalized fricatives. There are nasal vowels and three different tones. All vowels except one have both nasal, creaky-voiced and nasal-creaky counterparts, for a total of 19 vowels.

The grammar is polysynthetic with a complex evidential system.

Reportedly, Nambikwara children do not pick up the language fully until age 10 or so, one of the latest recorded ages for full competence. Nambikwara is sometimes said to be the hardest language on Earth to learn, but it has some competition.

Nambikwara definitely gets a 5 rating, hardest of all!

Wintotoan

Bora, a Wintotoan language spoken in Peru and Colombia near the border between the two countries, has a mind-boggling 350 different noun classes.

Bora gets a 5 rating, hardest of all.

Tucanoan

Tuyuca is a Tucanoan language spoken in by 450 people in the department of Vaupés in Colombia. An article in The Economist magazine concluded that it was the hardest language on Earth to learn.

It has a simple sound system, but it’s agglutinative, and agglutinative languages are pretty hard. For instance, hóabãsiriga means I don’t know how to write. It has two forms of 1st person plural, I and you (inclusive) and I and the others (exclusive). It has between 50-140 noun classes, including strange ones like bark that does not cling closely to a tree, which can be extended to mean baggy trousers or wet plywood that has begun to fall apart.

Like Yamana, a nearly extinct Amerindian language of Chile, Tuyuca marks for evidentiality, that is, how it is that you know something. For instance:

Diga ape-wi. The boy played soccer (I saw him playing).
Diga ape-hiyi. - The boy played soccer (I assume, though I did not see it firsthand).

Evidential marking is obligatory on all Tuyuca verbs and it forces you to think about how you know whatever it is you know.

Tuyuca definitely gets a 5 rating!

Australian

Australian Aborigine languages are some of the hardest languages on Earth to learn, like Amerindian or Caucasian languages.

All Australian languages are rated 5, most difficult of all.

Papuan

Tor-Kwerba

Berik is a Tor-Kwerba language spoken in Indonesian colony of Irian Jaya in New Guinea.

Verbs take many strange endings, in many cases mandatory ones, that indicate what time of day something happened, among other things.

TelbenerHe drinks in the evening.

Where a verb takes an object, it will not only be marked for time of day but for the size of the object.

KitobanaHe gives three large objects to a man in the sunlight.

Verbs may also be marked for where the action takes place in reference to the speaker.

GwerantenaTo place a large object in a low place nearby.

Berik is rated 5 - hardest of all.

Trans New Guinea

Amele is the world’s most complex language as far as verb forms go, with 69,000 finitive and 860 infinitive forms.

Amele is rated 5 - hardest of all.

Afroasiatic

Semitic

Arabic has some very irregular manners of noun declension, even in the plural. For instance, the word girls changes in an unpredictable way when you say one girl, two girls and three girls, and there are two different ways to say two girls depending on context. Two girls is marked with the dual, but different dual forms can be used. All languages with duals are relatively difficult for most speakers that lack a dual in their native language.

Further, it is full of irregular plurals similar to octopus and octopi in English, whereas these forms are rare in English. When you say I love you to a man, you say it one way, and when you say it to a woman, you say it another way. On and on.

There are 28 different symbols in the alphabet and three different ways to write each symbol depending on its place in the word. Consonants are written in different ways depending on where they appear in a word. An h is written differently at the beginning of a word than you would write it at the end of a word. However, one simple aspect of it is that the medial form is always the same as the initial form.

The laryngeals, uvulars and glottalized sounds are hard for many foreigners to make and nearly impossible for them to get right.

Arabic is at least as idiomatic as French or English, so it order to speak it right you have to learn all of the expressionistic nuances.

One of the worst problems with Arabic is the dialects, which in many cases are separate languages altogether. If you learn Arabic, you often have to learn one of the dialects along with classical Arabic. All Arabic speakers speak both an Arabic dialect and Classical Arabic.

To attain anywhere near native speaker competency in Egyptian Arabic, you probably need to live in Egypt for 10 years, but Arabic speakers say that few if any second language learners ever come close to native competency. There is a huge vocabulary, and most words have a wealth of possible meanings.

Adding weight to the commonly held belief that Arabic is hard to learn is research done in Germany in 2005 which showed that Turkish children learn their language at age 2-3, German children at age 4-5, but Arabic kids did not get Arabic until age 12.

Arabic is rated 4, extremely difficult.

Maltese is a strange language, basically an Arabic language that has very heavy influence from non-Arabic tongues. It shares the problem of Gaelic that often words look one way and are pronounced another.

Maltese is rated 4, extremely difficult.

Hebrew is hard to learn according to a number of Israelis. Part of the problem may be the abjad writing system, which often leaves out vowels. Also, other than borrowings, the vocabulary is Afroasiatic, hence mostly unknown to speakers to IE languages. There are also difficult consonants as in Arabic such as pharyngeals and uvulars.

Hebrew gets a 4 for extremely difficult.

Dravidian

Malayalam, a Dravidian language of India, was recently rated the hardest language of all to learn by the World Language Research Foundation.

Malayalam words are often even hard to look up in a Malayalam dictionary.

For instance, adiyAnkaLAkkikkoNDirikkukayumANello is a word in Malayalam. It means something like “I, your servant, am sitting and mixing (which is why I cannot do what you are asking of me)”.  The part in parentheses is an example of the type of sentence where it might be used.

The word is composed of many different morphemes, including conjunctions and other affixes, with sandhi going on with some of them so they are eroded away from their basic form. There doesn’t seem to be any way to look that word up, or to write a Malayalam dictionary that lists all the possible forms, including forms like the word above. It would probably be way too huge of a book.

Tamil, a Dravidian language, is probably close to Malayalam in difficulty. Tamil has an incredible 247 characters in its alphabet. In addition, as with other languages, words are written one way and pronounced another.

Tamil has two completely different registers for written and spoken speech. Both Tamil and Malayalam are very hard to pronounce, are spoken very fast and have extremely complicated, nearly impenetrable scripts. If Westerners try to speak a Dravidian language in south India, more often than not the Dravidian speaker will simply address them in English rather than try to accommodate them.

Malayalam and Tamil are rated 5, most difficult of all.

Altaic

Most agree that Korean is a hard language to learn.

The alphabet, Hangul at least is reasonable; in fact, it is quite elegant. But there are four different Romanizations- Lukoff, Yale, Horne, and McCune-Reischauer – which is preposterous. It’s best to just blow off the Romanizations and dive straight into Hangul. This way you can learn a Romanization later, and you won’t mess up your Hangul with spelling errors, as can occur if you go from Romanization to Hangul. Hangul can be learned very quickly, but learning to read Korean books and newspapers fast is another matter altogether.

Bizarrely, there are two different numeral sets used, but one is derived from Chinese so should be familiar to Chinese, Japanese or Thai speakers who use similar or identical systems.

Korean has a similar problem with Japanese, that is, if you mess up one vowel in sentence, you render it incomprehensible. Korean has a wealth of homonyms, and this is one of the tricky aspects of the language. Any given combination of a couple of characters can have multiple meanings.

One problem is that the bp, j, ch, t and d are pronounced differently than their English counterparts. The consonants, the pachim system and the morphing consonants at the end of the word that slide into the next word make Korean harder to pronounce than any major European language. The vocabulary is very difficult for an English speaker who does not have knowledge of either Japanese or Chinese. Japanese or Chinese will help you a lot with Korean.

Korean is agglutinative and has a subject-topic discourse structure, and the logic of these systems is difficult for English speakers to understand.

Meanwhile, Korean has an honorific system that is even wackier than that of Japanese. However, the younger generation is not using the honorifics so much, and a foreigner isn’t expected to know the honorific system anyway. Speakers of Korean can learn Japanese fairly easily.

Korean is rated by language professors as being one of the hardest languages to learn.

Korean is rated 5, hardest of all.

Japonic

Japanese also uses a symbolic alphabet, but the symbols themselves are sometime undecipherable, in that even Japanese speakers will sometimes encounter written Japanese and will say that they don’t know how to pronounce it. I don’t mean that they mispronounce it; that would make sense. I mean they don’t have the slightest clue how to say the word! This problem is essentially nonexistent in a language like English.

There are over 2,000 frequently used characters in three different symbolic alphabets that are frequently mixed together in confusing ways. Due to the large number of frequently used symbols, it’s said that even Japanese adults learn a new symbol a day a ways into adulthood.

The Japanese writing system is probably crazier than the Chinese writing system. Japanese borrowed Chinese characters. But then they gave each character several pronunciations, and in some cases as many as 24. Next they made two syllabaries using another set of characters, then over the next millenia came up with all sorts of contradictory and often senseless rules about when to use the syllabaries and when to use the character set. Later on they added a Romanization to make things even worse.

Chinese uses 5-6,000 characters regularly, while Japanese only uses around 2,000. But in Chinese, each character has only one or maybe two pronunciations. In Japanese, there are complicated rules about when and how to combine the hiragana with the characters. These rules are so hard that many native speakers still have problems with them. There are also personal and place names (proper nouns) which are given completely arbitrary pronunciations often totally at odds with the usual pronunciation of the character.

Speaking Japanese is not as difficult as everyone says, and many say it’s fairly easy. However, there is a problem similar to English in that one word can be pronounced in multiple ways, like read and read in English.

There is also a class of Japanese called “honorifics” that is quite hard to master. These typically effect verbs. Honorifics vary depending on who you are and who you are talking to. In addition, gender comes into play. One wild thing about Japanese is counting forms. You actually use different numeral sets depending on what it is you are counting! There are dozens of different ways of counting things.

Japanese grammar is often said to be simple, but that does not appear to be the case on closer examination. Particles are especially vexing. Verbs engage in all sorts of wild behavior, and adverbs often act like verbs. Meanwhile, honorifics change the behavior of all words. There are particles like ha and ga that have many different meanings. One problem is that everything that all noun modifiers, even phrases, must precede the nouns they are modifying.

It’s often said that Japanese has no case, but this is not true. Actually, there are seven cases in Japanese. The aforementioned ga is a clitic meaning nominative, made is terminative case, -no is genitive and -o is accusative.

In this sentence:

The plane that was supposed to arrive at midnight, but which had been delayed by bad weather, finally arrived at 1 AM.

Everything underlined must precede the noun plane:

Was supposed to arrive at midnight, but had been delayed by bad weather, the plane finally arrived at 1 AM.

Speaking Japanese is one thing, but reading and writing it is a whole new ballgame. It’s perfectly possible to know the meaning of every kanji and the meaning of every word in a sentence, but you still can’t figure out the meaning of the sentence because you can’t figure out how the sentence is stuck together in such a way as to create meaning.

However, Japanese grammar has the advantage of being quite regular. For instance, there are only four frequently used irregular verbs.

Like Chinese, the nouns are not marked for number or gender. However, while Chinese is forgiving of errors, if you mess up one vowel in a Japanese sentence, you may end up with incomprehension.

The real problem is that the Japanese you learn in class is one thing, and the Japanese of the street is another. One problem is that in street Japanese, the subject is typically not stated in a sentence. Instead it is inferred through such things as honorific terms or the choice of words you used in the sentence. Probably no one goes crazier on negatives than the Japanese. Particularly in academic writing, triple and quadruple negatives are common, and can be quite confusing.

Yet there are problems with the agglutinative nature of Japanese. It’s a completely different syntactic structure than English. Often if you translate a sentence from Japanese to English it will just look like a meaningless jumble of words. Although many Japanese learners feel it’s fairly easy to learn, surveys of language professors continue to rate Japanese as one of the hardest languages to learn. However, it’s generally agreed that Japanese is easier to learn than Korean. Japanese speakers are able to learn Korean pretty easily.

Japanese is rated 5, hardest of all.

Turkic

Turkish is often considered to be hard to learn, and it’s rated one of the hardest in surveys of language teachers, however, it’s probably easier than its reputation made it out to be. It is agglutinative, so you can have one long word where in English you might have a sentence of shorter words. One word is Çekoslovakyalilastiramadiklarimizdanmissiniz?, meaning, Were you one of those people whom we could not make into a Czechoslovakian? Many words have more than one meaning.

There is no verb to be, which is hard for many foreigners. Instead, the concept is wrapped onto the subject of the sentence as a -dim or -im suffix. Turkish is an imagery-heavy language, and if you try to translate straight from a dictionary, it often won’t make sense. However, the suffixation in Turkish, along with the vowel harmony, are both very precise, and there are few if any exceptions.

Turkish is a language of precision in other ways. For instance, there are eight different forms of subjunctive mood that describe various degrees of uncertainty that one has about what one is talking about. This relates to the evidentiality discussed under Tuyuca above. On Turkish news, verbs are generally marked with miş, which means that the announcer believes it to be true though he has not seen it firsthand

The Roman alphabet and almost mathematically precise grammar really help out. A suggestion that Turkish may be easier to learn that many think is the research that shows that Turkish children learn attain basic grammatical mastery of Turkish at age 2-3, as compared to 4-5 for German and 12 for Arabic. The research was conducted in Germany in 2005.

In addition, Turkish has a phonetic orthography.

However, Turkish is hard for an English speaker to learn for a variety of reasons. It is agglutinative like Japanese, and all agglutinative languages are difficult for English speakers to learn. As in Japanese, you start your Turkish sentence the way you would end your English sentence. As in the Japanese example above, the subordinate clause must precede the subject, whereas in English, the subordinate clause must follow the subject. The italicized phrase below is a subordinate clause.

In English, we say, “I hope that he will be on time.”

In Turkish, the sentence would read, “That he will be on time I hope.”

Turkish is rated 3, or average to moderately difficult.

Finno-Ugric

Finnic

Finnish is very hard to learn, and even long-time learners often still have problems with it. You have to know exactly which grammatical forms to use where in a sentence. In addition, Finnish has 15 cases in the singular and 16 in the plural. This is hard to learn for speakers coming from a language with little or no case.

For instance,
talo is the house
talonhouse’s
taloasome of the house
taloksiinto/as the house
talossa in the house
talostafrom inside the house
talooninto the house
talolla on to the house
taloltafrom beside the house
talolleto the house
taloistafrom the houses
taloissa in the houses.

It gets much worse than that. This web page shows that the noun kauppashop can have 2,253 forms.

A simple adjective + noun type of noun phrase of two words can be conjugated in up to 100 different ways.

Adjectives and nouns belong to 20 different classes. The rules governing their case declension depend on what class the substantive is in.

As with Hungarian, words can be very long. For instance, lentokonesuihkuturbiinimoottoriapumekaanikkoaliupseerioppilas which means a non-commissioned officer cadet learning to be an assistant mechanic for airplane jet engines.

Finnish, oddly enough, always puts the stress on the first syllable. Finnish vowels will be hard to pronounce for most foreigners.

However, Finnish has the advantage of being pronounced precisely as it is written. This is also part of the problem though, because if you don’t say it just right, the meaning changes. So, similarly with Polish, when you mangle their language, you will only achieve incomprehension. Whereas with say English, if a foreigner mangles the language, you can often winnow some sense out of it.

However, despite that fact that written Finnish can be easily pronounced, when learning Finnish, as in Korean, it is as if you must learn two different languages – the written language and the spoken language. A better way to put it is that there is “one language for writing and another for speaking.” You use different forms whether conversing or putting something on paper.

Nevertheless, some pronunciation is difficult, especially the contrast between short and long vowels and consonants. Check out these minimal pairs:

sydämelläsydämmellä and jollekinjollekkin

One easy aspect of Finnish is the way you can build many forms from a base root: kirj-, you can build
kirjabook
kirjeletter
kirjoittaato write
and kirjailijawriter.

Finnish verbs are very regular. The irregular verbs can almost be counted on one hand – juosta, käydä , olla, nähdä, tehdä , and a few others. In fact, On the plus side, Finnish in general is very regular.

As in many Asian languages, there are no masculine or feminine pronouns. One redeeming feature of Finnish is a complete lack of consonant clusters.

Finnish is rated 5, hardest of all.

Estonian has similar difficulties with Finnish, since they are closely related. Estonian has 14 cases, including strange cases such as the abessive, adessive, elative and inessive. It also has three different varieties of vowel length, which is strange in the world’s language. There are short, long vowels and extra-long vowels and consonants.

linalinen – short n
linnathe town’s – long n, written as nn
`linnainto the town – extra-long n, not written out!

There are differences in the pronunciation of the three forms above, but in rapid speech, they are hard to hear, though native speakers can make them out. Difficulties are further compounded in that extra-long sonorants (m, n, ng, l, and r) and vowels and are not written out. All in all, phonemic length can be a problem in Estonian, and foreigners never seem to get it completely down.

Estonian is rated 5, hardest of all.

Ugric

It’s widely agreed that Hungarian is one of the hardest languages on Earth to learn. Even language professors agree. For one thing, there are many different forms for a single word via word modification. This enables the speaker to make his intended meaning very precise.

Hungarian is said to have an incredible 35 different cases, but the actual number is probably just 18. Verbs change depending on whether the object is definite or indefinite. There are five different types of verb conjugations. Nearly everything in Hungarian is inflected, similar to Lithuanian or Czech.

The case distinctions alone can create many different words out of one base form. For the word house, we end up with 31 different words using case forms.

házbainto the house
házban
in the house
házból
- from [within] the house
házra
onto the house
házon
on the house
házról
off [from] the house
házhoz
to the house
házíg
until/up to the house
háznál
at the house
háztól
- [away] from the house
házzá
– Translative case, where the house is the end product of a transformation, such as They turned the cave into a house.
házként
as the house, which could be used if you acted in your capacity as a house, or disguised yourself as one. He dressed up as a house for Halloween.
házért
for the house, specifically things done on its behalf, or done to get the house. They spent a lot of time fixing things up (for the house).
házul
– Essive-modal case. Something like “house-ly” or “in the way/manner of a house.” The tent served as a house (in a house-ly fashion).

And we do have some basic cases:
ház - nominative. The house is down the street.
házat
– accusative. The ball hit the house.
háznak
- dative. The man gave the house to Mary.
házzal – Similar to instrumental, but more similar to English with. Refers to both instruments and companions.

The genitive takes 12 different declensions, depending on person and number.
házam – my house
házaim – my houses
házad – your house
házaid – hour houses
háza – his/her/its house
házai - his/her/its houses
házunk - our house
házaink – our houses
házatok - your house
házaitok - your houses
házuk - their house
házaik - their houses
egyház (literally one-house) means church, as in the Catholic Church.

There are also very long words such as megszentségteleníthetetlenségeskedéseitekért. Being an agglutinative language, that word is made up of many small parts of words, or morphemes. That word means something like for your (you all possessive) repeated pretensions at being impossible to desecrate.

The preposition is stuck onto the word in this language, and this will seem strange to speakers of languages with free prepositions.

Hungarian is full of synonyms, similar to English.

For instance, there are 78 different words that mean to move: halad, jár, megy, dülöngél, lépdel, botorkál, kódorog, sétál , andalog, rohan, csörtet, üget, lohol, fut, átvág, vágtat, tipeg, libeg, biceg, poroszkál, vágtázik, somfordál , bóklászik, szedi a lábát, kitér, elszökken, betér , botladozik, őgyeleg, slattyog, bandukol, lófrál, szalad, vánszorog, kószál, kullog, baktat, koslat, kaptat, császkál, totyog, suhan, robog, rohan, kocog, cselleng, csatangol, beslisszol, elinal, elillan, bitangol, lopakodik, sompolyog, lapul, elkotródik, settenkedik, sündörög, eltérül, elódalog, kóborol, lézeng, ődöng, csavarog, lődörög, elvándorol , tekereg, kóvályog, ténfereg, özönlik, tódul, vonul, hömpölyög, ömlik, surran, oson, lépeget, mozog and mozgolódik .

Only about five of those terms are archaic and seldom used, the rest are in current use.

In addition, while most languages have names for countries that are pretty easy to figure out, in Hungarian even languages of nations are hard because they have changed the names so much. Italy becomes Olazorszag, Germany becomes Nemetzorsag, etc.

As in Russian and Serbo-Croatian, word order is relatively free in Hungarian. Further, there are quite a few dialects in Hungarian. Native speakers can pretty much understand them, but foreigners often have a lot of problems. Accent is very difficult in Hungarian due to the bewildering number of rules to determine accent. In addition, there are exceptions to all of these rules. Nevertheless, Hungarian is probably more regular than Polish. Hungarian spelling is also very strange for non-Hungarians, but at least the orthography is phonetic.

There are many irregularities in inflections, and even Hungarians have to learn how to spell of these in school and have a hard time learning this. Hungarian phonetics is also strange, and to make matters worse, there is tons of slang.

One of the problems with Hungarian phonetics is vowel harmony. Since you stick morphemes together to make a word, the vowels that you have used in the first part of the word will influence the vowels that you will use to make up the morphemes that occur later in the word. The vowel harmony gives Hungarian the “singing effect” when it is spoken. The gy sound is hard for many foreigners to make.

It’s hard to say, but Hungarian is probably harder to learn than even the hardest Slavic languages like Czech, Serbo-Croatian and Polish.

Hungarian is rated 5, hardest of all.

Sino-Tibetan

Sinitic

It’s fairly easy to learn to speak Mandarin at a basic level, though the tones can be tough. This is because the grammar is very simple. Short words, no case, gender, verb inflections or tense. But with Japanese, you can keep learning, and with Chinese, you sort of hit a wall, often because the syntactic structure is so strangely different from English (isolating).

Actually, the grammar is harder than it seems. At first it seems simple, like a simplified English with no tense or articles. But the simplicity makes it difficult. No tense means there is no easy way to mark time in a sentence. Furthermore, tense is not as easy as it seems. Sure, there are no verb conjugations, but instead you must learn some particles and special word order that are used to mark tense.

Once you start digging into Chinese, there is a complex layer under all the surface simplicity. There is aspect, serial verbs, a complex classifier system, syntax marked by something called topic-prominence, a strange form called the detrimental passive, preposed relative clauses, use of verbs rather than adverbs to mark direction, and all sorts of strange stuff.

The alphabet uses symbols, so it’s not even a real alphabet. There are at least 85,000 symbols and actually many more, but you only need to know about 3-5,000 of them, and many Chinese don’t even know 1,000. To be highly proficient in Chinese, you need 10,000 characters, and probably less than 5% of Chinese know that many.

Even leaving the characters aside, the stylistic and literary constraints required to Chinese in an eloquent or formal (literary) manner would make your head swim. And just because you can read Chinese, does not mean that you can read Classical Chinese prose. It’s as if it’s written in a different language.

It’s a real problem when you encounter a symbol you don’t know because there is no way to sound out the word. You are really and truly lost and screwed. You need to learn quite a bit of vocabulary just to speak simple sentences.

The tones are often quite difficult for a Westerner to pick up. If you mess up the tones, you have said a completely different word. Often foreigners who know their tones well nevertheless do not say them correctly, and hence, they say one word when they mean another.

A major problem with Chinese is homonyms. To some extent, this is true in many tonal languages. Since Chinese uses short words and is either monosyllabic or disyllabic, there is a limited repertoire of sounds that can be used. At a certain point, all of the sounds are used up, and you are into the realm of homophones.

Tonal distinctions is one way that monosyllabic and disyllabic languages attempt to deal with the homophone problem, but it’s not good enough, since Chinese still has many homophones, and meaning is often discerned by context. Chinese, like French and English, is heavily idiomatic.

It’s little known, but Chinese also uses different forms to count different things, like Japanese. Many agree that Chinese is the hardest to learn of all of the major languages. Language professors have rated Chinese as the hardest language on Earth to learn.

It gets a 5 rating for hardest of all.

However, Cantonese and Min Nan (Taiwanese) are even harder to learn than Mandarin. Cantonese has nine tones to Mandarin’s four, and in addition, they continue to use a lot of the older traditional Chinese characters that were superseded when China moved to a simplified script in 1949. In addition, Cantonese has verbal aspect, possibly up to 20 different varieties. Furthermore, since non-Mandarin characters are not standardized, Cantonese cannot be written down as it is spoken.

Min Nan also has a more complex tone system than Mandarin, with eight tones. Even many Taiwanese natives don’t seem to get it right these days, as it is falling out of favor and many fewer children are being raised speaking than before.

Cantonese and Min Nan get 5 ratings, hardest of all.

Austroasiatic

Mon-Khmer

Vietnamese is also hard to learn because to an outsider, the tones seem hard to tell apart. Therefore, foreigners often make themselves difficult to understand by not getting the tone precisely correct. It also has “creaky-voiced” tones, which are very hard for foreigners to get a grasp on. Vietnamese grammar is fairly simple, and reading Vietnamese is pretty easy once you figure out the tone marks. Words are short as in Chinese. However, the simple grammar is relative, as you can have 25 or more forms just for I, the 1st person singular pronoun.

Vietnamese gets 4, extremely difficult.

Khmer has a reputation for being hard to learn. I understand that it has one of the most complex honorifics systems of any language on Earth. Over a dozen different words mean to carry depending on what one is carrying. There are several different words for slave depending on who owned the slave and what the slave did. There are 28-30 different vowels, including sets of long and short vowels and long and short diphthongs. The vowel system is so complicated that there isn’t even agreement on exactly what it looks like.

Speaking it is not so bad, but reading and writing it is pretty difficult. For instance, you can put up to five different symbols together in one complex symbol.

Khmer gets a 5 rating, hardest of all.

Sedang, a language of Vietnam,  has the highest number of vowel sounds of any language on Earth, at 55 distinct vowel sounds.

Sedang gets a 5 rating, hardest of all.

Hmong-Mien

Hmong is widely spoken in this part of California, but it’s not easy to learn. There are eight tones, and they are not easy to figure out. It’s not obviously related to any other major language but the obscure Mien.

It has some very strange consonants called voiceless nasals. We have them in English as allophones – the m in small is voiceless, but in Hmong, they put them at the front of words – the m in the word Hmong is voiceless. These can be very hard to pronounce.

Hmong gets a 5 rating, hardest of all.

Austro-Tai

Austronesian

Malayo-Polynesian

Bahasa Indonesia and the related Malaysian are fairly easy languages to learn. For one thing, the grammar is dead simple. Verbs are not marked for tense at all. And the sound system of these languages, in common with Austronesian in general, is one of the simplest on Earth. Bahasa Indonesia has few homonyms, homophones, homographs,
heteronyms, etc. Words in general have only one meaning. Though the orthography is not completely phonetic, is only has a small number of exceptions. The system for converting words into nouns or verbs is regular.

Bahasa Indonesia and Malaysian get a 1 rating for very easy.

However, Tagalog is considerably harder. Tagalog is an ergative-absolutive language, not a nominative-accusative language. In the former, phrases are marked not according to subject or object as in the latter, but according to whether the verb is transitive or intransitive. The subject of a transitive verb is marked one way, and the subject of an intransitive verb and object of a transitive verb are marked a second way.

Compared to many European languages, Tagalog syntax, morphology and semantics are often quite different. Unlike Malay, verbs conjugate quite a bit in Tagalog. However, articles and creation of adjectives from nouns is very easy. Compare ganda = beauty (noun) and maganda = beautiful (adjective).

Tagalog gets a 3 rating, average to moderately difficult.

Maori and other Polynesian languages have a reputation for being quite hard to learn, but others say they are not that hard at all, so the situation is confused. The pronunciation is simple, and there is no gender. The main problem for English speakers is that the sentence structure is backwards compared to English. In addition, macrons can cause problems.

Maori gets a 3 rating, average to moderately difficult.

Kwaio is an Austronesian language spoken in the Solomon Islands. It has four different forms of number to mark pronouns – not only the usual singular and plural, but also the rarer dual and the very rare paucal.

For instance:

1 dual inclusive (you and I)
1 dual exclusive (I and someone else, not you)

1 paucal inclusive (you, I and a few others)
1 paucal exclusive (I and a few others)

1 plural inclusive (I, you and many others)
1 plural exclusive (I and many others)

Pretty wild!

Kwaio gets a 5, hardest of all.

Tai-Kadai

Thai is a pretty hard language to learn. There are 75 symbols in the strange script, there are no spaces between words in the script, and vowels can come before, after, above or below consonants in any given syllable. There are five tones, including a neutral tone. Tones are determined by a variety of complex things, including a combination of tone marks, the class of consonants, if the syllable ends in a sonorant or a stop, and what the tone of the preceding syllable was.

There is a system of noun classifiers for counting various things, similar to Japanese. In addition, common to many Asian languages, there is a complicated honorifics system. The vowels are different than in many languages, and there are some unusual diphthongs: eua, euai, aui and uu. There is a contrast between aspirated and unaspirated consonants.

Consonant pronunciations vary depending on the location of the syllable in the word – for instance, s can change to t. There are many vowels which are spoken but not written. There are many consonants that are pronounced the same – for instance, there are six different t‘s, not counting the s‘s that turn into t‘s. The Thai script is definitely one of the most difficult phonetic scripts. Nevertheless, the Thai script is easier to learn than the Japanese or Chinese character sets. In spite of all of that, the syntax is simple, like Chinese.

Thai gets a 4 rating, extremely hard to learn.

Niger-Kordofanian

Niger-Congo

Bantu

Bakjalukasha, a Bantu language spoken in Ivory Coast, is hard to learn. Many of these African languages are tonal and can be quite complex. They also divide nouns into different categories (noun classes) like Caucasian languages do. Further, they are often seriously inflected.

Bakjalukasha gets a 5 rating, hardest of all.

Nguni and Xhosa, two languages of South Africa, are quite difficult, with up to nine click sounds in both. Clicks only exist in one language outside of Africa, an Australian language, and are extremely difficult to learn. Even native speakers mess up the clicks sometimes. Nelson Mandela said he had problems making some of the click sounds in Xhosa.

Nguni and Xhosa get 5 ratings, hardest of all.

Zulu and Ndebele also have these impossible click sounds. These languages also make plurals by changing the prefix of the noun, and the manner varies according the noun class. If you want to look up a word in the dictionary, first of all you need to discard the prefix. For instance, in Ndebele,

river = umfula
rivers = imifula

but stone = ilitshe
stones = amatshe

yet tree = isihlahla
trees = izihlahla .

Zulu has pitch accent, tones and clicks. There are nine different pitch accents, four tones and three clicks, but each click can be pronounced in five different ways. However, tones are not marked in writing, so it’s hard to figure out when to use them. Zulu also has depressor consonants, which lower the tone in the vowel in the following syllable. In addition, Zulu has multiple gender – 15 different genders. And some nouns behave like verbs.

Zulu and Ndebele both get 5 ratings, hardest of all.

The African Bantu language Ga has a bad reputation for being a tough nut to crack. It is spoken in Ghana by about 600,000 people. It has two tones and engages in a strange behavior called tone terracing that is common to many West African languages. It also has many sounds that are not in any Western languages.

Ga gets a 5 rating, hardest of all.

Ndali is a Bantu language with 150,000 speakers spoken in Malawi and Tanzania. It has many strange tense forms. For instance, in the past tense:

Past tense A: He went just now.
Past tense B: He went sometime earlier today.
Past tense C: He went yesterday.
Past tense D: He went sometime before yesterday.

Future tense is marked similarly:

Future tense A: He’s going to go right away.
Future tense B: He’s going to go sometime later today.
Future tense C: He’s going to go tomorrow.
Future tense D: He’s going to go sometime after tomorrow.

Ndali gets a 5, hardest of all.

For unknown reasons, Swahili is generally considered to be an easy language to learn. The US military ranks it 1, with the easiest of all languages to learn. This seems to be the typical perception. Why Swahili is so easy to learn, I am not sure. It’s a trade language, and trade languages are often fairly easy to learn. There’s also a lot of controversy about whether or not Swahili can be considered a creole, but that has not been proven. For the moment, the reasons why Swahili is so easy to learn will have to remain mysterious.

Swahili gets a 1 rating, easiest of all.

Khoisan

!Xóõ (Taa),spoken by only 4,200 Bushmen in Botswana and Namibia, is a notoriously difficult Khoisan language replete with the notoriously impossible to comprehend click sounds. Taa has anywhere from 130 to 164 consonants, possibly the largest phonemic inventory of any language. Of this vast wealth of sounds, there are anywhere from 30-64 different click sounds.

In addition, there are four types of vowels: plain, pharyngealized, breathy-voiced and strident. On top of that, there are four tones. Speakers develop a lump on their larynx from making the click sounds.

Taa, gets a 5 rating, hardest of all.

Eskimo-Aleut

Inuktitut is extremely hard to learn. Inuktitut is polysynthetic-agglutinative, and roots can take many suffixes, in some cases up to 700. Verbs have 63 present indicative and conjugation involves 252 different inflections. However, suffixation is extremely regular. In a typical long Inuktitut text, 92% of words will occur only once. This is quite different from English and many other languages where certain words occur very frequently or at least frequently. Certain fully inflected verbs can be analyzed both as verbs and as nouns. Words can be very long.

InuktituusuungutsialaarungnanngittuaraaluuvungaI truly don’t know how to speak Inuktitut very well.

Inuktitut is also rated one by linguists one of the hardest languages on Earth to pronounce. Inuktitut may be as hard to learn as Navajo.

Inuktitut is rated 5, hardest of all.

Paleosiberian

Chukchi is a polysynthetic languages, so clearly it must be hard to learn. In polysynthetic languages, very long words can denote an entire sentence, and it’s quite hard to take the word apart into its parts and figure out exactly what they mean and how they go together.

Chukchi gets a 5 rating, hardest of all.

Basque

Basque, of course, is just a wild language altogether. There is an old saying that the Devil tried to learn Basque, but after seven years, he only learned how to say Hello and Goodbye. There are 24 cases, and the verbs are quite complex. This is because it is an ergative language, so verbs vary according to the number of subjects and the number of objects and if any third person is involved.

If you don’t grow up speaking Basque, it’s hard to attain native speaker competence. It’s quite a bit easier to write in Basque than to speak it. Nevertheless, Basque verbs are quite regular. In fact, the entire language is quite regular. In addition, most words above the intermediate level are borrowings from large languages, so once you reach intermediate Basque, the rest is not that hard. In addition, on the plus side, pronunciation is straightforward.

Basque is rated 5, hardest of all.

52 Comments

Filed under !Xóõ, Afroasiatic, Algonquian, Altaic, Arabic, Austro-Asiatic, Austro-Tai, Austronesian, Bahasa Indonesian, Bakjalukasha, Bantu, Basque, Cantonese, Cherokee, Chinantec, Chinese language, Chukchi, Chukotko-Kamchatkan, Cree, Dene-Yenisien, Descriptive, Dravidian, Eskimo-Aleut, Finnic, Finnish, Finno-Ugric Languages, Hebrew, Hmong, Hmong-Mien, Hopi, Hungarian, Inuktitut, Iriquoian, Isolates, Japanese, Japonic, Khmer, Khoisan, Kootenai, Korean language, Language Families, Language Learning, Language Samples, Linguistics, Malayalam, Malayo-Polynesian, Malaysian, Maltese, Mandarin, Maori, Min Nan, Mon-Khmer, Na-Dene, Navajo, NE Caucasian, Nguni, Niger-Congo, Niger-Kordofanian, Ojibwa, Oto-Manguean, Paleosiberian, Philippine, Quechua, Quechuan, Salishan, Semitic, Sinitic, Sino-Tibetan, Slavey, Tabasaran, Tamil, Tsez, Turkic, Turkish, Ugric, Vietnamese, Xhosa, Yamana