A Reworking of Chinese Language Classification

Updated December 3, 2014. This post runs to 112 pages so far. On March 6, 2011, Sinologist Victor Mair took on the question of Mutual Intelligibility of Sinitic Languages.

The Chinese languages have undergone a lot of reclassification lately (Mair 1991), from one Chinese language a couple of decades ago up to 14 Chinese languages today according to the latest Ethnologue.

However, Jerry Norman, one of the world’s top experts on Chinese, says that based on mutual intelligibility, there are 350-400 separate languages within Chinese (Mair 1991). According to Gong Xun, a Sichuan Mandarin speaker in Deyang, China, by my criteria of distinguishing between language and dialect, there would be 300-400 separate languages in Fujian alone.

So far, 2,500 dialects of the Chinese language have been identified, and a number of them are separate languages.

I have been doing research on this issue recently. Based on the criteria of mutual intelligibility, I have expanded the 14 Chinese languages into 365 separate languages.

There are different ways of doing mutual intelligibility. I decided to put it at 90%, with >90% being dialect and <90% being a separate language. This is based on what appears to be Ethnologue‘s criteria for establishing the line between a dialect and a language.

In the cases below where I had intelligibility data available, a number of Chinese languages had no more than 65% intelligibility between them (Cheng 1991).

Intelligibility is hard to determine. I am not interested in typological studies of lects involving either lexicon, phonology or tones, unless this can be quantified in terms of intelligibility in a scientific way (see Cheng 1991). For the most part, what I am interested in is, “Can they understand each other?”

Reasonable, fair-minded and professional comments, additions, criticisms, elaborations, presentations of evidence, etc. are highly encouraged, as long as politics and emotions are left out of it. The purpose of the classification below is more to stimulate academic interest and sprout new thinking and theory. It is not intended to be an end-all or be-all statement on the subject, in fact, it is quite the opposite.

Interested scholars, observers or speakers of Chinese languages are encouraged to contribute any knowledge that they may have to add to or criticize this data below. So far as I know, this is the first real attempt to split Chinese beyond the 14 languages elucidated by Ethnologue.

There are lapses in the data below. I mean to present this data in outline form to make it more readable.

There are also problems with the data below. In many cases, “separate language” just means that the lect is not intelligible with Putonghua. Unfortunately, I currently lack intelligibility data within the major language groups such as Gan, Xiang, Wu and the branches of Mandarin. There is probably quite a bit of lumping still to be done below. Where lects are mutually intelligible below, I have tried to lump them into one language with various dialects.

It is reasonable to ask what background and expertise I have to write such a post. I have a Masters Degree in Linguistics and have been employed as a salaried linguist for a US Indian tribe. I also sit on a peer review board for a linguistics journal and will soon publish my first work in book form via a book chapter in a book on Turkic languages that will come out soon.

I assume it will be controversial. Keep in mind that this work is extremely tentative and should not be taken as the last word on the subject by a long shot. There are claims that this study claims to be “accurate and precise.”

In truth, it claims nothing of the sort. Initial studies, which is what this is, are de facto never “accurate and precise,” and you can take an extreme argument from scientific philosophy that no science is really “accurate and precise” but is simply “correct for now” or “correct until proven otherwise.”

Gan is a separate language, already identified as such. Many individual Gan lects are unintelligible to other Gan lects. In fact, it is possible that all Gan lects are unintelligible with each other, but that remains to be proven.

Outside of Gan Proper, Leping, while very diverse, is nevertheless intelligible with nearby Gan lects and with Nanchang (Campbell January 2009).

Nanchang and Anyi are apparently separate languages within Gan based on a 200 word Swadesh test (Ben Hamed 2005). Nanchang has a great deal of dialectal diversity, with several dialects covering different cities and the rural areas. Intelligibility is not known.

Jiangyu, spoken in Hubei, is very strange and at least unintelligible to Putonghua speakers, as is Huarong (evidence). Huarong is surely a separate language.

Similarly, Wanzai must surely be a separate language, as must Yichun, Ji’an, Wanan, Fuzhou, Yingtan, Leiyang, Huaining and Dongkou.

Nanchang and Anyi are within the Changjing Group of Gan, which has 15 different lects. Yingtan and Leping are members of the Yingyi Group has 12 lects. Jiangyu and Huarong are members of the Datong Group of Gan, which has 13 lects. Yichun is a member of the Yiliu Group of Gan, which has 11 lects. Wanzai is a member of the Yiping Group of Gan, of which it is the only member.

Leiyang is a member of the Leizi Group of Gan, which has 5 lects. Wanan is a member of the Jilian Group of Gan, of which it is the only member. Ji’an is a member of the Jicha Group of Gan, which has 15 lects. Huaining is a member of the Huaiyue Group of Gan, which has 9 lects. Fuzhou is a member of the Fuguang Group of Gan, which has 15 lects. Dongkou is a member of the Dongsui Group of Gan, which has 5 lects.

Gan has 102 separate lects in it. There are 30 million speakers of the Gan languages.

Within the Min group, Northern Min (Min Bei) and Central Min (Sanminghua) have already been identified as separate languages. There are 50 million speakers of all of the Min languages (Olson 1998). Northern Min has only 0-20% intelligibility with Min Nan.

Central Min has three lects, Shaxian, Sanming and Yongan, but we don’t know if there are languages among them. Central Min has 3.5 million speakers.

Northern Min is said to be a single language, although it has 9 separate lects. Most dialects are said to be mutually intelligible, but Jianyang and Jian’ou have only about 75% intelligibility. Northern Min has 10 million speakers.

The standard dialect of Min Dong or Eastern Min is Fuzhou.

Eastern Min has only 0-20% intelligibility with Min Nan.

Chengguan, Yangzhong and Zhongxian are separate languages, all spoken in Youxi County (Zheng 2008).

Beyond that, Eastern Min is reported to have several other mutually unintelligible languages. One of them is Fuqing, located near Fuzhou but not intelligible with it, according to Wikipedia, but others say the two are mutually intelligible, although speakers are divided on the question.

It appears that possibly Fuzhou speakers can understand Fuqing speakers better than the other way around. Fuzhou and Fuqing are about 65% intelligible in praxis, and it about the same with the rest of the Hougan Group (Ngù 2009).

Ningde, Fuding and Nanping are probably other languages in this family (evidence). Of these three, Ningde is definitely a separate language. According to George Ngù, a passionate proponent of Fuzhou, “Fuzhou is not intelligible even within its many varieties.”

It’s not clear if that applies to all of Eastern Min, but it appears that it does. Therefore, Changle, Gutian, Lianjiang, Luoyuan, Minhou, Minqing, Pingnan, Pingtan, Yongtai, Fuan, Fuding, Shouning, Xiapu, Zherong and Zhouning are all separate languages.

There are two other lects lumped in with Eastern Min. Manjiang is spoken in the central part of Taishun County, and Manhua spoken in the eastern part of Cangnan County. Both of these names mean “barbarian speech.”

Both are probably mixtures of Southern Wu (Wenzhou etc.), Eastern Min, Northern Min, and maybe even pre-Sinitic languages. Manhua and Manjiang are not intelligible with Fuzhou. However, Manjiang has affinity with Shouning in phonology, vocabulary and grammar. Whether or not it is intelligible with Shouning is not known. Min Nan speakers who have looked at Manjiang data say that it doesn’t even look like a Sinitic language.

Manhua is best dealt with as a form of Wu. I discuss it further below under Wu.

Fuding, Fuan, Shouning, Xiapu, Zherong and Zhouning are in the Funing Group of Eastern Min, which has 6 lects.

Fuzhou, Fuqing, Chengguan, Yangzhong, Zhongxian, Ningde, Changle, Gutian, Lianjiang, Luoyuan, Minhou, Minqing, Pingnan, Pingtan, Yongtai and Nanping are in the Houguan Group of Eastern Min, which has 16 lects.

Eastern Min contains 23 separate lects.

Within Min Nan, Xiamen and Teochew are separate languages (evidence). There is even a proposal to split Xiamen, Qiongwen and Teochew into three separate languages before SIL.

Amoy, Taiwanese, Jinjiang, ZhangzhouTainan, Taibei, Yilan, Taichung, Quanzhou and Lufeng are part of the Xiamen group.

Jinmen is apparently a separate language, as it has poor intelligibility with Taiwanese.

A much better name for Xiamen according to the Chinese literature is Quanzhang (Campbell January 2009).

Quanzhang is a combination of Quanzhou and Zhangzhou, two of the most important dialects in the language. Xiamen has only 51% intelligibility with Teochew. Whether or not Zhangzhou and Quanzhou are intelligible in China itself is still somewhat of an open question.

Nevertheless, Quanzhou speakers in Singapore can no longer understand Taiwanese or Xiamen well, though they have partial understanding of them. They have only 30-40% intelligibility with Yilan. Nevertheless, they have good understanding of Zhangzhou. This implies that much of the understanding between at least some of the Xiamen lects was due to bilingual learning.

The Yilan dialect on Taiwan is so different that it alone has posed serious problems for the task of standardizing Taiwanese Min Nan, yet it is intelligible with the rest of Taiwanese (Campbell January 2009). Lugang is also very different but is also intelligible with Taiwanese (Campbell 2009).

There are some communication problems for Tainan speakers hearing Taipei, but it appears that they are still intelligible with each other (Campbell January 2009).

JieyangRaoping, Chaoyang, Shantou (Swatow) and Hailok’hong (Haklau) are lects in the Teochew Group (evidence) of Teochew. Teochew (Chaozhou) is the prestige version of Teochew. Chaoyang speakers can understand Jieyang, Raoping (evidence) and Shantou, but intelligibility is difficult with Haifeng and Lufeng. Shantou, Raoping, and Jieyang are then dialects of Chaoyang.

Zhangzhou and Quanzhou have marginal intelligibility with Teochew varieties. They are both spoken in Taipei, Taiwan. After all, Taiwanese itself is just a mixture between Zhangzhou and Quanzhou. The situation in Taipei was interesting. The dialects of the city were a mix of Zhangzhou and Quanzhou. The dialect of the center of the city was mixed between the two, with a slight Quanzhou lean to it. In Sulim (Shilin), people spoke with a dialect that heavily favored Zhangzhou. Other districts spoke a Tang’oann-type dialect, which is just Quanzhou mixed with a bit of Zhangzhou.

All these conditions are more common with the older generation because the new generation either does not speak Teochew at all or they favor the mixed Zhangzhou-leaning “Southern” style favored in the media, or they just do not speak the language at all. Hailok’hong (Haklau) is spoken down the coast between the Teochew zone and the Hong Kong area. It has marginal intelligibility with other Teochew lects. Nevertheless, Taiwanese speakers can no longer understand the pure Quanzhou spoken in the Chinese city of that name.

On the other hand, Chaoyang itself is unintelligible to some other Teochew lects. Shantou speakers cannot understand some of the other Teochew lects, and speakers of other lects often find Shantou hard to understand.

Sources report that Teochew lects can vary greatly in the pronunciation of even single words, and the tones can be quite different too.

There are claims that Teochew is intelligible with Zhangzhou and Quanzhou, but these claims appear to be incorrect (see above). That might make some sense, as Teochew are a group of Min speakers who broke off from Zhangzhou Min about 600-1,100 years ago. They moved down to northeast Guangdong, after hundreds of years, a heavy dose of Cantonese went in, producing modern Teochew.

chinese language map

Teochew has only 51% intelligibility with Xiamen.

Haifeng and Shanwei are members of the Luhai Teochew subgroup of Teochew, which differs markedly from Teochew and may be a separate language. Luhai is said to be halfway between Teochew and Zhangzhou. Luhai probably represents a later move from Zhangzhou towards northeast Guangdong by the same group that formed Teochew. This move may have occurred around 400 years ago.

Lufeng is said to have over 90% intelligibility with Xiamen, but if it is really halfway between, it should have 75% intelligibility. Intelligibility testing may be needed.

The Teochew spoken in Indochina – in particular, in Vietnam and Cambodia (Indochinese Teochew) may be a separate language. Some Indochinese Teochew speakers who have returned to their family villages say they could only understand 70% of the speech there.

Furthermore, intelligibility is difficult between Malay Teochew and other Teochew, such as SE Asian Teochew and Teochew on the mainland. Malay Teochew is spoken in Malaysia, Singapore and Indonesia.

The Teochew variant spoken in Malaysia is composed of many highly variant lects. Whether or not they are mutually intelligible with each other is not known. The variety spoken in Medan, Indonesia is particularly interesting. It has heavy Malay and Cantonese influence and cannot be understood by other Teochew speakers. Teochew has 10 million speakers.

Zhangping, though close to Xiamen, is a separate language according to a 200 word Swadesh test (Ben Hamed 2005).

Sanjiang appears to be a separate language .

Datian, in Fujian, is also a separate language.

A version of Hokkien called Malay Hokkien is spoken in Malaysia and in Indonesia in Sumatra and Kalimantan. In Indonesia, it is spoken in the city of Medan, the state of Riau, the city of Bagansiapiapi on Sumatra and in a few places on Kalimantan, such as Kuching and especially in Brunei. Malay Hokkien is heavily laced with Teochew.

Northern Malay Hokkien is spoken from Taiping along the coast formerly all the way to Phuket but now only to Pedang in Malaysia and in Indonesia in the city of Medan, the state of Riau, the city of Bagansiapiapi on Sumatra and in a few places on Kalimantan, such as Kuching and especially in Brunei. Speakers of Northern Malay Hokkien have a hard time understanding the Southern Malay Hokkien (see Singapore Hokkien below) spoken in Kelang, Malacca and Singapore. Northern Malay Hokkien is creolized, with Malay and Thai embedded deeply in the language.

Southern Malay Hokkien is less creolized, if at all. Singapore Hokkien lies between Northern Malay Hokkien and Taiwanese on the continuum. A very pure variety of Hokkien is spoken in the Indonesian city of Bagansiapiapi. It has avoided the Mandarinization of Hokkien that is occurring elsewhere. They speak like the Hokkien speakers of Tang’oann (Tong’an), China.

Kelantan Hokkien is spoken in the Malay state of Kelantan. It is wildly creolized with Malay and is probably not intelligible with any other form of Hokkien.

The version of Hokkien spoken in the Philippines is often called Binamhue, Banlamhue or Minanhua (Philippines Hokkien) by speakers, derives from a dialect on the outskirts of Quanzhou, and it may have drifted into a separate language. At present, it is sometimes not intelligible with Quanzhou or Xiamen. That is, some Philippines Hokkien speakers claim that they can only understand about 70% of Taiwanese television.

The version of Min Nan, Singapore Hokkien (Southern Malay Hokkien), spoken in Singapore, Kelang and Malacca is similar to that spoken in Taiwan, but many Singapore Hokkien speakers have a hard time understanding Taiwanese Hokkien, while others can understand it just fine. Older Singapore Hokkien speakers can understand Taiwanese Hokkien better than younger ones. This is due to bilingual learning more than anything else because younger Singapore Hokkien speakers are no longer good at understanding other Min Nan dialects due to lack of exposure to them.

The reason that Taiwanese speakers can seem to speak communicate well with Singapore Hokkien speakers is because they are using a simpler vocabulary. A Singapore Hokkien speaker, if immersed in Taiwan, could pick up Taiwanese fairly quickly, within say 3 months.

An umbrella term covering Malay Hokkien, Singapore Hokkien and Philippines Hokkien may be Nusantaran Hokkien.

Another language in the same group is best called Wan’an, comprising a number of dialects and possibly languages in Wan’an County of Fujian (Branner 2008).

Zhaoan, Pinghe and Yunxiao, also of Fujian, are separate languages.

Wan’an and Longyan are not mutually intelligible (Branner 2008). Longyan seems to have about 85% intelligibility with Taiwanese. Koongfu and Shizhong are apparently dialects of Longyan Min and are probably intelligible with it. Koongfu is spoken in Kanshi Township in Yongding County. Shizhong is spoken in southern Longyan County.

There are many varieties of Southern Min spoken in Western Fujian that may or may not be independent languages.

Liancheng Gutyan Junbao, Longyan Wan’an Wuzhai, Longyan Wan’an Songyang, Longyan Wan’an Tutuan, Longyan Baisha Youshui, Shiahtsuen Buhyun Liling, Shanghang Buhyun Liling, Liancheng Xuanhe Shengxing, Shanghang Gutian Laifang, Liancheng Xinquan Linguo, Liancheng Xinquan Lelian, Liancheng Pengkou Wangcheng, Liancheng Miaoqian Zhixi, Liancheng Gechuan Zhuyu, Liancheng Miaoqian Jiangshe, Liancheng Sibao Shangjian Zhenbian, Liancheng Juxi Gaoding, Liancheng Tangqian Dikeng, Liancheng Wencheng Hengming, Liancheng Xinquan Dongnancun, Liancheng Quxi Puxi Dongxiduan, Liancheng Quxi Qiaotou and Liancheng Liwu Nanban Zhangwu are spoken in Western Fujian. Shiahtsuen is spoken in Laiyuan Township in southeastern Liancheng County. (Branner 2000).

Whether or not these lects are dialects or separate languages is difficult to say. With many of these lects, they don’t understand each other at first, but after they talk to each other for a while, they start to figure out the other lect. (Branner 2008). Intelligibility testing needs to be done for these lects.

Quanzhou, Zhangzhou, Singapore Hokkien, Philippines Hokkien, Xiamen, Amoy, Yilan, Tainan, Taipei, Taichung, Taiwanese, Jinjiang, Lufeng, Lugang, Jinmen, Zhangping, Koongfu, Shizhong, Nanjing, Zhaoan, Pinghe, Yunxiao, Longyan, Wan’an, Liancheng Gutyan Junbao, Longyan Wan’an Wuzhai, Longyan Wan’an Songyang, Longyan Wan’an Tutuan, Longyan Baisha Youshui, Shiahtsuen, Shanghang Buhyun Liling, Liancheng Xuanhe Shengxing, Shanghang Gutian Laifang, Liancheng Xinquan Linguo, Liancheng Xinquan Lelian, Liancheng Pengkou Wangcheng, Liancheng Miaoqian Zhixi, Liancheng Gechuan Zhuyu, Liancheng Miaoqian Jiangshe, Liancheng Sibao Shangjian Zhenbian, Liancheng Juxi Gaoding, Liancheng Tangqian Dikeng, Liancheng Wencheng Hengming, Liancheng Xinquan Dongnancun, Liancheng Quxi Puxi Dongxiduan, Liancheng Quxi Qiaotou and Liancheng Liwu Nanban Zhangwu are all members of the Quanzhuang Group of Min Nan, which has 50 lects.

Teochew, Shantou, Lufeng, Haifeng, Chaoyang, Jieyang, SE Asian Teochew and Malaysian Teochew are members of the Chaoshan Group of Min Nan, which has 12 lects.

Datian is in its own group in Min Nan.

Min Nan consists of 68 separate lects. Clearly, the dialectal relationships of Min Nan are confusing, as many of the lects are very closely related, if not fully intelligible. Intelligibility testing may be needed to sort out some of these issues. There are 30 million speakers of Southern Min.

Zhenan Min, spoken in Zhejiang Province around Pingnang and Cangnan and in the Zhoushan Islands, is a separate language. Zhenan Min contains 4 lects, Pingyang, Cangnan, Dongtou and Yuhuan, which may or may not be languages. Zhenan Min has 574,000 speakers. Zhenan Min is influenced by Eastern and Northern Min.

Qiongwen (Hainanese) is a separate language with 8 million speakers. It has the lowest intelligibility with the rest of Southern Min as any other Min Nan lect. Qiongwen itself has 14 separate lects, all spoken on Hainan. Whether or not any of them are separate languages is not known.

Longyan (Branner 2008) is a separate language, apart from Southern Min. It is spoken in Longyan City’s Xinluo District and Zhangping City and has 740,000 speakers. It has heavy Hakka influence due to the large number of Hakka speakers in the surrounding areas.

Another split in Min is Leizhou. Leizhou Min is a separate language and is now recognized by some as a separate branch of Min altogether, along the lines of Southern and Northern Min. Leizhou consists of 7 different lects. Haikang appears to be a dialect of Leizhou.

However, at least some of the other 6 Leizhou lects are very different in phonology and lexicon. Intelligibility data is not known, but they may be mutually intelligible. Leizhou Min, with 4 million speakers, has low intelligibility with Min Nan lects and has only 50% intelligibility with Hainanese.

Shaojiang Min, or Min Gan, is said to be a completely separate high-level division of the Min language like Leizhou Min. It has four lects – Shaowu, Guangze, Jiangle and Shunchang – that are said to be mutually intelligible. There are subdialects within these larger lects. The substratum of Shaojiang is not Min, Gan or Hakka – instead, it is the ancient Baiyue language.

Puxian Min has already been identified as a separate language. Puxian has 3 separate lects. There are minor differences between these lects.

However, there is a form of Puxian Min spoken in Singapore, Hinghwa, and presently it lacks full intelligibility with Puxian Min proper. Puxian speakers are a minority in Singapore, and their language has mixed a lot with Singapore Hokkien, Malay, English and other languages spoken in Singapore, resulting in a separate language.

A Min language called Longdu, located in Guangdong, is not only a separate language (evidence here and here) but seems to be in another Min category from Southern Min. It is spoken in the southwest corner of Zhongshan City in Shaxi and Dayong.

In Guangdong Province, there are other divergent lects of Min Nan. Two of these, Nanlang (also spoken in Zhongshan) and Sanxiang, are also separate languages. Nanlang is spoken 10 miles southeast of Zhongshan in Cuiheng. It is also spoken in Nanlang and Zhangjiabian. Sanxiang is spoken to the south of Zhongshan in the hilly rural areas.

In Chinese, Longdu, Nanlang and Sanxiang are referred to as All-Lung, South Gourd and Three Rural, respectively. Sources give Longdu and Nanlang 100,000 speakers and Sanxiang 30,000 speakers. 14% of the population of Zhongshan speaks Min. Nanlang now has mostly elderly speakers.

All of these seem to be in the same group, Zhongshan Min, and all are spoken in the Pearl River Delta near Hong Kong. Zhongshan Min has 150,000 speakers.

This group is possibly a Northern or Eastern Min group stranded way down in Guangdong. They are sometimes referred to in old literature as “Northeastern Min”. That’s not really a category. It often means Northern Min, but sometimes it means Eastern Min. These languages have all borrowed extensively from the type of Cantonese spoken in the Pearl River Delta.

Looking at the whole picture, it appears that various immigrants speaking Puxian Min, Northern Min and Southern Min all settled around Zhongshan. These various Min elements, along with a hefty dose of Cantonese, have gone into the creation of Zhongshan Min.

Sanxiang, Nanlang and Longdu are apparently not mutually intelligible, although Nanlang is close to Longdu. Sanxiang is more divergent. Further, there are more dialects within these three languages, and dialectal divergence is considerable, with possible communication difficulties among them.

Sanxiang has at least two dialects, Phao and Tiopou. Phao is fairly uniform across a number of villages, but Tiopou is quite different. Nevertheless, there is near-full intelligibility between Phao and Tiopou. For now, we will just list Sanxiang, Nanlang and Longdu as separate languages, with possible dialects Phao and Tiopou (Sanxiang); Nanlang A and Nanlang B; and Longdu A and Longdu B, among them.

A very strange lect is spoken by the She people in Zhejiang, Fujian and Guangdong. The She language was originally Hmong-Mien, then added a Cantonese layer, then a Hakka layer, next a Min layer, and in Zhejiang, a Wu layer. It is best described as a Hmong-Mien language that has been Sinicized. There are probably 200,000 speakers of this language.

There is also an original She language that is non-Sinitic (Hmong-Mien) and is spoken by only about 1,000 people in Guangdong.

In Eastern Guangdong, the She speak the Chaoshan She language. They live in the Phoenix Mountains in Chao’an County in Chaozhou prefecture. It has had heavy contact with Chaoshan (Teochew) Min group. This is probably a separate language, unintelligible with other She languages and also with Chaoshan Min.

Within Hakka, besides Hakka Proper (Meixia)Tingzhou is a separate language (evidence). Wuhua Hakka is intelligible with Meixian.

Fangcheng and Dabu are close to Meixian, but intelligibility data is lacking. Fangcheng has five different lects within it, but intelligibility data is not known. Hong Kong Hakka is not intelligible with the Hakka spoken on Taiwan, nor with Dabu.

Dongguan, spoken near Hong Kong, can understand Meixian, but Meixian cannot understand Dongguan.

Taipu or Taipo is spoken in the village of the same name in Hong Kong and is not intelligible with Meixian, nor is Wakia, also spoken in Hong Kong.

A variety of Hakka spoken in a part of Hong Kong called Shataukok is called variously Satdiugok, Sathewkok, Shataukok, Satdiukok or Satdiugok. It is said to be different from other Hakka, and evidence indicates that Shataukok may indeed be a separate language. Shataukok has dialects within it and they are different, but they are generally mutually intelligible.

All three of these are dialects of a more or less intelligible language called Hong Kong Hakka.

Located near Hong Kong, Shenzhen/Bao’an is a separate language.

Haifeng and Lufeng, located near each other in Guangdong, appear to be dialects of a separate language called Hailufeng.

Longchuan in northeastern Guangdong is a separate language (evidence), with poor intelligibility with other Hakka lects. Longchuan has four different dialects, Huangbu, Sidu, Chetian and Tuocheng. Sidu and Tuocheng are close and are probably dialects of Longchuan. Sidu Longchuan has 18,000 speakers.

Boluo and Heyuan are separate languages, not mutually intelligible.

Longchuan, Boluo and Heyuan are quite distant from other Hakka. Heyuan is spoken in central Guangdong.

Huizhou is mutually intelligible with Longchuan and also with Meixia and Dabu.

Sanxiang, spoken in Zhongshan prefecture, is different from all other Hakka, but intelligibility data is lacking.

It is possible that in northern Guangdong, there may be many different Hakka languages, since dialects tend to differ from village to village, and in many cases, communication is difficult.

The Hakka spoken in Kunming, Sarawak, in Malaysia, known as Ho Po Hak, is a separate language.

It is very different from the Hakka spoken in Sabah, Malaysia, and it is similar to Hopo, spoken in Hopo, near Meizhou. Hopo is not intelligible with Dabu, Hailu or Meixian. Hopo appears to be a dialect of Jiaoling. Hopo has deep influence from Teochew Min, because it is located right next to the Teochew area.

The Gannan Group (or Ninglong Group) from Southern Jiangxi, Mingxi from Western Fujian, and the Yuemin Group from Southern Fujian and Southeastern Guangdong are separate languages.

In the Gannan Group are multiple lects. One of them is Xingguo, spoken in Xingguo County in Ganzhuo Prefecture (evidence).

The Gannan Group is extremely diverse compared to the Hakka of Guangdong and Fujian. Gannan lects differ even from village to village.

With Gannan Hakka, we may be dealing with a situation of many different languages, as with Wu, Hui, Tuhua and Xiang. In fact, it quite possible that with Jiangxi Hakka, we may be dealing with every Hakka lect being a separate language, but that remains to be proven.

In Fujian Province, there is the wildly diverse Tingzhou Hakka Group mentioned above. Even within this group, there are separate languages, including Yongding, Liancheng, Changting, Xinquan, Qingliu, Mingxi, Ninghua and Shanghang (evidence). Gucheng is probably also a member of Tingzhou.

Sources say that each Hakka village in Fujian speaks its own lect, and that the lects are far enough apart to make communication from village to village very difficult.

Therefore, we conclude that in addition to the above, we will add Wuping, Longyan, Zhaoan, Yunxiao, Shangsixiang, Fuding, Fuan, Gucheng and Nanjing Qujiang.

Luoyuan She Hakka is spoken in Fujian. It is an extremely diverse form of Hakka that differs from all other Hakka. It must surely be a separate language.

Chengdu is spoken in Chengdu, Sichuan. It is quite different from other forms of Hakka and has poor intelligibility with other forms.

On Taiwan, the Miaoli (Four Counties), Dongshi (Dapu) and Xinzhu (Hailu) lects are not mutually intelligible, nor is the mixed Gaoxiong lect created in order that these three lects could communicate with each other.

Kunbei (Zhaoan) is very different and must be a separate language. Raoping may well be a separate language, but intelligibility data is lacking. In general, speakers of other kinds of Hakka find Taiwan Hakka to be hard to understand, possibly due to Southern Min influence.

Bangka Island Indonesian Hakka, spoken on Bangka Island in Indonesia, has diverged so radically with its tones that it is now a separate language. That is, speakers of other Indonesian Hakka lects say that they cannot understand Bangka Island speakers. It’s actually said to be a Hakka creole more than anything else.

In Indonesia, two other Hakka languages are spoken, Kun Dian Indonesian Hakka, spoken in Borneo, and Belitung (Ngion Voi) Indonesian Hakka. Kun Dian Hakka is the largest Hakka group in Indonesia. Most live at Pontianak and Singkawang, where they speak two different mutually intelligible lects, but they have spread all over Indonesia. Kun Dian Hakka is a dialect of Meixian.

Belitung Hakka is spoken mostly on Sumatra and Borneo, and is characterized by a soft way of speaking. Belitung Hakka and Bangka Hakka say they cannot understand Kun Dian Hakka, but Kun Dian speakers say they can understand the other two for the most part. East Timor Hakka is a dialect of Meixian.

Jiexi is spoken in southeast Guangdong. Dayu is spoken in southern Guangxi. Liannan is spoken northwest Guangdong. Dongguan Qingxi is spoken in south-central Guangdong. Wengyuan is spoken in northern Guangdong. Ningdu is spoken in Jiangxi. Mengshan Xihe is spoken in eastern Guangxi. Hong Kong Hakka is spoken in Hong Kong.

Zhaoan Xiuzhuan is spoken in southern Fujian.

Shanghang Pengxin, Basel Mission and Shanghang Guanzhuang Shangzhuo are spoken in West Fujian (Branner 2000).

Dayu, spoken in Jiangxi, is a separate language, not intelligible at least to Central, or Meixian, Hakka speakers.

Meixian, Wuhua and Bao’an are members of the Yuetai Group of Hakka, which has 23 lects. Within Yuetai, Wuhua and Dabu are members of the Xinghua subgroup, which has 5 lects. Xinghua has 3.4 million speakers. Bao’an and Lufeng are in the Xinhui subgroup of Yuetai, which has 9 lects. Xinhui has 2.4 million speakers.

Gaoxiong, Xinzhu, Dongshi and Miaoli are members of the Jiaying Group of Hakka, which has 7 lects.

Tingzhou, Yongding, Liancheng, Changting, Xinquan, Shanghang, Basel Mission, Shanghang Pengxin, Wuping, Ninghua, Qingliu and Mingxi are all part of the diverse Tingzhou Group of Hakka. All told, Tingzhou has 12 lects, all of which are separate languages.

Longchuan, Boluo and Heyuan are members of the Yuezhong Group of Hakka, which has 5 lects.

Huizhou is in its own subgroup of Hakka.

Xingguo and Ningdu are in the Ninglong Group of Hakka, which has 13 lects. This group is said to be very diverse, with lects differing from village to village.

Liannan and Wengyuan are members of the Yuebei Group of Hakka, which has 11 lects and must surely be a separate language.

Dayu is a member of the Yugui Group of Hakka, which has 43 lects.

Ho Po Hak, Bangka Island, Nanjing Qujiang, Jiexi, Dayu, Hong Kong, Mengshan Xihe, Zhaoan Xiuzhuan, Nanjing Qujiang, Fuan, Fuding and Haifeng are unclassified.

There are 12 major Hakka lects and 210 Hakka lects altogether. Others claim that there are over 1000 Hakka lects spoken in China. There are 30 million speakers of the various Hakka languages. The dialect situation with Hakka, as with Min Nan, is quite confused and somewhat contradictory. Intelligibility testing could clear up some of the confusion. Some speakers report adequate intelligibility between lects, while others report difficulty.

Putonghua is Standard Mandarin, based on the Beijing dialect as of 1949, but it has since diverged wildly and many Putonghua speakers today cannot understand Beijing. Putonghua is being promoted as the national language of China. In addition to Putonghua, there 1,500 other dialects of Mandarin spoken in China. In general, other Mandarin dialects are not intelligible to Putonghua speakers (Campbell April 2009).

However, the Northeastern dialects and the dialects around Beijing may be more intelligible than the Mandarin dialects in the rest of the country. The implication is that there may be as many as 1,500 Mandarin languages in China. However, many of these Mandarin dialects are intelligible with at least some other Mandarin dialects. Hence, despite the lack of intelligibility with Putonghua, there is a lot of potential lumping within Mandarin.

The degree to which Mandarin dialects are intelligible to each other is very much an open question and in general is poorly investigated.

Within Mandarin, besides Putonghua, the main branch, Jinan (New Jinan), Beijing and Tianjin (evidence and here) are not intelligible with Putonghua; however, Tianjin may be intelligible with Beijing, on the other hand, Tianjin is looking more and more like a separate language.

For one thing, Tianjin’s tones are quite different from Putonghua’s, and its tone sandhi is much more complicated and it is more closely related to lects 150-500 miles away, since originally Tianjin speakers came from Anhui (Lee 2002). Some reports say that Tianjin is intelligible with Putonghua, so intelligibility testing may be needed.

Jinan is not intelligible with Putonghua, but may be learned over a period of weeks to possibly months, as it is close enough. Jinan is only 65% intelligible with Beijing.

Since Beijing, Tianjin, Nanjing City, Hebei and all of NE Mandarin may be intelligible, I am just going to make a language called Northeast Mandarin and call Beijing, Tianjin, Hebei and Nanjing City dialects of NE Mandarin for now. Beijing is has low intelligibility with other branches of Mandarin: 72% intelligible with Southwest Mandarin, 64% intelligible with Jilu Mandarin and Zhongyuan Mandarin and 55% intelligible with Jiaoliao Mandarin.

However, many Putonghua speakers claim that Beijinghua is not inherently intelligible with Putonghua. Complaints about unintelligible taxi drivers in Beijing are legendary. At the very least, competing views of the intelligibility of Beijinghua and Putonghua deserve investigation.

On the other hand, Beijinghua may be intelligible with Hebei and Nanjing City. I think that Hebei is clearly a dialect of Beijing. The lect of Beijing’s hutongs and taxi drivers is legendary for being hard to understand. It would be interesting to see whether Tianjin and Hebei speakers can understand it. Tianjin may be a separate language, since it is not intelligible with Beijinghua.

What probably happened was that Beijinghua and Putonghua have taken separate trajectories. This has also occurred in Italian, where, though Standard Italian was based on Tuscan, Standard Italian and Tuscan have taken separate trajectories since. It is said that if you see old Tuscan men on TV in Italy, a speaker of Standard Italian from southern Italy would need subtitles to understand them, but one from northern Italy would not.

Others say that Putonghua was based on the language of the Beijing suburbs, not the city itself.

For whatever reason, Beijinghua often seems to have less than 90% intelligibility with Putonghua, though the question needs further research. Beijinghua, in its pure and least mutually intelligible form, seems to be spoken mostly in the innermost hutongs and among taxi drivers and other low income and working class people. The lect of people with more education and money is probably a lot more comprehensible.

I would describe the real, pure, Putonghua as “CCTV speech”, the lect you hear on Chinese state television. Evidence that Beijinghua lacks full intelligibility with Putonghua is here, here, here, here, here, here, here and here.

The question of whether or not Beijinghua is a separate language from Putonghua is sure to be highly controversial. Perhaps intelligibility testing could settle the question.

Beijing is in a group all of its own called the Beijing Group. It contains 43 separate lects, and may contain more than one language.

We should also note here that even Putonghua, the language that was meant to tie the nation together, seems to be evolving into regional languages.

Guangdong Putonghua is not fully intelligible to speakers of the Putonghuas of Northern China and hence is probably a separate language.

There are also varieties of Putonghua that are spoken in Singapore and Taiwan. Taiwanese Mandarin is about 80-85% intelligible with Putonghua and is a separate language. Claims that Taiwan Mandarin is fully intelligible with Putonghua are incorrect.

Shanghai Putonghua is often not intelligible with Putonghua from other regions. It has heavy interference from Shanghaihua, which seriously effects the Putonghua accent. Even after four years of exposure to it, Standard Putonghua speakers often have problems with it.

In addition, Jianghuai Mandarin Putonghua and Zhengcao Mandarin Putonghua Putonghua are not intelligible with Putonghua from other areas (Campbell April 2009). These varieties of Mandarin cause a particular interference with Putonghua Mandarin that results in a severe dialectal disturbance in their Putonghua.

These Putonghuas are spoken in the regions native to the Jianghuai and Zhengcao dialects of Mandarin. Jianghuai is spoken in Anhui, Jiangsu, Hubei and to a much lesser extent Zhejiang Provinces. Zhengcao is spoken in Anhui, Henan, Shandong, Jiangsu, with one dialect is spoken in Hebei.

Although it is different, Singapore Putonghua is still intelligible with Putonghua. Malay Mandarin is said to be quite different but nevertheless intelligible. Nevertheless Malay Mandarin speakers say they have to make speech adjustments with Chinese speakers otherwise their speech is poorly intelligible. This implies that Malay Mandarin is indeed a separate language.

Yunnan Putonghua is intelligible with Putonghua from other regions (Campbell January 2009).

Cangzhou, spoken in southeastern Hebei, is a separate language. It is only partly intelligible with Putonghua. Renqiu, Huanghua, Hejian, Cangxian, Qingxian, Xianxian, Dongguang, Haixing, Yanshan, Suning, Nanpi, Wuqiao and Mengcun, all spoken in Cangzhou prefecture, are all dialects of Cangzhou.

Cangzhou shares some similarities with Tianjin, but it is only partly intelligible with it.

Jinan is a member of the Liaotai Group of the larger Jilu Group, which has 37 lects.

The Baotang Group of Jilu has 52 lects. Tianjin forms its own subgroup within Baotang. Cangzhou, Renqiu, Huanghua, Hejian, Cangxian, Qingxian, Xianxian, Dongguang, Haixing, Yanshan, Suning, Nanpi, Wuqiao, and Mengcun are members of the Huangle subgroup of Baotang, which has 25 lects.

Jilu itself consists of 170 lects.

Taiwanese Mandarin, while different from Putonghua, is intelligible with it. Singapore Mandarin has fewer differences then Taiwanese. Both are dialects of Putonghua.

Luoyang, Kiafeng, Changyuan and Zhengzhou, all in Henan Province, are not intelligible with Putonghua. However, all four are mutually intelligible with each other, so they are dialects of a single language, Henan Mandarin.

Xinyang, also spoken in Henan, is a separate language and cannot be understood by Luoyang speakers.

Nanyang has high but not complete intelligibility with Luoyang. After a few weeks of close contact, Luoyang speakers can understand Nanyang, but initially, comprehension is poor due to different tones. Nanyang has 15 million speakers.

Luoyang and Gushi are unintelligible with Putonghua. In addition, Gushi is different from Nanyang and may not be intelligible with it. Intelligibility between Xinyang, Gushi and Nanyang is not known. In general, intelligibility between many lects in Henan is not good, but after a week or two of close contact, they can start to understand each other.

In Shaanxi, Yanan, Xian, Huxian (evidence), Zhouzhi (evidence), and Hanzhong are not intelligible with Putonghua. Let us call this language Shaanxi Mandarin. Xi’an, for instance, is about 65% intelligible with other Mandarin groups.

Xining, spoken in Xinghai, seems to be very different from other Shaanxi lects, and is probably a separate language altogether (evidence here and here) .

In Gansu Province, Tongwei is not intelligible with Putonghua, and Gansu Mandarin seems to be very different from other forms of Mandarin. Gansu Mandarin appears to be a separate language.

However, within Gansu, there are divergent lects, such as Sale, which are unintelligible with other Gansu lects.

Bozhou (evidence), Yingshang (evidence), and Fuyang (evidence), spoken in Anhui, are at least unintelligible with Putonghua. Fuyang is very different. The lect spoken 300 km south of Jinan, around Mengcheng in rural Anhui, is said to be completely unintelligible with Putonghua, Tianjin and Beijinghua. For the time being, we will refer to this as one language, Anhui Mandarin. Intelligibility between lects of Anhui Mandarin is not known.

Anhui Mandarin Putonghua has poor intelligibility with Standard Putonghua due to its phonology. Therefore, it is a separate language.

Xian, Huxian and Zhouzhi are members of the Guanzhong Group of Zhongyuan, which has 45 lects.

Yanan, Hanzhong and Xining are members of the Qinlong Group of Zhongyuan, which has 67 lects.

Luoyang is a member of the Luoxu Group of Zhongyuan, which has 28 lects.

Kiafeng, Nanyang, Zengzhou, Changyuan, and Bozhou are members of the Zhengcao Group of Zhongyuan. The Zhengcao Group has 93 lects.

Xinyang and Gushi are in the Xinbeng subgroup of Zhongyuan, which has 20 lects.

Tongwei and Sale are part of the Longzhong Group of Zhongyuan, which has 25 lects.

Yingshang is a member of the Cailu Group of Zhongyuan, which has 30 lects.

The Mandarin spoken in Qinghai is very different from that spoken in Gansu, but it’s not known if it is a separate language. They are both usually two types of Zhongyuan Mandarin.

Zhongyuan has a shocking 388 lects. Zhongyuan Mandarin is not fully intelligible with Putonghua. Zhongyuan Mandarin has 130 million speakers (Olson 1998).

Yichang (evidence), Longchang (evidence), Chengdu, Chongqing (evidence), Guilin and Nanping (spoken near Mt. Wuyi evidence), Longcheng (evidence), Luocheng (evidence), Luzhou (evidence here and here), Lingui (evidence), Jiuzhaigou (evidence) Xindu, Wenshan (evidence), Mianzhu (evidence here and here), Yangshuo (evidence), Wuhan (evidence), and Leshan (evidence) are all unintelligible with Putonghua.

Guilin is not intelligible with general Southwest Mandarin speech. Wenshan at least is not intelligible with other Southwestern varieties (Johnson 2010).

Chengdu is part of a Sichuan Mandarin koine that is spoken in many of the larger cities in Yunnan. It includes Kunming, Bazhong, Dazhou, Neijiang, Zigong, Yibin, Luzhou, Chengdu, Mianyang, Deyang and Guiyang and is broadly intelligible (Xun 2009). Ziyang is intelligible with the koine but has a heavy accent (Xun 2009). Leshan is unintelligible with the koine, but it can be learned in a few weeks of exposure (Xun 2009).

Dali is also not intelligible with Putonghua, but that is because Tibetan Mandarin has heavy Tibetan admixture.

Chongqing speakers cannot understand Chengdu or Luzhou speakers. The many small lects around Mt. Emei are not intelligible with Chengdu, appear to be be very different, and may one or more separate languages.

Wuhan is not intelligible to speakers of Southwest Mandarin from other provinces, for instance, it is only 80% intelligible with Chengdu. The intelligibility of Wuhan and Yichang is not known.

Dahua, spoken in and around Dahua village on the Puduhe River near Dongchuan in Yunnan Province, is apparently a separate language .

Lanping, may be a separate language. Kunming not intelligible with Tuoyuan., so Tuoyuan may be a separate language also. The language spoken in Kunming is part of the Sichuan Mandarin koine that includes Kunming, Bazhong, Dazhou, Neijiang, Zigong, Yibin, Luzhou, Chengdu, Mianyang, Deyang and Guiyang.

Chuanlan is a little-known language spoken by the Tunbao people of Guangxi Province.

Yingshan is a separate language based on a 200 word Swadesh test (Ben Hamed 2005).

Menghai (evidence) may well be a completely separate language. The mutual intelligibility of Menghai, Guiyang and Kunming is not known. Guiyang is at least not intelligible with Putonghua. Guiyang is evolving into the Sichuan Mandarin koine, which is broadly intelligible with Kunming, Bazhong, Dazhou, Neijiang, Zigong, Yibin, Luzhou, Chengdu, Mianyang and Deyang.

Shaoshan, apparently Mao Zedong’s lect, spoken in Hunan Province, is a separate language. It was said although Mao had a secretary who could understand him well, not many others could.

Another language spoken in Hunan, in Zhangjiajie County, is called Zhangjiajie Maoxi. The Maoxi are a tribal group there that speak a strange variety of Mandarin.

Tuoyuan in Hunan is not fully intelligible with other Southwest Mandarin lects, or at least not with Kunming.

Junhua, or military language, is a language spoken by an ethnic group on Hainan in the city of Zonghe. It is said to be “Old Mandarin” and is probably not intelligible with other lects. It is a form of Southwest Mandarin known as the Junhua Group, which contains 4 lects .

Guilin, Luocheng, Yangshuo, Liuzhou and Lingui are members of the Guiliu Group of Southwest Mandarin, which has 57 lects. Guiliu Southwest Mandarin is at least not comprehensible with Putonghua or Chengyu Southwest Mandarin.

Leshan and Longchang are members of the Guanchi Group of Southwest Mandarin, which has 85 lects. Within Guanchi, Longchang is a member of the Renfu Group , which has 13 lects.

Yichang, Chengdu, Chongqing and Yingshan are members of the Chengyu Group of Southwest Mandarin, which has 113 lects. Chengyu Southwest Mandarin is not comprehensible with Putonghua or Guiliu Southwest Mandarin.

Menghai, Kunming, Wenshan and Guiyang are members of the Kungui Group of Southwest Mandarin. The Kungui Group itself has an incredible 95 lects.

Lanping is in the Dianxi Group of Southwest Mandarin, which has 36 lects. Within Dianxi, it is a member of the Baolu subgroup, which has 21 lects.

Taoyuan is in the Changhe Group of Southwest Mandarin, which has 14 lects.

Wuhan is a member of Wutian Group of Southwest Mandarin, which has 9 lects.

Dali is a member of the Dianxi Group of Mandarin, which has 36 members. Within Dianxi, Dali is a member of the Yaoli Group, which has 15 members.

Nanping, Chuanlan, Shaoshan, Jiuzhaigou, Zhangjiajie Maoxi and Dahua are unclassified.

Southwest Mandarin itself has a stunning 519 lects and is not fully intelligible with Putonghua. There are 240 million speakers of Southwest Mandarin (Olson 1998).

Jianghuai Mandarin is a separate language.

Yangzhou is considered to be a separate language by a 200 word Swadesh test (Ben Hamed 2005). Yangzhou has about 52% intelligibility with the other branches of Mandarin.

Nanjing (evidence and here) is also a separate language – now mostly spoken in the suburbs, as city speech is not a separate language anymore. The city language is said to be intelligible with the general northeastern China lect spoken in Beijing and Hebei.

So I will call Nanjing Suburbs a separate language.

Lianyungang is a separate language, as is Yancheng and Huaian (evidence for both).

Nantong, a very strange variety of Mandarin on the border of Wu and Mandarin that shares many features with Wu languages, is a separate language, as is its sister language, Tongdong. Jinsha is a dialect of Nantong.

Rugao, next to Nantong, is also a separate language.

Also within Jianghuai, Hefei is considered to be a separate language by a 200 word Swadesh list (Ben Hamed 2005).

Rudong is at least not intelligible with Putonghua.

Anqing, in Anhui Province, is also not intelligible with Putonghua.

In 1933, there were three different languages spoken in Tongcheng, Anhui – East Tongcheng, West Tongcheng and Tongcheng Wenli. Tongcheng Wenli was the classical-based language spoken by the educated elite of the city. Whether these three languages still exist is not known, but surely some of the speakers in 1933 are still alive.

Chuzhou, spoken in Anhui, is not intelligible with Putonghua, although it is said to be close to Nanjing. Dangtu, also spoken in Anhui, is not intelligible with Putonghua.

Dongtai is a separate language (evidence).

The lects spoken in Dafeng, Taizhou, Xingua and Haian are said to be similar to Dongtai, so for the time being, we will list them as dialects of Dongtai.

Jiujiang, spoken in Jiangxi Province, is a separate language, as is Xingzi, located close by.

Intelligibility between Rudong, Anqing, Chuzhou, Dafeng, Taizhou, Xingua, Haian and Dangtu is not known.

Yangzhou, Lianyungang, Yancheng, Huaian, Nanjing, Hefei, Anqing, the Tongchengs, Chuzhou, and Dangtu are in the Hongchao Group of Jianghuai, which has 82 lects.

Dongtai, Dafeng, Taizhou, Haian, Xinghua, Jinsha, Nantong, Tongdong, Rudong, and Rugao are in the Tairu Group of Jianghuai. Tairu has 11 different lects.

Jiujiang and probably Xingzi are members of the Huangxiao Group of Jianghuai, which has 20 lects.

Jianghuai is composed of an incredible 120 lects and is not fully intelligible with Putonghua. Some suggest that all of the lects of Jianghuai are mutually unintelligible, but that remains to be proven. Jianghuai Mandarin has 65 million speakers (Olson 1998).

Northeastern (Dongbei) Mandarin is a separate language. Within Northeast, Shenyang is a separate language according to a 200 word Swadesh list (Ben Hamed 2005). Harbin is often listed as intelligible with Putonghua, but some Putonghua speakers can barely understand a word of it. Harbin may be a separate language. That classification is sure to be controversial, so intelligibility testing may be required to sort it out.

Shenyang is a member of the Jishen Group of Northeastern Mandarin, which has 44 dialects. Within Jishen, Shenyang is a member of the Tongxi Group, which has 24 dialects.

Harbin is a member of the Hafu Group of Northeastern Mandarin, which has 64 lects. Within Hafu, it is a member of the Zhaofu Group, which has 18 lects.

Lanyin Mandarin in the far northwest is also a separate language (Campbell 2004). Though Lanyin is said to be intelligible with Putonghua, that does not appear to be the case. Minqin (evidence) and Lanzhou (evidence) in Gansu are not fully intelligible with Putonghua, nor is Yinchuan (evidence) in Ningxia.

Intelligibility within Lanyin is not known, but Jiuquan at least appears to be a completely separate language inside Lanyin.

Jiuquan is a member of the Hexi Group of Lanyin, which has 18 lects.

Yinchuan is a member of the Yinwu Group of Lanyin, which has 12 lects.

Lanzhou is a member of the Jincheng Group of Lanyin, which has 4 lects.

Lanyin is composed of 57 separate lects. Lanyin Mandarin has 9 million speakers (Olson 1998).

The Jiaoliao Mandarin spoken in Shandong contains lects such as Qingdao (evidence here and here) and Wehai (evidence) which are not fully intelligible with Putonghua. Dalian is quite different from Putonghua. Intelligibility between Qingdao, Wehai and Dalian is not known.

Wehai and Dalian are members of the Denglian Group of Jiaoliao, which has 23 lects.

Qingdao is a member of the Qingzhou Group of Jiaoliao, which has 16 lects.

Jiaoliao is composed of 45 lects. Jiaoliao is not fully intelligible with Putonghua. Intelligibility inside of Jiaoliao is not known, but there may be multiple languages inside of it, because some Shandong Peninsula lects sound very strange even to speakers used to hearing Shandong Mandarin.

Karamay is an unclassified Mandarin language spoken in Xinjaing.

The Mandarin spoken around Tiantai in Zhejiang is not intelligible with Putonghua and may be a separate language. It is also unclassified.

Mandarin has 873 million speakers. There are an incredible 1,526 lects of Mandarin.

Although it is related to Mandarin, Jin is a completely separate language. Besides the Main Jin branch Baotou are apparently separate languages (evidence). As is possibly Taiyuan (evidence).

Within Hohhot Jin, there are two separate languages.

One is Hohhot Xincheng Jin, a combination of Hebei Jin, Northeastern Mandarin and the Manchu language.

The other is Jiucheng Hohhot Jin, spoken by the Muslim Hui minority in the city. It is related to other forms of Jin in Shanxi Province.

Yuci is a separate language from Taiyuan on a 200 word Swadesh test (Ben Hamed 2005).

Fenyang, the language used in Chinese director Jia Zhanke’s movie Xiao Shan Going Home is not intelligible with Putonghua.

Jingbian, in Shanxi, is a separate language.

Yulin is also a separate language.

Hohhot is a member of the Zhanghu Group of Jin, which has 29 lects.

Baotou and Yulin are members of the Dabao Group of Jin, which has 29 lects.

Taiyuan and Yuci are members of the Bingzhou Group of Jin, which has 16 lects.

Fenyang is a member of the Luliang Group of Jin, which has 17 lects.

Jingbian is a member of the Wutai Group of Jin, which has 30 lects.

Jin is composed of 171 lects, and some of them are separate languages. Jin has 48 million speakers (Olson 1998).

Besides Xiang Proper, assuming there even is such a thing, Shuangfeng and Changsha are separate languages, having only 47% intelligibility.

In fact, Changsha itself is divided into multiple languages in the city itself. We do not know how many there are, but we know that they exist. For the moment, we shall just add one lect to Changsha, and divide it into Changsha A and Changsha B, but there may be more. Furthermore, there are significant differences within the Changsha spoken in Changsha City and in the surrounding countryside.

Shuangfeng is also very different within itself, as the vocabulary changes every 10 miles or so. Intelligibility data is lacking.

Mao Zedong spoke Xiangtan, a notoriously difficult Xiang language in Hunan, about which it is said, “No one can understand it.” Xiangtan itself is internally diverse, with differences between the dialect of the city and rural areas, but intelligibility data is lacking.

Hengyang is apparently a separate language, as is Jishou (evidence). There is significant dialectal diversity in Hengyang, but intelligibility data is lacking.

Liuyang is a separate language, actually a macrolanguage, spoken in Liuyang county-level city in Changsha prefecture in Hunan. Liuyang is split into 5 divisions – Liuyang North, Liuyang South, Liuyang West, Liuyang East and Liuyang City.

Liuyang South and Liuyang East are separate languages, mutually unintelligible with the others. Liuyang City has recently arisen as a sort of a Liuyang “Putonghua” that is understandable to speakers of all Liuyang lects. So within Liuyang, we have three dialects – Liuyang City, Liuyang North and Liuyang West. Outside of Liuyang Proper, there are also two separate languages – Liuyang South and Liuyang East. None of the three Liuyang languages is intelligible with Changsha.

Even within this classification, each of the 5 Liuyang lects has multiple dialects. Each village is said to have its own lect in Liuyang.

Hengshan (evidence) is a separate language with vast dialectal divergence divided by Mount Hengshan.

There are two Xiang Hengshan lects on either side of the mountain – Qianshan and Houshan – that are very different and must be separate languages. Huayuan (evidence) is at least not intelligible with Putonghua.

In the city of Yiyang, Henan Province, 3 lects are spoken. One is a Yiyang Changyi Xiang lect, another is a Yiyang Luoshao Xiang lect, and a third is Luoyang Southwest Mandarin, a dialect of Henan Mandarin, described above. All appear to be separate languages.

We will call the two Xiang lects Yiyang Changyi and Yiyang Luoshao.

Baojing at least is not intelligible with Putonghua, yet it is said to be intelligible with Chengdu Southwest Mandarin.

Lingshuijiang, also spoken in Hunan by 300,000 people, may well be a separate language.

Ningxiang is said to be very different from Changsha. Given the dramatic divergence present even as background in Xiang, this must mean that Ningxiang is at least not intelligible with Changsha.

According to good sources, there is a tremendous amount of lect diversity in Western Hunan, and most of it probably involves Xiang lects, while most or all of these lects are not mutually intelligible. But until we get more data, we cannot carve any languages out of this mess yet.

Shuangfeng and Lingshuijiang are a members of the Luoshao Group of Xiang, which has 21 lects.

The Changshas, Hengyang, Xiangtan, Hengshan, Ningxiang and the Liuyangs are members of the Changyi Group of Xiang, which has 32 lects.

Baojing, Jishou and Huayuan are members of the Jixu Group of Xiang, which has 8 lects.

Xiang is composed of 74 lects. Many, or possibly all of them are separate languages. The various languages of Xiang have 50 million speakers.

Wu is a major group of diverse Chinese languages that is often divided into Northern Wu and Southern Wu. Northern Wu and Southern Wu are definitely mutually unintelligible languages. Southern Wu has 18 million speakers. In general, the list below just lists Wu lects that are utterly unintelligible with Putonghua. My opinion is that in general, the Wu lects are mostly separate languages, however, some are merely dialects of other Wu lects.

A good general rule for Zhejiang lects is that people say they can sort of understand the next city over, but two cities away was incomprehensible. For instance, in the Taizhou prefecture region, there are 4-5 unintelligible dialects across a 12 mile area. In Zhejiang, the mountains go all the way down to the sea, so there are few flat areas where language can spread out and become comprehensible.

Suzhou, Shanghaiese, Wuxi (evidence), Huzhou (evidence), Changzhou (evidence), Xiaoshan (evidence), Songjiang (evidence), Jiaxing, Hangzhou (evidence), Kunshan (evidence), Ningbo and Yixing (evidence) are separate languages.

Tongxiang also appears to be a separate language, as does Yuyao (evidence) and Zhoushan.

Qidong, spoken in the city of Qidong, is a separate language.

Lvsi, Qisi or Tongdong, spoken in the nearby town of Qisi, is a separate language from Qidong. Qidong is said to be very close to Chongming, so for the time being, we will list Chongming as a dialect of Qidong.

Haimen also appears to be a dialect of Qidong. However, there are 2 lects spoken in Haimen, and they are apparently not mutually intelligible. We will leave Haimen A as a dialect of Qidong, while we will set Haimen B as a separate language as it is not intelligible with Haimen A.

There are differences between Chongming and Haimen A, but the degree of them is not known. Changyinsha is very similar to Haimen, Chongming and Qidong, so it is probably a dialect of Qidong also. Another name for Qidong is Qihai, which refers to the speech of Qidong, Haimen and Tongzhou. For the time being, we will list Haimen A, Changyinsha and Chongming as dialects of Qidong. Chongming, and hence Qidong, is not intelligible with Shanghaiese.

Zhangjiagang, Changsha and Kunshan may be intelligible with Suzhou, but data is lacking. Suzhou is only 43% intelligible with Wenzhou. None of these lects is intelligible with Shanghaiese.

Ningbo has good intelligibility with Shanghaiese, but not vice versa.

Reports vary on the intelligibility of Shanghaiese and Suzhou. Some say they understand each well, but that is probably not the case at first due to serious differences in tones. Intelligibility testing is needed.

Pudong, the older form of the Shanghai language, is still spoken in the Pudong District of the city, but it is dying out. There is a question of whether or not it is mutually intelligible with Shanghaiese, but Shanghaiese speakers seem to feel it is not mutually intelligible (Gilliland 2006).

Several lects are spoken in the suburbs of Shanghai. Reports vary, but Shanghai residents generally report that these lects are not mutually intelligible with Shanghaiese (Gilliland 2006).

They are Baoshan, Fengxian, Nanhui, Jiading, Jinshan, Pudong (or Chuansha) and Qingpu.

Hangzhou is reportedly much different from the lects of Shanghaiese, Ningbo, etc. to the northeast, and is not intelligible with Shanghaiese, nor with Suzhou. Hangzhou has 1.2 million speakers.

Changzhou and Wuxi are not intelligible with Shanghaiese or Suzhou. Changzhou and Wuxi have high, but not full, intelligibility. Changzhou and Wuxi are part of a dialect chain in which eastern Changzhou speakers can communicate with western Wuxi speakers, but as one moves further west into Wuxi or east into Changzhou, intelligibility drops off. Like Czech and Slovak, it is best then to split Wuxi and Changzhou into separate languages.

Changzhou itself has considerable dialectal divergence, though apparently all dialects are intelligible. Changzhou has 3 million speakers.

Yixing, near Changzhou, is not intelligible with Shanghaiese.

Jiangyin is spoken in Jiangyin city. It is related to Changzhou and has high intelligibility with Changzhou and Wuxi.

All of the above are in the Taihu Group.

Taizhou, centered around the city of Tuzhou in Eastern Zhejiang, is composed of 11 separate lects, all of which are separate languages, Huangyan (evidence), Jiaojiang, Linhai, Sanmen, Tiantai (evidence), Wenling (evidence), Ninghai (evidence), Xianju, Leqing (evidence), Yubei and Yuhuan (evidence). (Evidence for all).

A single subgroup of Wuzhou, Yiwu – contains 18 separate languages, all mutually unintelligible. We will call them Yiwu A, Yiwu B, Yiwu C, Yiwu D, Yiwu E, Yiwu F, Yiwu G, Yiwu H, Yiwu I, Yiwu J, Yiwu K, Yiwu L, Yiwu M, Yiwu N, Yiwu O, Yiwu P, Yiwu Q and Yiwu R for the time being.

Pucheng is a separate language. Pucheng has 2 dialects, Nampo and North Dabei. Intelligibility data is not known. Pucheng is so diverse that some say it is a language isolate and is not even a part of Wu (Norman 1988).

There are two groups of Southern Wu which are said to be both highly divergent and to have very low intelligibility internally. These groups are sometimes called Jinqu and Shangli.

Jinqu consists of at least 30 languages: Jinhua, Jinhua Xiaohuang, Tangxi, Lanxi, Pujiang, Yiwus A-R, Dongyang, Pan’an, Yongkang (evidence), Wuyi (evidence), Quzhou (evidence), Longyou and Jinyun. Lanxi has 660,000 speakers (Rickard 2006). Quzhou is apparently not intelligible with Wenzhou. Jinqu is roughly equivalent to the Wuzhou Group.

Shangli contains at least 18 languages: Shangrao City, Shangrao County, Guangfeng, Yushan, Kaihua, Changshan, Jiangshan, Lishui (evidence), Suichang , Songyang, Xuanping, Qingtian (evidence here and here), Yunhe, Jingning, Longquan, Qingyuan, Taishun and Pucheng.

This group is roughly equivalent to the Longqu and Chuzhou Groups of Chuqu. Some members of this group extend beyond Zhejiang and into northeastern Jiangxi and northern Fujian.

We are going to cautiously classify all of these lects as separate languages since they are said to be much more divergent and much less mutually intelligible than Taihu, and Taihu itself seems to have pretty low internal intelligibility.

Wenzhou (evidence) is a separate language.

Ouhai, Yongjia and Ruian appear to be dialects of Wenzhou, but all of them are probably separate languages, since if you go 5 miles in any direction in Wenzhou, there’s a new dialect, and it’s hard to understand people.

Wenzhou is 43% intelligible with Suzhou.

Wencheng (evidence) appears to be a separate language.

Wenxi is a separate language within Oujiang, not intelligible with Wenzhou. It is spoken in one town in Qingtian County.

Jinxiang also has its own Wu lect, with Mandarin influences. This is a Taihu (Northern Wu) outlier.

In addition, in Taishun County, there is also an aberrant Wu lect spoken in the town of Luoyang, influenced by both Manjiang and Oujiang Wu.

There is another Wu lect similar to Manjiang Eastern Min spoken in the town of Hedi in Qingyuan County in Lishui.

Manhua is quite different. There is a controversy over whether or not Manhua is Macro-Min or Macro-Wu. It is probably Macro-Wu based on phonology and it also shares some similar Min-like traits with other Wu lects such as those in the Chuqu group.

Within Manhua, there is a northern group spoken in the town of Yishan and a southern group spoken in the towns of Qianku and Jinxiang. Qianku is the standard for Manhua. The northern/southern divide may impede intelligibility, but we have no information yet.

Wuhu is a separate language, unintelligible with Shanghaihua.

Nanjing Wu is a separate language

Jiaxing, Shanghaiese, Suzhou, Wuxi, Songjiang, Tongxiang, Qidong, Lvsi, Yunhe and Kunshan are all in the Hujia Group of Taihu. The Hujia Group contains 32 lects.

Changzhou, Yixing, Jiangyin and Haimen are in the Piling Group of Taihu. Piling has 12 lects. Piling has 8 million speakers.

Wenzhou, Ouhai, Yongjia, Ruian and Wencheng are in the Oujiang Group of Taihu, which also contains 12 lects.

Hangzhou has its own group, the Hangzhou Group of Taihu.

Shaoxing, Fuyang, Xiaoshan, Linan, Yuyao and Zhuji are in the Linshao Group of Taihu which also contains 12 lects.

Fenghua and Zhoushan are in the Yongjiang Group of Taihu. The Yongjiang Group contains 11 lects and has 4 million speakers.

Changxing is in the Taioxi Group of Taihu, which has 5 lects.

The Taihu Group is composed of 75 separate lects, many or all of which are separate languages. Taihu has 47 million speakers.

Lishui, Qingyuan, Jingning, Jinyun and Taishun are in the Chuzhou group of Chuqu, which contains 9 lects. Chuzhou has 1.5 million speakers. Chuqu itself contains 35 separate lects.

Pucheng, Shangrao County, Shangrao City, Jiangshan, Songyang, Guangfeng, Longquan, Kaihua, Changshan, Suichang, Longyou, Yushan and Quzhou are members of the Longqu Group of Chuqu, which has 14 lects and 5 million speakers (Olson 1998).

The Yiwu languages, Dongyang, Jinhua, Jinhua Xiaohuang, Lanxi, Tangxi, Wuyi, Pan’an, Pujiang and Yongkang are all members of the Wuzhou Group, which contains 27 separate languages. Wuzhou has 4 million speakers.

Nanjing Wu is unclassified.

The various Wu languages have 85 million speakers.

Within Hui, there are at least six separate languages (Hirata 1998). Actually, there are many more.

Xidi, spoken in a village at the foot of Huangshan Mountain, is a separate language. Xidi is unintelligible even to villages a few miles away.

Tunxi, Wuyuan and Xiuning are separate languages. The first two are spoken in Anhui, but Xiuning is spoken in Jiangxi Province.

Within the Jingzhan Group of Hui, JingdeNingguo, Qimen, Chilingkou, (spoken in Chiling, Qimen County), Meixi Xiang, and Shitai are separate languages.

Within Qimen County itself, there are 6 different Hui lects, with low intelligibility between them. It is quite possible that we are talking about 6 different languages here. One of them appears to be Chilingkou above. The others we will just call: Qimen A, Qimen B, Qimen C, Qimen D and Qimen F. All except Meixi are spoken in Anhui Province. Meixi is spoken in Meixi, Jiangxi.

Jixi, Hongmen and Shexian are separate languages.

Within Shexian, there are two different languages that we will only call Shexian A and Shexian B for now. Jixi and the Shexian languages are spoken in Anhui.

Dexing and Dongzhi are separate languages, the first spoken in Jiangxi and the second spoken in Anhui.

In the Yanzhou Group of Hui, Jiande and Chunan are separate languages.

There are two other lects in the group, Suian and Shouchang. Chunan and Suian are very diverse and are in all probability separate languages. Shouchang is also extremely diverse, and Jiande has some differences with Shouchang.

The Yanzhou languages are interesting because there is controversy whether they are Wu or Hui languages. Careful examination reveals that they cannot be subsumed under Southern Wu due to their great divergence, despite having some similarities with Wu. Some authors feel that they are Hui-Wu merged lects, and their similarity with both is given as a reason for merging Wu and Hui into a supergroup.

While it is best to classify them as Hui, they are much different from most Hui lects. All are spoken in western Zhejiang. The Yanzhou Group has four languages. Discussion here.

Huangshan, Tunxi, Wuyuan and Xiuning are members of the Xiuyi Group of Hui, which has 6 lects.

Meixi, the Qimens, Chilingkou, Shitai, Ningguo and Jingde are members of the Jingzhan Group of Hui. Jingzhan has 12 lects, all of which are separate languages.

Jixi, Hongmen and the Shexians are members of the Jishe Group of Hui. The Jishe Group has 6 lects .

Dexing and Dongzhi are members of the Qide Group of Hui. The Qide Group has 5 lects.

Xidi is unclassified.

The various Hui languages have 3.2 million speakers . There are 34 different Hui lects, at least 24 of which are separate languages. There is a possibility that all Hui lects are separate languages, but that remains to be proven.

Cantonese is a major language spoken in the south of China. They are said to be a mix between the Yue people and the Han. They have great pride in their speech which appears to be closer to ancient Chinese than Mandarin is. When Sun Yat-Sen was President of Republican China, a vote was held on which language to base Standard Chinese on. Cantonese only lost by one vote in favor of Mandarin.

Some Cantonese activists denounce Mandarin as a pidgin language spoken Manchu and Mongol invaders glommed onto the Chinese of the people they conquered.

Attempts to determine intelligibility through the use of complex lexical, tonal, grammatical and phonological formulae produce results that are excessively high in terms of percentage of intelligibility. A better method is presented in Szeto 2000, in which sentences in other lects are played to speakers of Lect A, and speakers of Lect A are asked to give the basic meaning of the sentences played to them. A sentence is recorded as correct if the basic meaning was ascertained.

By this better method, Standard Cantonese has only 31.3% intelligibility of Siyi, 7.2% of Hakka, 2.7% of Teochew and 2.5% of Xiamen. This paper also highlights the very important role morphological and syntactic differences play in intelligibility, even apart from phonology and other factors.

In contrast, the more complex method not relying on actual informants gives false positives. By this method, Cantonese has 54.7% intelligibility of Hakka, 47.4% of Xiamen 43.5% of Teochew. This method falsely overestimates the intelligibility of Hakka by 7.6 X, of Teochew by 16.1 X and of Xiamen by 19 X.

Cantonese is traditionally said to have nine tones, but phonemically, there are only six tones, since the last three are just three of the first six with a voiceless stop consonant on the end. These are often called entering tones in traditional Chinese scholarship.

Entering tones have disappeared from most Mandarin lects, probably about 800 years ago due to the influence of invading Mongols speaking Turkic languages, but are still present in Cantonese, Hakka and Min. The original entering tones of Middle Chinese have merged into one or the other or Mandarin’s four tones.

Traditional Chinese tones or contour tones end in a vowel or a nasal. However, in Cantonese, the entering tone has retained its original short and sharp character from Middle Chinese, so in a sense, it has a different sound quality.

Besides Standard Cantonese (the Guangzhou lect in the Yuehai Group), there is Siyi, or Sze Yup, a separate language. Siyi has 8 dialects, however, there are reports that there are intelligibility problems within the Siyi lects.

In particular, Enping speakers cannot understand some other dialects. Therefore, Enping is a separate language.

Kaiping, or Chikan, is not fully intelligible with Enping until they get used to each others’ sounds. Kaiping is so different from Taishan that it is hard to imagine how they can communicate well, though there is partial intelligibility.

In Xinhui, there is a dialect called Hetang that is very divergent and has many strange features not found in other dialects. Doubtless it is less than fully intelligible with other Siyi lects.

Actually, there seems to be many more than 8 dialects of Siyi. In Taishan County alone, there are 20 townships there may be a different lect in each one. For certain, there are at least three distinct dialects of Taishan, Taishan A, Taishan B and Taishan C. Even the lects in Taishan County can be quite different. However, all lects in Taishan County appear to be mutually intelligible.

Xinhui is somewhat different from Taishan, but appears to be intelligible. Heshan is said to be intelligible with Xinhui and Taishan.

Nevertheless, there are calls from Taishan speakers to split their lect off from the rest of Siyi. If Taishanese is unintelligible with the rest of Siyi, this would make sense, but that does not appear to be the case.

150 years ago, there was less, but still significant, difference between Siyi and Sanyi (Standard Cantonese), but Siyi was disparaged as a “hill dialect” of poor farmers, while Sanyi was elevated as the prestige lect of the cultured and cosmopolitan. This is why Sanyi became the Standard Cantonese lect. The Siyi incorporated this negative view into their self-image even to the point where they held overseas meetings meeting in Sanyi speech.

There are 3.6 million speakers of Siyi.

Vietnamese Cantonese is quite different from Standard Cantonese, but it is nevertheless intelligible with it. Malay Cantonese is also quite different from Standard Cantonese. Intelligibility data between Malay Cantonese and Standard Cantonese is not known. Both are dialects of Cantonese.

Hong Kong is a dialect of Guangzhou. Foshan and Nanhai are close to Guangzhou and may be intelligible with it. Nanhai and Shunde are mutually intelligible.

Some say that Shunde and Zhongshan are intelligible with Standard Cantonese, but others disagree. This requires further study, as they are obviously close. However, both are said to at the same time be quite different from Standard Cantonese.

Even within Yuehai, Panyu is said to be a separate language (Chan 1981).

Namlong, a poorly understood lect from the Pearl River area, is also a separate language, or at least it was one in 1949. Whether it still exists is not certain, but speakers must still be alive. Yuehai itself has 31 separate lects.

Danija, the Cantonese lect of the Tanka fisherpeople who live on boats off the coast of Guangdong, Guangxi and Hainan, may well be a separate language.

In Hong Kong, another Cantonese language, Gashiau, is spoken by a group of fisherpeople related to the Danija. This language is related to Danija but apparently not intelligible with it.

Maihua, a Cantonese lect spoken on Hainan, may well be a separate language also.

Nanning is a dialect of Cantonese, easily understandable by a Standard Cantonese speaker.

However, Lizhou is a separate language, with difficult intelligibility with Standard Cantonese.

Dongguan and Zhanjiang (evidence), are separate languages.

Shiqi, spoken in Guangxi, is a separate language. Speakers of Standard Cantonese cannot necessarily understand Shiqi, but Shiqi people can understand Guangzhou. Shiqi is spoken in the urban part of Zhongshan City.

Huazhou is a very divergent Cantonese lect that is very hard even for other Cantonese speakers to understand. It is surely a separate language (evidence here and here).

Maoming is an extremely diverse Cantonese lect that must also be a separate language.

Beihai and Hepu are reported to be very different, but intelligibility data is not known, nor is it known to what extent these two lects differ from other Cantonese.

But the Quinlian Group of which they are members must surely be a separate language.

One division holds that the Standard Cantonese (Guangzhou), Siyi, Zhongshan, Gaoyang and Guangfu groups are mutually unintelligible groups.

The Goulou Group of Cantonese appears to be a separate language from all of the rest of Cantonese, and is probably in a group of its own away from the rest of Cantonese, and linked with Pinghua and Tuhua. Yulin is a representative lect in Goulou, and is said to present form of Chinese that is closest to Old Chinese.

Siyi has at least 11 dialects, includes the famous Taishanese (includes Taishan A, Taishan B and Taishan C), along with Heshan, Jiangmen, Siqian, Doumen, Xinhui, Enping and Kaiping.

Nanning is in the Yongxun Group of Cantonese, which has 12 lects.

Zhanjiang and Maoming are members of the Gaoyang Group of Cantonese, which has 10 lects. Gaoyang has 5.4 million speakers.

Dongguan, Shunde, Foshan, Zhongshan, Nanhai, Panyu and Hong Kong are members of the Guangfu Group of Cantonese, which has 31 lects. Guangfu has 13 million speakers.

Shiqi is a member of the Zhongshan Group of Cantonese , which contains at least 3 lects.

Huazhou is a member of the Wuhua Group of Cantonese, which has 2 lects.

Beihai and Hepu are members of the Quinlian Group of Cantonese, which has 6 lects.

Namlong is unclassified.

There are 100 lects of Cantonese, and Cantonese has 64 million speakers.

Pinghua, now recognized as a major split off from Cantonese, is composed of Guinan and Guibei, which are separate languages. The Guibei lects are very different, but we don’t have any intelligibility data.

Guinan has 22 lects, and Guibei has 8 lects .

There is one Pinghua lect that is unclassified.

Pinghua has 31 separate lects. Ping has 2 million speakers.

Tuhua is a separate branch of Chinese spoken in Guangdong and Hunan Provinces. It has 26 separate lects.

In addition to Tuhua Proper, the best known of the Tuhua lects is Shaozhou, referred to here as Shaozhou Proper. Shaozhou is said to be very different from other Chinese lects. Shaozhou itself consists of many different lects which are often strikingly different from the others. Some say that Shaozhou is a branch of Min Nan, while others say it is related to Hakka.

In Lechang prefecture, there are five separate languages, Lechang Tuhua 1, Lechang Tuhua 2, Lechang Tuhua 3, Lechang Tuhua 4 and Lechang Tuhua 5, which are not fully intelligible with each other.

Additionally, many Tuhua lects are starting to splinter recently as influences from Hakka, Cantonese and Southwest Mandarin begin to affect the younger speakers such that the language of the youngest speakers is quite a bit different from the language of the older speakers.

One of the Shaozhou Tuhua lects, Longgui Tuhua, spoken in Qujiang County in Guangdong, is a separate language. Longgui Tuhua has 2,000 speakers.

Actually, Tuhua is not really a language group, but a wastebasket group for various lects derisively referred to as “tuhua” – or “farmer’s language.”

Xianghua, said to be an unclassified Chinese lect, is actually a branch of Tuhua that contains 6 lects of its own. Xianghua is a completely separate and highly diverse language that is spoken in Western Hunan.

Jiahe Tuhua is a completely separate language, unintelligible with other lects. Furthermore, there are huge dialectal differences within Jiahe Tuhua that may or may not constitute separate languages.

Jiangyong Tuhua is divided into two mutually unintelligible languagesNorth Jiangyong Tuhua and South Jiangyong Tuhua (Leming 2004). It is spoken in the rural areas of Jiangyong County in Hunan Province. There are multiple lects within these two languages, which have considerable distance between them.

A subdialect of North Jiangyong Tuhua – the suburban, or “upper street language” dialect, was the basis for the famous nishu, “women’s script”, a secret language of women, originating from the Shangjiangxu (Xiao River) region of northeastern Jiangyong County in Hunan Province, of which much has been written lately.

Also in Hunan, in Guiyang County, another Tuhua language is spoken – Guiyang Tuhua. This is apparently a separate language, and the northern and southern variants are so divergent that they are separate languages also – Northern Guiyang Tuhua and Southern Guiyang Tuhua. In addition, there are a lot of diverse dialects within the two Guiyang Tuhua languages, but intelligibility data is lacking.

Yantang Tuhua, one of these dialects, may well be a separate language, as may Yangshi Tuhua. Jiangyong and Guiyang are in the Tuhua branch of Tuhua. Yantang and Yangshi are unclassified.

Furthermore, initial examination suggests that a number of things.

First of all, that the Tuhua lects, especially those of Southern Hunan, are very diverse, possibly as diverse as Wu, Xiang and Hui. Many or all of them may well be separate languages. Further, they are poorly studied and dialectally very diverse. There are many dialects inside the known Tuhua lects, and these dialects are often very different. So there appear to be languages inside even the known Tuhua lects.

Further, there appear to be links with the Tuhua lects of Southern Hunan, the Tuhua lects of northern Guangdong and the Ping lects of northern Guangxi, which border each other. They all appear to be related, and to have descended from a common ancestor.

Danzou is a separate language. Danzou is spoken in the northwest of Hainan, and Hainanese speakers cannot understand it. It is either related to the language spoken by the Lingao or is the same language. Yet the Danzou people speak 9 different lects, including lects described as Hakka, others described as Cantonese and others described as Mandarin.

Maojiahua is a form of Chinese spoken by 20,000 Hmong in southwest of Hunan, in the northeast of Guangxi and in some areas of Hubei. It is a separate language already recognized by Ethnologue, but is incorrectly lumped in with the Hmong languages by them.

Linghua is an unclassified Chinese lect spoken in Yongzhou in Hunan. Linghua is a separate language. It is apparently the same as the Yongzhou Tuhua dialect.

However, the Yongzhou Tuhua language has 17 different dialects: Yongzhou Tuhua A, Yongzhou Tuhua B, Yongzhou Tuhua C, Yongzhou Tuhua D, Lanshan Tushi Tuhua, Lanjiaoshan Tuhua, Xintian Southern Rural Tuhua, Xintian Northern Rural Tuhua, Ningyuan Zhangjia Tuhua, Ningyuan Pinghua, Lanshan Shangdong Tuhua, Lanshang Taiping Tuhua, Daoxian Xianglinpu Tuhua, Daoxian Xiaojia Tuhua, Shuangpai Lijiaping Tuhua and Jianghua Baimangying Tuhua.

Of these, Lanshan Tushi Tuhua may well be a separate language.

Intelligibility between lects is not known, but dialectal divergence within Tuhua lects is typically great, and some or all of the above may be separate languages.

Pingde Yahua or Kim Mun, incorrectly classed as an unclassified Chinese lect, is actually one of the Mien languages. It is not a Sinitic language.

Wutun, or Wutunhua, is a Chinese-Mongolian-Tibetan mixed language spoken by 2,000 Tu in Qinghai Province. Whether it is a form of Chinese is controversial. Until it is proven to be Sinitic, we will not list it here.


Ben Hamed, Mahe´. 2005. Neighbour-nets Portray the Chinese Dialect Continuum and the Linguistic Legacy of China’s Demic History. Proc. R. Soc. B 272:1015–1022.

Bodman, Nicholas C. 1988. Two Divergent Southern Min Dialects of the Sanxiang District, Zhongshan, Guangdong. BIHP 59 (2): 401-423.

Branner, David. 2000. Problems in Comparative Chinese Dialectology. The Classification of Miin and Hakka. Berlin: Walter de Gruyter.

Branner, David. 2008. Personal communication.

Campbell, Hilary. 2004. Chinese Grammar – Synchronic and Diachronic Perspectives. Oxford, UK: Oxford University Press.

Campbell, James Michael. Putonghua and Taiwanese Min Nan speaker. Taipei, Taiwan. January 2009. Personal communication.

Campbell, James Michael. Putonghua and Taiwanese Min Nan speaker. Taipei, Taiwan. April 2009. Personal communication.

曹志耘 (Cao, Zhiyun). 2002. 南部吴语语音研究 (Southern Wu Phonology Research). Beijing: Commercial Press (In Chinese).

Chan, Marjorie K.M., Lee, Douglas W. 1981. Chinatown Chinese: A Linguistic and Historical Re-evaluation. Amerasia Journal, Volume 8, Number 1.

Cheng, Chin-Chuan. 1997. Measuring Relationship Among Dialects: DOC and Related Resources. Computational Linguistics & Chinese Language Processing 2.1:41-72.

Cheng, Chin-Chuan. 1998. Extra-Linguistic Data for Understanding Dialect Mutual Intelligibility. Taipei, Taiwan: Paper delivered at the 1998 Annual Conference of the Pacific Neighborhood Consortium.

Gilliland, Joshua. 2006. Language Attitudes and Ideologies In Shanghai, China. MA Thesis. Columbus, OH: Ohio State University.

Hirata, Shoji. 1998. Aspect: A General System and its Manifestation in Mandarin Chinese. Taipei: Student Book Company.

Johnson, Eric. 2010. SIL Electronic Survey Reports 2010-027. A Sociolinguistic Introduction to the Central Taic languages of Wenshan Prefecture, China. Dallas, Texas: SIL.

Lee, Kent A. 2002. Chinese Tone Sandhi and Prosody. MA Thesis. Urbana, IL: University of Illinois at Urbana-Champaign.

Lien, Chinfa. August 17-19, 1998. Denasalization, Vocalic Nasalization and Related Issues in Southern Min: A Dialectal and Comparative Perspective. International Symposium on Linguistic Change and the Chinese Dialects Dedicated to the Memory of the Late Professor Li Fang-kuei in Seattle Washington.

Liming, Zhao. The Women’s Script of Jiangyong: An Invention of Chinese, Chapter 4. In Tao, Jie, Zheng, Bijun, Mow, Shirley L., editors. 2004. Holding Up Half the Sky: Chinese Women Past, Present, and Future. New York: Feminist Press at the City University of New York.

Mair, Victor H. 1991. What Is a Chinese ‘Dialect/Topolect’? Sino-Platonic Papers:29

McKeown, Adam. 2001. Chinese Migrant Networks and Cultural Change: Peru, Chicago, Hawaii, 1900-1936. Chicago, IL: University of Chicago Press.

Ngù, George. Eastern Min speaker. 2009. Personal communication.

Olson, James Stuart. 1998. An Ethnohistorical Dictionary of China. Westport, CN: Greenwood Publishing Group.

Rickard, Kristine. 2006. A Linguistic-phonetic Description of Lanqi Citation Tones. Proceedings of the 11th Australian International Conference on Speech Science & Technology, pp. 349-353. Edited by Paul Warren & Catherine I. Watson. University of Auckland, New Zealand. December 6-8, 2006. Auckland, NZ: Australian Speech Science & Technology Association Inc.

Szeto, Cecilia .2000. Testing intelligibility among Sinitic dialects. Proceedings of ALS2K, the 2000 Conference of the Australian Linguistic Society.

Thurgood, Graham. 2006. Sociolinguistics and Contact-induced Language Change: Hainan Cham, Anong, and Phan Rang Cham.‭ Tenth International Conference on Austronesian Linguistics, 17-20 January 2006, Palawan, Philippines. Linguistic Society of the Philippines and SIL International.

Xun, Gong. Sichuan Mandarin and Putonghua speaker. Deyang, Sichuan, China. Personal communication. September 2009.

Zheng, Rongbin. 2008. The Zhongxian Min Dialect: A Preliminary Study of Language Contact and Stratum-Formation, pp. 517-526. Edited by Chan, Marjorie K.M. and Kang, Hana. Proceedings of the 20th North American Conference on Chinese Linguistics (NACCL-20). Volume 1. Columbus, Ohio: The Ohio State University.

This research takes a lot of time, and I do not get paid anything for it. If you think this website is valuable to you, please consider a a contribution to support more of this valuable research.


Filed under Asia, Cantonese, China, Chinese language, Dialectology, Indonesia, Language Classification, Language Families, Linguistics, Malaysia, Mandarin, Min Nan, Philippines, Regional, Sinitic, Sino-Tibetan, Thailand, Vietnam

98 responses to “A Reworking of Chinese Language Classification

  1. James Schipper

    Dear Robert
    My understanding is that Mandarin is to the Sinic languages what German is to the Germanic languages without English and what Russian is to the Slavic languages. About 2/3 of those who speak a Germanic language other than English speak German and about 2/3 of those who speak a Slavic language speak Russian. I don’t really consider English a Germanic language anymore because it has been so heavily Romanized. We should then speak of Mandarin and all the other Sinic languages.
    I suppose that all speakers of a Sinic language are schooled in Mandarin. If that is the case, are speakers of a Sinic language other than Mandarin in the same situation as Welsh-speakers and Friesian-speakers? Every Welsh-speaker is also an English-speaker and every Friesian-speaker is also a Dutch-speaker. Moreover, even fluent Welsh-speakers and Friesian-speakers are likely to write respectively in English or Dutch rather than in their respective mother tongues.
    Have a wonderful 2009. James Schipper

  2. Dragon Horse


    You are correct, that everywhere but Hong Kong and Macao (to my knowledge) Chinese people (Taiwan too) are educated in Mandarin. So most Chinese outside of the Mandarin heartland (North of the Yangtse) are bi-lingual to a certain extent. It is how Swiss German speakers are educated in High German, but rarely speak it at home. There is no written standard to Swiss German, just like most Chinese dialects (languages) have no written standard (but maybe Cantonese).

    What I’ve seen is that, similar to Swiss Germans, most non-Native Mandarin speakers in China speak Mandarin with varying accents and often mix up tones, but can still be understood by everyone else. I do not see the dialects disappearing though, well not most of them. Like Swiss German, most Chinese are proud to speak their dialects, and in places such as Shanghai, even discriminate against those who do not speak their dialect (as backward outsiders). I’ve witnessed this often.

  3. Hi James, in a sense you are probably true, however the differences between the Chinese languages is probably much greater. English and German have 60% lexical similarity. English and French have about 25% and English has about 29% with Russian (I think!). I need to look at some charts here. It’s not uncommon for Chinese lects to have 5-30% lexical similarity. Further, there are deep differences in tones, and even grammar and structure. Even the pronouns can differ. But clearly they are all related to German and they all derived form Chinese. So yes, your analogy with Russian and German as super-languages on top of their families is correct, but it is important to note the vast differences in the lects. It was said that no one could understand Chairman Mao’s dialect, Xiang Nan (Mandarin dialect). Apparently his secretary could understand him, but few others could. I’m not sure how he got his points across.

    Further, at this point probably most speakers of the Sinitic languages for sure speak Putonghua, which is the Standard Mandarin. It’s a standard the same way that High German is Standard German and Standard Italian is the standard for that language. However, overseas, many do not speak Putonghua, and in the Cantonese area, I believe many still do not speak Putonghua. English is a Germanic language.

    Look at the vocabulary – closest language is Frisian with 64%. Dutch is 62% and German is 60%. French is 25%. English is clearly a Germanic language. There are similar cases in the Chinese languages. Some of them have heavy layers of non Sinitic tongues like Zhuang or Hmong.

    Besides Putonghua, you are correct that the vast majority of Sinitic speakers are native speakers of some kind of Mandarin.
    Um, I believe that a lot of the older folks do not have very good Mandarin and may be monolinguals of their Sinitic tongue, but I’m not sure. The government has been pushing Putonghua very hard, almost too hard. It’s been killing the smaller tongues. So it’s not quite the same way with Frisian and Welch yet. I believe it’s pretty common in the South to find Cantonese speakers who don’t speak Mandarin, and it’s for sure the case overseas.

    As far as writing, I don’t believe it’s a problem. An ideographic system was perfect for Chinese as it was the one way that all of the speakers of the various Chinese lects could communicate. My father was in China in 1946 and he said that the rickshaw drivers often could not understand each other, but they could all write Chinese, so they would communicate by writing notes. All Chinese can write to each other, no matter what language they speak, assuming they are literate. A decade ago in a college in Henan, a professor said that the students would come to the college from all over the province and for the first month would communicate by writing notes to each other, so they all wrote a common language. In that province, every county has its own language, and there are even separate languages within counties. It took them about a month or so before they could start working out each other’s languages.

    Some comment that the Chinese languages are like a Cockney accent of English. On a website, a commenter said that that’s not true. He said he can understand Cockney, but they had a speaker of an Anhui Mandarin lect as a professor at the university and no one could understand what he was talking about. So it’s quite common for the various Chinese lects to be pretty much incomprehensible to each other.

    This is especially true in the center and south of the country. In Anhui, Fujian, Henan, Hunan, Jiangsu and Zhenjiang there is an incredible diversity of tongues. It is said in Fujian that every 3 miles the culture changes and every 6 miles the language changes. In these parts of China, there are lots of mountains and it is very rural. Many people never left their home village to go over the mountain to talk to the people over there, so a multitude of tongues arose. I understand that in this part of China there are even incomprehensible tongues inside major cities where the downtowners can’t understand the suburbs.

  4. Goytá

    Robert, I am NOT a linguist (though I *am* a translator) and can’t speak a word of any of the Chinese languages, but languages are a subject that fascinates me. In the case of Chinese, it has always been evident to me (especially from what the ethnic Chinese I know have always told me, that they can’t understand each other when coming from areas sometimes not so far away) that the expression “Chinese language” was an abstraction at best, fiction at worst. But other countries’ cultural insensitivities and central Chinese government political sensitivities (from the first dynasty to today’s PRC and Taiwan) have made that abstraction convenient to their interests.

    However, it seems to me that the concept of mutual intelligibility is more complex than it seems at first. Being Brazilian, I find it much easier to understand Spanish than European Portuguese, especially when spoken too fast, or by rural or uneducated speakers, or through low-quality audio devices. (Here I must explain to less familiar readers that for several reasons, and contrary to what one might think, Brazil has always sort of turned its back to other South American countries, and except in the far South, cultural contact with neighboring, Spanish-speaking countries has always been too negligible to be of any significant influence in Brazilian Portuguese. So, that is not the reason.) The Portugal variant requires the most of our attention to be understood.

    However, in spite of many lexical and sometimes grammatical differences (much more so than between the UK and the US, for example), the Portuguese clearly speak the same language as we do. This is evident when they speak under ideal conditions (slowly, by educated urban speakers) or when they write. By contrast, Spanish is clearly a different language, yet its intelligibility is often greater for us.

    Additionally, mutual intelligibility is not always symmetrical. Spanish speakers have a hard time understanding Portuguese. On the other hand, Italian is not so easy for us to understand (though it still can be understood to a significant extent), but Italians can understand Portuguese surprisingly well.

    Accent and local usage apparently play a role, too. A Taiwanese traditional physician living in Brazil (and speaking perfect, accent-free Brazilian Portuguese) once told me of when a newly-arrived Shanghainese woman was referred to him. They were both educated persons and could speak Standard Mandarin, but they still had a hard time understanding each other. In the end, the woman’s daughter, who had been in Brazil for a longer time, had to translate what her mother said into Portuguese for the doctor to understand!

    All this makes me think: is mutual intelligibility really a good criterium to classify a regional form of speech as a dialect or a separate language, when considered alone? In cases where the degree of separation is too great (considering lexicon, pronunciation, grammar, etc., all at once), certainly yes, because then it is a blatant sign of difference. But I suppose there must be a significant “grey zone” in many cases. What do you think?

  5. Hello Goyta, the question of when a language starts and a dialect ends of vice versa is something that we go round and round about.

    An example you give is if people talk slowly or are better educated, you can understand E Portuguese.

    In the Chinese examples above, as a general rule, these people simply *cannot* understand each other. They can’t understand each other no matter how fast, slow or whatever they speak. Due to the very strange nature of the Chinese language with tones, etc. the Chinese dialects often show incredible amounts of differentiation.

    In your post, you note that even speakers of heavily accented Mandarin have a hard time understanding. Please understand the difference between Mandarin and Putonghua. Mandarin itself is actually a collection of many different languages, many of which are not mutually intelligible.

    However, Putongua is like US English in that it is supposed to be relatively uniform. So if the doctor and the patient really were trying to communicate in regional Mandarin languages as opposed to Putonghua, there indeed could have been quite a few problems.

    As a general rule though, Putonghua speakers can probably communicate with each other, especially if they speak slowly, in the same way that most, if not all, US English speakers can communicate with each other. So I suspect that these two were attempting to communicate in Mandarin languages not in Putonghua.

    I assume that the distance between E Portuguese and Br Portuguese has been tested experimentally and it has been determined that they are not separate languages.

    As an English speaker, I assure you that there are many English lects that I can barely make heads or tails of. The Cockney accent in England, the Irish accent in Ireland, some Black accents in the US – especially around Memphis, Tennessee, the Cajuns in Louisiana, and some Deep South Whites as in Alabama. In general, if you ask these folks to slow down a bit, you can understand them.

    Further, I am convinced that many regional Englishes are best seen as foreign languages. The English spoken in India and West Africa is barely intelligible to the rest of us. I honestly feel that these need to be set aside as separate languages.

    Do you need subtitles on the screen if you are watching someone speak the lect? That is a clue that you are dealing with a language and not a dialect. We US California English speakers usually do not need subtitles on any variety of English, however, we do need it for Scots and Jamaican English. As you might guess, Ethnologue has already thrown those over to separate languages.

    I am just throwing out some ideas here. The questions you raise in your comment are things that we linguists never stop debating and fighting about. Many journal article, book chapters and even books have been written about where a dialect ends and a language begins. A hint – much of it has to do with politics and sociolinguistics.

    I do agree with you that Brazilians are isolated. I met some Brazilians on the web and they had little exposure to Spanish. I communicated with them in Spanish and they responded in English or Portuguese. In some cases, their English was better than their Spanish. Brazilians seem to have little interest in learning Spanish. I would add that the neighboring Spanish countries have zero interest in learning Portuguese.

    This is unfortunate. Spanish and Portuguese are close enough so that if one speaks one, one can easily pick up the other. I had a Brazilian gf once. She spoke Portuguese to me and I spoke Spanish and English to her. After a few days, I was already learning Portuguese.

    In recent days, Ethnologue has been splitting off languages based on whether or not they are “structurally separate languages”. Think about that one for a bit. I doubt if that applies to the Portuguese lects. If you ever hear some Galician, see how much you can make it. It’s been determined that Galician is far enough from E Portuguese that it is a separate language.

  6. Goytá

    Hello, Robert! Very interesting reply, thank you! Now, a few further remarks:

    In the Taiwanese doctor’s story, the way he told it, it appeared that he meant they were speaking Putonghua, but I can’t be sure and it might well be possible that they were speaking different flavors of “Mandarin”.

    I don’t know if the distance between European and Brazilian Portuguese has been tested (it probably has), but a native speaker of either doesn’t need that – it *is* the same language, even though there are numerous vocabulary and spelling differences and a few significant grammar and usage differences, too. Computer terminology, in particular, has evolved separately in both countries, and computer and IT texts are almost completely incomprehensible in the other country. Microsoft has separate versions of Windows and Office for Brazil and Portugal, and I can only read a computer magazine from Portugal with difficulty and guessing many terms. However, when I read a newspaper article from Portugal about a more general subject, I have no problem at all – coincidence is over 99% in most cases.

    Unless you stumble in some very typically British expression or spelling (a “lorry” for a truck, a “tin” for a can, a “colour telly”), I bet it would be theoretically possible for you to read several pages of an English text without knowing from which side of the pond the author is. Not so in Portuguese. It only takes a couple of phrases, at most a paragraph, to know whether the author is from Portugal or Brazil. In other words, there seem to be well-established national “styles”.

    Yet, except for some lexical differences and preference for some verb tense constructions and pronouns instead of others (but existant and acceptable in both countries nonetheless), there is no doubt to any native speaker in either country that it is the same language. If one is careful, it is also possible to concoct a lengthy text that could have been written by someone in either country. This should not be surprising, since the Brazilian Academy of Letters and the Lisbon Academy of Sciences work jointly to standardize the language, allowing for national variations but keeping the basic unity of the standard. A new spelling reform is to come into effect in 2009 bringing the two written variants even closer.

    In recent years, Brazilian Portuguese has influenced the European variant a lot, too, chiefly for two reasons: Brazilian soap operas are wildly popular in Portugal (and broadcast in the original version), and Brazilian Internet services and sites are more numerous and complete, due to the much larger population. Many Portuguese users have “@yahoo.com.br” e-mail addresses, for example. As a result, for example, the informal addressing pronoun “você” (“you”, but conjugated in the third person and used only with close friends and family), which is standard in Brazil but used to be rarely employed and considered somewhat gross in Portugal, is now commonplace there.

    By contrast, little Portuguese media comes to Brazil, but when it does, we do wish it had subtitles (it never has). The problem with spoken language seems to be that in European pronunciation they omit unstressed vowels almost completely, usually alternating a syllable where the vowel is pronounced with one where it’s not. In Brazilian Portuguese, all vowels are clearly pronounced. So are they in Spanish, and this probably explains why the latter is often easier to understand. But the words and phrases being spoken are still exactly the same in European and Brazilian Portuguese in most cases. I have also met many Portuguese immigrants who had been in Brazil for a very long time, and I only learned they were Portuguese after several meetings – they can catch the Brazilian accent almost perfectly after some time, even when they come here as adults.

    I have never heard Galician, but I have seen it written – it is very heavily influenced by Spanish syntax, but the lexicon and spelling are 95% Portuguese, I’d guess. In my layman’s opinion, it wouldn’t be difficult to consider it a Portuguese dialect (and I know historically it has been considered so), though they’d certainly burn me alive if I said that in Galiza!

    As for (not) learning Spanish in Brazil, this is starting to change because of the Mercosul/Mercosur economic community joining Brazil, Uruguay, Paraguay and Argentina. Brazil and Argentina are by far each other’s greatest commercial partners now, and nearly all corporations in both countries have large binational operations now. So, knowledge of Spanish is increasingly a requirement for good jobs here, and people are rushing to learn it. The same is happening across the border with Portuguese. This is even more true in Paraguay, whose economy has always been heavily dependent on Brazil: many Paraguayans are trilingual (Spanish, Guarani and Portuguese – not rarely, so good as to be accent-free, or nearly so). In border areas of Paraguay there are also many Brazilian farmers, and influence of Portuguese is heavy.

    However, we still can communicate wonderfully using either language, or a curious improvised mix of both we call “Portunhol” or “Portuñol”. But the very similarity of the languages is treacherous and one almost inevitably ends up speaking “Portunhol” even if one has learned the other language properly. My own Spanish is horrible and I feel very uncomfortable speaking it, even more so writing it. In Internet chats, for example, I have often communicated in English with Spanish speakers because we both were more comfortable that way than trying to use the other’s language.

    I am relieved that you, an American, can’t understand some accents easily… I understand some Americans better than others. Chicago was where I had the least problems, I could understand everyone easily. By contrast, in nearby Michigan they don’t speak, they mumble, and I often had to ask them to repeat what they said. California is OK but you speak very fast, I have to pay attention closely.

    Dallas was fine, and I even wondered “is this the famous Texan accent they say is so difficult to understand?” But I soon discovered that Dallas is a more cosmopolitan place, and when I went to Austin I found that actual Texan *is* hard! I also met an Oklahoman woman who seemed to speak Old Tibetan. And I never understood a word of what Sylvester Stallone says – I was relieved when Brenda Blethyn’s character in “Secrets and Lies” said exactly the same thing! (That movie is a great lesson on British social class accents, by the way.) I’d say the masters of vocalization in Hollywood were John Wayne and Richard Burton – foreign speakers wish everyone spoke like them…

    Enough diversion… We were talking about “Chinese” and its proposed linguistic divisions, and I was asking to what extent mutual intelligibility is a defining criterium by itself for that purpose. It seems cultural specificities make that decision even more difficult in China’s case. And of course, politics is sensitive. In the West, Catalan/Valencian and Romanian/Moldovan come to mind, but what about sometimes remote provinces of a huge country that needs to consolidate central power?

  7. Hello Goyta, this is very interesting.

    We can always understand British English no matter who is writing it. Same with understanding spoken Australian and New Zealand (Kiwi) English. British English is often written a bit differently in slang expressions, but we pick them up. The formal writing is totally understandable.

    There have been huge fights on Wikipedia between Br English and US English speakers with complaints from the Brits of bullying by the Americans. There was an attempt to fork the English Wiki into Br and US versions but it failed.

    Wikipedia demands that you have an ISO code in order to get a Wikipedia and ISO codes only come from SIL, who publishes Ethnologue. I petitioned for a few new languages a couple of years ago and they all got shot down.

    There is an ongoing war between P Portuguese and Br Portuguese on the Portuguese Wikipedia with complaints from the P Portuguese of bullying on the part of the Brazilians. I am sorry you could not read Portuguese IT materials. This is unfortunate.

    All written British is intelligible to us. We can read anything written in the UK, though most of our reading material here is from the US. I can read The Economist and The New Standard and The Spectator with no problems at all.

    As a Californian, I speak completely normally, of course, and have no accent whatsoever! Haha. We can understand the Midwest accent perfectly, though it can be different. It sounds “flat”. They also insert rhotic consonants before some consonants at the end of a word and the raising of the preceding vowel – “wash” becomes “worsh”.

    Okalhoma accent is different and sometimes it can be hard to understand. I heard some people speaking Oklahoman in the doctor’s office the other day for a minute or so I thought they were speaking a foreign language! Of course they were mumbling too. Then I asked them where they were from and they said Oklahoma. At that point, I had caught onto their accent and could understand them perfectly.

    I do not know why the Texan accent is said to be hard to understand. We understand it perfectly, but it sounds funny. We make a lot of jokes about it. George Bush has a strong Texan accent. There is also an Arkansas accent (Arkies) that is different but understandable. This is also the source of jokes.

    In this part of California there are many Whites who still speak Arkie and Okie. They are the descendants of those who came out here from the Dust Bowl in the 1930’s. Steinbeck wrote a book about this called The Grapes of Wrath. Other than that, there are no accents in the West.

    There is some sort of a Kentucky-Tennessee accent, but I am not sure if they differ. This is also a source of jokes.
    It’s sort of a general Appalachian accent, and it’s the source of jokes about inbred hillbillies and whatnot.

    The Southern accent is well-known but usually understanble. My brother went to live in Alabama though and he said that the workers in the factory he worked at were often completely untelligible. The Blacks were worse than the Whites and they had separate accents. He has imitated their incomprehensible accent to me and it’s pretty hilarious.

    I have heard poor Blacks from Memphis on the Cops show who were completely unintelligible to me. People with more money and status tended to be more comprehensible. I sometimes have a hard time understanding a Mississippi or Alabama accent, but it’s generally no problem. Our Southern politicians all have thick Southern accents.

    Cajun English from Louisiana is often unintelligible to us, but the people with more money and status are quite intelligible.

    There is also a Black accent from the coast of South Carolina called Gullah that is hard to understand. The Blacks from around there speak something like it and you can pick it out if you are sharp. It has a pretty, lilting sound to it. It’s different from the standard Southern accent and is sort of charming.

    Moving up the coast, there is a Virginia accent that is softer, pleasant and charming.

    There is the famous New York accent, which to us laid back Californians sounds horribly rude, obnoxious, loud and belligerent. Some forms of it also sound ignorant – these tend to be associated with working class Whites in Brooklyn and the Bronx. One thing they do is to glide and lengthen rhotic consonants – “New York” beomes New Yawwk” “Brooklyn” becomes “Bwwoklyn”.

    There is a Boston accent which is completely understandable. Ted Kennedy speaks that. It involves the lenition of hard consonants into glides and the end of a word – “car” becomes “caw”. I believe there is a sort of a slow drawl from Vermont and New Hampshire too. Those people, especially the older men, are known for not talking much. Men of few words.

    Some Blacks around here still talk with thick Black accents that sound Southern even though they were born in the Central Valley.

    There is also an “Ebonics” English (for lack of a better word) that is spoken here by sort of ghettoish or semi-ghettoish Blacks. It is frankly almost completeley unitelligible. They seem like they are talking with their mouths full, mumbling and speaking extremely fast, running all of the sounds together.

    Everyone who talks like this can also speak Standard English thank God, and they can quickly move in and out of that Ebonics talk when you talk to them. It’s sort of a language for them to talk so that we can’t understand them, I think. To us, it sounds sloppy, low class and ghetto, but it reportedly a full-fledged language.

    The Blacks in the Caribbean do not speak English! That makes me feel good because I can hardly understand a word they say. Each island has its own form of Creole English which is a completely separate language.

    As I noted before, I think that Indian English (Chichi derogatively) and West African English need to be split into separate languages because they are often incomprehensible to us. This is a case of regional Englishes evolving on their own.

    Further, West African English often differs a lot in its written form. Indian English is often so mangled in its written form that it is incomprehensible, but more educated writers are comprehensible. The tendency to drop articles is very annoying and makes written Indian English sound ignorant to us. Don’t mess with our damned useless articles!

    Reading about the Chinese languages, there are efforts underway to get speakers to speak proper Putonghua, whatever that means. Speakers from different parts of China still speak Putonhua with an accent that can be heavy at times.

    Here in the US, we do not have this problem. Even our politicians still speak in heavy regional accents, and no one cares. We can always understand them.

    There is no national effort to get everyone to speak proper English that involves wiping out regional accents, though I understand that in the corporate world, they are offering classes to help people get rid of Southern accents, which are stereotyped as sounding backwards, ignorant and racist. I think this is sad. Our regional accents are what makes this country great.

    When I was dealing with them 5-10 years ago, most Brazilians did not speak much Spanish (They acted like it was extremely low on their list of priorities) and Spanish speakers had zero interest in learning Portuguese (In fact, they regarded the suggestion as offensive and preposterous!)

    You note that with regional integration, more Portuguese are speaking Spanish and more Spanish speakers from nearby countries are learning Portuguese. Spanish is becoming a prerequisite to getting a good job in Brazil. This is good as it’s good to see Latin Americans getting together.

  8. Pingback: Englishes, Portugueses, and Chineses « Robert Lindsay

  9. Goytá

    Very interesting, Robert! As a foreigner, my ears are not as sharply trained to distinguish many subtleties of American accents, but I do notice some. For example, New York accent doesn’t exactly sound “rude” (it would take more cultural references to say that, I suppose), but it does sound a bit blunt to me, I’d say. It has some too sudden loudness shifts, it seems, that make it sound somewhat aggressive.

    Here in Brazil there are many marked regional accents, but as in the U.S. (probably more, from what you wrote), they are all perfectly intelligible. The one possible exception is “gaúcho” accent (from Rio Grande do Sul in the far South), which has a heavy Spanish influence and a large number of unique words, so it’s sometimes a bit hard to understand, but in the end we always do. (To me, the chief problem with “gaúcho” is that their characteristic “musicality” makes them always sound as if they are making a question, and since Portuguese relies solely on intonation for that, with no special interrogative structures, sometimes I don’t know when they are really asking something or not…) Given Brazil’s huge inequality of wealth and education opportunities, we resemble Britain in that social class is often more decisive in determining the intelligibility of one’s speech than the place where one comes from.

    We also have a “most hated” accent – the one from Rio de Janeiro. Many Brazilians from other regions (and virtually all from São Paulo) can’t stand it for more than a few minutes without getting irritated – even though Brazil’s largest TV production center is still Rio and a large proportion of TV stars and journalists have Rio accent. A friend of mine once gave up a guy she was interested in because he had a too strong Rio accent and that went up to her nerves… Rio accent is very nasal, with strongly rounded vowels, a final “s” (as in plurals) converted to a loud hissing “sh”, and a very long, loud aspirated “r” that sounds more like English “h”. It has a certain tone that sounds like contempt or despise in other regions, and then there are also negative cultural references about Rio and its peculiar lifestyle.

    There are also two discriminated regional accents. The first is the Northeastern one (very sharp vowels, sometimes reminiscent of French, “t” and “d” pronounced with the tip of the tongue – in most other Brazilian accents, usually softened to “tch” and “dj” – and a characteristic “musical” cadence – people often say that Northeasterners “speak singing”). The Northeast is Brazil’s poorest and most socially backwards region, and hordes of uneducated, unskilled Northeasterners migrate to the richer Southern states (President Lula was once one of them, though he lost most of his accent after many years in São Paulo). They are sort of our Puerto Ricans, only they speak the same language. When they come to a more sophisticated urban society, the cultural clash is inevitable and leads to a lot of discrimination – and ungratefulness, because it was their hard work that built much of the South’s wealth.

    The other discriminated one is “caipira”, a rural (or originally rural) accent of the Southeast, often wrongly identified with the state of Minas Gerais but actually characteristic of the interior of São Paulo state. This is our hillbilly accent, used in jokes when one wants to impersonate a rural ignorant person. The chief characteristic is a more strongly rhotic “r” than any American will ever be able to do in his or her life, as well as an “lh” (similar to Spanish “ll”) that is converted to “y” (“palha” = “straw” becomes “paia”). The actual Minas Gerais accent has a completely different pronunciation of “r”, akin to Rio’s aspirated one (albeit much softer and shorter).

    I am from Minas Gerais myself but live in São Paulo, and it’s probably because of my aspirated “r” that I am often mistaken for a “carioca” (person from Rio), even though my accent lacks all other features of “carioca”. Minas accent also tends to omit and/or fuse the last syllables of words, so Brazil’s national date, September 7 – “sete de setembro” in Portuguese – becomes “”setsetembro”, “sessetembro” or even “sessetemb”.

    And the accent of São Paulo city (very different from that of the state’s hinterland) has been heavily influenced by Italian, for São Paulo is the world’s second largest “Italian” city (after Rome, but with more people with Italian ancestry than Milan). A foreigner who doesn’t understand Portuguese can easily think he or she is hearing Italian. For some reason, women’s accent is even stronger, but although born in Minas Gerais, Pelé lived later in Santos, the port city for São Paulo, and he has one of the strongest São Paulo accents I’ve ever heard.

    São Paulo’s “r” is a trill, some vowels are over- or undernasalized in comparison with other accents, and among uneducated people it is usual to omit the “s” in plurals (again, Italian influence, since Italian plurals don’t end with an “s”). The “t” and “d” used to be pronounced with the tip of the tongue (Pelé does), but among younger generations they are quickly being replaced with the more usual softened “tch” and “dj” of other Brazilian accents.

    Does this sound like a lot of difference? Actually, it isn’t. Brazilian Portuguese is remarkably uniform for the most part, and communication problems are extremely rare and minor.

  10. YOu wrote: “Although Putonghua is within the Beijing Group, Beijing itself is a separate language, unintelligible to Putonghua speakers. Furthermore, Beijing is in a group all of its own called the Beijing Group. It contains 43 separate dialects, and may contain more than one language.”

    1. THis is absurd. Beijing is intelligible with Putonghua, in fact, Putonghua is based on the Beijing dialect. The dialect itself has words that are not standard in Putonghua which is how we identify it as a dialect.
    2. Your last sentence. This “dialect group” may contain more than one language? WTF? How do languages appear inside of dialect groups? Go back to Linguistics 101 bro. Think about what the term 方言 actually means, morphemically.

  11. Hello Mr. Campbell. I certainly thought that Beijinghua would be intelligible with Putonghua also until I did a lot of reading and found out otherwise.

    In my research, I determined that many people find the Beijing dialect to be unintelligible, at least that spoken by taxi drivers, etc. Obviously, Putonghua and Beijing have diverged at some point. Others say that Putonghua was based on the Beijing suburbs, and not on Beijing city itself. Certainly there are plenty of folks claiming that they understand Beijinghua at less than 90% intelligibility. In this work, I am setting 90% as the point at which I split language from dialect. Over 90% intelligible, dialect. Below 90% intelligible, language. You are welcome to disagree.

    A similar thing has occurred in Italy, where Standard Italian was based on Tuscan, yet has diverged so much from Tuscan that now if you see old men from Tuscany on TV, you need subtitles to understand them.

    I am using dialects in place of the word “lects” which I ought to use. Lects sounds silly, so I’m using dialects instead. You’re welcome to disagree. I assure you I know the difference between the two.

    I was hoping I could cooperate with you on this project, but I guess not.

    Further, I will not accept any more of your comments until you change your tone.


  12. Pingback: Standard Languages Versus Their Parents « Robert Lindsay

  13. Pingback: Mutual Intelligibility as a Linguistic Concept « Robert Lindsay

  14. Pingback: Been Working on the Chinese Language Reclassification « Robert Lindsay

  15. Pingback: Definition of Chinese Dialect : Glossika Linguistics

  16. Pingback: Response To Mike Campbell on Chinese Language Classification « Robert Lindsay

  17. You can erase this and just update your references.

    曹志耘 is the name of the author. In Pinyin, that’s Cao Zhiyun. So the author is now “known”.

  18. Pingback: Linguists Know Lots of Languages « Robert Lindsay

  19. Pingback: A Reworking of German Language Classification « Robert Lindsay

  20. ren

    Pretty absurd indeed..
    I speak NE Mandarin and not only can I understand Beijing Mandarin and Putonghua just fine, but is able also to communicate with SW Mandarin speakers in Yunnan, if speaking slowly, repetitively.

    My ex was a Xiamen Minnan speaker who could communite with Quanzhou and Zhangzhou Minnan speakers just fine. (I’ve seen her do it myself.)

    By your standards, New England and New York English are separate languages. (I’m sure some Bostonian would even agree that they can’t understand a thing being said by a New Yorker.)

  21. Pingback: German Language Reclassification | Free Media Productions - Editorials

  22. Ren, that’s why we need intelligibility testing. There’s a strong suggestion that other NE Mandarin speakers might be able to understand Beijing, but people from the other parts of the country seem to have the damnedest time understanding the taxi drivers and hutong residents. I’m leaving whether or not Beijing is a separate language completely up in the air.

    The fact that you can understand Yunnan if spoken very slowly and repetitively is no good. You have to be able to understand it spoken at a normal speed and no repeating allowed. We have tons of reports of even other SW Mandarin speakers who can’t even understand one city from the other in Yunnan.

    For something to be a dialect and not a language, you need to understand over 90% of it on a tape recorder or video spoken at normal speed with no repetitions. If you can’t, it’s a separate language.

    Xiamen, Zhangzhou and Quanzhou are very close, and it is definitely an open question whether or not and to what extent they can understand each other. Once again, intelligiblity testing would be in order to sort all of this out.

    I don’t think you understand US English at all. You guys have dialects, we just have accents. I live in California. Do you know how many times in my life I meet another US English speaker who speakers another dialect of US English that I can’t understand well? Never! We all understand each other! I live in a tourist town and ppl come from all over the US to here. We understand 100% of them.

    There are a few really hardcore dialects in the US. There are a few super hardcore speakers of NY English that are hard to understand, but we understand over 99% of the ones we meet. Some Black English speakers are hard to understand. Some Louisiana English speakers are hard to understand. Some Deep South English speakers are hard, and some Appalachian English speakers are hard to understand. Other than Blacks and NY, we never meet any such folks in real life. I only say we have a hard time with them since we hear them on the radio or see them on TV and they are hard to understand.

  23. I’m on the same page with Ren above.

    This is in response to Quanzhou, Zhangzhou and Xiamen. These are all accents of the Quanzhang dialect. Actually Xiamen is a mixture of both (some words retain Quanzhou pronunciation, some words Zhangzhou) and this is how it more or less entered Taiwan. Actually Quanzhou and Zhangzhou are cities, and therefore are TOPOLECTS, and are abbreviated together as Quanzhang to refer to the dialect. This dialect is the standard dialect when referring to Southern Min and it includes the Taiwanese accents too.

    Books that I have read refer to Quanzhou 腔 (qiang1 = accent) and Zhangzhou qiang (e.g. q.v. Kho Kektun, Taiwan-yu Gailun, 1998, ISBN: 9578011989, Chapter 5 Part 4 Quanzhou-qiang yu3 Zhangzhou-qiang, Sections 2 and 3 on pages 103-107)

    Rounding before back vowels under certain conditions is common in a lot of dialects around the world. The Northeastern American dialects do this as well, for example, ‘talk’ becomes ‘twahk’ (is it possible to type IPA here or is it allowed on this site?) IPA: tu̯ɒk. It is also common in Russian and Bavarian.

    This rounding occurs (on/off) in Quanzhang Southern Min.

    Although I learned to speak Southern Min as a foreigner, whenever I learn a foreign language I like to focus on one topolect and get it right without moving around and mixing up the way I speak too much. Later, I can learn to code-switch with other accents. The topolect I learned is of Taibei, in northern Taiwan. It wasn’t until I researched ‘what are the Quanzhou and Zhangzhou pronunciations’ that I learned I’m mixing a lot of them. This is precisely because of the Xiamen influence on Taiwanese speech.

    As far as consonants are concerned, there is a voiced sound that tends to vary from accent to accent, so although we say /l/ in Taipei, it sounds more like [zh] (IPA /ʑ/ or /ʒ/) in central Taiwan near Zhanghua (Changhwa). It also surfaces as /g/ (implosive, IPA /ɠ/) in Eastern Taiwan.

    So with that consonant alone, it’s quite easy to code-switch. Saying the number 2, instead of saying [li], I just say [zhi] and I sound like I’m from Central Taiwan. Everybody understands it anyway. So if you go read the ban-lam-gi version of Wikipedia, all the [j]’s are pron0unced as [l] by me and the rest of Northern Taiwan speakers.

    The problem with learning a non-standardized language (a collection of dialects) is that you tend to hear all kinds of different pronunciations from different people and you’re unsure of which one to learn. And then when you do learn one, people associate your pronunciation with a particular location and if you use a different pronunciation suddenly they will tell you, oh you say it wrong, or you say it like a southerner or a northerner, so it’s really important for me to focus on getting one topolect right and always consult somebody who has always lived and spoken at that location. Otherwise it’s easy to get confused, especially with vowels, on what to use and where. Not only that but every character in Southern Min has two or more pronunciations depending on if it’s ‘baihua’ or ‘wenyanwen’ (i.e. colloquial or literary) in which vowels vary a lot.

    Because of all the variations even in one kind of speech, there is a very high level of comprehension when it comes to hearing other varieties. The bottom line is, some of them just sound funny ^__^ like Yilan’s -uiN endings, but that doesn’t mean it impedes on comprehension.

    That -uiN ending is where we get the name for Amoy (it’s supposed to be nasal at the end) which would be written as e-muiN in peh-oe-ji which is the romanization used on Wikipedia. But in my version of Southern Min we refer to Amoy as e-mng, same as Amoy speakers themselves.

    So this rounding–sometimes you get it and sometimes you don’t in either Quanzhou and Zhangzhou. I thought that most of the time (so I thought) I was using a Quanzhou pronunciation, then after looking at the reference book I mentioned above, I realized that probably half the time I’m using Zhangzhou pronunciations, so now I’m more or less in the dark as to which ones are which.

    For accent, Mandarin is ‘qiang’. In Southern Min some people say khioN (Zhangzhou pronunciation), but I say khiuN (Quanzhou pronunciation). That’s the different Quanzhou/Zhangzhou pronunciations coming into play.

    If you were going to go as far as separate the two into distinct dialects or languages, good luck, because even us speakers can’t tell one from the other apart–they’re just too similar and the dialect lines criss-cross too much.

    Here are some examples:
    Quanzhou -e = Zhangzhou -ue as in fire (he/hue), fruit (ke/kue). Funny, because I say the Zhangzhou pronunciation and if I hear the Quanzhou one, I immediately think they’re from somewhere else.
    Quanzhou -ue = Zhangzhou -e as in buy / sell (bue/be), to return something (thue/the). Again, I say the Zhangzhou pronunciation!

    Quanzhou -i = Zhangzhou -e as in ‘at home’ (ti tshu / te tshu). Here I say the Quanzhou and I’ve heard the other one quite a lot.
    Quanzhou -un = Zhangzhou -uan as in ‘we’ (gun / guan). In Quanzhou we actually pronounce it as gu-un, or gwun, so it’s more or less how the -a- is pronounced becoming schwa in Quanzhou. The rounding is always there in front.
    Quanzhou for pig is ‘ti’ and language is ‘gi’, Zhangzhou for pig is ‘tu’ and language is ‘gu’ (because the voiced velar is slightly implosive resulting in a bit of creaky voice, this consonant is not normally detected by foreigners). I say ‘ti’ and ‘gi’. Language is ‘ü’ in Mandarin.
    This is just like German ‘ü’ being pronounced as ‘ee’ in English (füsse / feet). So, whereas four rhyme grades are delineated for all Sinitic languages, Southern Min is actually lacking the fourth grade resulting in fewer possible semivowel combinations for rhymes and this gives rise to the various surface representations found in Quanzhou and Zhangzhou and the mixing into Xiamen and Taiwan.

    There is also a difference between Zhangzhou accent and Zhangzhou pronunciation of a character. For example, Z-accent for ‘song’ is IPA /kɤ/ vs. Z-pron. of ‘song’ is IPA /kɔ/. I only use one kind when speaking and without this reference book I wouldn’t know the difference.

    Not to complicate things anymore, but just to show the variation in pronunciations that people use regularly, here is the pronunciation of Mandarin 在 zài (IPA tsai51) in the Southern Min that I speak. (Tones are written on five-scale, five=high, one=low, contours represented by the difference between the two numbers–I have written all tones as surface, after sandhi.)

    1. Literary: tsai33 as in tsai21 hak5 (studying), or tsai21 se21 (living), tsun33 tsai33 (exist).
    2. Colloquial: tshai33 as in tshai21 le21 m21 tin55 tang33 (sit and don’t move).
    3. Common: ti33 as in ti33 kin33 ni35 (this year), ti33 gua21 khau53 (be outside), ti33 tshu53 lin21 (be at home), ti33 tai33 pak2 (in Taipei), ti33 tshia33 thau35 (at the train station).
    4. Common: tua53 as in tua55 tshia33 thau33 sio33 tan53 (to wait opposite the train station), gua55 tua53 tsia55 (I’m here).
    5. Common: teh2 + verb (continuous tense) as in i33 teh2 khuaN53 tsheh2 (he’s reading a book), li55 teh2 tshong53 siaN53? (what are you doing?), khia33 teh2 khuaN21 (stand and watch).

    A lot of words have this many different pronunciations, so you have to gain a feel for the language, and you have to realize there are many ways to say the same thing, and that’s why hearing different accents is very common and that words and sayings get borrowed back and forth all the time.

    Finally, to draw a line between Zhangzhou and Quanzhou as spoken by an individual would be like trying to separate the ingredients of a cake with a knife. You really think it can be done? For me and the rest of Taipei, you’d have to cut our tongues into many pieces!

    And since everybody uses each other’s pronunciations, where’s the evidence that they’re different enough to impede comprehension? I understand Southern Taiwan, Xiamen, Zhangzhou/Quanzhou just fine. Even the weird Yilan -uiN is just fine. Try throwing some uiN’s into your speech and watch people’s reaction! — *where* did you learn to speak like that, that’s the weirdest thing! You sound like my cousin in Yilan!

  24. Thank you very much both of you. There are all sorts of folks out there who say, “Everyone can understand each other.” We have so many tons of evidence that this is not the case that I just throw out all of the armies of people who come at me with that one.

    But once you accept my premise that a lot of the Chinese dialects are not inherently intelligible, we can start to work from there. This is exactly the sort of evidence based input that I am looking for.

    As per your criticism, I demoted Zhangzhou and Quanzhou to dialects of Xiamen.

    About Beijing, I am very confused. It seems that at least 20 years ago, everyone outside of NE China found the hardcore Beijinghua impossible to understand. OTOH, I am willing to consider that residents of the surrounding region of NE China (NOT including Shangdong) didn’t really have any problems with Beijinghua.

    Therefore, Beijinghua is part of a larger language that stretches across many 100’s of dialects across NE China. If this is the case, then let me know.

    But you need first of all to accept my premise that many folks find hardcore Beijinghua to be difficult. I assume that these folks are from the South, correct? Many are also 2nd language learners. You may wish to dismiss this, but what are we to make of folks who carry on fine in Putonghua and get to Beijing and can’t figure out a word of taxidriverhua? If they understand Putonghua, why not Beijing?

    What I am saying is that we need to come up with some evidence based theories to explain all of these facts.

    A few things. The theory of mutual intelligibility has been developing for over 50 years now. It is now as good a science as anything else in linguistics. First off, inherent intelligibility means no bilingual learning. You need “virgin ears.” You get a tape recording of a Lect A and give it to speakers of Lect B and see how much they get. If they understand Lect A because they have been hearing it for years, it’s essential to note that this does not count!

    Fancy intelligibility algorithms such as James showcases on his site are very interesting, but produce huge false positive results. For instance they show Cantonese speakers with 50-60% intelligibility of Mandarin, Wu, Hakka and Min. In studies using tape recorders and virgin ears with Cantonese speakers, Cantonese speakers understand ~5% of the 4 languages above. Further, Cantonese speakers only understood 31% of Taishan.

    Spanish and Portuguese have 50-58% intelligibility. Constant pronouncements that the major Chinese languages are on the order of Romance tongues like the Iberian ones are just wrong. Even internal differences in Cantonese are much worse than Iberian.

    We also have many cases of Mandarin speakers who have lived in Taiwan for 25 years and barely understand a single word of Taiwanese. We have other cases of Mandarin speakers who lived in Hong Kong for 25 years and barely understand one word of Cantonese.

    To me, this shows me that the differences between the major Chinese tongues are vast, incredibly vast. That doesn’t sound “the Romance languages to me.” More on the order of English and German/Dutch or English and Russian. As an English speaker, I have heard Frisian, German and Dutch videos and I barely got a word. The occasional word I got was because I have some bilingual knowledge of Dutch.

    Frisian has 60% lexical similarity with English and after a 5 minute video, I did not have even one word. So 60% lexical similarity in the real world barely gives you a hill of beans. Frisian is closer to English than any other language on Earth save Scots. Onto Scots. I have heard Scots tapes. I get about 25% of that. Abysmal. Lexical similarity is a red herring. Spanish and Italian have 87% similarity, but Spanish speakers hear maybe 30% of Italian if that.

    These are some things that the “everyone can understand each other” crowd may wish to think about.

  25. Robert,
    Quanzhou and Zhangzhou are the geographical sources of these pronunciations (i.e. -iuN vs. -ioN rhymes) and are not dialects of Xiamen. Rather, Xiamen should be an accent that mixes both.

    By demoting them to dialects of Xiamen, what you’re saying is that within Xiamen City, there are two separate dialects being spoken by two groups of people and that is not the case.

    Put it this way: the situation of Xiamen is very similar to Taipei where I live. There are not two separate groups of people in the city–one speaking Zhangzhou and one speaking Quanzhou. Instead, people are speaking words that originated from both locations. That’s why I said in my earlier post that I was surprised to learn that some of the words I was saying were actually from Zhangzhou, and the others were from Quanzhou. I didn’t originally know that I was speaking a mixture of those two pronunciations. But that’s precisely what Taiwanese and Xiamen are.

    It’s like saying that in my English I use both “look out” and “attention” and therefore, there are two dialects of English: one being Germanic-based, the other being Latin-based.

    By using your logic, if I use “look out” in my vocabulary of English, you’re demoting my dialect to Germanic English. And then if I say “attention” you’re demoting my dialect to Latinate English. When in actuality, everybody uses both terms from both sources. You can’t split English into Latin-based and Germanic-based and say there are two dialects there.

    In your writing above you used the word “algorithm” so maybe you’re speaking Arabic-based English, a completely separate dialect of English. Based on the same logic that everybody goes around saying I understand this and that, but you’re claiming that what they claim to understand are really separate dialects and languages that they truly can’t comprehend.

    Yes, languages can have borrowings and often do have large strata of vocabulary. English is a great example. But that doesn’t mean we split those strata into separate languages based on the vocabulary source.

    Via your logic why do you use Arabic-, Latin-, Greek-, Germanic-based words in your writing whereby using your same logic none of us should be able to comprehend but only one kind?

    That’s a weird analogy, I know, but that’s the logic you’re positing here. And to me, it doesn’t make sense.

    Colloquial English is based largely on the phrasal verb structure originating from Germanic. But there are also matching Latin-based vocabulary in English. So from a child’s point of view who only knows colloquial phrasal English with very limited Latin-based vocabulary, listening to a speech heavy with Latin-based vocabulary would be challenging and difficult to understand–YES. But that doesn’t mean these are two different dialects or languages.

    Both of my responses today are completely unrelated to any other topic, such as living in Hong Kong or Taiwan or those languages that you mention, and only focus specifically on the topolects of Quanzhang Dialect of Southern Min language, and my own analogy to English etymology.

    • I am from Singapore, 3rd generation here. Basically we get all sorts of Min Nan dialect here. My grandfather is from North of Quanzhou, a village call Hui-An. I believe we are in the same situation as Taiwan.

      I am able to understand Xianmen and Zhangzhou, more because our Min Nan language are aggregation of various Min Nan branch. I have friends who ancesters are from Zhangzhou and Xiamen. But nevertheless, we use simpler lexicon.

      My Min Nan is better than average population in Singapore. Taiwanese for me is easy to understand. They have much more breath than us. They are richer. A Taiwanese can understand us easily. If they choose a simpler vocab, we are able to communicate. For general Singapore to communicate with Taiwanese in their way is not so easy.

      Yilan poss a difficulty for me at normal speed. Nevertheless I am still able to get quite a fair bit when I tried my virgin ear at Yilan few years back. But at slow speed I think mutual intelligibility is high. .

  26. Hello Mike. Ren says that Xiamen speakers understand speakers of the real Zhangzhou and Quanzhou. There are real speakers of these languages in these locations in China. I split them because I had some data indicating that speakers of the pure Quanzhou and Zhangzhou have a hard time understanding each other in China when they meet each other. Now, is that the case or is it not?

    If Xiamen, Zhangzhou and Quanzhou speakers can all get together and talk and pretty much understand each other without bilingual learning, then all 3 are dialects of a single language. Lects that understand each other are related dialects, bottom line.

  27. Could you please post the data you say you have? Otherwise I feel that I’m the only one posting data and that’s hardly sufficient for a comparison.

    If you follow the simple rules of logic, then the answer is yes, they understand. We have established that:

    X includes Z
    X includes Q
    X and Z are intelligible meaning:
    Z understands X(Q) and X(Z)
    X and Q are intelligible meaning:
    Q understands X(Q) and X(Z)
    Z and Q are intelligible.

    I’m curious to see data that shows otherwise.

  28. “They moved down to northeast Guangdong, after hundreds of years, a heavy dose of Cantonese went in, producing modern Teochew.”

    The above sentence refers to Chaozhou (which you write as Teochew).

    Chaozhou does not border on any Cantonese-speaking area. I don’t understand how Teochew has Cantonese influence. And I don’t see any trace of Cantonese in Chaozhou.

    Could you please show data to support this?

  29. Regarding Singaporean Hokkien, it’s the same language as Southern Min Xiamen/Quanzhang/Taiwanese.

    Having attended meetings with Singaporeans and Taiwanese, I have witnessed these people communicate at length in Southern Min. Sure, the Singaporeans have a different accent but that’s like British and American English. It doesn’t impede communication.

    If I ask somebody how much of a language they understand, and they reply 65%, I won’t believe them. What are they measuring? How did they come up with that number? If that’s off the top of their head, the margin of error must be astronomical.

    In your article you have links to Sanso and Chaenzo (Taiwan). I’ve been residing here and doing research on Chinese dialectology for more than a decade and never come across these terms. Do you have the Chinese equivalent (as neither have standard spelling). Also, your links point to Ethnologue’s Southern Min page only. There is no mention of these terms on those pages.

    Then you mention Yilan, which of course, I mentioned in my response from yesterday as having the distinct -uiN rhyme. But your proof that it is a separate language comes from an online discussion. Have you investigated Yilan, in that other than the -uiN rhyme, is there anything else that sets it apart? You’ll be hard pressed to find it. Besides, the -uiN rhyme is not rare or uncommon, as it is an older form that has died out in most Southern Min topolects. Like I mentioned, Xiamen and Jinmen used to have this pronunciation in their own names.

    Your link to an online discussion does not constitute as data. Any speaker from anywhere in Taiwan can speak with any other person in Taiwan using Taiwanese and still understand each other fine. The differences are not that great. Most of the differences come in names of things, much like you have people in the US who say ladybug or ladybird. Those two uses don’t constitute different languages.
    If two speakers can’t, then their family is not Hokkien, perhaps they are Hakka, aborigine, or other (like me). The national language is Mandarin because not everybody comes from the same background. But as far as Southern Min speakers, it’s pretty uniform island-wide and does not differ enough from X/Z/Q to impede communication.

    • Teochew is Teochew, why censor the name and Mandarize it to Chaozhou?

      Sanso and Chaenzo are 泉州 (Coanciu) and 漳州 (Ciangciu) in Japanese, basically.

      There really is a mix of dialects in the city of Taipak / Taibei / Taihoku. The dialect in the city center was well-mixed, with a slight Coanciu lean, but in Sulim (Shilin) people spoke a dialect with a strong Ciangciu lean. Other districts spoke a Tang’oann-type dialect, which is in itself Coanciu cut with some Ciangciu, if you want to describe it that way. I use the past tense b/c young people are likely to speak the mixed, Ciangciu-leaning “Southern” style favored in the media, or not speak the language at all.

      To get back to the big picture, yes, I think it’s appropriate to consider Ciangciu and Coanciu as all part of one big language, along with the Hailok’hong / Haklau language spoken down the coast between Teochew and the Hong Kong area. Dialect differences can impede communication, but intelligibility can be won after a week or two of contact, whereas the differences between, say, Hokkien and Hinghwa could be considered irreconcilable without deliberate learning.

      One of the hardest dialects of Hokkien for me to understand at this point is Medan Hokkien. It is cut with a load of Malay, Teochew and Cantonese.

      • Thank you very much for all of this. Can Medan Hokkien understand the rest of Hokkien spoken in Malaysia and Indonesia (what I all Malay Hokkien)?

        • It depends. On the west coast of Malaysia from about Taiping on up to the Thai border (it used to be on up to at least Phuket), the Hokkien that’s spoken is almost identical to Medan Hokkien — we are talking about full comprehension, besides local knowledge and the fact that Medan speakers tend to have a shallower grasp of “pure Hokkien” and a deeper command of Malay.

          I’ve heard that some Penang speakers have trouble understanding the Hokkien spoken in Klang, Melaka and Singapore, so it follows that many Medan speakers would have at least as much trouble with those and probably more, since they are at the end of this continuum. Then again, many Penangites have little or no trouble with the southern dialects. Watching Taiwanese telenovelas seems to help. Penang/Medan Hokkien is somewhat creolized. It has Malay and Thai swirled pretty deep into it. The southern dialects are less creolized, if at all. They slot in between Penang and Taiwan on the continuum.

          Bagansiapiapi on Sumatra is a Hoklophone town. Some of the purest Hokkien in the world can be found there, b/c they’d escaped the onslaught of Mandarin till just the last few yrs. They talk like the people in Tang’oann, China (Tong’an on Google).

          Rumor has it two types of Hokkien are spoken in Kelantan, Malaysia. One is wildly creolized and possibly not mutually intelligible with any other Hokkien. Hokkien is also spoken throughout Riau and in a few places around the island of Kalimantan, inc. Kuching esp., and Brunei.

          Since U are into intelligibility… The Hokkien is laced with Teochew throughout this region, but the Teochew is also laced with Hokkien and even as an outside Hoklophone I find that some of the Teochew speakers speaking Teochew are easier to understand than many Hokkien speakers from China esp.

          English-language info on Hokkien can be dicey. The richest source is probably this forum: http://hakkadictionary.com/forums/viewforum.php?f=6&sid=ba7d6a5b9cbbd31e092a5fd0b87397ac … if U haven’t found it already.

          Before I quit, I should mention that “Malay Hokkien” is a misleading term and not even acceptable academically — I know U could care less about the social aspect. There are no Malay families that speak Hokkien — only individuals, although maybe a lot of them, esp. around Penang. There are actual dialects of Hokkien that go heavy on Malay elements — Kelantan, Medan, and Penang somewhat. Then there is Baba Malay, which is Hokkienized Malay spoken by Hokkienese. A better word for the concept U need to express might be “Nusantaran Hokkien”. But that might take in Philippines Hokkien as well.

        • I am from Singapore. I have heard Medan Minnan once, when my students from Indonesia is speaking among themselves. I did not try to communicate. The dialect is very difficult for me to understand, worse than Yilan.

          You deduce they are conversing in Minnan, and that is all. I understand little of their speach as a virgin ear. Singaporean Minnan people would figure out Taiwanese quickly if they use simpler vocab at normal speed. But it doubt so for Medan, If they speak slowly, I would have understand more. But I never ask them to speak slowly.

          Also I heard Medan just once in my life, and never really try to communicate with them. I simply observed as a passive listener. So my impression cannot be that accurate. For Yilan, I have been there and try to communicate. So my assessment of Yilan is more accurate.

  30. Hello there. I am willing to accept your evidence that Singaporean and Taiwanese Min is the same and that they can all speak to each other, based on the evidence that observed at the meetings. Thank you. However, I have a question. Is there some standard Southern Min that serves as koine for both Taiwan and Singapore? If they were speaking the koine instead of Taiwanese Min and Singapore Min, that T Min and S Min may still be separate.

    If I am not mistaken, Sanso and Chaenzo were Min languages spoken on Taiwan around 1900 or so, and they were unintelligible at the time. I was looking for more data on them, but couldn’t find any. I assume that they must have gone extinct. Thank you once again. My work states that Yilan is a dialect and not a separate language, so you misread that part. You are telling me that all Taiwanese Min is intelligible. This is what I suspected but could not prove.

    Online discussions of whether or not people can understand each other is a form of data. Obviously it has weaknesses, but we have to start somewhere. Other forms of data include personal communications such as you are giving me here. The personal communications you give me here are 100% on the same level as online discussions.

    When doing research with native speakers, we ask them how much of a lect that they can understand. They say 90%, or “we understand it, it’s the same language”, or 85%, or 65-70%, or 60%, or 30%, they tell us all of these things. This data has problems, but intelligibility research shows that people’s estimates tend to line up well with their intelligibility.

    The problem with destructive criticism, which is what you specialize in, Mike, is that it offers no alternatives. You immediately shoot down all attempts to determine intelligibility of any and all Chinese lects as worthless and unreliable, but then we ask instead for you to offer some other way of determining this evidence. The evidence being the intelligibility of Chinese lects (ICL). You, Mike, offer no alternative methods whatsoever to enable us to determine ICL. If I read you correctly, your argument is that ICL is impossible to determine with any accuracy, and therefore all attempts to quantify ICL must be rejected. This mentality is not acceptable.

    I would like to remind you that some of the biggest names in Chinese linguistics (Sinologists) are supporting me in this endeavor. They also disagree with all of your assertions about non-quantifiability of ICL.

    WRT to your questions about Xiamen, Zhangzhou and Quanzhou:

    Xiamen understands Zhangzhou and Quanzhou and vice versa. Implication being that all 3 are dialects of 1 language. Can Zhangzhou and Quanzhou understand each other in China? I am not sure, but you seem to be saying yes, correct?

    There is 1 language with dialects Taiwanese, Xiamen, Zhangzhou, Amoy, Jinjiang, Lufeng, Jiamen, Yilan and Quanzhou.

    Now we can call this language anything we wish. We can pick one of the names above. We can call it this or that. We can call it Mikecampbellese or Robertlindsayan. I suggest calling it Xiamen, but we can also call it Taiwanese or Amoy.

    • Singaporean has more difficulties understanding Taiwanese Min Nan. The older generation understands Taiwanese better. The reason is younger Singaporeans are no longer good in their dialects. The PAP government of Singapore imposes culture genocide of Singaporeans.

      If you see Taiwanese communicates perfectly in Min Nan with Singaporeans, the primary reason is Taiwanese choose to use simpler Min Nan lexicon.

      I found Singapore Min Nan and Taiwan Min Nan very similar. However, they have a richer vocab and are more expressive.

      However, it takes very little time for non-westernized Min Nan Singaporeans to pick up Taiwanese if he is given a chance to immerse in Taiwan. I would say 3 months is enough.

  31. Sorry about destructive criticism. That wasn’t my intention. I’m not limiting you per se, but also not offering another way, because I don’t know of another way. That is not my area of research nor a strong area of interest.

    I don’t mind that you are asking people and I think that is a method, but I’m not sure how you’re recording it, or what criteria you have set up to find candidates. (Over exposure to a particular lect, or as you say, having learned it obviously does not meet at least one criterion).

    The term “Quanzhang” has already been established as a major dialect of Southern Min covering the topolects you mention. That is the name being used in publications. I’m sure you’d like to nominate it for language status. Okay then. I understand now what your criteria are.

    That name uses Chinese abbreviation methods:
    Quanzhou + Zhangzhou = Quanzhang.

    To be politically correct, even here in Taiwan, the term “Taiwanese” is not good. It’s a layman’s term for Southern Min which we use in everyday speech. However, I almost always refer to it as Southern Min in English, and Minnan-hua in Mandarin and Banlam-oe in Southern Min. The term “Taiwanese” in Taiwan (if referring to a language) almost always refers to the language Southern Min and not aborigines or Hakka. But I still refrain from using the term if possible. Especially in writing!

    What is Jiamen?

  32. In fact, the Teochew speakers ended up very close to the Cantonese speakers and they continue to live almost next door to them. Cantonese speakers often say that Teochew has a “Cantonese” sound or vibe to it.

    I will do some more research into this based on your criticism though.

    • Jason

      I’m a Teochew and I assure you this is a non-starter. Teochew is certainly not a fusion of Minnan and Cantonese. Unless having actually learnt the other language, no Cantonese speaker can understand colloquail Teochew or vice versa. If the only significant connection between the two languages is through literary Chinese.

    • Teochew speakers are very far from Cantonese speakers.

  33. Pingback: A Reworking of German Language Classification, Part 1: Low German « Robert Lindsay

  34. ren

    Robert wrote,
    “The fact that you can understand Yunnan if spoken very slowly and repetitively is no good. You have to be able to understand it spoken at a normal speed and no repeating allowed.”

    “For something to be a dialect and not a language, you need to understand over 90% of it on a tape recorder or video spoken at normal speed with no repetitions. If you can’t, it’s a separate language.”

    I am a native English speaker who grew up in America, and a lot of varieties of English must be classified as separate languages by your standards. If you can understand 90% of Texan, a lot of others can’t. And tongues that require slow, repetitive speech into the ears of the average American in order to achieve 90% understanding, if that’s even possible, are Jamaican English, Scottish English, Irish English, Australian English, New Zealand English, the Cockney dialect of London, and I’m sure a whoe bunch of others.

    I would say that a lot of dialects are actually entities that are in-between language and accent/dialect. To this goes the relationship between Jamaican English and Californian English, for example, as well as SW Mandarin and NE Mandarin, and Portuguess as it relates to Castillian and Italian.

  35. Answered in a new post.

    I’m getting kind of tired of endlessly debating this subject, sorry.

    In the field of Linguistics, this stuff is not very controversial and no one really cares. Outside Linguistics, it seems to be incredibly upsetting.

  36. ren

    So are English dialects different languages or not?

    It’s not so much upsetting as it is an unrealistic definition of language, when you can actually have a conversation with someone by speaking a different language. Would a Beijinger learning Jinan Mandarin or a New Yorker learning the Cockney accent be fulfilling foreign language requirement in college?

    There’s just things that are between language and dialect and it doesn’t help by making criteria about how 90% over or below a cap is a language. It’s unworkable.

  37. Ethnologue is dividing languages from dialects at 90% intelligibility and they are the ones giving out ISO codes for new languages. If you have problems with that, take it up with them. Right now, linguistic science has granted SIL-Ethnologue the task of deciding what is and what is not a language, and they put it at 90%. That’s the state of the science in our field. If that bothers you, go blog on it.

    Linguists don’t care about this, but your average non-linguist who knows nothing about the field is completely outraged.

    Scots is already split off and so are the Caribbean creoles. Yinglish is split off too; it’s some of Orthodox Jewish New York dialect. The rest have not been split off yet. I would like to split some, but your average idiot is going to freak out so much if we do that we will never hear the end of it, so it’s more a political question than anything else.

    SIL says that English, Yinglish, Scots and the Caribbean Creoles are all separate languages. All the rest are dialects.

    The truth is that almost all Americans can almost always understand almost all other Americans. To compare US English dialects to the wildly divergent dialects of German and Chinese is completely insane. We barely even have dialects here; we just have accents.

    The dean of Chinese linguistics, Jerry Norman, says that Chinese is made of 350-400 separate languages. I have some of the top Sinologists in the world supporting me on this project, so unless you can offer some objective criticism or tone it down, I’m going to have to shut you down. If you don’t knock it off, I’m going to ban you. Ok?

    I estimate intelligibility between SW Mandarin and the rest of Mandarin is on the order of 20-30%, but studies are lacking.

    A Beijinger learning Jinan would not be fulfilling a foreign language requirement, no. Nor would a New Yorker learning Cockney. BTW, I can understand a lot of Cockney.

    If intelligibility is lower than 90%, it’s hard to have a good conversation about anything complex anyway. If the only way someone can talk to you is real slow and repeating himself all the time, you may well be speaking very closely related foreign languages, despite the fact that you are communicating. Very closely related foreign languages can definitely communicate with each other on some level. This may bother you, but in the field it’s not controversial.

    People living in the region do NOT share your views on SW Mandarin. Chengdu SW Mandarin says they can’t even understand Chongqing SW Mandarin and vice versa. A top Sinologist friend of mine went to Mount Emei in Sichuan.

    On the way up the mountain, he and his wife passed people speaking incomprehensible languages. They finally asked a guide what they were speaking and he said they were speaking a wide variety of SW Mandarin dialects from around Mt. Emei. Neither the professor nor his wife could understand one single word of these dialects!

    His wife was a native speaker of Chengdu SW Mandarin, which is spoken only 100 miles away! And you tell me SW Mandarin is all intelligible with NE Mandarin? Forget it, man. I ain’t buying.

    I would like to ask you what makes you so sure you are speaking to someone who is speaking SW Mandarin to you? Maybe they are speaking Putonghua with a heavy SW Mandarin accent, no?

    You are taking the “everyone understands everyone” POV. I reject this. I know a guy who speaks Hong Kong Cantonese, and he says that there are “thousands” of dialects of Cantonese. He also said that in general, “most” of them can’t understand each other very well.

    No, I’m not saying he’s right, but here is a native speaker of Cantonese telling me that there are 1000 separate languages of Cantonese alone. When I hear people tell me this, and you “everyone can understand everyone” guys come around, I get really dubious about your line.

  38. I would also point out that in Linguistics in general, there is *nothing* that is “between a language and a dialect”. Just about every widely spoken speech form is either a language or a dialect.

    I think what you are referring to are either divergent dialects or very closely related languages.

  39. ren

    No, I’m not upset about anything. I’m simply discussing concepts with you, to which you can’t seem to reply in a mature, thoughtful fashion, except to say that Ethnologue says so and to say that you will ban me (for what?, and as if that would “upset” me).

    The bottom line is that SW Mandarin is to NE Mandarin as Jamaican English is to Yankee English. You may wish to call SW Mandarin and NE Mandarin separate languages but then you have to use the same standard for English as well.

  40. Point is, Ren, Jamaican English and US English are *two completely separate languages* according to ISO, which is the scientific organization that decides all such things. Do you understand me? Go to Ethnologue and look it up, ok?

    Look man, if you think that SIL having charge over ISO language designations is not ok, why don’t you just blog it, man.

    I am a very busy man and these criticisms are not scientifically useful in terms of our discipline.

    Also, if you want to criticize my scholarship, you need to research the tone that is used by academics in peer review and in academia. So all of your criticisms need to be in that tone. Your criticisms are not in an appropriate academic/peer review type tone, so I am threatening to ban you.

    You should note that I lumped all Taiwanese dialects together into one language based on criticism. I also lumped Beijing into a language called “NE Mandarin” based on your criticism.

    Thing is, I DO NOT have to set the same standard for English, ok? I can avoid that whole question on political grounds, which is probably what I want to do anyway.

    I have the world’s top Sinologists behind me on this. Who is on your side. Get some scholars to back you up and maybe I will listen to you.

  41. I doubt if you have the world’s top Sinologists behind you on this, and you haven’t mentioned any names except Jerry Norman–I would like to see proof that he is as most of his publications are outdated and many points proven otherwise (unless he’s updated his publications). Most Sinologists are not westerners and are in China. My impression is that you’re communicating only with western scholars. Like I said before, most of the research and publications done in Sinology outnumber publications in western languages by 30 to 1. It’s typical to see in a bibliography that for every 300 references, maybe 10 are written in English or other western language. I think that Ren could concur with me on this? Simple enough to prove, just open one of Zhan Bohui’s (THIS IS A TOP SINOLOGIST!) huge bibliographies.

    I haven’t been to Yunnan or Sichuan yet, but I see people interviewed on TV all the time from those areas. Their southwest accents just sound like Americans speaking Mandarin who have their tones wrong (I say that jokingly but these people are comprehensible). Chinese would not be comprehensible if single words are spoken in isolation, so the context is very important for understanding individual words–this is even true for two people speaking the same Sinitic topolect. As we all know, words like ‘xi’, ‘yi’, ‘yu’, ‘xu’ can have hundreds of meanings.

    A Chinese friend of mine just returned from two weeks in Yunnan last year (some project with the locals–mostly children and adults) and she recorded lots of their conversations on video–I helped her with some of the video work and so have a copy of it all. I didn’t find anything strange about the Putonghua they were speaking and I didn’t detect any use of local dialect either. What I really need to do is make a video of myself touring around China interviewing people and see what dialects we come across. That would be very interesting. Now where do I get the time and funding for that? I’d leave my job if I could just do this all the time. I love collecting data and investigating more languages.

    Actually, we have to make the point that there is a difference between a southwest dialect, for example Emei (http://language.glossika.com/zh1459/), and Emei-accented Putonghua. For example, while traveling I cannot understand people’s local dialects, and I have a hard time understanding their accented Putonghua. There are certain areas that I can’t understand (accented Putonghua) more than other areas.

    HOWEVER, I found that if the person comes from a non-Mandarin speaking area, I can always understand their Putonghua better than someone coming from a Mandarin speaking area. The reason for this is because the non-Mandarin speakers learn Putonghua as a foreign language and so the sound system and grammar is learned separately from their own way of speaking. However, Mandarin speakers bring their own sound system into Putonghua making it more difficult to understand them. People whom I’ve met speaking Putonghua with their own accents, specifically from Zhengcao dialect and Jianghuai dialect are particularly difficult for me to understand.

    This makes me think of somebody speaking Chinese-accented English. Sure, we can understand what they’re saying with a lot of difficulty, but if they were speaking Chinglish, nobody would understand unless they had understanding of both Chinese and English. I don’t like mixing 2 languages. I prefer if you’re going to speak one, choose one and speak it correctly. More than half the time, I refuse to speak to Singaporeans in English because they usually speak Chinglish and don’t know how to link a subject-verb-object together to make a complete sentence. Personally, communicating in incomplete sentences is very irritating. Even their Mandarin is hard to listen to. I speak enough of the *big* languages as it is, if they can’t choose one of and speak it properly, then what’s the point? I don’t mind an accent, but I don’t like communicating in half of this and half of that. This is probably more sociolinguistics and off topic, sorry.

    Here is the conclusion from my writing:

    There are:
    1. Mandarin and English and Chinglish — mixing of the two.
    2. Singaporean accented English — I find it’s mixed with very poor grammatic and syntactic structure
    3. Emei-accented Putonghua — Emei dialect are two different things
    4. Jianghuai dialect — Jianghuai accented Putonghua are two different things
    5. Zhengcao dialect — Zhengcao accented Putonghua are two different things
    6. As a fluent speaker of Putonghua, I cannot understand Jianghuai- and Zhengcao-accented Putonghua. (Nor would I be able to understand their local dialects)
    7. Yunnan-accented Putonghua is mutually intelligible with Putonghua
    8. Yunnan Mandarin dialects — I haven’t heard them

    Finally, I support the idea that there are accents (differences in phonology and prosody, as in British and American English), and dialects (in addition to phonology and prosody, differences in vocabulary and syntax) and languages (in addition are differences in vocabulary with many faux amis, and grammar). For example, Romance languages have the cognate “embarrassed” but Castilian Spanish’s cognate means “pregnant”, not to say they don’t have the cognate of “enceinte” but that’s just how people speak. Just as the cognate “vergogno” in Italian means shame, the Spanish cognate “avergonzado” is embarrassed or shamed. Such false friends appear all over the place in the Chinese dialects.


  42. Pingback: A Reclassification of the Occitan Language « Robert Lindsay

  43. Pingback: Sociolinguistics and the Language – Dialect Question « Robert Lindsay

  44. Patrick

    Hi Robert – not sure you’d want to include “Hong Kong Putonghua” as a language (although I understand linguists have a different definition of “language” from the layperson). Most native Hong Kong adults have only a cursory knowledge of Putonghua – there’s a negative relationship between Putonghua skills and age as young adults and students have been exposed or feel the need to be educated in Putonghua. The issue with the idea of “Hong Kong Putonghua” as a language is that there are actually no native speakers of it – just lots of Cantonese trying to speak Putonghua.

  45. Additionally, Putonghua is entirely artificial, like the RP accent of English; it’s almost everyone’s second or third lect. It’s understandable that learning this narrow lect doesn’t give you much help in learning closely related ones.

    An analogous phenomenon in the Internet domain is misspelled English: nativ speekers hav no problum with such errors, but non-nativ speekers often find them completely unintelligibul.

  46. mumbaki00

    I did the same with philippine languages….but it is subject to revision.

  47. minus273

    As a native Sichuan Mandarin speaker, I would say that Chengdu, Mianyang, Chongqing and my native Deyang speak very much the same language and are mutually intelligible without prior exposure.
    Leshan is difficult, with different function words and retaining the MC entering tone (as a high tone, while they merged into a low falling tone in Chengdu-Chongqing).
    But personally I would take Leshan as dialect of the same language, being too easy to learn after some exposure.

    • Hi, the reason I split off Leshan is due to the post by Victor Mair, a friend of mine, who discussed going to Mt Emei with his Chinese wife, who speaks Chengdu. Even though she spoke Chengdu, she could not understand a word of what the people walking up the mountain were saying. They asked a guard and he said that they were speaking dialects from the various towns around Mt. Emei. I looked at a map and the nearest town to Mt. Emei was Leshan, so I assumed it was one of the unintelligible lects, but maybe I was wrong?

    • minus273

      Of course, Leshan is barely intelligible to the koinè (a different language on your terms). But I bet on my hypothetical Porsche that the dialect on Mt. Emei is as far from Leshan as is from the koinè.
      Applying your good methodology to Northern and Eastern Fujian, you would have to add 300 languages to your current list. (Fujian is not smaller, more mountainous and with an equally interesting population history than South Germany)

  48. minus273

    Anyway, non-urban areas and smaller cities in Sichuan speak radically different from the koinè of the bigger cities (Chengdu-Chongqing, and by extension Guiyang, Wuhan, Kunming etc), as you have noticed from Emei, and the Hakka pockets around Sichuan.

    • I am going to leave Chengdu and Chongquing as separate because I have a quote from a Chinese woman from one of the cities (I think Chongqing) who said she has a hard time even understanding Chengdu. Is she possibly referring to some of divergent Chengdu dialects that are spoken in the outlying areas of town?

    • minus273

      I once listened to the PRC’s propaganda radio to Taiwan, in Moiyen. I found it easy to understand, as the vocabulary is much too formal, and the tone value is almost the same as Sichuanese. I deeply suspect that Sichuanese was a Hakka-Gan dialect, mandarinized. (or that Moiyen is a Hakka adopting Southern Mandarin tone value by commerce)

  49. minus273

    Basically in Central Southern Min (may I relegate Teochew etc. as “Peripherical” Southern Min?), from what I’ve heard, Zhangzhou and Quanzhou are like two poles, Amoy being in the center, Taipei a little more Quanzhou in the center, and Yilan is almost unadulterated Zhangzhou.

    A “diasystem” with koinè located in the center?

    • Interesting. I am curious about whether or not Zhangzhou and Quanzhou can understand each other on the mainland. Some suggested that maybe they can’t completely understand each other.

    • minus273

      I guess not. If standard Taiwanese is fifty-fifty Zhang-Quan and Yilan is 80% Zhangzhou, and our average Taiwanese regards Yilan as difficult/incomprehensible.

    • minus273

      Just curious, but is this kind of dialect continuum kinda nightmare for classificationists? Say, if we cut it to two languages, suddenly the majority of CSM-speaking population (in Amoy and Taiwan and “colonies”) speak a language with a very fragile position. (“Do you say hong or hoang for wind?” – “Good, hoang. You speak Quanzhou Min.” “But I say suiN for sour!” – drop dead)
      And if we leave them as a language, socially-inexperienced teenagers in urban Quanzhou may legitimately claim that they couldn’t understand another majority variety of their own tongue.

  50. David

    Thanks a lot for this work, by the way. The group insanity of Chinese linguists (in the face of their “voluminous scholarship”) is well known to practically anyone who has seriously studied the Chinese languages. When a Chinese linguist says, “further study is required on the matter,” this almost always is a coded expression meaning, “Well, we only have established around 60% spoken intelligibility, but perhaps we can perform some sophistic loopholes with regards to syntactic and lexical analysis so that we can lump it with another language (preferably ‘Mandarin’).”

    The Chinese linguistic situation with regards to how they perceive their spoken lects would be akin to if Africa (a geographic area with a smaller population, and probably a fewer number of mutually unintelligible lects, by the way) were united under a single political framework, and the political/academic leadership suddenly began claiming, “Everyone speaks African! Oh wait no, there are maybe a dozen or so African dialects, but they are all variations of a single African! Everyone can understand everyone else! Oh wait, no, there are a few minority languages, but everyone else speaks African! No wait…” and on and on and on.

    It’s really a pity you don’t have access to linguistic scholarship written in putonghua, because from the mass of papers already published probably something like 1000 mutually unintelligible lects can already be concluded (although, like I said, this data is usually glossed over or discounted in favor of languages being declared as dialects).

    • Wow, great comment!

      Do you read Chinese? I don’t, and that really made this work difficult.

      • David

        When I was still in college, about 2 years’ intense study followed by 9 months at a Chinese university gave me reasonable enough command of baihua (written putonghua (not considered the same thing weirdly enough (though extremely close))) that I could read Chinese papers on dialectology, whereupon I immediately began to encounter this bizarre contradiction between treatment of Sinitic languages and treatment of pretty much everything else.*

        My Standard Mandarin has since declined to the point that it would probably take 6 to 8 months of serious work (as in 8-12 hours a day) to begin to read it again with any degree of literacy, and another year or so before I could rapidly browse and absorb the gist of academic papers. Unfortunately my time is being invested in other endeavors at the moment (although I would like to ultimately regain my grasp of the written standard sometime in the next 10 or 20 years), otherwise I would assuredly be forwarding you masses of paper summarizations.

        *Though, to be honest, this attitude is not much different than the German/Dutch dialect lumpers, or the English Scots denialists, or American AAVE denialists, etc. etc. It’s just the fact that China has 1.3 BILLION FUCKING PEOPLE, many of whom have been continuously inhabiting the region in geographically isolated pockets for 10,000 years or more, makes the difference between the “idealized national language” viewpoint and the facts on the ground brutally obvious in their scenario. I imagine that if the United States is continuously inhabited for another few thousand years we could possibly develop quite the same situation.

  51. Movenon

    Hi Robert, this is a wonderful post gathering evidence of mutual (un)intelligibility between Sinitic lects. Here are my two cents:

    -Under Min Bei, if 90% is taken as the critical point dividing language from dialect, Jianyang should probably be split off from Jian’ou. It seems they are very similar, but probably more in the 70-80% range? I doubt anyone’s bothered to do meticulous sampling.

    -Nanning Cantonese really should be classified as under the same language as Cantonese. The differences are very minute (basically a couple diphthongs, maybe a little slang), and intelligibility is extremely high, as in pretty much everything. As a native speaker of Standard Cantonese, I was able to understand everything, without prior knowledge/exposure. The thing to be careful about is making sure you find a speaker of Nanning Cantonese (AKA Nanning Baihua), and NOT Nanning Pinghua, which is obviously a Ping lect, formerly subsumed under Cantonese.

    Example of Nanning Cantonese on TV (which every Standard Cantonese speaker would be able to understand fully):

    -I noticed you mention the difference between Vernacular and Literary Teochew as being significant. Personally, I’d avoid opening that can of worms for now, as the Vernacular/Literary split is the direct result of digraphia in written Sinitic (in the past, between lects and Classical Chinese, now, usually between lects and Written Vernacular Chinese based on Standard Mandarin). So basically the pure Literary form would be used only for reciting Classical Chinese texts, and a Literary form mixed with elements of the vernacular would be used for reading a typical modern text. This is by no means anything unique to Teochew, in fact, virtually every lect has this. Even in Standard Mandarin, when reciting Classical Chinese texts, there are many special readings of characters that should be used. So these aren’t separate registers even, mostly situational pronunciations for characters. Listing a Literary version of each lect as a separate language would be viable as listing Go’on, To’on, and Kan’on readings in Japanese as separate languages!

    -Under Pu-xian Min, Singapore’s Hinghwa (Heng Hua, Hing Hua, however you spell it) has high, but unlikely 90% intelligibility with Putian City. I don’t know what town or county in Putian prefecture those speakers migrated from, perhaps that may be the reason. The other potential reason may be that as Hinghua is a very small language in Singapore, it has mixed considerably with Singaporean Hokkien, Malay, English, and all the other languages spoken in Singapore. Either way, something’s up.

    -Within Shaozhou Tuhua, there has got to be multiple languages. In Lechang, there are at least five lects which are likely under the 90% bar. In addition, many Tuhua lects themselves are starting to splinter under language shift, as influences from Southwestern Mandarin, Hakka, and Cantonese affect different locales. It seems that the pronunciation has changed between the oldest generation and the youngest generation that still speaks it, although I do not know whether this is enough to get down to 90% range.

    -Within Wenzhou Wu, there are a couple rural dialects that people from Wenzhou city have some difficulties understanding. The rural speakers say they understand city speakers, though. As the links to Glossika are down, I can’t see which lects you are referring to in your post.

    Perhaps a little organization would help? Anyway, keep up the good work!

    • Thanks so much sir!

      I made all of the changes you recommended. Great work!

    • I am from Singapore. I just look up the map. My ancestral village in Hui-an China, north of Quanzhou is just 8 km south from border Puxian or Xinghua land. I just realize that my ancestral land is the northern most Min-nan land.

      I cannot understand them. Its very mutually unintelligible.

      I believe we are 2 different sinitic tribe who happens to be side my side and never really mix.

      I can understand Chaozhou or Taiwan which is 200-300km away.

      I realize that in Fujian, mutual intelligibility got more to do with whether you are descendant from the same tribe, than geographical proximity.

      In my the case of Hui An Min Nan, it is a clear example that there is not so much of lect continum in Fujian, but rather a clear demarcated lect boundary.

  52. caffeind

    If A is mutually intelligible with B (by 90% or whatever standard) and B is mutually intelligible with C, then A and B are part of the same language and B and C are part of the same language, but A and C are not part of the same language. Mathematically, we would say something like “partitioning a dialect continuum into ‘languages’ based on the mutual intelligibility criterion is not well-defined.”

    Mutual intelligibility in practice is mostly sociological. If you have never been exposed to lect A, or even if lect A speakers are all around you but A is low-prestige and you have made no attempt to learn A, comprehension is very low. If you have more exposure, comprehensibility goes up rapidly and you are able to start using regular correspondences to partially predict the meaning of A words based on your knowledge of related lect B. If A is high-prestige, economically advantaged, or a medium for education or electronic media in your area, you are likely to understand it well even if your own lect C is not closely related to A.

    The mutual intelligibility definition of “a language” is mathematically ambiguous or undecidable, assumes an idealized situation where speakers of each lect are innocent of exposure to others, is undecided in practical cases as we see from the arguments above and elsewhere, and grossly contradicts the ordinary, non-linguist-jargon meaning of “language” which usually refers to a somewhat standardized medium that has some currency and has nothing to do with obscure rural dialects. It would be better for linguists to just refer to mutual intelligibility as mutual intelligibility, not load it onto the words “language” and “dialect”.

  53. Movenon

    Hi Robert,

    I’ve been looking into the Oujiang group within Southern Wu Chinese. Things are very complicated, as expected. I’m not done yet, but I would confirm separation of Wenxi from Wenzhou proper (city). The Wenxi lect seems to listed as “Qingtian” under Glossika’s system. However, this is quite misleading, as Qingtian dialect refers to something different, the main dialect of Qingtian (Chuqu subgroup). The Oujiang dialect spoken in one village within Qingtian county, named Wenxi, is listed by Glossika as the “Oujiang lect of Qingtian.” Confusing, eh? It’s a shame Glossika’s site is still down, I can’t always see which lects are being referred to in the post.

    I suspect more Oujiang lects will fall short of 90% with metropolitan Wenzhou dialect. But at the moment, I’m still reading about them.

    There are two other Sinitic languages that need to be enumerated. Lumped within Mindong (Eastern Min) are Manjiang spoken in the central part of Taishun County and Manhua spoken in the eastern part of Cangnan County. The names Manhua and Manjiang both mean “barbarian speech.” These enigmatic languages are likely a fusion of Southern Wu (Wenzhou etc.), Eastern Min, Northern Min, and maybe even pre-Sinitic languages. Seems like Eastern Min is used on Wikipedia as the genetic affiliation, but I wouldn’t be too sure of that yet. The Zhejiang-Fujian provincial border is a hotspot of diversity of all kinds of Sinitic languages and dialects. Anyway, I’ll be back with more eventually.

    By the way, it would be interesting if you did this project for some other big macrolanguages too. I suggest Japanese (lots of “dialects” even in the mainland), Spanish, Arabic, Italian, and “Hindi.” Come back to your linguist roots 🙂

    • Hi.

      Spanish, Portuguese, Arabic and Italian are all done! I just need to write them up is all. They are in “notes” form right now. But the task is sort of overwhelming for some reason. I look at the notes and start to get a headache.

      So Manhua and Manjiang are separate languages from Eastern Min, correct?

      How are you reading about this stuff? You can read Chinese? If I could read Chinese, I would have been able to do a better job on this. Most of the really good was in the primary language (Chinese). Not so much in English.

      • Movenon

        Yeah, I read/speak Chinese(s). Manhua and Manjiang are definitely unintelligible with Fuzhou dialect, and I don’t see any other evidence of intelligibility with other Eastern Min lects yet. I’m still reading about them. Very interesting. In general, the whole southern Zhejiang- northern Fujian border is still understudied. There are a number of phyla represented there- Oujiang Wu, Chuqu Wu, Shaojiang Min, Northern Min, Eastern Min, Southern Min, Central Min, Manhua, Manjiang, Hakka, Gan, etc. each of which has many mutually unintelligible subdivisions. And the linguistic divide between Wu and Min is murky around here too. There are some southern Wu lects that clearly align with general Min in initials of some “litmus test” words. And some southern Wu lects have lost the 3-way distinction in plosives. Meanwhile, Manjiang still has the 3-way distinction, which to me is a clear non-Eastern Min indicator. Either it had kept it all along, or it was influenced back in from Wenzhou dialect. Anyway, I’m still reading.

        I’d really like to see your work on the other macrolanguages, especially on Arabic dialects… I hope you can post it soon!

        • Wow, great, you are Korean and you learned how to read Chinese? Why?

          All that stuff is done. It just needs to be written up and formatted into WordPress. But some of it is so confusing that I will have a hard time writing it up. For instance, I wrote up Ladin, but now when I go back and try to read it I have a hard time making sense of it because it is so confusing.

          French and Italian have lots of languages in them. Spanish, not so many. Portuguese, less. Arabic, quite a few.

          Oh, I also did Frisian.

      • Movenon

        Hmm, it seems Taishun’s Manjiang shares affinity with Shouning variety of Eastern Min. Apparently, phonology, vocab, grammar are similar to Shouning, but I haven’t found anything discussing whether they are mutually intelligible. Anyway, Manjiang is under the Eastern Min Macrolanguage.

        Manhua is totally different, so don’t get the names confused. There are varying opinions whether it can be lumped into macro-Wu, macro-Min, or is a creole of the two. I personally think it’s Wu, based on phonology (it has voiced consonants and labiodentals), and some of the Min-like traits can be seen in other Wu lects, such as a couple in the Chuqu group.

        Complicating things further is the administrative decision to split off Cangnan as an independent county from Pingyang County. So, what was formerly referred to as Pingyang Manhua is now Cangnan Manhua, because the area of Pingyang County that was Manhua-speaking became part of the new Cangnan County.

        Within Pingyang/Cangnan Manhua, there is a northern group in Yishan town and a southern group in Qianku and Jinxiang towns. Jinxiang also has its own distinctive Wu lect, with some Mandarin influences. Manhua speakers take Qianku as their standard dialect. The difference between the northern group and the southern group may impede intelligibilty, but so far I haven’t found any confirmation either way.

        Unfortunately, some publications I’ve read aren’t very careful with the two names Manhua, and Manjiang, which makes things very confusing sometimes. In addition, there is also some sort of lect spoken in Hedi village of Qingyuan County in Lishui, which so far seems to be more like Taishun Manjiang, but often takes the name Manhua instead.

        There is a Taihu (Northern group) Wu outlier lect in Jinxiang town. Also, there is an aberrant Wu lect in Luoyang town of Taishun county, that has been influenced by Taishun’s Manjiang as well as Oujiang Wu.

        Also, can you change Wenxi to a town, rather than a village? That’s a mistake on my part.

        • Movenon

          Oops, Hedi is a town too, not village. Administrative divisions can be so confusing sometimes across languages.

  54. Nikephoros

    There’s an interesting story about Spanish. A guy from Venezuela called Andres Bello essentially saved the unity of the language. It happens some prominent Argentinian academicians launched a proposal to devise a separate Argentinian official language, based on their local lect. Bello convinced them of the advantages of carrying on with the Spanish standard, like having an unified standard for communication all across Latin America, as well as with the former metropolis. Had they succeeded in implanting this Argie standard, is possible other countries would have followed their lead, and today we would have had a number of different languages derived from Spanish.

    • There is a totally hardcore Buenos Aires dialect that is used in a lot of tango songs that other Spanish speakers can’t really understand. I forget what it’s called at the moment. It’s more a slang than anything else.

      I do think there are some separate Spanishes in Latin America, but not too many of them. One is Colombian Caribbean Spanish. No one can understand this.

      I have also heard that Spaniards cannot understand Argentine Spanish, and Argentine movies get subtitles in Mexico.

      Argentine Spanish sounds totally bizarre to me. It’s sounds “European.” I think Catalan or something like that.

      • Nikephoros

        Lunfardo. It is a criminal argot, heavily Neapolitan-influenced. Indeed, porteño dialect has been said to be exactly like Neapolitan in its accentuation, due to the huge influx of migrants from Southern Italy. As for Colombian Caribbean, like other Caribbean dialects, it is a reflex of Andalusian. Those dialects have incorporated a number of Anglicisms, due to the American presence in the region a century ago. Still understandable, barring the odd word here and there. Actually, i don’t have more problem understanding the Caribbean dialect than i have with rural dialects of my own region. But, if you included Creoles, there’s one in the Colombian Caribbean, in a settlement founded by escaped slaves near Cartagena called San Basilio de Palenque, Criollo Palenquero. As expected, it is influenced by a number of African languages, principally Kongo. But the prize for shittiest dialect unarguably goes for Chilean. If you didn’t understand Argie, Chilean is gonna make you nuts.

  55. Saim Inayatullah

    What’s the point of all this? Do you think China will allow the standardization of regional languages if there are actually more than a hundred of them? At best we can hope that Cantonese and Taiwanese don’t die out because of Taiwan’s and Hong Kong/Macau’s political autonomy.

    In fact, this is part of the argument many nationalistic Chinese use to defend a single standard (i.e. standard Mandarin) for the Chinese varieties. “If we standardized and officialized Cantonese and Hokkien and Shanghainese, then soon enough each village would have its own language!”. What a load of horseshit.

  56. Daniel

    Mr. Lindsay, I re-post here my comment (first posted April 9) on Prof. Mair’s article regarding intelligibility of Sinitic varieties:

    Chinese “languages,” or “dialects”? I have been stumped by this question for years, but I think I have recently settled on a resolution.

    For the linguist, it would of course be ideal to just apply “purely linguistic” criteria for classifying speech varieties as languages or dialects, and some (like Franz Bebop, in his comments of March 12, 2009, 1:11 am and 8:54 am) are quite uncomfortable with designating languages based on “non-linguistic” criteria like political, socio-cultural, and historical factors (although the idea of these being “non-linguistic” is itself contentious; don’t these factors fall within the purview of sociolinguistics?). Using “purely linguistic” criteria, all the debates about Sinitic, Arabic, Hindi-Urdu, Serbo-Croatian, etc., etc. would theoretically be put to rest.

    The problem is, the major linguistic criterion that is being touted for language-dialect differentiation is the mutual intelligibility criterion, which has a host of inadequacies, including the following theoretical ones:
    (1) mutual intelligibility is not an all-or-none thing, the degree of mutual intelligibility is usually some value between 0% and 100% – What then should be the threshold of mutual intelligibility? 60%?, 70%?, 90%?, 91.5%?
    (2) intelligibility can be asymmetrical between two speech varieties
    (3) no universally accepted method for measuring intelligibility – For example, should it simply be based on the number of words understood? If someone understands 5 out of 6 words in the sentence “I will not sell my horse” that is 83% of the words, but if the word that is not understood is the crucial “not,” then the whole message does not get through.

    Aside from the theoretical ones, the mutual intelligibility criterion also has what might be called “practical” or speaker-related problems (as enumerated in http://www.linguasphere.info/spip.php?article171096) which would affect the measurement:
    (1) different linguistic aptitude among individuals (which probably explains why some of the Mandarin-speaking commenters had more difficulty in understanding Cantonese, whereas others had less)
    (2) previous linguistic experience and/or education of the speakers being evaluated (For example, considering that Standard Mandarin is taught in schools, is it really possible to accurately measure the degree of mutual intelligibility between a Mandarin and non-Mandarin topolect? Should we get uneducated speakers to measure mutual intelligibility?…Could it be that the commenters who had less difficulty understanding Cantonese were those who had a longer time of contact with Cantonese speakers?)
    (3) reciprocal feelings among speakers of the different varieties
    (4) subject matter under discussion (As you previously pointed out, “highly Mandarinized” formal written Cantonese may be easier to understand than conversational Cantonese in speech, comics, and short stories; I suspect TV news broadcasts would also be easier to understand than dialogues of TV dramas)
    (5) hearing acuity of the speakers

    And assuming that all these problems could be surmounted, there is another thing which would render the mutual intelligibility criterion useless: the existence of dialect continua (as already pointed out by LoveEncounterFlow, March 7, 2009, 4:50 pm, and Merri, March 9, 12:49 pm). The Sinitic topolects, which form a speech continuum, are often compared to the Romance languages in terms of diversity (“as different as French or Spanish is from Italian”—or some other variant of the statement). However, West Romance dialects constitute a continuum, and the division of these dialects into “Portuguese,” “Castilian,” “Catalan,” “French,” “Walloon,”…is arbitrary. If the division of West Romance or North Germanic into separate languages is considered arbitrary, then—following the suggestion that what applies to other languages must also apply to Chinese—the division of the Sinitic continuum into separate languages (whether 7, 8, 10, 13, or more) would be just as arbitrary. The debates about Jin, Old and New Xiang, and Northern and Southern Wu are perhaps a reflection of the difficulty of placing the dividing lines. The Sinitic continuum is also more likely to be perpetuated, since unlike West Romance or North Germanic with their numerous standard languages, the Sinitic sphere has only one standard language, toward which all the topolects may eventually gravitate. Since any division into languages would be arbitrary, the logical alternatives would be (1) to consider the entire Sinitic continuum as a single language, or (2) to consider each topolect as an individual language (which means there would be thousands of Sinitic “languages” if the counties are used as basis)

    Therefore, the reason for the Sinitic problem does not seem to be Chinese nationalism or official intransigence (though both may be very much present), but rather the fact that there is as yet no satisfactory, “purely linguistic” definition of language. In other words, it is not just a problem of Sinitic, but a problem of West Romance, North Germanic, West Germanic, North Slavic, South Slavic, Indic, Arabic, and Turkic as well. Or to put it in another way, it is not a “problem” of any of these, but rather a problem of the linguists themselves, who have yet to come up with a definition of language which conforms to reality.

    Since there is as yet no satisfactory linguistic definition of language, we will just have to fall back on the status quo and rely on the non-linguistic—political, socio-cultural, historical—definitions of language. It seems we would just have to accept the old familiar languages like Portuguese, Spanish, Norwegian, Danish, Ukrainian, Russian, Kirgiz, Kazakh, Hindi, Urdu, and, with apologies to Franz Bebop, Chinese.
    * * *
    Of course, I have learned new things after reading your article here, like the fact that linguists concur on a 90% intelligibility threshold to distinguish languages from dialects.

    Now, I must apologize, for having re-posted that lengthy comment just to serve as a backgrounder to my questions. (1) I learned that the Sinitic spoken varieties formed a continuum from Mahé Ben Hamed, “Neighbour-nets portray the Chinese dialect continuum and the linguistic legacy of China’s demic history,” Proceedings of the Royal Society, Series B 271 (2005): 1015-1022 (published online). Is there really abundant evidence for a Sinitic continuum, or does Hamed offer a minority view? (2) Have linguists resolved the problem of the definition of language, particularly in connection with the existence of language continua?

  57. Iris

    Cantonese unfortunately did not lose out by one vote in becoming Standard Chinese – it’s a myth (and I’m from Hong Kong so I’d really like it to NOT be a myth) – here’s the article that explains it a bit more:

  58. Hi Robert, you are generally quite accurate. I want to add some opinions.

    I am from Hui-an, a village north of Quanzhou. My Min Nan dialect is the most north branch. I found myself understanding roughly 50% of Teochew if they speak normally. But I found myself understanding almost 0% of Northern Min dialect. I do not interact that much in Teochew. My ears is not virgin ears as to Teochew. The Teochew people in Singapore speaks Min Nan.

    Nevertheless, I consider Teochew part of Min Nan.

    For Eastern Min or Northern Min, I could understand at most 0-20%. Its very unintelligible.

    For Taiwanese in Yilan, I could understand 30-40% at normal speed. I tested my virgin ears when I visited Yilan some years back. If they speak very slowly, I would be able to understand 60-90%.

    Slowing down in Taiwanese or Xiamen Min Nan helps a lot for me to understand them. Slowing down in Teochew helps a lot as well but not as much as other Min Nan dialect. The reason is Teochew has their own set of lexicon. I think Teochew is in Min Nan group but their small differences in lexicon and unique pronunciation not found in other Min Nan often caught me by surprise. I can pick up Teochew fast if I want to learn the language.

    For eastern Taiwanese, I understand much better.

  59. I believe my dialect should be Quanzhou. My ancestral village is Hui-An, north of Quanzhou, now being absorb into Quanzhou city.
    It is the northern most Min Nan Dialect. My grandfather and mother is born in Hui-An. I have no problem in understanding them as well as all by tribe members. I have been Quanzhou, and I understand them., But if they speak in a more literal way, I may have problem. Anyway, I do not get to hear very literal Min Nan on the streets. My colloquail communications with them is very mutually intelligible,

    For Singaporean Min Nan born in Xiamen or Zhangzhou, I could understand them as well. Probably they are not speaking pure Xiamen or Zhangzhou Min Nan, but just Singaporean Min Nan.

    My sister in Law is Taiwanese native. She can understand us 100%. I understand her 100% in day to day use. But if she speak in full richness and breath of Taiwanese Min Nan, I believe my intelligibility will drop to 60%.

    In Singapore, you find all sorts of Min Nan people. There are many Teochew people as well. So my ears are not virgin. But I have no problem understanding Zhangzhou as well in basic conversation and even for Teochew, I understand for a large extend. The reason is we generally do not need to convey complicated ideas in daily life.

    But we are more westernized. The young people do not speak dialect often. That is the main problem that communication between us and China/Taiwanese are asymmetrical. If they use common Min Nan, its ok for me in normal speed. If they use literal Min Nan especiallyTaiwanese, I would have some difficulties. ,

    Some comment for Mandarin.
    Most Chinese learn Mandarin. All educated Chinese are very mutually intelligible to one another in Mandarin. For Singaporeans or Malaysian, we speak a Mandarin heavily toned in our dialects. But all Mandarin acquired in schools are anchored based on Beijing variant, educated Chinese can understand one another 100% in Mandarin.

    My sister married a Shandong man. She told me she could not understand Shandong Mandarin at normal speed when she visited her husband villeage. She could understand them at slow speed.

    The reason Mandarin is mutually unintelligible among Chinese is mostly because some Chinese are not that educated, and their Mandarin is less biased towards Beijing variant. Or they prefer to speak their colloquial Mandarin.

  60. My mum told me she can understand to a large extend many Min Nan. For me, understanding Taiwanese soap opera is not a problem. If I watch political debate, my understanding drop to around 60%. I tested her using Taiwanese political debate program in Taiwanese. She can understand more than 90%.

    But my mum is educated and she is more exposed to various Min Nan news radio since young. My generation gone through cultural genocide by PAP government.

    Even then, I regard my Min Nan is among the better one in my generation. For Mandarin TV everywhere in the world, I understand 100%, from soap opera to serious documentary.

  61. Recently on Facebook a number of us discussed which Hokkien and Teochew dialects we understand best, next best, worst, next worst, etc. I recall that of the 3 or 4 of us that spoken “General Taiwan” or “Haiteng” (a district of Ciangciu very close to Amoy), all of us agreed that the Coanciu City dialect was no easier to understand than Teochew, or at least Swatow-type Teochew.

    Regarding Manjiang, known natively as “Mango” or something like that — there is very little data on it floating around. I saw some data once and it doesn’t “look” like a Sinitic language, no more than Vietnamese does…

  62. Eidolon

    I find this exercise intellectually stimulating, but otherwise not all that practical. The central criteria used, mutual intelligibility, is changing all the time. It is a fact that language standardization, displacement, and loss is ongoing, especially in China. Hundreds of years of illiterate, uneducated villagers living out their entire lives within 30 miles of their village have created tons of mutually unintelligible languages in China. But such isolation – the engine of new language creation – has come to an end. Today, mass communication, standardized education, and mobile populations have completely changed the landscape of mutual intelligibility within China. It won’t be long before Chinese speak just one of 7-8 major languages, and then just one of 2-3 major languages, and eventually just one language.

    It’s worthwhile to keep track of major languages eg Cantonese, Hakka, etc. because they have a shot of surviving for a few generations, at the minimum, and therefore being relevant to the future. But the exercise of trying to track down whether a sub-dialect within a sub-dialect is, in fact, a language, is futile. In 1-2 generations, it’s going to be gone, before you even gather the money & resources needed to test its mutual intelligibility with other sub-dialects.

  63. True, as in most of the rest of the world. I would not say the loss is “ongoing” in China, though. It really only started about 30-35 years ago.

    Given the socio-political environment, we could reasonably expect all ethnic Chinese to be speaking Mandarin in a century’s time. However, the situation on the ground with Cantonese defies such expectations at this point. The youngest generation of Mandarin speakers in the Cantonese megalopolis is actually shifting TO Cantonese during their school years in spite of instruction being mostly in Mandarin. Some kind of “soft power play” is in effect. There’s an outside chance that Mandophone China will switch to English — say, in 150 years — while Cantophone China continues to speak Cantonese. I say from direct observation on the ground in the Mega Cantopolis.

  64. Pingback: Beyond Highbrow - Robert Lindsay

  65. Pingback: Hong Kong: a ​vortex of mother tongue​ and motherland – RoundTableChina

  66. Pingback: Hong Kong in Vortex: Cantonese or Mandarin? – RoundTableChina

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s