Monday, September 29, 2008


Cantonese, or Yue , is a major group or language, a member of the family of languages. Colloquially, it is also known as 廣東話 . The exact number of Cantonese speakers is unknown due to a lack of statistics and census data. The areas with the highest concentration of speakers are in Guangdong and some parts of Guangxi in southern mainland China, Hong Kong and Macau. Cantonese is the ''de facto'' official language of Hong Kong. Other major groups include Chinese minorities in Southeast Asia. The name has been stated as being derived from ''Canton''; a name in English for Guangdong province, and ''Canton''; a name in English for Guangzhou city.

There are numerous different dialects of Cantonese; the most widely spoken is the Guangzhou dialect, often referred to simply as "Cantonese". The Guangzhou dialect is a ''lingua franca'' of not just Guangdong province, but also the overseas Cantonese speaking diaspora. The Guangzhou dialect is also spoken in Hong Kong.

Like other major varieties of , Cantonese is often considered a dialect of a single Chinese language for cultural or nationalistic reasons, though in practice Cantonese, like many other Chinese language varieties, is mutually unintelligible with many other Chinese "dialects". See Identification of the varieties of Chinese.


There are at least four major dialect groups of Cantonese: 1) ''Yuehai'', which includes the dialect spoken in Guangzhou, Hong Kong and Macau as well as the dialects of Zhongshan, and Dongguan; 2) ''Sìyì'' , exemplified by Taishan dialect, which used to be ubiquitous in American Chinatowns before 1970; 3) Gaoyang dialect, as spoken in Yangjiang; and 4) ''Guinan'' spoken widely in Guangxi. In Hainan Province, the two Min dialects which are both closely related to the Cantonese are spoken: the Mai dialect and the Danzhou dialect. However, Cantonese generally refers to the ''Yuehai'' dialect.

There are at least three other major Chinese languages spoken in Guangdong Province— Standard Mandarin, which is used for official occasions, education, the media, and as a national lingua franca; Minnan spoken in the eastern regions bordering Fujian, such as those from Chaozhou and Shantou; and , the language of the . Standard Mandarin is mandatory through the state education system, but in Cantonese speaking households, the popularization of Cantonese-language media , isolation from the other regions of China, local identity, and the existence of the non-Mandarin speaking Cantonese diaspora give unique characteristics to the language. Most wuxia films from Canton are filmed originally in Cantonese and then dubbed or subtitled in Standard Mandarin or or both.


:''See Standard Cantonese for a discussion of the sounds of Standard Cantonese and pages on individual dialects for their phonologies

Cantonese development and usage

Officially Standard Mandarin is the standard language in mainland China and Taiwan and is taught to nearly everyone with different variations as a supplement to their native local languages . Cantonese along with form two of the official languages of Hong Kong and Macau. Cantonese is also one of the main languages in many overseas Chinese communities including Australia, Southeast Asia, North America, and Europe. Many of these emigrants and/or their ancestors originated from Guangdong. In addition, these immigrant communities formed before the widespread use of Mandarin, or they are from Hong Kong where Mandarin is not commonly used.

In some ways, Cantonese is a more conservative language than Mandarin, and in other ways it is not. For example, Cantonese has retained consonant endings from older varieties of Chinese that have been lost in Mandarin, but Cantonese has merged some vowels from older varieties of Chinese.

The Taishan dialect, which in the U.S. nowadays is heard mostly spoken by Chinese actors in old American TV shows and movies , is more conservative than the Guangzhou/Hong Kong variants of Cantonese. It has preserved the initial /n/ sound of words, whereas many post-World War II-born Hong Kong Cantonese speakers have changed it to an /l/ sound and more recently drop the "ng-" initial ; this seems to have arisen from some kind of street affectation as opposed to systematic phonological change. The common word for "who" in Taishan is "s?e" , which is the same character used in classical Chinese, whereas Cantonese has changed it to "bīngo" .

Cantonese sounds quite different from Mandarin, mainly because it has a different set of syllables. The rules for syllable formation are different; for example, there are syllables ending in non-nasal consonants . It also has different tones and more of them than Mandarin. Cantonese is generally considered to have 8 tones, the choice depending on whether a traditional distinction between a high-level and a high-falling tone is observed; the two tones in question have largely merged into a single, high-level tone, especially in Hong Kong Cantonese, which has tended to simplify traditional Chinese tones . Many descriptions of the Cantonese sound system record a higher number of tones, 9. However, the extra tones differ only in that they end in p, t, or k; otherwise they can be
modeled identically.

Cantonese preserves many syllable-final sounds that Mandarin has lost or merged. For example, the characters , , , , , , , , , , , and are all pronounced "yì" in Mandarin, but they are all different in Cantonese . Like and Min Nan, Cantonese has preserved the final consonants from Middle Chinese, while the Mandarin final consonants have been reduced to . But unlike any other modern Chinese dialects, the final consonants of Cantonese match those of Middle Chinese exactly with extremely few exceptions. For example, lacking the syllable-final sound "m"; the final "m" and final "n" from older varieties of Chinese have merged into "n" in Mandarin, e.g. Cantonese "taahm" and "tàahn" versus Mandarin tán; "yìhm" and "yìhn" versus Mandarin yán; "tìm" and "tìn" versus Mandarin tiān; "hùhm" and "hòhn" versus Mandarin hán. The examples are too numerous to list. Furthermore, nasals can be independent syllables in Cantonese words, e.g. "ńgh" "five," and "m?h" "not".

Differences also arise from Mandarin's relatively recent sound changes. One change, for example, palatalized with to , and is reflected in historical Mandarin romanizations, such as ''Peking'' , ''Kiangsi'' , and Fukien . This distinction is still preserved in Cantonese. For example, 晶, 精, 經 and 京 are all pronounced as "jīng" in Mandarin, but in Cantonese, the first pair is pronounced "jīng", and the second pair "gīng".

A more drastic example, displaying both the loss of coda plosives and the palatization of onset consonants, is the character , pronounced '''' in Middle Chinese. Its modern pronunciations in Cantonese, Hakka, Taiwanese, Vietnamese, Korean, and Japanese are "hohk", "hók" , "" , h?c , "?" , and "gaku" , respectively, while the pronunciation in Mandarin is xué .

However, Mandarin's vowel system is somewhat more conservative than Cantonese's, in that many diphthongs preserved in Mandarin have merged or been lost in Cantonese. Also, Mandarin makes a three-way distinction among alveolar, alveopalatal, and retroflex fricatives, distinctions that are not made by modern Cantonese. For example, ''jiang'' and ''zhang'' are two distinct syllables in Mandarin or old Cantonese, but in modern Cantonese they have the same sound, "jeung1". The loss of distinction between the alveolar and the alveolopalatal sibilants in Cantonese occurred in the mid-19th centuries and was documented in many Cantonese dictionaries and pronunciation guides published prior to the 1950s. ''A Tonic Dictionary of the Chinese Language in the Canton Dialect'' by Williams , writes: ''The initials "ch" and "ts" are constantly confounded, and some persons are absolutely unable to detect the difference, more frequently calling the words under "ts" as "ch", than contrariwise.'' ''A Pocket Dictionary of Cantonese'' by Cowles adds: ''"s" initial may be heard for "sh" initial and vice versa.''

There are clear sound correspondences in, for instance, the tones. For example, a fourth-tone word in Cantonese is usually second tone in Mandarin.

This can be partly explained by their common descent from Middle Chinese , still with its different dialects. One way of counting tones gives Cantonese nine tones, Mandarin four, and Middle Chinese eight. Within this system, Mandarin merged the so-called "yin" and "yang" tones except for the Ping category, while Cantonese not only preserved these, but split one of them into two over time. Also, within this system, Cantonese is the only Chinese language known to have split its tones rather than merge them since the time of Late Middle Chinese.

Written Cantonese

is, in essence, written Standard Mandarin. When standard written Chinese is read aloud with Cantonese sound values, the result sounds stilted and unnatural because it is different from normal spoken Cantonese. Written Cantonese is spoken Cantonese written as it is actually spoken. Unusual for a regional Chinese language, Cantonese has a written form, developed over the last few decades in Hong Kong, and includes many unique characters that are not found in . Readers who understand standard written Chinese but do not know Cantonese often find written Cantonese hard to understand or even unintelligible as it is different from standard written Chinese in grammar and vocabulary. However, written Cantonese is commonly used informally among Cantonese speakers. Circumstances where written Cantonese is used include conversations through instant messaging services, letters, notes, entertainment magazines and entertainment sections of newspapers, and sometimes s in Hong Kong movies, and advertisements. It rarely finds its way into the subtitles of Western movies or TV shows, though The Simpsons is a notable exception. Cantonese Opera scripts also use the Cantonese written vernacular.

Historically, written Cantonese has been used in Hong Kong for legal proceedings in order to write down the exact spoken testimony of a witness, instead of paraphrasing spoken Cantonese into standard written Chinese. Newspapers have also done likewise to capture more exact quotes. Its popularity and usage has been rising in the last two decades, the late Wong Jim being one of the pioneers of its use as an effective written language. Written colloquial Cantonese has become quite popular in certain tabloids, online chat rooms, and instant messaging. Some tabloids like Apple Daily write colloquial Cantonese; papers may contain editorials that contain Cantonese; and Cantonese-specific characters can be increasingly seen on advertisements and billboards. Written Cantonese remains limited outside of Hong Kong, even in other Cantonese-speaking areas such as Guangdong, where the use of colloquial writing is discouraged. Despite the relative popularity of written Cantonese in Hong Kong, some disdain it, believing that being too accustomed to write in such a way would affect a person's ability to use standard written Chinese in situations that demand it.

Forms of written Chinese in Hong Kong:

# Standard Written Chinese used in Hong Kong SAR post-WWII Vernacular Reformation.
# Colloquial Written Cantonese - currently used in journals, advertisements, etc. in Hong Kong SAR, overseas Cantonese Chinese communities.
# Classical Cantonese Chinese - a reconstructed Neo-Classical written Chinese forms widely used in 1900s-90s Hong Kong in Cantonese opera lyrics, Cantonese Chinese poetic forms and especially in 80s cantopop.
# Classical Chinese known as - a written Chinese form in poems and writings from the dynastic periods.

For colloquial vernacular usage, written Cantonese incorporates an entire range of characters and particles specific to the Cantonese spoken form. This is commonly used in publicity and journalistic writing in Hong Kong and Hong Kong-influenced regions. It reads exactly as Modern Standard Spoken Cantonese.

For literary and artistic reasons , standard literary Chinese, the classical wenyanwan is used.

Records of legal documents in Hong Kong also use written Cantonese sometimes, in order to record exactly what a witness has said.

Colloquial Cantonese is rarely used in formal forms of writing; formal written communication is almost always in standard written Chinese, albeit still pronounced with Cantonese sound values. However, written colloquial Cantonese does exist; it is used mostly for transcription of speech in tabloids, in some broadsheets, for some subtitles, for personal diaries, and in other informal forms of communication such as Internet bulletin boards or e-mails. It is not uncommon to see the front page of a Cantonese paper written in hanyu, while the entertainment sections are, at least partly, in Cantonese. The vernacular writing system has evolved over time from a process of modifying characters to express lexical and syntactic elements found in Cantonese but not the standard written language. In spite of their vernacular origin and informal use, these characters have become so common in Hong Kong that the Hong Kong Government has incorporated them into a special , along with special characters used for proper nouns.

A problem for the student of Cantonese is the lack of a widely accepted, standardized transcription system. Another problem is with Chinese characters: Cantonese uses the same system of characters as standard written Chinese, but it often uses different words, which have to be written with different or new characters. An example of Cantonese using a different word and a different character to write it: the Mandarin word for "to be" is shì and is written as , but in Cantonese the word for "to be" is haih and is used in written Cantonese . In standard written Chinese is normally used, though can be found in classical literature and modern formal writing. In Hong Kong, is often used in colloquial written Cantonese and therefore actively avoided and discouraged in formal writing; on the other hand, the use of is relatively widespread in both mainland China and in Taiwan, in government publications and product descriptions, for example.

Many characters used in colloquial Cantonese writings are made up by putting a mouth radical on the left hand side of another more well known character to indicate that the character is read like the right hand side, but it is only used phonetically in the Cantonese context. The characters, , , , , , , , , , , }, , , , , , , , , , , , , , , , etc. are commonly used in Cantonese writing.

As not all Cantonese words can be found in current encoding system, or the users simply do not know how to enter such characters on the computer, in very informal speech, Cantonese tends to use extremely simple romanization , symbols , homophones , and Chinese characters of different Mandarin meaning to compose a message.

For example, "" is often written in easier form as "" .

Other common characters are unique to Cantonese or deviate from their Standard Chinese usage; these include: , , , , , , , .

The words represented by these characters are sometimes cognates with pre-existing Chinese words. However, their colloquial Cantonese pronunciations have diverged from formal Cantonese pronunciations. For example, in formal written Chinese, is the character used for "without". In spoken Cantonese, has the same usage, meaning, and pronunciation as , differing only by tone. is actually a hollowed out writing of its antonym . represents the spoken Cantonese form of the word "without", while represents the word used in formal Chinese writing . However, is still used in some instances in spoken Chinese in both dialects, like . A Cantonese-specific example is the , which means "to come". is used in formal writing; is the spoken Cantonese form.

Relation to Classical Chinese

Since the pronunciation of all modern varieties of Chinese are different from Old Chinese or other forms of historical Chinese , characters that once rhymed in poetry may no longer . Poetry and other rhyme-based writing thus becomes less coherent than the original reading must have been. However, some modern Chinese dialects have certain phonological characteristics that are closer to the older pronunciations than others, as shown by the preservation of certain rhyme structures. Some believe wenyan literature, especially poetry, sounds better when read in certain dialects believed to be closer to older pronunciations, such as Cantonese or Southern Min.

Cantonese outside China

Historically, the majority of the overseas Chinese have originated from just two provinces; Fujian and Guangdong. This has resulted in the overseas Chinese having a far higher proportion of Fujian and Guangdong languages/dialect speakers than Chinese speakers in China as a whole. More recent emigration from Fujian and Hong Kong have continued this trend.

The largest number of Cantonese speakers outside mainland China and Hong Kong are in south east Asia, however speakers of Min dialects are predominate among the overseas Chinese in south east Asia.

United Kingdom

The majority of Cantonese speakers in the UK have origins from the former British colony of Hong Kong and speak the Guangzhou/Hong Kong dialect, although many are in fact from Hakka speaking families and are bilingual in Hakka. There are also Cantonese speakers from south east Asian countries such as Malaysia and Singapore, as well as from Guangdong in China itself. Today an estimated 300,000 British people have Cantonese as a mother tongue/first language.

United States of America

For the last 150 years, Guangdong Province has been the place of origin of most Chinese emigrants to western countries; one coastal county, Taishan , alone may have been the home to more than 60% of Chinese immigrants to the US before 1965. As a result, Guangdong dialects such as sei yap and what is now called mainstream Cantonese have been the major Cantonese dialects spoken abroad, particularly in the USA.

The Taishan dialect, one of the ''sei yap'' or ''siyi'' dialects that come from Guangdong counties that were the origin of the majority of Guangdong Chinese emigrants to the USA, continues to be spoken both by recent immigrants from Taishan and even by third-generation Chinese Americans of Taishan ancestry alike.

The dialect of Zhongshan in Pearl River Delta is spoken by many Chinese immigrants in Hawaii, and some in San Francisco and in the Sacramento River Delta ; it is much closer to Cantonese than the Taishan dialect, but has "flatter" tones in pronunciation than Cantonese. Cantonese is the third most widely spoken non-English language in the United States. The currently most popular romanization for learning Cantonese in the United States is Yale Romanization.

The dialectal situation is now changing in the United States; recent Chinese emigrants originate from many different areas including mainland China, Hong Kong, Taiwan, and Southeast Asia. Recent immigrants from mainland China and Taiwan in the U.S. all speak Standard Mandarin , with varying degrees of fluency, and their native local language/dialect, such as Min , , , Cantonese etc. As a result Standard Mandarin is increasingly becoming more common as the Chinese lingua franca amongst the overseas Chinese.

Cantonese as a foreign language

Cantonese courses can be found at such U.S. universities as Harvard University, Yale University, Stanford University, the University of Hawaii, Brigham Young University, San Jose State University, New York University, Duke University and Cornell University. In Canada, Cantonese courses can be taken at various universities such as Simon Fraser University in Vancouver and the University of Toronto. The language is also commonly taught in 'heritage language' programs in the public schools in areas where many children have parents who speak the language. It can be easier for a Cantonese-speaker who does not speak Mandarin to learn Mandarin than vice-versa. This is because Cantonese speakers who do not speak Mandarin are educated to read and write in standard modern written Chinese but with Cantonese pronunciation when read aloud, so they are already familiar with Mandarin grammar and vocabulary because the grammar and vocabulary of standard modern written Chinese is more similar to Mandarin than Cantonese.


Qin and Han

In ancient China, Guangdong was called Nanyue, and very few people lived there. Therefore, the Chinese language was not widely spoken there at that time. However, in the Qin Dynasty Chinese troops moved south and conquered the Baiyue territories, and thousands of Han people began settling in the Lingnan area. This migration led to the Chinese language being spoken in the Lingnan area. After was made the Duke of Nanyue by the Han Dynasty and given authority over the Nanyue region, many Han people entered the area and lived together with the Nanyue population, consequently affecting the livelihood of the Nanyue people as well as stimulating the spread of the Chinese language. Although Han Chinese settlements and their influences soon dominated, some indigenous Nanyue population did not escape from the region. Today, the degree of interaction between Han Chinese and the indigenous population remains a highly controversial topic due to the lack of historical records and the hostility between modern Chinese and Vietnamese.


In the Sui Dynasty, was in a period of war and discontent, and many people moved southwards to avoid war, forming the first mass of people into the South. As the population in the Lingnan area dramatically rose, the Chinese language in the south developed significantly. Thus, the Cantonese language began to develop more significant differences with central Chinese.


As the Han population in the Guangdong area continued to rise during the Tang Dynasty, some indigenous people living in the south had been culturally assimilated by the Han population, while others moved to other regions , developing their own dialects. At the time, Cantonese had been affected by central Chinese and became more standardized, but it further developed a more independent language structure, vocabulary, and grammar.

Song, Yuan, Ming and Qing

In the Song Dynasty, the differences between central Chinese and Cantonese became more significant, and the languages became more independent of one another. During the and dynasties, Cantonese evolved still further, developing its own characteristics.

Mid to Late Qing

In the late , the dynasty had gone through a period of ban under the Hai jin. Guangzhou remained one of the only cities that allowed trading with foreign countries, since the trade chamber of commerce was established there. Therefore, some foreigners learned Cantonese and some Imperial government officials spoke Cantonese, making the language very popular in Cantonese speaking Guangzhou. Also, the European control of Macau and Hong Kong had increased the exposure of Cantonese to the world.

20th century

In the Cantonese speaking region of mainland China, Mandarin is used for official purposes while Cantonese is used more informal situations.

In Hong Kong, Hong Kong Cantonese is the main and dominant form of spoken Chinese and is used in education, the government, public life, the media and entertainment , and in business dealings with Cantonese speaking overseas Chinese communities.

Nowadays, due to ''Putonghua'' being the medium of education on the mainland, many youngsters in the Cantonese speaking region in mainland China do not know specific historical and scientific vocabularies in Cantonese but do know social, cultural, entertainment, commercial, trading, and all other vocabularies. Cantonese is widely spoken and learned by overseas Chinese of Guangdong and Hong Kong origin.

The popularity of Cantonese language media and entertainment from Hong Kong has led to a wide and frequent exposure of Cantonese to large portions of China and the rest of Asia. Cantopop and the Hong Kong film industry are prominent examples of modern Cantonese language media.


Cantonese dictionaries or databases with spoken Cantonese entries

* : This is one of the few online sites with an extensive database of ''spoken Cantonese terms and phrases'' on the Internet. A rare and precious resource .
* is a free, online Cantonese dictionary updated frequently by volunteers from around the world. As well as tens of thousands of words, many of its example sentences may be listened to in MP3 format.
* : This is an online Cantonese-Japanese dictionary aimed mainly at Japanese speakers. The corresponding is also on this site.

Character-only Cantonese pronunciation dictionaries

* A vast Chinese character database of over 13,000 characters with audio pronunciations in Cantonese. This site is viewable in Chinese and English. You can choose from seven transcription schemes to view character pronunciations. By default, the site is displayed in Chinese and uses the LSHK jyutping transcription scheme. To view the site in English and/or use a different transcription scheme, a cookies enabled browser is required. For each character you can find Cantonese pronunciations, Mandarin pronunciations, character ranking/frequency, Big5 encoding number, Unicode number, cangjie input code, and which radical the character can be found under using a traditional Chinese dictionary.
* in Chinese; require Big5 font
* An online Chinese English dictionary supporting both Cantonese and Mandarin. Standard Chinese words only. Vernacular Cantonese not supported

Other links

* Free tool that converts Cantonese Yale Romanization into characters. Note that tone markings/numbers are not used, but rather there is a menu of characters.
*Character comparisons between colloquial Cantonese characters and Standard Chinese characters.
* Cantonese dictionary using Yale Romanization with numbers.
* Cantonese dictionary using Jyutping romanization
* A site that can help cantonese learners improve.
* an English site dedicated to publishing Cantonese learning resources and reference materials.
* Add tone marks to romanized Cantonese
* Languages which are exceptionally difficult for native English speakers

