inefficiency of Chinese names is astounding

anti-Shuimo   Tue Apr 21, 2009 8:39 am GMT
Obscure names in Chinese are a burden to both database and brains:

BEIJING — “Ma,” a Chinese character for horse, is the 13th most common family name in China, shared by nearly 17 million people. That can cause no end of confusion when Mas get together, especially if those Mas also share the same given name, as many Chinese do.

Ma Cheng’s book-loving grandfather came up with an elegant solution to this common problem. Twenty-six years ago, when his granddaughter was born, he combed through his library of Chinese dictionaries and lighted upon a character pronounced “cheng.” Cheng, which means galloping steeds, looks just like the character for horse, except that it is condensed and written three times in a row.

The character is so rare that once people see it, Miss Ma said, they tend to remember both her and her name. That is one reason she likes it somuch.

That is also why the government wants her to change it.

For Ma Cheng and millions of others, Chinese parents’ desire to give their children a spark of individuality is colliding head-on with the Chinese bureaucracy’s desire for order. Seeking to modernize its vast database on China’s 1.3 billion citizens, the government’s Public Security Bureau has been replacing the handwritten identity card that every Chinese must carry with a computer-readable one, complete with color photos and embedded microchips. The new cards are harder to forge and can be scanned at places like airports where security is a priority.

The bureau’s computers, however, are programmed to read only 32,252 of the roughly 55,000 Chinese characters, according to a 2006 government report. The result is that Miss Ma and at least some of the 60 million other Chinese with obscure characters in their names cannot get new cards — unless they change their names to something more common.

Moreover, the situation is about to get worse or, in the government’s view, better. Since at least 2003, China has been working on a standardized list of characters for people to use in everyday life, including when naming children.

One newspaper reported last week that the list would be issued later this year and would curb the use of obscure names. A government linguistics official told Xinhua, the state-run news agency, that the list would include more than 8,000 characters. Although that is far fewer than the database now supposedly includes, the official said it was more than enough “to convey any concept in any field.” About 3,500 characters are in everyday use.

Government officials suggest that names have gotten out of hand, with too many parents picking the most obscure characters they can find or even making up characters, like linguistic fashion accessories. But many Chinese couples take pride in searching the rich archives of classical Chinese to find a distinctive, pleasing name, partly to help their children stand out in a society with strikingly few surnames.

By some estimates, 100 surnames cover 85 percent of China’s citizens. Laobaixing, or “old hundred names,” is a colloquial term for the masses. By contrast, 70,000 surnames cover 90 percent of Americans.

The number of Chinese family names in use has tended to shrink as China’s population has grown, a winnowing of surnames that has occurred in many cultures over time.

At last count, China’s Wangs were leading with more than 92 million, followed by 91 million Lis and 86 million Zhangs. To refer to an unidentified person — the equivalent of “just anybody” in English — one Chinese saying can be loosely translated this way: “some Zhang, some Li.”

The potential for mix-ups is vast. There are nearly enough Chinese named Zhang Wei to populate the city of Pittsburgh. Nicknames are liberally bestowed in classrooms and workplaces to tell people apart. Confronting three students named Liu Fang, for example, one middle-school teacher nicknamed them Big, Little and Middle.

Wang Daliang, a linguistics scholar with the China Youth University for Political Science, said picking rare characters for given names only compounded the problem and inconvenienced everyone. “Using obscure names to avoid duplication of names or to be unique is not good,” he wrote in an e-mail response to questions.

“Now a lot of people are perplexed by their names,” he said. “The computer cannot even recognize them and people cannot read them. This has become an obstacle in communication.”

But Professor Zhou Youyong, dean of Southeast University’s law school, said the government should tread carefully in issuing any new regulation. “The right to name children is a basic right of citizens,” he said.

Miss Ma said that while her given name was unusual, bank employees, passport control clerks and ticket agents had always managed to deal with it, usually by writing it by hand. But when she tried to renew her identity card last August, she said, Beijing public security officials turned her down flat.

“Your name is so troublesome and problematic,” she recalled an official telling her. “Just change it.”

Miss Ma argues that the government’s technology should adapt, not her.

“There were no such regulations when I was born, so I should be entitled to keep my name for my whole life,” she said. If she changes her name to get an identity card, she noted, it will be wrong on all of her other documents, like her passport and university diploma.

Besides, she said, “I can’t think of another, better name.”

Using the time-honored Chinese method of backdoor connections, Miss Ma was able to get a temporary card in January. She must renew it every three months but considers that a small sacrifice for keeping her name.

Zhao C., a 23-year-old college student, gave up the fight for his. His father, a lawyer, chose the letter C from the English alphabet, saying it was simple, memorable and stood for China.

When he could not get a new identity card in 2006, Zhao C. sued. But security officials convinced him that it would cost millions of dollars to alter the database, his father said, so he dropped the suit in February.

His case might suggest that resistance against China’s powerful bureaucracy was futile. Still, the government’s plan to limit the use of characters has not gone all that smoothly.

The new rules were originally supposed to be issued by 2005. Now, 70 revisions later, they have yet to be put in place.

An official this week batted away questions, saying publicity might delay the rules even longer.
blanc   Tue Apr 21, 2009 8:47 am GMT
The best way to deal with a troll is to ignore him. Please, don't respond to Shuimo.
anti-Shuimo   Tue Apr 21, 2009 8:51 am GMT
Well, it is an interesting article what's with all of the debate on alphabets vs. characters that's been going on lately. Besides, it's in English so most people might actually understand...
alphabetitis   Tue Apr 21, 2009 12:52 pm GMT
<<The bureau’s computers, however, are programmed to read only 32,252 of the roughly 55,000 Chinese characters, according to a 2006 government report.>>

Shouldn't computers adapt to the local culture and language? Why restrict the lkanguage because current computer encodings don't include certain characters? Multi-byte (i.e. variable 1,2,3,or 4-bytegs per character) encodings schemes exist, and could be used to encode the additional characters that won't fit into a two-byte scheme.

Remember the "old days" when typewriters user a lower case 'L" for "1", so they could save keys? This was OK in the days of human-readable pages, but when computers came along, they had to add a real "1" to keyboards. Maybe China needs to come up with a new multi-byte encoding scheme so they can type in the full language?
Jef   Tue Apr 21, 2009 1:23 pm GMT
I rather appreciate this article being posted in English. Some of the articles posted by Shuimo actually spark my interest, but I cannot understand Chinese characters and I am too lazy to find a translator.
CANadian   Tue Apr 21, 2009 9:34 pm GMT
<< The best way to deal with a troll is to ignore him. >>

Speak for yourself, there's nothing troll-y about this article.
Big5   Wed Apr 22, 2009 12:59 am GMT
Any plans to add in the missing characters to GB18030 to bring the total up to 55000? From what I understand, this encoding scheme has room for a million or more characters.