The shi-shi-shi poem

Byron Han shared this article [1] by Jian Shuo Wang about a Chinese poem written in 1930 in which every word is "shi" [2].

Here's the text of the poem:

石室诗士施氏,嗜食狮,誓食十狮。适施氏时时适市视狮。十时,适十狮适市。是时,适施氏适市。氏视是十狮,恃矢势,使是十狮逝世。氏拾是十狮尸,适石室。石室湿,氏使侍拭石室。石室拭,氏始试食是十狮尸。食时,始识是十狮尸,实十石狮尸。试释是事。

I selected this text and told my Mac to "Start Speaking". For some reason in my System Preferences I had the "Kyoko" voice selected at the slowest speed, so it came out sounding like a drunk Japanese woman.


I changed to the "Ting-Ting" voice at normal speed and heard it in mainland Chinese as intended. Fun!


In Cantonese (the "Sin-Ji" voice) the words are not only different from Mandarin but much more different from each other than in Mandarin. In Cantonese most of the syllables are "see", a few are "sik", quite a few are different altogether. This did not surprise me.


What did surprise me was how it came out in the Taiwanese voice ("Ya-Ling"). I had thought mainland and Taiwanese Mandarin were pronounced the same word for word, with the only difference being a matter of accent, the way Minnesotans pronounce harder R's than Californians. But it turns out there are some words that are pronounced completely differently.


Wang gives a translation of the poem in his blog post. Rather than copy it here, I suggest you visit his blog at the link I gave. I like what Google Translate returns, especially the phrase "relies on the vector potential":

The sarcophagus poetry Guests Amur, addicted to food lion, oath eat ten lions. Suitable Amur always suitable for the city, as the lion.Ten o'clock, fitness, fitness ten lions. Yes, the appropriate Amur appropriate city. 's Depending on those ten lions, relies on the vector potential, so that the death of ten lions. S pick the ten lions corpse, fitness sarcophagus. Shishi wet, that's so paternity swab sarcophagus. The sarcophagus the swab,'s start tasting is ten lions corpse. Food before consensus is ten lions corpse, the real ten lions corpse. Interpreting things.

I used the Mac's "say -o" command to generate the above audio files. The output was in AIFF format, which I converted to MP3 using an app called "Music Converter".

UPDATE 1: Edited to mention Jian Shuo Wang by name, and to mention that he offers his own translation of the poem.

UPDATE 2: It occurred to me that a person claiming to be a native Mandarin speaker could be tested by being made to read this poem aloud. In theory a real native speaker would have less trouble getting the tones right than a faker, making this poem — dare I say — a sort of "shi-bboleth".


[1] Actually Byron linked to an article on Shanghaiist.com, but I'm linking here to the original article which it links to.

[2] Note that "shi" is not pronounced "shee" as you might think. It's more like the English word "shirr". It amuses me that the word "是" by itself, which means "Yes" or "Okay," sounds like "Sure." It's an affirmative answer in two languages.

Minor fixes to Speech voices

This is an update regarding my post about the international speech-to-text options on the Mac — specifically, about the text fragments they use for the "speakers" to introduce themselves.

The text fragment used by the Cantonese Sin-Ji voice still contains the name "Sin-Ji" in Romanized form, but somehow when it's pronounced it sounds more like Cantonese than before.

Here's the text fragment that gets spoken:

$ plutil -convert xml1 /System/Library/Speech/Voices/Sin-Ji.SpeechVoice/Contents/Info.plist  -o - | grep -A 1 VoiceDemoText
 
<key>VoiceDemoText</key>
<string>您好,我叫 Sin-Ji。我講廣東話。</string>

Here's how it used to sound:


Here's how it now sounds:


I wonder how it does that — how the text-to-speech engine can possibly know the right tones to use for "Sin-Ji", or even that it's a Chinese name. Regardless, I wish they'd use the Chinese characters for the name because I'd like to know what they are.

The text fragment used by the Taiwanese Ya-Ling voice does what I want. It has replaced the Romanized "Ya-Ling" with Chinese characters.

Here's the new text fragment ("Ya-Ling" has been replaced with "雅玲"):

$ plutil -convert xml1 /System/Library/Speech/Voices/Ya-Ling.SpeechVoice/Contents/Info.plist  -o - | grep -A 1 VoiceDemoText
 
<key>VoiceDemoText</key>
<string>您好,我叫 雅玲。我說國語。</string>

Here's how it used to sound — pretty disastrous:


Here's how it now sounds — note how different the "Ya-Ling" is:


You can hear these voices yourself by going to System Preferences > Dictation & Speech > Text to Speech. If you don't already have the voices for these languages installed, it'll take a while to download them.

I'm on 10.8.2. I don't know when Apple made these changes.