SM4712A Graduation Project

Theoretical Text

Steven Zixuan ZHOU 55668513

School of Creative Media,

City University of Hong Kong

 

Work Title: Generative Sawndip

 

Artist Statement: 

Sawndip is a logographic Chinese-derived writing system traditionally used by Tai-speaking people in Southern China and Northern Vietnam including ethnic Zhuang groups. Due to the lack of standardization, there are many variations of Sawndip characters across different regions and even different individuals. Generative Sawndip is a generative language art project using computer algorithms to create new Sawndip characters for the Zhuang minority language. Inspired by critical software art and generative art, I explore the possibilities between computer programming and critical social issues to find innovative ways of connecting underrepresented ethnic minority cultures with modern computer techniques. By using language data and following the historical rules of making Sawndip characters, the project generates the non-existing characters to show the nature of the Sawndip writing system and the Zhuang language culture. As a half Chinese and half Zhuang, the project is also to explore my cultural heritage and identity by making my version of Sawndip characters.

 

Motives:

One of the motives to make this project came from the concerns about the situation of ethnic minorities and the issues of race and ethnicity. According to Healey & Palriwala (1998), a minority group refers to a group of people who experience comparably disadvantages to a major social group. Therefore, the project focuses on ethnic minority groups. Additionally, another motive came from my background of being an ethnic minority. Since my grow-up environment lost the connections with the Zhuang minority culture, the project is to explore my cultural identity of being half Chinese and half Zhuang. Moreover, in terms of cultural identity, Karjalainen (2020) stated that studies on different cultures and the concept of cultural identity remain limited due to the influential studies of Hofstede, which led to increased interest in national cultures. The cultural identity of ethnic minorities tends to be neglected. Therefore, the project also discusses and studies the culture of the Zhuang minority groups. 

 

Artistic Concept:

The artistic concept is to combine modern computer techniques with the culture of ethnic minorities. In the research of artworks related to ethnic minorities, most of the artworks about the ethnic minority culture are presented in traditional media forms. There is a niche in presenting the ethnic minority culture in the new media form. Therefore, the project uses computer algorithms to study and discuss the Zhuang minority groups from a new perspective. 

 

Also, the project focuses on the language of Zhuang minority groups to study the Zhuang culture. Today, the Zhuang language is written in a Latin-based writing system, while Sawndip was the traditional writing transcript used by the Zhuang people in the past. The project converts the historical rules of making Sawndip characters into computer algorithms to create new and my version of Sawndip characters. Moreover, Schwenger (2019) discussed a kind of writing called "asemic writing" that does not intend to convey any message but only the nature of writing. Thus, the new generative Sawndip characters are also not used for communication but just as an art of writing scripts that show the nature of Sawndip characters and how they are constructed. 

 

Sawndip Characters:

The traditional Zhuang writing script, Sawndip, is a logographic Chinese-derived writing system used for writing Tai languages in Southern China and Northern Vietnam by many ethnic groups including Bouyei, Nùng, Tày, and Zhuang. Most of the Sawndip characters are created based on Chinese characters. However, compared to the similar traditional Vietnamese writing system, Sawndip has never been standardized which led to many variations existing in different places (Holm, 2013). 

 

According to Bauer (2000), there are 9 principles to classify Sawndip characters based on how they were constructed and their origins. 1. Symbols that are not originated from Chinese characters and probably are borrowed from other writing systems including Latin alphabets and Burmese. 2. Non-standard Chinese-like characters created by semantic compounds, which also exist in Chinese. 3. Non-standard Chinese-like characters created by phono-semantic compounds. 4. Chinese characters are borrowed only for indicating the native Zhuang pronunciations. 5. Chinese characters are borrowed only for indicating the native meaning of Zhuang words. 6. Chinese loanwords. 7. Written Cantonese characters. 8. Non-standard Chinese-like characters were created to indicate the meaning in accordance with indicative ideograms. 9. Non-standard Chinese-like characters created by phonetic compounds to “spell out” the native pronunciations, which are regarded as “fanqie” in Chinese. Among these characters, there is more than 70 percent of the Sawndip characters are created in phonetic ways.

 

Sawndip Literature:

Sawndip literature is the Zhuang literature written in Sawndip characters. Among the various pieces of literature, a fairy tale, The Orphan Girl and the Rich Girl, has attracted much attention. The story starts with an orphan girl who was abused by her stepmother after her biological parents passed away. On the day of the festival, her biological mother turned into a crow to help her dress up to attend the festival. At the festival, she met the local prince and lost one of her shoes when she left. The prince used the shoe to find her and finally married the girl. The story is regarded as very similar to the Western story Cinderella (Beauchamp, 2010).

 

Techniques:

1.     Generating Process

Figure 1 shows the process of generating new Sawndip characters. The generation starts with the inputs, a JSON object of a Zhuang word, and a Chinese character that indicates the meaning of the Zhuang word. The number of the input Chinese characters should be less than 2. The algorithm will select a rule according to the Chinese characters. If two Chinese characters are inputted, the system will choose rule 1, the semantic compounds. If the input Chinese character is a component or radical, it will randomly select one rule from rules 2, 5, and 6. If the input Chinese character is a character, it will randomly select one rule from rules 2, 3, 4, and 6. Then respectively according to the rules, the algorithm will find another Chinese character that matches the pronunciation of the Zhuang word and select a suitable decomposition to construct the character. The generation creates one character for one Zhuang word.

 

Diagram

Description automatically generated

Figure 1

 

2.     Rules

Different from the original principles, the project selected and summarized the 9 principles into 6 rules in the generating process. 1. Non-standard Chinese-like characters are created via semantic compounds. 2. Non-standard Chinese-like characters are created via phono-semantic compounds, which also widely exists in Chinese. 3. Standard Chinese characters are borrowed solely for pronunciations. 4. Standard Chinese characters are borrowed solely for their meanings. 5. Indicative ideograms (the characters’ radicals or components). 6. Characters are made by phonetic compounds to "spell out" the pronunciation of the Zhuang word, which refers to the “fanqie” system in Chinese.

 

3.     Data

The language data includes Chinese, Zhuang, and Chinese strokes data, which are all sorted in JSON format. The Zhuang data are from Kaikki.org. The Chinese data are from the project, Make Me a Hanzi.

 

4.     Ideographic Description Characters (Unicode Blocks)

Ideographic Description Character is a Unicode block to describe the non-existing CJK characters or the characters that are not included in the Unicode Standard. The unencoded ideographs can be described using the existing characters and the Ideographic Description blocks. There are 12 Ideographic Description Characters shown in figure 2, which represent the 12 primary decompositions of Chinese characters. (Howe, n.d.). To describe a non-existing character, it needs an Ideographic Description Character as the decomposition with the needed existing characters, which is the Ideographic Description Sequence. However, the Ideographic Description Sequences cannot be directly visualized into a character by Unicode to display on computers.

Table

Description automatically generated with low confidence

Figure 2

 

5.     SVG path and the visualization of non-existing characters

Scalable Vector Graphics (SVG) is an XML-based vector image format for 2D graphics, which can be scaled in different sizes without losing quality. By SVG path, computers can draw lines, curves, and arcs. This can achieve the effects of drawing character strokes and drawing the non-existing characters. Figure 3 shows the examples of non-existing characters drawn by the SVG path, highlighting the different components of the characters. Each character represents one decomposition.

Logo, company name

Description automatically generated

Figure 3

 

6.     Decompositions

As mentioned in the generating process, the algorithm will select a suitable decomposition, which depends on the characters. However, not all decompositions are suitable for the generative characters. First, the algorithm skipped the third and the fourth decomposition in figure 2 (the top-left corner), which could be achieved by the first two decompositions. The last one is also removed from the generation since it specifically depends on the characters’ shapes, which is not suitable for the generating process.

Additionally, components and characters also need different position arrangements on the canvas. As shown in figure 4, the size of the components is generally smaller than the character, and the position is also different even though they are both set in the same location parameters. Moreover, apart from the first two decompositions, other decompositions are also only available for some specific components. For example, the fifth one needs the radical “.

 

Diagram

Description automatically generated with low confidence

Figure 4

 

7.     Pronunciation matching

For the pronunciation-matching character, "regular expressions”  are used to search for letters. Let's take the word "Bouxcuengh" (the Zhuang people) as an example. First, the algorithm looks up the phonemes for the word, using the International Phonetic Alphabets (IPA) system. For "Bouxcuengh", the phonemes are "/pou˦˨ ɕuːŋ˧/". Second, since Zhuang words can be either monosyllabic or multisyllabic, it checks if the phoneme string contains one or more spaces. If so, then it's a multisyllabic word, as in the case of "Bouxcuengh". Third, according to the commonly-used Soundex phonetic algorithm, it is consonants that primarily affect pronunciation similarity. Therefore, it matches the consonants of the syllables for pronunciation, or the first IPA phone if the syllable doesn't contain a consonant. The table below demonstrates the mapping from the Zhuang alphabet to Chinese pinyin. Some letters may correspond to multiple pinyin letters, so, in these cases, the algorithm randomly selects one of the options, and then returns the pronunciation-matching character.

 

Table

Description automatically generated

Figure 5

Conclusion:

Generative Sawndip is a generative language art using computer algorithms to generate new Sawndip characters for the Zhuang language. It focuses on computer programming and the language of Zhuang minority groups. The project explores the culture and identity of the ethnic Zhuang minority, presenting them with new media forms.

 

Reference:

Bauer, R. S. (2000). The Chinese-Based Writing System of the Zhuang Language, Cahiers de Linguistique Asie Orientale, 29(2), 223-253. doi: https://doi.org/10.1163/19606028-90000082

 

Beauchamp, F. (2010). Asian Origins of Cinderella: The Zhuang Storyteller of Guangxi. Oral Tradition, 25(2), 0.

 

Howe, D. C. (n.d.). Rednoise.org. Retrieved April 15, 2022, from https://rednoise.org/daniel/radicaloftheverticalheart

Holm, D. (2013). Mapping the old Zhuang character script : A vernacular writing system from Southern China (Handbuch der Orientalistik. Vierte Abteilung, China ; 28. Bd). Boston, Mass.: Brill.

Healey, J., & Palriwala, R. (1998). Race, ethnicity, gender and class: The sociology of group conflict and change. Contributions to Indian Sociology, 32(1), 150.

Karjalainen, H. (2020). Cultural identity and its impact on today’s multicultural organizations. International Journal of Cross Cultural Management, 20(2), 249–262. https://doi.org/10.1177/1470595820944207

National Archives and Records Administration. (n.d.). Soundex system. National Archives and Records Administration. Retrieved April 15, 2022, from https://www.archives.gov/research/census/soundex

Schwenger, P. (2019). Asemic : The art of writing. Minneapolis: University of Minnesota Press.