SM4712A Graduation Project
Theoretical Text
Steven
Zixuan ZHOU 55668513
School
of Creative Media,
City
University of Hong Kong
Work Title: Generative Sawndip
Artist Statement:
Sawndip is a logographic Chinese-derived writing system traditionally
used by Tai-speaking people in Southern China and Northern Vietnam including
ethnic Zhuang groups. Due to the lack of standardization, there are many
variations of Sawndip characters across different regions and even different
individuals. Generative Sawndip is a generative language art project using
computer algorithms to create new Sawndip characters for the Zhuang minority
language. Inspired by critical software art and generative art, I explore the
possibilities between computer programming and critical social issues to find
innovative ways of connecting underrepresented ethnic minority cultures with
modern computer techniques. By using language data and following the historical
rules of making Sawndip characters, the project generates the non-existing
characters to show the nature of the Sawndip writing system and the Zhuang
language culture. As a half Chinese and half Zhuang, the project is also to
explore my cultural heritage and identity by making my version of Sawndip
characters.
Motives:
One of the motives to make this project came from the
concerns about the situation of ethnic minorities and the issues of race and
ethnicity. According to Healey & Palriwala
(1998), a minority group refers to a group of people who experience comparably
disadvantages to a major social group. Therefore, the project focuses on ethnic
minority groups. Additionally, another motive came from my background of being
an ethnic minority. Since my grow-up environment lost the connections with the
Zhuang minority culture, the project is to explore my cultural identity of
being half Chinese and half Zhuang. Moreover, in terms of cultural identity,
Karjalainen (2020) stated that studies on different cultures and the concept of
cultural identity remain limited due to the influential studies of Hofstede,
which led to increased interest in national cultures. The cultural identity of
ethnic minorities tends to be neglected. Therefore, the project also discusses
and studies the culture of the Zhuang minority groups.
Artistic Concept:
The artistic concept is to combine modern computer
techniques with the culture of ethnic minorities. In the research of artworks
related to ethnic minorities, most of the artworks about the ethnic minority
culture are presented in traditional media forms. There is a niche in
presenting the ethnic minority culture in the new media form. Therefore, the
project uses computer algorithms to study and discuss the Zhuang minority
groups from a new perspective.
Also, the project focuses on the language of Zhuang
minority groups to study the Zhuang culture. Today, the Zhuang language is
written in a Latin-based writing system, while Sawndip was the traditional
writing transcript used by the Zhuang people in the past. The project converts
the historical rules of making Sawndip characters into computer algorithms to
create new and my version of Sawndip characters. Moreover, Schwenger
(2019) discussed a kind of writing called "asemic
writing" that does not intend to convey any message but only the nature of
writing. Thus, the new generative Sawndip characters are also not used for
communication but just as an art of writing scripts that show the nature of
Sawndip characters and how they are constructed.
Sawndip Characters:
The traditional Zhuang writing script, Sawndip, is a
logographic Chinese-derived writing system used for writing Tai languages in
Southern China and Northern Vietnam by many ethnic groups including Bouyei, Nùng, Tày, and Zhuang. Most of
the Sawndip characters are created based on Chinese characters. However,
compared to the similar traditional Vietnamese writing system, Sawndip has
never been standardized which led to many variations existing in different
places (Holm, 2013).
According to Bauer (2000), there are 9 principles to
classify Sawndip characters based on how they were constructed and their
origins. 1. Symbols that are not originated from Chinese characters and
probably are borrowed from other writing systems including Latin alphabets and
Burmese. 2. Non-standard Chinese-like characters created by semantic compounds,
which also exist in Chinese. 3. Non-standard Chinese-like characters created by
phono-semantic compounds. 4. Chinese characters are borrowed only for
indicating the native Zhuang pronunciations. 5. Chinese characters are borrowed
only for indicating the native meaning of Zhuang words. 6. Chinese loanwords.
7. Written Cantonese characters. 8. Non-standard Chinese-like characters were
created to indicate the meaning in accordance with indicative ideograms. 9.
Non-standard Chinese-like characters created by phonetic compounds to “spell
out” the native pronunciations, which are regarded as “fanqie” in
Chinese. Among these characters, there is more than 70 percent of the Sawndip
characters are created in phonetic ways.
Sawndip Literature:
Sawndip literature is the Zhuang literature written in
Sawndip characters. Among the various pieces of literature, a fairy tale, The
Orphan Girl and the Rich Girl, has attracted much attention. The story
starts with an orphan girl who was abused by her stepmother after her
biological parents passed away. On the day of the festival, her biological
mother turned into a crow to help her dress up to attend the festival. At the
festival, she met the local prince and lost one of her shoes when she left. The
prince used the shoe to find her and finally married the girl. The story is
regarded as very similar to the Western story Cinderella (Beauchamp, 2010).
Techniques:
1.
Generating Process
Figure
1 shows the process of generating new Sawndip characters. The generation starts
with the inputs, a JSON object of a Zhuang word, and a Chinese character that
indicates the meaning of the Zhuang word. The number of the input Chinese
characters should be less than 2. The algorithm will select a rule according to
the Chinese characters. If two Chinese characters are inputted, the system will
choose rule 1, the semantic compounds. If the input Chinese character is a
component or radical, it will randomly select one rule from rules 2, 5, and 6.
If the input Chinese character is a character, it will randomly select one rule
from rules 2, 3, 4, and 6. Then respectively according to the rules, the
algorithm will find another Chinese character that matches the pronunciation of
the Zhuang word and select a suitable decomposition to construct the character.
The generation creates one character for one Zhuang word.
Figure
1
2.
Rules
Different
from the original principles, the project selected and summarized the 9
principles into 6 rules in the generating process. 1. Non-standard Chinese-like
characters are created via semantic compounds. 2. Non-standard Chinese-like
characters are created via phono-semantic compounds, which also widely exists
in Chinese. 3. Standard Chinese characters are borrowed solely for
pronunciations. 4. Standard Chinese characters are borrowed solely for their
meanings. 5. Indicative ideograms (the characters’ radicals or components). 6.
Characters are made by phonetic compounds to "spell out" the
pronunciation of the Zhuang word, which refers to the “fanqie” system in
Chinese.
3. Data
The language data includes Chinese, Zhuang, and Chinese strokes data, which are all sorted in JSON format. The Zhuang data are from Kaikki.org. The Chinese data are from the project, Make Me a Hanzi.
4. Ideographic Description
Characters (Unicode Blocks)
Ideographic Description
Character is a Unicode block to describe the non-existing CJK characters or the
characters that are not included in the Unicode Standard. The unencoded
ideographs can be described using the existing characters and the Ideographic
Description blocks. There are 12 Ideographic Description Characters shown in
figure 2, which represent the 12 primary decompositions of Chinese characters.
(Howe, n.d.). To describe a non-existing character, it needs an Ideographic
Description Character as the decomposition with the needed existing characters,
which is the Ideographic Description Sequence. However, the Ideographic
Description Sequences cannot be directly visualized into a character by Unicode
to display on computers.
Figure
2
5.
SVG path and the visualization of
non-existing characters
Scalable
Vector Graphics (SVG) is an XML-based vector image format for 2D graphics,
which can be scaled in different sizes without losing quality. By SVG path,
computers can draw lines, curves, and arcs. This can achieve the effects of
drawing character strokes and drawing the non-existing characters. Figure 3
shows the examples of non-existing characters drawn by the SVG path,
highlighting the different components of the characters. Each character
represents one decomposition.
Figure 3
6. Decompositions
As mentioned in the generating process, the algorithm will select a suitable decomposition, which depends on the characters. However, not all decompositions are suitable for the generative characters. First, the algorithm skipped the third and the fourth decomposition in figure 2 (the top-left corner), which could be achieved by the first two decompositions. The last one is also removed from the generation since it specifically depends on the characters’ shapes, which is not suitable for the generating process.
Additionally, components and characters also need different position arrangements on the canvas. As shown in figure 4, the size of the components is generally smaller than the character, and the position is also different even though they are both set in the same location parameters. Moreover, apart from the first two decompositions, other decompositions are also only available for some specific components. For example, the fifth one needs the radical “⼞”.
Figure 4
7. Pronunciation matching
For the pronunciation-matching character, "regular expressions” are used to search for letters. Let's take the word "Bouxcuengh" (the Zhuang people) as an example. First, the algorithm looks up the phonemes for the word, using the International Phonetic Alphabets (IPA) system. For "Bouxcuengh", the phonemes are "/pou˦˨ ɕuːŋ˧/". Second, since Zhuang words can be either monosyllabic or multisyllabic, it checks if the phoneme string contains one or more spaces. If so, then it's a multisyllabic word, as in the case of "Bouxcuengh". Third, according to the commonly-used Soundex phonetic algorithm, it is consonants that primarily affect pronunciation similarity. Therefore, it matches the consonants of the syllables for pronunciation, or the first IPA phone if the syllable doesn't contain a consonant. The table below demonstrates the mapping from the Zhuang alphabet to Chinese pinyin. Some letters may correspond to multiple pinyin letters, so, in these cases, the algorithm randomly selects one of the options, and then returns the pronunciation-matching character.
Figure 5
Conclusion:
Generative
Sawndip is a generative language art using computer algorithms to generate new
Sawndip characters for the Zhuang language. It focuses on computer programming and
the language of Zhuang minority groups. The project explores the culture and
identity of the ethnic Zhuang minority, presenting them with new media forms.
Reference:
Bauer, R. S. (2000). The Chinese-Based Writing System of the Zhuang Language, Cahiers de Linguistique Asie Orientale, 29(2), 223-253. doi: https://doi.org/10.1163/19606028-90000082
Beauchamp, F. (2010). Asian Origins of Cinderella: The Zhuang
Storyteller of Guangxi. Oral Tradition, 25(2), 0.
Howe, D. C. (n.d.). Rednoise.org. Retrieved April 15, 2022, from https://rednoise.org/daniel/radicaloftheverticalheart
Holm, D. (2013). Mapping the old Zhuang character script : A vernacular writing system from Southern China (Handbuch der Orientalistik. Vierte Abteilung, China ; 28. Bd). Boston, Mass.: Brill.
Healey, J., & Palriwala, R. (1998). Race, ethnicity, gender and class: The sociology of group conflict and change. Contributions to Indian Sociology, 32(1), 150.
Karjalainen, H. (2020). Cultural identity and its impact on today’s multicultural organizations. International Journal of Cross Cultural Management, 20(2), 249–262. https://doi.org/10.1177/1470595820944207
National Archives and Records Administration. (n.d.). Soundex system. National Archives and Records Administration. Retrieved April 15, 2022, from https://www.archives.gov/research/census/soundex
Schwenger, P. (2019). Asemic : The art of writing. Minneapolis: University of Minnesota Press.