Still unify to 𥦗 (U+25997) by UCV #22, because no actual use of abstract shape ⿱[宀][总] is found.
If the actual shape ⿱|宀||总| is regarded more important, it's okay for G-source to update the reference glyph of 𥦗 (U+25997).
Disagree with the analysis, agree with UCV in Henry #13328.
Fanqie information in Evidence #1, “△毒, …, 初錦反” shows 初 initial 侵 final (*tʃʰĭĕm) 上 tone, in phonetic symbol field {*TSƏM (參)}, instead of {*TSAU (喿)}.
But variation relationships are objective and are divided into two groups:
Agree unify to 𥳭 (U+25CED) according to liding unification rule.
There are four types of liding, 𥳭 (Shuowen) / ⿱𥫗肯 (Guangjinshi, same below) / 𦒕 / 𥰸. Any other liding should be unified to these four.
Agree unify to 𦎯 (U+263AF).
Fanqie information of 搆 shows 鈎豆反, 見 initial 侯 final 去 tone, M.C. *kəu.
Four phonetic symbol {*KOK (𡉉)}, {*KO (冓)}, {*KO (后)} and {*KO (丩)} cover this status. In phonetic symbol {*KOK (𡉉) > 㱿} there are 𤚲㝅𢐙.
For phonetic symbol 殸, there are only {*KLEŊ (殸)} which does not cover this status.
So, in M.C. 見侯去 status, |殸| actual shape belongs to [㱿] class.
As UCV #312d cover this case, should let this rule work.
UCV
For |殸| and |𣪊|:
撀 (GKX) ~ 㝅, 𢐙 (GKX) ~ 彀, 𣫘 (GKX) ~ 㲉, 𧹷 (GKX) ~ 𧹲, 𨢤 (GKX) ~ 𨢋
㯏 (GHZR) ~ 穀, 䅽² (GHZR) ~ 穀, 䡰³ (GHZR) ~ 轂, 𤛗 (GHZR) ~ 𤚲, 𩏜 (GHZR) ~ 轂, 𪍱 (GHZR) ~ 𪍠
𥊧 (B02905-001) ~ 瞉 (B02905)
𮭑 (B06142-016) ~ 鷇 (B06142)
𰒘 (GHZR) ~ 愨 <ORT>, “Since 殸 and 𣪊 often plays a phonetic part in characters, encoding the corrupted forms could lead to confusion. It is recommended that such corruption in vulgar characters should be unifiable via IVS.”
Besides difference in relative length of 丿/丶 stroke, |殸| have another 丨 stroke.
Agree not unify to 䇂 (U+41C2).
(1) Actual shape with multiple cognition, one from [䇂] > |⿱丶平|, another from [䘚] > |卒| > |⿳亠儿十| > |⿱丶平|. (2) Productive component: 𠱪𢬪𥞯𨒳𱲵 ...
Same case will be 囙:
(1) Actual shape with multiple cognition, one from [因] > |𡆬| > |囙|, another from [回] > |囙|. (2) Productive component: 𠰸𡖣𡛸𢈞𤇀𤇆𤤨 ...
Fanqie information shows 准閏反, 章 initial 諄 final 去 tone, here the phonetic symbol should be {*TUN (𦎫)}.
For 𣠐, M.C. status should be 見 initial 鐸 final 入 tone, and the phonetic symbol should be {*WAK (𩫖¹)}.
So the abstract shape for WS2021-01837 should be ⿰[木][𦎫] instead of ⿰[木][𩫖¹], which is unencoded. If we have actual shape ⿰|木||𦎫|, we should unify it to WS2021-01837.
Unify to 𠫼 (U+20AFC).
There are two “kinds” of “upper 齊”. One is |厽| shape, the other is |𠫸| shape. The second shape has three vertical bars downward. |厽| and |二| make [亝], |𠫸| and |二| make [𠫼]. Here we think |人| should be the two vertical bars in |𠫸|, so the whole shape should belong to [𠫼].
纒列反 gives 澄 initial 薛 final 入 tone, same as 徹.
So |⿱云𭕄| belongs to [⿱𠫓𭕄 < 㐬], |𠕎| belongs to [肉]. So the abstract shape should be ⿲[彳]⿱[㐬][肉][攵]. Since [㐬] may be different from [𠫓], WS2021-01310 may not be unified to 徹.
According to 汗簡, the correct form should be ⿰[犭]⿱[火][刀], the phonetic symbol should be {*LEK (狄)}.
According to UCV #305 ⿱ ~ ⿸, ⿸[狄][刀] ~ ⿱[狄][刀] = [𠜓].
[总] may be a special case for identifying abstract shape, because it came from simplification in {*TSOŊ (悤)} field.
The two PRC simplified forms that contain [总] is [总] from [總] and [聪] from [聰]. Another over-simplified forms, [𩨂] (GFZ) from [驄], [] (GZH) from [璁], [𱶚] (Leizhou) from [𥡥], establish strong correlation between [总] and [悤]. So if a true ⿱[宀][总] exists, it may connect to ⿱[宀][悤], that is WS2021-01052.
[窗] is related to [窻] [牕] [窓], M.C. status 初 initial 江 final 平 tone, in phonetic symbol {*TSOŊ (悤)}. [總] in 精 initial 東 final 上 tone, [聰] is related to [聦], in 清 initial 東 final 平 tone, all in phonetic symbol {*TSOŊ (悤)}. So the potential existed ⿱[宀][总] may be also in {*TSOŊ (悤)}.
Comment
Reply to #14310,
Radical 穴 comes from cognition. Seal [窗] should be analysed as ⿱[穴][囱], and seal [窻] should be analysed as ⿱[穴][悤 (phonetic)].
Then [悤] becomes |怱| |忽| |𱝼| |𢗀|. The |八| from |穴| is regarded to form a unity with |𱝼| |𢗀|, and thus constitutes the |忩| |总|.
The liding unification rule does not involve any authoritative discourse, but only the identification of inter-character relationship, which is in the scope of IRG work.
Determining that two Kaishu actual shapes correspond to the same ancient form, as suggested by IRGN2612, limits the determination to a relatively small scope. This unification rule would not be implicated if it were not special document (e.g., evidence or repertoire declaring the source of ancient forms), or special content (e.g., evidences here which are discussing character cognition directly referring to Shuowen or pre-Qin materials).
Comment
Reply to #14385,
Sorry. “Clearly a case of different components therefore abstract shape is different” may not be clear itself.
In current UCV list, one may find many so-called “different components” that correspond to same abstract shape, such as |壬| |王| (#1), |天| |夭| (#8), |戌| |戍| (#49), |凡| |卂| (#55), |圣| |𢀖| (#87), |屮| |山| (#96), and so on. So only when “different component” means “different abstract component”, the sentence is correct.
In this case, |囚| and |𠔽| are indeed different actual shape, but according to evidence, they belong to the same abstract shape [𠔽] that is part of [𦤝].
Sorry. I may not explain the information I listed in detail.
I use the “~” symbol for what I identify as a Z-variant relation (广义异写) (that may be unifiable). The parentheses indicate the source of the variation relation, e.g., an intercharacter relation identified by GHZR, GZH, GKX, or an intercharacter relation identified by the Variant Dictionary (using the same glyph number).
In particular, encoding determinations made during the Extension B period shall not be regarded as disunification due to different abstract shapes. So all of the examples I listed seem to me to be examples of what could be used as unification.
P.S. I may use “>” for what I identify as a Y-variant relation (广义异构) (that may be disunifiable), use “≁” for non-cognate, [...] for abstract shape (or abstract IDS), |...| for actual shape (or actual IDS), {*... (...)} for phonetic symbol field.
Comment
Reply to #14362 and #14368,
Thank you! I will change to 𧓋 (DCCV: C12501), in which DCCV represents Dictionary of Chinese Character Variants (異體字字典).
I'm still thinking about how to design a proper notation system that simplifies the presentation and clarifies the concepts in ORT reviewing, so the notation system is still unstable, sorry.
Wenzhou /ɡ/ corresponds to 見-匣 initial, Wenzhou /uɔ/ in 見-匣 initials corresponds to two groups, one is 肴-宵 finals ({包} {交} {卯} ...), the other is 江-唐-陽 finals ({旁} {當} {方} {工} ...), Wenzhou /³⁴/ 陽上 (the book says using /²⁴/ is to be conspicuous) corresponds to 上 tone.
Consider 見 initial and 江-唐-陽 finals, these syllables will become /kɔ̃/ in Taizhou (Xianju point), and /kɔ̃/ in Lishui (Yunhe point). Since we lack direct language material, it is unable to substantiate linguistic association.
IRG Working Set 2021v5.0
Source: Kushim JIANG
Date: Generated on 2024-12-05
Unification
Still unify to 𥦗 (U+25997) by UCV #22, because no actual use of abstract shape ⿱[宀][总] is found.
If the actual shape ⿱|宀||总| is regarded more important, it's okay for G-source to update the reference glyph of 𥦗 (U+25997).
Agree unify to 𣝼 (U+2377C).
Evidences all point to Shuowen, can apply liding unification rule.
𧓋 (C12501) ~ 𧔊 (C12501-001)
𧸪 (C13655-002) ~ 𧸳 (C13655)
𨞬 ~ 鄽 (GHZR: 4054.04)
⿰金𠪨 (C15168-002) ~ 𨮻 (C15168)
⿰𧾷𠪨 (B04971-005) ~ 躔 (B04971)
⿵門𠪨 (C15366-001) ~ 𨷠 (TB-2215) 𨷭 (TB-2219)
⿺走𠪨 (C13847-002) ~ 𧾡 (C13847)
So it's okay to add UCV 𠪨~廛.
𦕦 (GZH, T6) ~ ⿱⿰耳人⿰𰀪𰀠 (GZH) ~ 焣 ~ 聚
⿳艹⿰耳人⿰亻仅 (GZH) ~ ⿳艹⿰耳人⿰𠂈仪 (GZH) ~ 叢
⿱⿰耳𡿨⿸亻𧘇 (A03273-014) ~ ⿱⿰耳𡿨乑 (A03273-008) ~ 聚
⿱⿰耳人⿰人从 (A03273-028-1) ~ ⿱取巛 (A03273-007) ~ 聚
⿱宀⿰耳𡿨 (A00302-005) ~ 㝡 (A00302-005) > 最
⿱穴⿰耳𡿨 (A00302-004) ~ 𥦡 (A00302-?) > 最
𱀹 (GHZ) > 𨼥 (GKX)
𱌣 (GHZ) > 𪙦 (GKX)
Fanqie information in Evidence #1, “△毒, …, 初錦反” shows 初 initial 侵 final (*tʃʰĭĕm) 上 tone, in phonetic symbol field {*TSƏM (參)}, instead of {*TSAU (喿)}.
But variation relationships are objective and are divided into two groups:
Group 1 {*TSAU}: ⿱品尒/⿱品𠇍 ~ 喿:
⿰氵⿱品尒 (A02344-002) ~ 澡 (A02344)
⿰𧾷⿱品尒 (A04048-002) ~ 躁 (A04048)
⿱艹⿰氵⿱品尒 (A03591-008) ~ 藻 (A03591)
Group 2 {*TSƏM}: ⿱品尒/⿱品𠇍 ~ 參:
⿰亻⿱品尒 (SAT-03878, WS2021-00212) ~ 傪
⿰忄⿱品尒 (A01423-015-4) / ⿰忄⿱品𠇍 (A01423-004) ~ 慘 (A01423)
⿰馬⿱品尒 (B05845-005) ~ 驂 (B05845) ~ 𮪚 (B05845-001)
⿰扌⿱品尒 (A01685-006) ~ ⿰扌⿱品𠇍 (A01685-024) ~ 摻
⿰木⿱品尒 (KC07807) ~ 槮
𥶟 (GKX) ~ 篸
𮇿 (SAT-00064) ~ 糝
Two groups can have UCV level 2.
Agree unify to 𥳭 (U+25CED) according to liding unification rule.
There are four types of liding, 𥳭 (Shuowen) / ⿱𥫗肯 (Guangjinshi, same below) / 𦒕 / 𥰸. Any other liding should be unified to these four.
Agree unify to 𦎯 (U+263AF).
Fanqie information of 搆 shows 鈎豆反, 見 initial 侯 final 去 tone, M.C. *kəu.
Four phonetic symbol {*KOK (𡉉)}, {*KO (冓)}, {*KO (后)} and {*KO (丩)} cover this status. In phonetic symbol {*KOK (𡉉) > 㱿} there are 𤚲㝅𢐙.
For phonetic symbol 殸, there are only {*KLEŊ (殸)} which does not cover this status.
So, in M.C. 見侯去 status, |殸| actual shape belongs to [㱿] class.
As UCV #312d cover this case, should let this rule work.
撀 (GKX) ~ 㝅, 𢐙 (GKX) ~ 彀, 𣫘 (GKX) ~ 㲉, 𧹷 (GKX) ~ 𧹲, 𨢤 (GKX) ~ 𨢋
㯏 (GHZR) ~ 穀, 䅽² (GHZR) ~ 穀, 䡰³ (GHZR) ~ 轂, 𤛗 (GHZR) ~ 𤚲, 𩏜 (GHZR) ~ 轂, 𪍱 (GHZR) ~ 𪍠
𥊧 (B02905-001) ~ 瞉 (B02905)
𮭑 (B06142-016) ~ 鷇 (B06142)
𰒘 (GHZR) ~ 愨 <ORT>, “Since 殸 and 𣪊 often plays a phonetic part in characters, encoding the corrupted forms could lead to confusion. It is recommended that such corruption in vulgar characters should be unifiable via IVS.”
Besides difference in relative length of 丿/丶 stroke, |殸| have another 丨 stroke.
Agree unify to 䇂 (U+41C2) by liding unification rule.
Shuowen: “䇂, 辠也 ... 讀若愆. 張林說.”
Shuowen: “愆, 過也 ... 或从𡫜省. 籒文諐.”
Here 慧琳 connects 䇂 & 諐 with 𠌤𠍴 (愆).
(1) Actual shape with multiple cognition, one from [䇂] > |⿱丶平|, another from [䘚] > |卒| > |⿳亠儿十| > |⿱丶平|. (2) Productive component: 𠱪𢬪𥞯𨒳𱲵 ...
Same case will be 囙:
(1) Actual shape with multiple cognition, one from [因] > |𡆬| > |囙|, another from [回] > |囙|. (2) Productive component: 𠰸𡖣𡛸𢈞𤇀𤇆𤤨 ...
Fanqie information shows 准閏反, 章 initial 諄 final 去 tone, here the phonetic symbol should be {*TUN (𦎫)}.
For 𣠐, M.C. status should be 見 initial 鐸 final 入 tone, and the phonetic symbol should be {*WAK (𩫖¹)}.
So the abstract shape for WS2021-01837 should be ⿰[木][𦎫] instead of ⿰[木][𩫖¹], which is unencoded. If we have actual shape ⿰|木||𦎫|, we should unify it to WS2021-01837.
Unify to 𠫼 (U+20AFC).
There are two “kinds” of “upper 齊”. One is |厽| shape, the other is |𠫸| shape. The second shape has three vertical bars downward. |厽| and |二| make [亝], |𠫸| and |二| make [𠫼]. Here we think |人| should be the two vertical bars in |𠫸|, so the whole shape should belong to [𠫼].
纒列反 gives 澄 initial 薛 final 入 tone, same as 徹.
So |⿱云𭕄| belongs to [⿱𠫓𭕄 < 㐬], |𠕎| belongs to [肉]. So the abstract shape should be ⿲[彳]⿱[㐬][肉][攵]. Since [㐬] may be different from [𠫓], WS2021-01310 may not be unified to 徹.
Agree unify to 𠜓 (U+20713).
According to 汗簡, the correct form should be ⿰[犭]⿱[火][刀], the phonetic symbol should be {*LEK (狄)}.
According to UCV #305 ⿱ ~ ⿸, ⿸[狄][刀] ~ ⿱[狄][刀] = [𠜓].
Agree unify to 𪏻 (U+2A3FB).
Consider 𥣣 ~ 馛, 𩡍 ~ 馛, ⿰⿳禾氺田犬 (GZH) ~ 馛, ⿱⿰⿱禾氺⿹勹丿㐄 (SAT-09909) ~ 𤛿,
|⿱禾氺| should be upper part of seal 香 (𪏰𪏽) and should be [黍].
Examples:
⿳艹⿷匚炎⿰㐄㐄 (GZH) ~ 蕣.
⿰亻⿱⿰㐄㐄朩 (A00221-001) ~ 傑 (A00221).
⿰扌𱣎 (SAT-08367) ~ 搩.
⿰𧾷𱣎 (SAT-07000) ~ 𨃥.
According to the seal form, |夕| ~ [夊] is mirror form of |㐄| ~ [𡕒]. So it is okay to unify these two.
Evidence
▲ 北京书同文数字化技术有限公司, 古籍汉字字频统计. 商务印书馆, 2008: page 120
The G4K database captures the character and does not submit it to Ext B. Consider that filtering has been done when generating the G4K repertoire.
▲ 南京中医药大学 (编著). 中药大辞典 (下册) (第2版). 2006.
▲ 異體字研究資料集成 一期 別卷二 龍龕手鑑: p. 40
▲ 俄羅斯科學院東方研究所聖彼得堡分所, 中國社會科學院民族研究所, 上海古籍出版社 (編). 俄羅斯科學院東方研究所聖彼得堡分所藏黑水城文獻 4 漢文部分 TK158-TK300. 上海古籍出版社, 1997: p. 43.
▲ 孙伯君. 西夏新译佛经陀罗尼的对音研究. 中国社会科学出版社, 2010: p. 112.
▲ 中国宗教历史文献集成. 102, 民间宝卷 / 周燮藩主编; 濮文起分卷主编. 合肥: 黄山书社, 2005. p. 民 2-288.
▲ 沈克成, 沈迦. 温州话词语考释. 2009: page 627.
▲ 沈克成, 沈迦. 温州话 修订版. 宁波出版社, 2006: page 260.
Other
The two PRC simplified forms that contain [总] is [总] from [總] and [聪] from [聰]. Another over-simplified forms, [𩨂] (GFZ) from [驄], [] (GZH) from [璁], [𱶚] (Leizhou) from [𥡥], establish strong correlation between [总] and [悤]. So if a true ⿱[宀][总] exists, it may connect to ⿱[宀][悤], that is WS2021-01052.
[窗] is related to [窻] [牕] [窓], M.C. status 初 initial 江 final 平 tone, in phonetic symbol {*TSOŊ (悤)}. [總] in 精 initial 東 final 上 tone, [聰] is related to [聦], in 清 initial 東 final 平 tone, all in phonetic symbol {*TSOŊ (悤)}. So the potential existed ⿱[宀][总] may be also in {*TSOŊ (悤)}.
Radical 穴 comes from cognition. Seal [窗] should be analysed as ⿱[穴][囱], and seal [窻] should be analysed as ⿱[穴][悤 (phonetic)].
Then [悤] becomes |怱| |忽| |𱝼| |𢗀|. The |八| from |穴| is regarded to form a unity with |𱝼| |𢗀|, and thus constitutes the |忩| |总|.
▲ China NB. Proposal to Encode the Lisu Monosyllabic Script. WG2 N5047 = L2/19-208: pdf page 56.
▲ 木玉璋. 傈僳族音节文字字典. 知识版权出版社, 2006: page 193.
The three Lisu Zhushu characters recording “bird” /niɛ˧˥/ are different.
Determining that two Kaishu actual shapes correspond to the same ancient form, as suggested by IRGN2612, limits the determination to a relatively small scope. This unification rule would not be implicated if it were not special document (e.g., evidence or repertoire declaring the source of ancient forms), or special content (e.g., evidences here which are discussing character cognition directly referring to Shuowen or pre-Qin materials).
Sorry. “Clearly a case of different components therefore abstract shape is different” may not be clear itself.
In current UCV list, one may find many so-called “different components” that correspond to same abstract shape, such as |壬| |王| (#1), |天| |夭| (#8), |戌| |戍| (#49), |凡| |卂| (#55), |圣| |𢀖| (#87), |屮| |山| (#96), and so on. So only when “different component” means “different abstract component”, the sentence is correct.
In this case, |囚| and |𠔽| are indeed different actual shape, but according to evidence, they belong to the same abstract shape [𠔽] that is part of [𦤝].
Sorry. I may not explain the information I listed in detail.
I use the “~” symbol for what I identify as a Z-variant relation (广义异写) (that may be unifiable). The parentheses indicate the source of the variation relation, e.g., an intercharacter relation identified by GHZR, GZH, GKX, or an intercharacter relation identified by the Variant Dictionary (using the same glyph number).
In particular, encoding determinations made during the Extension B period shall not be regarded as disunification due to different abstract shapes. So all of the examples I listed seem to me to be examples of what could be used as unification.
P.S. I may use “>” for what I identify as a Y-variant relation (广义异构) (that may be disunifiable), use “≁” for non-cognate, [...] for abstract shape (or abstract IDS), |...| for actual shape (or actual IDS), {*... (...)} for phonetic symbol field.
Thank you! I will change to 𧓋 (DCCV: C12501), in which DCCV represents Dictionary of Chinese Character Variants (異體字字典).
I'm still thinking about how to design a proper notation system that simplifies the presentation and clarifies the concepts in ORT reviewing, so the notation system is still unstable, sorry.
Consider 見 initial and 江-唐-陽 finals, these syllables will become /kɔ̃/ in Taizhou (Xianju point), and /kɔ̃/ in Lishui (Yunhe point). Since we lack direct language material, it is unable to substantiate linguistic association.