![]() GDM-00313 |
Date | Description |
---|---|
IRG #61 2023-10-16 (Mon) 12:04 pm -0400 Recorded by CHEN Zhuang | to be disucssed offline by IRG61. |
IRG #61 2023-10-17 (Tue) 8:17 am -0400 Recorded by CHEN Zhuang | back to m set, R=40.0 (宀), SC=9, FS=4, exceptional moving back. |
IRG #58 2022-03-15 (Tue) 8:46 am +0800 Recorded by CHEN Zhuang | Pending for more evidence |
Version | Description |
---|---|
3.0 | For 02899, add Discussion Record "Pending for more evidence, IRG 58." |
3.0 | For 02899, change Status to Postponed |
6.0 | For 02899, change Status to OK |
6.0 | For 02899, add Discussion Record "Back to M-set (exceptional case), R=40.0 (宀), SC=9, FS=4, IRG 61." |
6.0 | For 02899, change Radical to 40.0 (宀) |
6.0 | For 02899, change Stroke Count to 9 |
6.0 | For 02899, change First Stroke to 4 |
Source Reference | Glyph |
---|---|
GDM-00313 | ![]() |
group | China (GDM - Place Name Characters) |
a) Source reference | GDM-00313 |
b) PUA Code of TTF | E2B7 |
c) KangXi Radical Code(Primary) | 116.0 |
d) Stroke Count(Primary) | 7 |
e) First Stroke(Primary) | 2 |
g) Total Stroke Count | 12 |
i) IDS (Ideographic Description Sequence) | ⿱宀总 |
j) Similar/ Variants | N/A |
k) Ref. to Evidence doc | 宋元以来俗字谱 张克.公社的(宀总)口[J].山花,1961(11):3-5. 八辅 東坡集 檮杌閑評 花嶼詞 樂府詩鈔 綺樓重夢四十八回 愛日吟廬書畫錄 |
Review Comments
Single example of an idiosyncratic way of writing 窓, not sufficient evidence for encoding. Unify with 𥦗 by UCV #22.
Based on the new evidence, I still believe that unification with U+25997 𥦗 is appropriate.
Agree on unification to 𥦗 (U+25997).
Support the unification to 𥦗 (U+25997).
It’s common to write 穴 (⿱宀八/⿱宀儿) as ⿱宀丷 in the hand writing.
徐铁生: 《中华姓氏源流大辞典》,中华书局出版发行,北京市白帆印务有限公司,2014年1月北京第1版,2014年1月北京第1次印刷,ISBN978-7-101-09024-6, page1360
SJT 11239-2001 信息技术 信息交换用汉字编码字符集 第八辅助集
Considering the unification relates to the change of radical, I think it is better to encode the character seperately.
Still unify to 𥦗 (U+25997) by UCV #22, because no actual use of abstract shape ⿱[宀][总] is found.
If the actual shape ⿱|宀||总| is regarded more important, it's okay for G-source to update the reference glyph of 𥦗 (U+25997).
What's more, I‘d like to say that we can not live in theory and the ISO 10646 standard should be made for practical usage. The fact that the character is usually treated as different character from 𥦗 (U+25997) reflects the wide demand of using all two characters in the same context. Unifying this kind of common variants will cause that people use PUA to represent it, which can be easily avoided if we encode them seperately.
1) the radical of ⿱宀总 is 宀 while the radical of 𥦗 (U+25997) is 穴;
2) the variant ⿱宀总 is used in many books and 𥦗 (U+25997) as well;
3) ⿱宀总 is used as last name, which is important to people;
4) ⿱宀总 is used in Chinese place names;
5) ⿱宀总 is deleted from Extention I by China because it is in IRG WS2021;
6) people will normally treate ⿱宀总 and 𥦗 (U+25997) as different characters.
It is obviously better to encode ⿱宀总 seperately rather than unify it based on academic theory or other rigid “rules”.
▲ 《朱子語類》, 文淵閣四庫全書本, 卷五十三, folio 12A
▲ 《叶韻彙輯》, 文淵閣四庫全書本, 卷十二, folio 12A
▲ 《叶韻彙輯》, 文淵閣四庫全書本, 卷二, folio 1A
空 is written as ⿳宀丷工, but we also identify the radical as 穴 not 宀, that means the radical for this character is still 穴 not 宀.
As the experts’ comments, two forms are both needed in China, so maybe the best way is to use IVS not PUA for all the end-users.
▲ 北京书同文数字化技术有限公司, 古籍汉字字频统计. 商务印书馆, 2008: page 120
The G4K database captures the character and does not submit it to Ext B. Consider that filtering has been done when generating the G4K repertoire.
宋元以来俗字谱
檮杌閑評
花嶼詞
樂府詩鈔
To clarify, I am suggesting encoding the character seperately not just because some information processing systems don't support IVD. I am saying that, for this particular case, considering the facts I brought out in comment #14280, it is much better to encode ⿱宀总 seperately. For me, I will definately take part the character as ⿱宀总 at the first sight. What's more, the case of ⿳宀丷工 doesn't stand because ⿱丷工 is not a common character but ⿱丷𢗀 is.
Moreover, this character is used as frequently as 900 times in my database, quite stable to be encoded.
Experts diagree to encode ⿱宀总 seperately says:
1) Based on UCV#22, 八 and 丷 can be unified;
2)⿱宀总 should be take apart as ⿳宀丷𢗀 academically so ⿱宀丷 should be unified to 穴, and the radical of ⿱宀总 should be 穴(comment #14286).
Experts agree to encode ⿱宀总 seperately says:
1)Technically, ⿱宀总 and 𥦗 (U+25997) can be unified to each other;
2)Practically, most people would like to take part ⿱宀总 as ⿱宀总 because ⿱丷𢗀 is a very common character. Meanwhile, most people would like to take part 𥦗 (U+25997) as ⿳穴口心(i.e. ⿱穴𢗀). Then considering 宀 is not unifiable to 穴, 总 is not unifiable to 𢗀, so ⿱宀总 and 𥦗 (U+25997) should not be unified.
3)We think it is better to take part the character in a more practical way but take part the character academically thus they should not be unified.
4)Considering more:
a)The variant ⿱宀总 is used in many books and 𥦗 (U+25997) as well;
b) ⿱宀总 is used as last name, which is important to people;
c) ⿱宀总 is used in Chinese place names;
d) ⿱宀总 is deleted from Extention I by China because it is in IRG WS2021;
It is obviously better to encode ⿱宀总 seperately rather than unify it based on academic theory or other rigid “rules”.
Last but not least, I noticed that Eiso's comment in #14291 suggesting ORT manager should remove following source references:
宋元以来俗字谱
檮杌閑評
花嶼詞
樂府詩鈔
I'd like to say that we found totally more than 120 evidences for ⿱宀总, which are enough to prove point 4.a in this comment. I will post some of them in my next comment.
Evidence1: 朱錦琮 撰:治經堂詩集,清道光四年(1824)刻本,卷四,page20
Evidence2: 田汝耔 撰:漢隸分韻,清乾隆三十八年(1773)九沙萬氏刻本,序言,page1
Evidence3: 錢大昭 撰:廣雅疏義,日本昭和十五年(1940)靜嘉堂影印清抄本,卷十三,page11
Evidence4: 李調元 輯:蜀雅,清乾隆中綿州李氏萬卷樓刻、嘉靖十四年(1809)李鼎元重校印本,卷二十,page16
Evidence5: 阮文藻 撰:聽松濤館詩鈔,清道光十一年(1831)刻本,卷六,page20
Evidence6: 御定佩文韻府,清乾隆間寫摛藻堂四庫全書薈要本,卷八十一,page42
Evidence7: 王桂 撰:葵書,清光緒六年(1880)刻本,卷上,page52
Evidence8: 新刻今古傳奇,清嘉慶二十三年(1818)刻本,卷十二,page6
Evidence9: 彭元端 撰:五代史記注,清道光八年(1828)刻本,卷五十四,page4
Evidence10: 劉一明 撰:西遊原旨,清嘉慶二十四年(1819)湖南常德同善分社刻本,卷五,page5
Evidence11: 沈濤 撰:常山貞石記,清光緒二十年(1894)靈溪精舍刻本,卷十三,page26
Evidence12: 張居正 撰:明張文忠公詩文集,清宣統三年(1911)醉古堂石印本,卷十一,page15
Evidence13: 王三接 撰 ,(明)王用言 輯:王槐溪先生文集五卷,明萬曆三十六年(1608)王學曾刻本,卷一,page14
Evidence14: 吴嵩梁 撰:香蘇山館全集,清道光二十三年(1843)刻本,卷四,page12
Evidence15: 祁东县志,中国文史出版社,1992年10月,page81
Also seen in the PUA font of Zhong Hua Book Company(中华书局):
2. Why is 穴部, when it is composed of 宀(semantic)忩(phonetic), which is essentially equivalent to 宀(semantic)总(phonetic)?
As L F Cheng says (#14310), maybe the dictionary form is a normalization of ⿱宀总. If 𥦗 is not a common form, then China can consider changing the glyph for U+25997 to ⿱宀总.
The G-Source reference for U+25997 𥦗 is GHZ-42731.08, that means it is cited from 《汉语大字典》, see below.
In fact, 𥦗 and ⿱宀总 are both needed for China, so I think the normalization will affect so many databases. Unifying doesn’t mean we throw the character/form away. Dr. Lu has explained this issue again and again at the IRG meeting. What we need to consider should be as below.
1) Is there at least one piece of evidence shows 𥦗 and ⿱宀总 is not the variant of 窗? If yes, the character must be encoded separately, or we need to consider other unification; if not, ⿱宀总 should be unified with 𥦗, and consider how to update UCV.
2) The standard must go faster than reality. Macao SAR has registered several IVSes, but not all the IVSes has been run well in the OSes, even if in macOS, iOS, iPadOS and so on. We can not say this is not meaningful and ask Macao SAR not register IVS in future.
The most useful evidence to encode this character separately is the surname usage. 《中华姓氏源流大辞典》 shows the surname is cited from 《祁东县志》 and shows the reading and it is the variant of 窗, but the evidence cited from 《祁东县志》 just shows the character. If there is any evidence in 《祁东县志》 shows this character is not related to 窗, this character must be disunified.
There are more than 10,000 entries related to current U+25997 𥦗 in my databases. The form for U+25997 𥦗 is also stable. And so many databases had been used this form. It is better to keep the code chart stable and not normalize the G glyph.
The two PRC simplified forms that contain [总] is [总] from [總] and [聪] from [聰]. Another over-simplified forms, [𩨂] (GFZ) from [驄], [] (GZH) from [璁], [𱶚] (Leizhou) from [𥡥], establish strong correlation between [总] and [悤]. So if a true ⿱[宀][总] exists, it may connect to ⿱[宀][悤], that is WS2021-01052.
[窗] is related to [窻] [牕] [窓], M.C. status 初 initial 江 final 平 tone, in phonetic symbol {*TSOŊ (悤)}. [總] in 精 initial 東 final 上 tone, [聰] is related to [聦], in 清 initial 東 final 平 tone, all in phonetic symbol {*TSOŊ (悤)}. So the potential existed ⿱[宀][总] may be also in {*TSOŊ (悤)}.
Radical 穴 comes from cognition. Seal [窗] should be analysed as ⿱[穴][囱], and seal [窻] should be analysed as ⿱[穴][悤 (phonetic)].
Then [悤] becomes |怱| |忽| |𱝼| |𢗀|. The |八| from |穴| is regarded to form a unity with |𱝼| |𢗀|, and thus constitutes the |忩| |总|.
Again, to clarify, I am suggesting encoding the character seperately NOT JUST AND NOT MAINLY because some information processing systems don't support IVD. It is not about unifying a character meaning throwing away a character. So please do not involve unnecessary things into this particular case.
1. This character has developed distinct identity from 𥦗 in some regions (namely China).
2. This character has different underlying structure with 𥦗, signified by the orientation of two dots in the middle.
The criterion #1 is about subjective judgement, so we will defer it to experts from China, but it is perhaps only applicable to post-簡化字 usage, where 总 is formulated as a canonical shape.
As for #2, we have seen some analogous case in this WS: , where it can be justified if we can prove that this character is an outcome of separate development path from 𥦗. In favor of the assertion, as Kushim pointed (comment #14323), we also have in this WS that looks like its traditional variant without the 穴 radical. On the other hand, we also notice that, in Japanese 新字体, "window" is 窓 while 總 becomes 総, which suggests that the difference might be irrelevant. Do we have an evidence that shows somebody clearly recognizes the character as composition of 宀 and 总?
Given the sheer amount of discussion, it would be prudent for the IRG to err on the side of disunification in this particular case. What I wrote in the previous paragraph can also not be ignored. Let us accept this ideograph and move on to more important issues in this working set.
https://www.unicode.org/wg2/docs/n5234-WG2-M70-Recommendations.pdf
The character ⿱宀总 is in the first and second draft of GB 18030-2022 Amendment 1 and then is still included in the draft of Extension I after China removes 55 characters out of the set. It is included in Extension I in the last miniute. However, experts from China found that ⿱宀总 was in IRG WS2021. Seeing the character not suggested to be unified or withdrawn, in order to reduce the possibility of encoding duplicates, China decided to move this character out from Extension I. It will be very disappointing and reluctant if the character is unified. In fact, this character was formally approved by WG2.
What's more, it is reasonable to think that it is better to encode the character seperately as I commented in comment #14307.
The character is already in the PUA fonts of The Natural Resources Surveying and Monitoring Institute of Jiangxi Province(江西省自然资源测绘与监测院), The Ministry of Public Security of the People’s Republic of China(中华人民共和国公安部) and Zhonghua Book Store(中华书局). Unifying this character will simply cause the abuse of PUA because noramlly, they will be recognized as different characters with same meaning by people.
Last but not least, please try to think about this: If ⿱宀总 is unified to 𥦗 (U+25997) , then the radical of ⿱宀总 should be 穴, which is also reluctant.
I sincerely suggest to encode this character seperately based on what I wrote in this comment.
招子庸: 《粤謳》, 道光戊子年本, folio 1B