Henry CHAN Comments | WS2021v6.0

Menu

Please wait while loading

IRG Working Set 2021v6.0

Source: Henry CHAN
Date: Generated on 2024-04-25

Hide Deleted | Show comments from version: 1.0 2.0 3.0 4.0 5.0 6.0 7.0

Unification

Sn	Image/Source	Comment Type	Description
03560	03560 虫 142.8.3 GKJ-00436 TS 14 · IDS ⿰虫肴	Unification	Unifiable to 𧍂 (U+27342)? U+27342 comes from Kangxi, with the given text: 【備考】【申集】【虫字部】【五音篇海】音肴。又音豪。 which indicates they are the same character.
		Unification	To add to Kushim's comment in #15443, here are the variants from MOE Dictionary: Suggest China to update the glyph of 𧍂 (U+27342) to ⿰虫肴, unify GKJ-00436 to 𧍂 (U+27342).
		Unification	Even if the various forms of 𧍂 (U+27342) are stably used in dictionaries, the fact that there were historically two forms considered more or less canonical does not mean we need to disunify them. First and foremost these are different canonical shapes in different dictionaries, and there is no contrast in shapes in the same dictionary. Second there is also no contrast in meaning in the same running text. There are variants without a doubt and the abstract shape is the same which is more than sufficient for unification. If some digitization projects wish to display one canonical form over the other, that is solely within the realm of Ideographic Variation Sequences. ISO 10646 is a character standard, not a glyph standard.
		Unification	In previous IRG meetings, we already have historical precedence of unification to an existing Extension B character, then subsequently updating the Extension B character to take the canonical shape. Some of these characters even involved the GHZ single sourced characters, as-is this case. IRG should be consistent in its handling of these cases.
		Unification	We must also take into account that different dictionaries in different ages have different standard for what is considered "canonical". We cannot simply encode another variant as a different character because some dictionaries at some historical period considered another form as canonical. Otherwise, all the characters containing "歩" instead of "步" need to be disunified as "歩" is considered the canonical form historically in Tang dynasty while "步" is considered the canonical form in China in modern day.
		Unification	Another case in WS2021 where we have unified the canonical shape to the non-canonical shape found in some authoritative dictionaries:
		Unification	Another case in WS2017, which involves a taboo character, however it was not found as a head character.
		Unification
		Unification
04730	04730 鳥 196.9.1 GKJ-00739 TS 20 · IDS ⿰鳥春	Unification	Unify to 𪄻 Level 2 UCV for 舂 and 春
04730	04730 鳥 196.9.1 GKJ-00739 TS 20 · IDS ⿰鳥春	Unification	Here are some examples from the MOE Dictionary: The form using 春 instead of 舂 is particularly common in 《集韻》.
01343	01343 心 61.8.5 GKJ-00748 TS 11 · IDS ⿰忄甾	Unification	Unify to 惱. This is a variant of 惱 without a doubt in the included evidences. Potentially new UCV level 2 甾 & 𡿺.
01343	01343 心 61.8.5 GKJ-00748 TS 11 · IDS ⿰忄甾	Unification	Unify to 惱; and add level 2 UCV 甾 & 𡿺. We have one existing disunification case in Extension B where U+254F2 is disunified from U+78AF. They are both variants of 瑙 U+7459. We have another disunification case in Extension A where U+4409 is disunified from U+8166 because they are non-cognate. Note, ROK has a normalization rule #190-1 in IRGN2573 which covers this exact case: Therefore the variation should be systematic and pretty common in handwriting.
01789	01789 水 85.9.1 KC-05249 TS 12 · IDS ⿰氵柝	Unification	Unifiable with 淅?
03165	03165 羊 123.11.1 SAT-06015 TS 17 · IDS ⿱殸羊	Unification	𦎼 (U+263BC) / 𦎯 (U+263AF). > The two comments together list 22 disunification examples, comment #14276 8 examples and #14290 14 examples. Also that GHZR42524.09 was withdrawn. Withdrawing a character can be because the submitter does not agree to unification. The quote that says unifiable was not made by the submitter. The case for removing 殸 is very strong. Examples are from Extension B, which are not considered valid prior examples of disunification by IRG. > The inclusion of ⿱殸⬚ in UCV 312d seems unreasonable as it is not an example of "differences in relative length of strokes" (j-2). UCV 312d should only cover ⿱𣪊⬚ and ⿹𣪊⬚, and ⿱殸⬚ should be removed from the rule. UCV #312d should be moved away from the section J-2 and moved into section j-3 Unification of similar shapes. The fact that 𣪊 is often miswritten as 殸 is not disupted. Sufficient evidence also exists for this particular charcter. For China's case they may prefer to withdraw a form which is malformed, but SAT does not assign "it is an error or not" determination to a character. To suggest to encode a character via IVS implies the characters are unifiable. Suggest to unify and encode as IVS, and keep the UCV rule as-is.
02629	02629 疒 104.10.5 UK-20388 TS 15 · IDS ⿸疒挐	Unification	Unify to 𤸻 (U+24E3B) or potentially withdrawn. The given evidence from UK and the evidence from Tao Yang suggest that the phonetic component should be 拏, not 挐. Even though in some sources 挐 is considered a variant of 拏, they are considered separate characters by various versions of Shuowen.

Attributes

Sn	Image/Source	Comment Type	Description
02305	02305 牛 93.1.5 UK-20572 TS 4 · IDS ⿹⺄⿻𠃊丄	Radical	Change Radical to 5.0 (乙), SC=3, FS=5 after the glyph change.
00027	00027 丨 2.14.4 VN-F1C8D TS 15 · IDS ⿺𱷥中	Radical	Radical 212.2, SC=4, FS=2

Evidence

Sn	Image/Source	Comment Type	Description
01474	01474 手 64.9.1 GDM-00269 TS 12 · IDS ⿰扌厘	Evidence	Based on the pronunciation, it is a potential misprint? I suggest to postponed this character.
04695	04695 鳥 196.5.1 GKJ-00329 TS 16 · IDS ⿰世鳥	Unclear evidence	Per #11949, appears to be a misprint of 䳄. Are there more evidences for ⿰世鳥?
04775	04775 鳥 196.13.1 GKJ-00335 TS 24 · IDS ⿱幹鳥	Evidence	This looks like a valid phonetic component replacement so I suggest keeping the character.
04712	04712 鳥 196.7.2 GKJ-00347 TS 18 · IDS ⿰呈鳥	Evidence	Consider changing the source reference to the GZ one because the GZ one can be looked up directly.
02664	02664 鳥 196.5.5 皮 107.11.3 GKJ-00348 TS 16 · IDS ⿰皮鳥	Evidence	Consider changing the source reference to the GZ one because the GZ one can be looked up directly.
02954	02954 竹 118.11.1 UK-20060 TS 17 · IDS ⿱竹𨊸	Evidence	This character was marked as "No Change" without any record of any reply from UK.
04215	04215 長 168.9.3 VN-F1BD7 TS 16 · IDS ⿰镸重	Evidence	The evidence from Andrew and Eiso seem to be a non-cognate but identical shaped character, maybe horizontal extension can be done.

Glyph Design & Normalization

Sn	Image/Source	Comment Type	Description
01942	01942 毛 82.5.4 UK-20228 TS 9 · IDS ⿺毛主	Glyph design	Consider changing the glyph to ⿺毛玍 based on comment #8367
01713	01713 日 72.13.4 VN-F191D TS 17 · IDS ⿰日滛	Normalization	Potentially normalize the shape on the right to 淫 because as in the dictionary entries, 淫 is also a common form when used as a component.

Editorial

Sn	Image/Source	Comment Type	Description
01146	01146 山 46.8.4 GDM-00251 TS 11 · IDS ⿰山底	Editorial issue	Fixed the status of this character on 2024-03-18.
03607	03607 虫 142.10.1 SAT-07153 TS 16 · IDS ⿱⿰臣巿虫	Editorial issue	I checked and have updated the total stroke count to 16. (Previously I marked it as 17). I miscounted the number of strokes for 臣 as 7. Correct value for 臣 is 6.
02888	02888 禾 115.12.5 TE-7465 TS 17 · IDS ⿰禾巽	Editorial issue	It is a known issue that when the source reference is changed the ORT stops being able to track the glyph of the previous source reference. Sorry for the inconvenience.