Based on the provided evidence, the glyph shape should be recognized as the component in the middle one instead of the current glyph shape in the provided font. It would subsequently be unifiable with 鮆 (U+9B86).
This character seems to be a corrupted form of 𩼥 (U+29F25) where the top grass radical has been moved to the left and joined with 大 to become a 关 structure, with the bottom part added an additional dot as a generalization to another 关.
The main sources quoted for these characters are known to contain vulgar forms of characters. It may be more appropriate to encode them as unified variants of existing characters instead of as new characters.
I suggest a new UCV of 榦 & 𠏉 only, with 幹 excluded. In that case it won't be unified with 澣 (U+6FA3), but it can be unified with ⿰氵榦 if we see it in the future.
I think we should remove NUCV #317. That rule was put there at a time when we did not have the Ideographic Variation Database. The form 靣 is sufficiently rare nowadays but was a common variant form of 面, which makes it a good candidate to be encoded as an ideographic variation instead of as a separate character.
This seems to be an erroneous form of U+8564 蕤. Suggest to include more non-computer typed evidences to confirm if this is a stable error, or withdraw.
The new evidence provided by Eiso shows that the sound is 穴, which indicates that the inner component should be 戉 instead of 戊. Would there be another evidence to show that 戊 is the expected form?
I believe the shape for 7064 in this version is a misprint. Based on the number of strokes in the right hand side index, 7064 should be 臬 (10 strokes) instead of ⿱自夲 (11 strokes). The surrounding characters 7063 and 7067 are also 真 (10 strokes) and 欮 (10 strokes) respectively.
(The characters are ordered by stroke count in ascending order. The only exception is the possibility of an additional character added in the originally empty space at the end of the list. Cf: 7078 闆 (9 strokes) for 門 and 7021 錳 (8 strokes) for 金.)
⿰截鳥 is not an error form, but an alternate form closer to modern conventions. In modern conventions 截 is used instead of 𢧵.
Theoretically we could add this as a UCV as this is a systematic transliteration variant, but most variants are already coded. Or we should consider adding this as an NUCV and encoding the remaining variants.
As mentioned in the meeting by Wang Yifan, the same character was submitted in WS2017 as 03913 and was postponed because the text suggests it is a miswritten ligature.
In IRG #57, there is discussion whether this should be unified with U+22994, as WS2021-01417 (⿱木戈) and WS2021-01418 (⿱水戈) constitute a continuum of variants to U+22994.
To recap the discussion: It would be better to judge for Unification via a new UCV rule with specific constraints, or to encode it separately, if more data could be provided on how systematic the "stable misprint" or "stable variation" between 卩 and 阝 components. For example, if it is sufficiently rare, IRG may choose to encode this "error" as a separate character. But if such variation is quite common in SAT's corpus, it would be better to unify them all systematically.
To recap the discussion, another variant (WS2017-01228) also exists. It will be easier for IRG to judge unification or to code which characters after we can see the other variants of this character.
The evidence provided by Tao Yang does not support the pronunciation provided by TCA. Are there any other sources besides 汉字海? As 汉字海 seems be unreliable sometimes.
The traditional character 𬧄 has been encoded in Extension C and seems to be also used in Min Nan. So I think this source can be accepted as sufficient proof of the existence of the simplified form.
IRG Working Set 2021v1.0
Source: Henry CHAN
Date: Generated on 2024-10-08
Unification
Suggest to Unify to 𣖑 (U+23591), and update the glyph and source reference of U+23591 to GDM-00241.
Based on the provided evidence, the glyph shape should be recognized as the component in the middle one instead of the current glyph shape in the provided font. It would subsequently be unifiable with 鮆 (U+9B86).
Possibly unifiable to 𩼥 (U+29F25).
This character seems to be a corrupted form of 𩼥 (U+29F25) where the top grass radical has been moved to the left and joined with 大 to become a 关 structure, with the bottom part added an additional dot as a generalization to another 关.
The main sources quoted for these characters are known to contain vulgar forms of characters. It may be more appropriate to encode them as unified variants of existing characters instead of as new characters.
(See: http://coe21.zinbun.kyoto-u.ac.jp/djvuchar?query=號)
There should be a new UCV for ⿱口了 and 号 on the left.
Based on the same source, ⿱口丁 can also be added:
Consider Unification to 𪆾 U+2A1BE and updating the form at U+2A1BE.
GKJ-00360 seems to be a more normalized transcription than the current form at U+2A1BE.
Suggest unification to 鴛 and encoded via IVS.
Seems to be a commonly miswritten form of 鴛.
怨 ~ ⿱死心:
Source from MOE Dictionary:
Source from 京都大学《拓本文字データベース》:
苑 ~ ⿱艹⿱一夗
Source from MOE Dictionary:
Source from 京都大学《拓本文字データベース》:
Unify to 𪕋 U+2A54B.
Add new UCV 夘 and 卯
The glyph shape provided in the evidence matches 𭺷 (U+2DEB7), it should be unified to 𭺷 (U+2DEB7) upon correction.
If ROK can confirm the shape given on www.koreanhistory.or.kr is correct, then the shape in ISO10646 should be updated.
Based on context the quoted character is identical in semantics to 𠬸 as it refer to an ancient form of the right hand part of 歿.
Suggest to unify to 㣇 (U+38C7).
Unify to 𤲑 (U+24C91).
Suggest to add a new UCV rule ⿱巛𠙻 to 甾.
(To copy some examples from MOE Dictionary)
Suggest to unify these shapes:
Unify to 延 (U+5EF6) and add a new UCV for the other shape to 廴.
Suggest unification to 䖤 and encoded via IVS.
夗 at the top is often 類化 to ⿱一夗 or 死, thereby losing its phonetic indication. Some examples of such variations:
怨 ~ ⿱死心:
Source from MOE Dictionary:
Source from 京都大学《拓本文字データベース》:
苑 ~ ⿱艹⿱一夗
Source from MOE Dictionary:
Source from 京都大学《拓本文字データベース》:
See also:
㳫 is a corrupted form of 沓 according to the Kangxi Dictionary.
They are non-cognate. 呣 is used as a modal particle while UTC-03197 is used for negation.
Unify to 𫝖
JH-JTC064 is a variant of 麁.
Also refer to the following Google search result, which shows the place name in the evidence to be 麁利町:
Existing coded variant forms of 所 include 㪽 (U+3ABD), 𠩄 (U+20A44), 𫝂 (U+2B742), 𫠦 (U+2B826), 𬻐 (U+2CED0).
Attributes
Evidence
(The characters are ordered by stroke count in ascending order. The only exception is the possibility of an additional character added in the originally empty space at the end of the list. Cf: 7078 闆 (9 strokes) for 門 and 7021 錳 (8 strokes) for 金.)
Glyph Design & Normalization
Other
If the text is accurate, both ⿰鱼𬶨 or ⿰鱼暨 seem acceptable.
Theoretically we could add this as a UCV as this is a systematic transliteration variant, but most variants are already coded. Or we should consider adding this as an NUCV and encoding the remaining variants.
U+22994:
WS2021-01417 (A):
WS2021-01417 (B):
WS2021-01418:
To recap the discussion, another variant (WS2017-01228) also exists. It will be easier for IRG to judge unification or to code which characters after we can see the other variants of this character.
I think the existence of this character is sufficiently proved.
I think there is sufficient evidence to prove that this character exists.
Data for Unihan