Evidence 1 and 2 both show that the replacement character for ⿰土𤲞 is U+756C 畬 which implies that ⿰土𤲞 = ⿰土畬 = U+302AE 𰊮 (UK-02707). Therefore unify to 𰊮 (U+302AE).
In light of #9797, I suggest to disunify UK-20622 and GKJ-00810, and suggest that China corrects the glyph and IDS for GKJ-00810 to ⿰臼鸟 (simplified form of U+4CCE 䳎).
Unify to 嗌 (U+55CC) because of unification of SAT-06800 to 益 with new UCV. If unification to U+55CC is considered inappropriate, then consider disunifying SAT-06800 from 益.
The character shown in #12873 is U+24D6E 𤵮 kuí. As U+24D6E has a subtle but important difference in glyph construction compared with UK-20302 (⿸疒⿱㇒火 compared with ⿸疒灭) and a different pronunciation and meaning, the two characters are not cognate, and according to IRG PnP Section 2.1.3 should not be unified: "Ideographs with different glyph shapes that are unrelated in historical derivation (non-cognate characters) are not unified no matter how similar their glyph shapes may be".
The T-source character for U+2209E seems to be a variant of 希 xī, so not cognate with the Vietnam character ⿱父布. Therefore unification is not appropriate. Suggest to keep this character in WS2021, and remove the V-source reference (VN-2209E) from U+2209E.
Change Radical to 24.0 (十), SC=5, FS=3 because GKJ-01026 (⿰千分) is under Radical 24. Alternatively, change GKJ-01026 to use Radical 18, but both characters should be under the same radical as they are the same abstract character with different layout of components.
Change Radical to 117.0 (立), SC=4, FS=3 Even if 立 is the phonetic here, 立 is a much better radical for indexing purposes than 又. Note that U+2E128 𮄨 is Radical 117.
Second radical not required because 羅 is such a common phonetic component, and ⿱維土 is not a thing, so there is no scope for confusion over the radical in this case.
The evidence shown by China derives from 爾雅注 which has "蜪蚅(未詳)". ⿰虫陶 in the evidence could be a font error for 蜪, so China should supply an image of the original woodblock edition of 郝懿行集 to confirm that ⿰虫陶 is shown in the original text.
I checked through the whole of 《太平圖話姓氏綜》 and was unable to find this character or any character with a rat 鼠 radical. I suspect that ⿰鼠亘 is a mistake for U+29C35 𩰵 yuán given on 3:4b (is the character given in the entry a variant of U+2CD20 𬴠?):
Therefore suggest to postpone pending additional evidence.
Also a second stage simplified character (《第二次汉字简化方案(草案)》 p. 4):
I suppose that if bare lists of vulgar form characters such as the evidence provided by Wang Xieyang are acceptable for encoding, then all the remaining unencoded second stage simplified characters listed in 《第二次汉字简化方案(草案)》 should be submitted for the next IRG Working Set.
Jumanjikyo notes on twitter that ⿱艹架 occurs several times in 1789年『私家農業談』 where it is used to write the word 稲架(はさ). Maybe someone with access to this book can check.
Evidence is sufficient for encoding. It makes no sense to discuss "encoding model". Either unify with U+211A0 𡆠 or move back to the Main set. My preference is to encode ⿴〇乙 as a separate character.
New evidence
Also seen in《五雜組》13:29B (supplementary evidence for VN-F1BD7) on 3rd column from left:
Agree with analysis provided by Huang Junliang that ⿰土苦 in the original evidence is probably a mistake for 塔 = 㙮. Therefore we WITHDRAW this character.
Based on the new evidence provided by Eiso in #6798, we withdraw the withdrawal of this character. The new evidence suggests that ⿱當心 is not a one-off error, but may be a deliberate choice.
The character is likely to be an error. Google Books gives "江口店" for this text in the 康熙順德縣志, but I cannot see the actual text so it may be an OCR error. It seems prudent to postpone pending additional evidence.
I agree that ⿰忄字 is a plausible mistake for 悖, but the Ming edition of 宋史 given in #2989 shows ⿰忄字 where the modern edition has 悖 (image of the pdf evidence shown below), so if it is an error it may be considered to be a stable error. Unfortunately there are no Tangut sources for this general's name. As there are two separate pieces of evidence in support of it, I suggest to keep the character in the Main set.
New evidence
This is the 《摛藻堂四庫全書薈要》 edition of the 宋史 (349:8a) which also shows ⿰忄字:
The character may be an error, but the Wuyingdian edition of 《宋史》 is an important source text which needs to be representable as a digital text. An internet search finds many websites which quote the text "副使安化郎将摩君明稽<田思>等十四人来贡" with UK-20480 represented as <田思> (e.g. here amd here). As such, we can consider that "⿰田思" is a stable error, and should be encoded for the convenience of users. Therefore we are not willing to withdraw this character.
It does not look like ⿰女啟 should be read miau²¹. It seems possible that ⿰女啟 is a mistake for some other character. I think additional evidence is required in this case.
Evidence
Based on the investigation by Eiso, UK-20807 is probably a mistake for ⿰女畝 (UK-20820), therefore we WITHDRAW this character.
⿰前刂 should be a mistake for U+84AF 蒯 as "老蒯" lǎokuǎi is an affectionate way of addressing an old wife in the Northeast dialect (https://baike.baidu.hk/item/%E8%80%81%E8%92%AF/1335056). Therefore suggest to postpone pending additional evidence.
As the comment by Huang Junliang shows multiple examples of 𠷢, and as there is only a single piece of evidence for ⿲𧾷𦈢亍, it is probable that the proposed character is a one-off mistake for 𠷢. Therefore postpone pending additional evidence.
As UK generally follow PRC conventions, modify the 幾 component to follow PRC convention for this component (tip of 人 should extend through the horizontal stroke above).
As UK generally follow PRC conventions, modify the 畢 component to follow PRC convention for this component (bottom horizontal stroke should be longer than the horizontal stroke above it).
As UK generally follow PRC conventions for Chinese characters, modify the 畢 component to follow PRC convention for this component (bottom horizontal stroke should be longer than the horizontal stroke above it).
In principle UK normalizes glyph forms for characters in Chinese sources to match PRC conventions, so in this case I agree with Henry that it is reasonable to normalize 口䕶 to ⿰口護. Therefore we should consider reverting to v. 1.0 glyph.
Based on the additional evidence for UK-20678 which shows that ⿵冂力 should be 㘞, and the new evidence above showing ⿰口㘞, I suggest changing the glyph and IDS for UK-20684 to ⿰口㘞.
Image from Tự Điển Chữ Nôm Trích Dẫn shown above shows ⿱𬼀見 rather than ⿰𬼀見. Consider modifying glyph and IDS to match the glyph form shown in Tự Điển Chữ Nôm Trích Dẫn.
Previously submitted by China for CJK_C1 as CYY01318. What was the evidence then? This character is also SJ/T 11239—2001 28-42 so it probably should be encoded as a variant of 衚.
There is very little point in encoding only ⿱宀鹿, so I hope that China will submit ⿱宀艸 and ⿱宀日 for the next working set, otherwise it will still not be possible to represent this text in digital format -- and there is no other use for ⿱宀鹿!
Based on the full evidence, SAT-05240 is the original Wu Zetian character for 月 which was later changed to 𠥱. Therefore SAT-05240 should be considered to be an ideograph not a symbol, and there is as much reason to encode it as a CJK unified ideograph as U+20971 𠥱.
There is no evidence that UK-20131 is a variant of 玩, and I very much doubt that it is related to 玩 ("similar to" just means that there is a graphical similarity between the two characters). As a person's name it is more likely to have been read as yuán.
It is a shame that Wylie stopped at Ω, as 口 forms of the first 26 of the 28 constellations are encoded or proposed in WS2021, but #27 ⿰口翼 and #28 ⿰口軫 are missing.
This is not a "wrong" glyph, but a deliberate variant of 餐. There are many variant forms of 餐, and the author of 《天聞閣琴譜》 decided to write the character as ⿱叔食. Suggest moving UK-20397 back to the Main Set.
Re Comment #9489: Please note that this character is not intended for use by transgender people who identify with a specific gender. As noted in Evidence 1 (highlighted in red at the bottom), this character is intended for use by *non-binary* people who do not identify as either male or female: "{⿰㐅也}為中性代詞,代表非二元性別認同的跨性别人士". The character should be encoded for those people who wish to self-identify using the non-binary pronoun ⿰㐅也. Of course, cis and trans people who do identify with a specific gender will not want to use this pronoun, which is fine!
Seems to be derived from the Buddhist term 菩提 bodhi by addition of 口 radical to both characters. Cf. also addition of 口 to the Buddhist term 三昧 samadhi shown at the bottom of the evidence.
Other sources give this text as "祖明、地軸、誕馬、偶人". In 《唐宋白孔六帖》 it seems that the character 誕 has been miswritten with a 車 radical because of the preceding character 軸.
IRG Working Set 2021v4.0
Source: Andrew WEST
Date: Generated on 2023-03-21
Labels
Unification
Evidence 1 and 2 both show that the replacement character for ⿰土𤲞 is U+756C 畬 which implies that ⿰土𤲞 = ⿰土畬 = U+302AE 𰊮 (UK-02707). Therefore unify to 𰊮 (U+302AE).
Unify to 㠋 (U+380B).
Unify to 𠻢 (U+20EE2) as the character is a variant of 謣 so the rhs should be cognate.
Unify to 嗌 (U+55CC) because of unification of SAT-06800 to 益 with new UCV. If unification to U+55CC is considered inappropriate, then consider disunifying SAT-06800 from 益.
We do not oppose unification to 玜 (U+739C) with a new UCV (see also UK-20188)
Agree to unify to 𡧾 (U+219FE) as both are variants of "寧".
The quoted text is a mistake for “劃劙雲陰,卷月日也". Suggest to unify ⿰蟸刂 to 劙 (U+5299) with new UCV for 蟸~蠡.
Unify to 𫫕 (U+2BAD5), and suggest UTC change glyph and source reference for U+2BAD5 to UTC-03216.
Attributes
NB U+25A9D 𥪝 is under Radical 117
Evidence
Also present in various editions of 《陝西通志》
Variant of U+248B9 𤢹
Additional evidence is needed to show that this is not a one-off typo in this edition.
Therefore suggest to postpone pending additional evidence.
I suppose that if bare lists of vulgar form characters such as the evidence provided by Wang Xieyang are acceptable for encoding, then all the remaining unencoded second stage simplified characters listed in 《第二次汉字简化方案(草案)》 should be submitted for the next IRG Working Set.
⿴囗九 in Evidence 2 would seem to be a mistake for 㘞, therefore suggest to postpone pending additional evidence that ⿴囗九 is correct in this context.
NB This evidence shows ⿱微血.
On the other hand, a different edition of 《紹興府志》 has ⿰阝京:
Additional evidence would be useful.
Therefore suggest to postpone pending additional evidence that ⿰貝厨 is not a one-off error.
This indicates that ⿰土取 is a variant of U+966C 陬.
As this variant is not unifiable with any encoded version of 寧, we therefore request to move this character back to the Main set.
Therefore we request to move UK-20398 back to the Main Set.
Suggest to postpone pending additional evidence.
⿰王未 in Evidence 1 is probably 珠 with a missing stroke, therefore suggest to postpone for additional evidence.
I think that ⿰子孟 is almost certainly a one-off error for 猛, so suggest to postpone pending additional evidence.
《宋稗類鈔》(1669):
《山堂肆考》(四庫全書本):
I strongly suspect that ⿰山耳 is a corruption of 㟁, therefore suggest to postpone pending additional evidence.
Pronunciation: k ̕a˥ (雷州)
Meaning: 捕魚簍子
This shows that ⿰土囦 is not an error for U+315AE 𱖮 (⿰土困)
Glyph Design & Normalization
Other
Data for Unihan
Other sources give this text as "祖明、地軸、誕馬、偶人". In 《唐宋白孔六帖》 it seems that the character 誕 has been miswritten with a 車 radical because of the preceding character 軸.