Unify to [ {{WS2021-01317}} ] ⿰忄史 (Ext. J U+327F6). Second evidence shows ⿰忄史 twice, and first evidence shows ⿰忄史 and ⿰忄⿻口人, but ⿰忄⿻口人 here is clearly a mistake for ⿰忄史.
This is already the U-source form of U+6ECB 滋, so UTC should submit a disunification proposal if they believe that ⿰氵玆 was incorrectly unified to 滋 (U+6ECB) (personally, I think the unification is correct).
Support unification. The difference between the two forms of this character is the same as the difference between the regional forms of e.g. U+6220 戠 and U+6222 戢 (G forms extend the horizontal stroke of the left component into the 戈, but T forms have separate components). We should add a UCV for this variation.
Support unification. The difference between the two forms of this character is the same as the difference between the regional forms of e.g. U+6220 戠 and U+6222 戢 (G forms extend the horizontal stroke of the left component into the 戈, but T forms have separate components). We should add a UCV for this variation.
The single piece of evidence provided does not convince me that this character should be separately encoded. If, as suspected, this is a variant form of 工 then it would be most appropriate to deal with it using IVS. Note that the much more common variant of 工 with a zig-zag middle stroke was not considered suitable for separate encoding, and is registered in the IVD for the Adobe and Hanyo-Denshi collections.
Wang Xieyang quoted the relevant sentence of the PnP in comment #3120: "Ideographs with different glyph shapes that are unrelated in historical derivation (non-cognate characters) are not unified no matter how similar their glyph shapes may be". In this case the outer box of UTC-03393 should be noticeably larger than the outer box of U+20B9A.
This my implementation of the two characters in my BabelStone Han and BabelStone Han PUA fonts, with the two characters clearly distinguished (although still easily confusable):
Unify to 𡎢 (U+213A2). The reading ngồi suggests that this is an error form of 𡎢, and unless there is additional evidence that it is a stable error found in multiple sources I suggest that it is postponed.
GZHSJ-0101 is not quite the same as the outside component of U+206A1 𠚡, which has one additional vertical stroke at the bottom middle. It is the same as the bottom outside component of the T and J forms of U+2700D 𧀍 (but the G form is the same as the outside of 𠚡).
I think 飠 is counted as 9 strokes, so total stroke count of 16 should be correct.
Total Stroke Count
Looking at the stroke counts given in the code charts, for characters with 飠 in the residual part, the calculated stroke count of 飠 is 9 for Exts. A and B: U+3533 㔳 (22.11), U+20343 𠍃 (9.11), U+20957 𠥗 (22.11), U+243FC 𤏼 (86.13), U+20FF0 𠿰 (30.13), U+20FEE 𠿮 (30.13), U+21468 𡑨 (32.13), U+21F3D 𡼽 (46.13), U+24014 𤀔 (85.13), U+25F41 𥽁 (119.13), etc.
Only in Exts. C and later does the calculated stroke count of 飠 change to 8: U+2AF97 𪾗 (108.12), U+2BA1C 𫨜 (27.12), U+325A3 (30.12).
9 strokes seems more consistent, especially when you consider the standard J and K form 𩙿 which is definitely 9 strokes.
I think 飠 is counted as 9 strokes, so total stroke count of 15 should be correct.
Total Stroke Count
Looking at the stroke counts given in the code charts, for characters with 飠 in the residual part, the calculated stroke count of 飠 is 9 for Exts. A and B: U+3533 㔳 (22.11), U+20343 𠍃 (9.11), U+20957 𠥗 (22.11), U+243FC 𤏼 (86.13), U+20FF0 𠿰 (30.13), U+20FEE 𠿮 (30.13), U+21468 𡑨 (32.13), U+21F3D 𡼽 (46.13), U+24014 𤀔 (85.13), U+25F41 𥽁 (119.13), etc.
Only in Exts. C and later does the calculated stroke count of 飠 change to 8: U+2AF97 𪾗 (108.12), U+2BA1C 𫨜 (27.12), U+325A3 (30.12).
9 strokes seems more consistent, especially when you consider the standard J and K form 𩙿 which is definitely 9 strokes.
《红楼梦》(人民文学出版社, 1982, 6th printing 1985) p. 714 gives ⿰氵⿱𫂁马 which is the correct simplified form of U+24172 𤅲 (⿰氵⿱𫂁馬). It seems that ⿰氵⿱𮅕马 is an error form that is only attested in a single edition (Evidence 1). Based on Evidence 2 and the new evidence from the 1982 edition shown here the IDS and glyph for GCW-00281 should be changed to ⿰氵⿱𫂁马.
Mimeographed typewritten draft (circa 1950-1966) for a translation of Through the Looking Glass using the Gwoyeu Romatzyh system of romanization for Mandarin, with Yuen Ren Chao's hand-annotated Chinese characters:
New evidence
You Chengcheng: "'Jabberwocky' in Chinese"; in "Completely Jabberwocky: A Companion to Translations of the Frabjous Poem by Lewis Carroll" (Evertype, In press) p. 85:
《哥德尔、艾舍尔、巴赫——集异璧之大成》[translation of Douglas Hofstadter's "Gödel, Escher, Bach: an Eternal Golden Braid"] (商务印书馆, 1996) [ISBN 9787100013239] p. 478:
New evidence
Yuen Ren Chao, "Dimensions of Fidelity in Translation With Special Reference to Chinese" (Harvard Journal of Asiatic Studies Vol. 29 [1969]) p. 128:
New evidence
You Chengcheng: "'Jabberwocky' in Chinese"; in "Completely Jabberwocky: A Companion to Translations of the Frabjous Poem by Lewis Carroll" (Evertype, In press) p. 85:
Yuen Ren Chao, "Dimensions of Fidelity in Translation With Special Reference to Chinese" (Harvard Journal of Asiatic Studies Vol. 29 [1969]) p. 128:
New evidence
You Chengcheng: "'Jabberwocky' in Chinese"; in "Completely Jabberwocky: A Companion to Translations of the Frabjous Poem by Lewis Carroll" (Evertype, In press) p. 85:
Yuen Ren Chao, "Dimensions of Fidelity in Translation With Special Reference to Chinese" (Harvard Journal of Asiatic Studies Vol. 29 [1969]) p. 128:
New evidence
You Chengcheng: "'Jabberwocky' in Chinese"; in "Completely Jabberwocky: A Companion to Translations of the Frabjous Poem by Lewis Carroll" (Evertype, In press) p. 85:
《哥德尔、艾舍尔、巴赫——集异璧之大成》[translation of Douglas Hofstadter's "Gödel, Escher, Bach: an Eternal Golden Braid"] (商务印书馆, 1996) [ISBN 9787100013239] p. 478:
New evidence
Yuen Ren Chao, "Dimensions of Fidelity in Translation With Special Reference to Chinese" (Harvard Journal of Asiatic Studies Vol. 29 [1969]) p. 128:
New evidence
You Chengcheng: "'Jabberwocky' in Chinese"; in "Completely Jabberwocky: A Companion to Translations of the Frabjous Poem by Lewis Carroll" (Evertype, In press) p. 85:
《哥德尔、艾舍尔、巴赫——集异璧之大成》[translation of Douglas Hofstadter's "Gödel, Escher, Bach: an Eternal Golden Braid"] (商务印书馆, 1996) [ISBN 9787100013239] p. 478:
New evidence
Yuen Ren Chao, "Dimensions of Fidelity in Translation With Special Reference to Chinese" (Harvard Journal of Asiatic Studies Vol. 29 [1969]) p. 128:
New evidence
You Chengcheng: "'Jabberwocky' in Chinese"; in "Completely Jabberwocky: A Companion to Translations of the Frabjous Poem by Lewis Carroll" (Evertype, In press) p. 85:
Evidence 4 and 5 do not show misidentified glyphs, as we already noted in the submission that these two pieces of evidence show error forms of the submitted character ("Note: Evidence 4 gives ⿰亻⿱羽止; evidence 5 gives ⿰亻翌, both are errors of ⿰亻𰙩 as 㒊 in 㒊譶.").
Pre-modern woodblock editions often show a wide range of glyph forms, and where appropriate we provide evidence showing error forms, as we believe this is useful for textual scholarship.
We do not believe that these two characters are symbols. I think that the gloss on the right side reads 勒忒 which is perhaps a variant of the word 肋脦 lēte 'untidy', although that does not make much sense here. Later editions of this novel have "週摺" (周折) zhōuzhé 'setback, complication' which does make sense here. I suspect that UK-30601 and UK-30602 are an idiosyncratic way of writing 週摺.
We do not believe that these two characters are symbols. I think that the gloss on the right side reads 勒忒 which is perhaps a variant of the word 肋脦 lēte 'untidy', although that does not make much sense here. Later editions of this novel have "週摺" (周折) zhōuzhé 'setback, complication' which does make sense here. I suspect that UK-30601 and UK-30602 are an idiosyncratic way of writing 週摺.
The evidence provided are not captions. Captions are transcriptions of spoken text superimposed on a video or film/tv broadcast (Wiktionary: "caption: A piece of text appearing on screen as a subtitle or other part of a film or broadcast, describing dialogue (and sometimes other sound) for viewers who cannot hear.").
The evidence shows examples of text usage, and it is not relevant that the text occurs as part of a video, as there is no IRG rule prohibiting the use of video as evidence. There is also no IRG rule prohibiting the use of photographs of signs as evidence, e.g. GDM-00507 and GDM-00508, and if someone took a video of the places shown in the photos for these two characters, a still image from the video would be acceptable evidence, certainly not a caption, do you not agree?
The evidence provided are not captions. Captions are transcriptions of spoken text superimposed on a video or film/tv broadcast (Wiktionary: "caption: A piece of text appearing on screen as a subtitle or other part of a film or broadcast, describing dialogue (and sometimes other sound) for viewers who cannot hear.").
The evidence shows examples of text usage, and it is not relevant that the text occurs as part of a video, as there is no IRG rule prohibiting the use of video as evidence. There is also no IRG rule prohibiting the use of photographs of signs as evidence, e.g. GDM-00507 and GDM-00508, and if someone took a video of the places shown in the photos for these two characters, a still image from the video would be acceptable evidence, certainly not a caption, do you not agree?
The evidence provided are not captions. Captions are transcriptions of spoken text superimposed on a video or film/tv broadcast (Wiktionary: "caption: A piece of text appearing on screen as a subtitle or other part of a film or broadcast, describing dialogue (and sometimes other sound) for viewers who cannot hear.").
The evidence shows examples of text usage, and it is not relevant that the text occurs as part of a video, as there is no IRG rule prohibiting the use of video as evidence. There is also no IRG rule prohibiting the use of photographs of signs as evidence, e.g. GDM-00507 and GDM-00508, and if someone took a video of the places shown in the photos for these two characters, a still image from the video would be acceptable evidence, certainly not a caption, do you not agree?
Re Comment #1061, I think the use of 㕱 in the paper must be a mistake or approximation because the actual character is not encoded so not available in the font. 𭀖 is a vulgar form of U+5398 厘 used in Cantonese (see e.g. GlyphWiki and jyut.net), and the character 𭀖 without a 口 radical is also used for in the sources cited in the UK submission for English 'ri' or 're'. There is no doubt that the character in the original sources is not 㕱 yóu, which would make no sense phonetically.
All the evidences are hand-written, and at least Evidences 1 and 2 are very unclear. It would be helpful to include a modern typeset edition of the text, if available, to see how the character is transcribed.
Both evidences show a cursive hand-written form of the character, and it is difficult to be sure what the intended character is. Is it possible to provided additional evidence showing a printed transcription of the character?
I am confused about this one. The proposed character is ⿰木朮 but the only evidence provided shows U+233D5 𣏕 (⿰木𣎳). FAD1 is indeed ⿰木朮, but as far as I can see no evidence for its disunification has been provided.
Given the unusual glyph form, the current evidence is insufficient for encoding. Need additional evidence to determine whether the character is a unifiable variant of an existing character or whether it should be separately encoded.
Current evidence is insufficient for encoding, as it is unclear whether this is a unifiable variant of an existing encoded character. Possibly a variant of U+6896 梖?
IDS is ⿰舟玆, font glyph is ⿰舟茲, and evidence shows ⿰舟兹. Please either change glyph to match IDS, or change IDS to match glyph. If IDS is changed, then first stroke also needs to be changed.
The second 纟 should be the same as the first 纟, with a slanting up final stroke rather than a horizontal stroke.
Glyph design
Modify glyph to have a horizontal stroke at the very top, not a dot, as shown in all evidences.
Glyph design
Evidences 4-9 appear to show ⿰月⿹⿶戈一⿳⿲纟言纟⿲长马长心 with a full 言 in the top middle. It is unclear what Evidence 1-3 intend for the top middle component, but it looks more like ⿱丷口 than ⿱二口.
The note at the top of this page and the pages for other resubmitted SAT characters states "Resubmitted from WS2024", but the Excel file submitted by SAT states "Resubmitted from WS2021" in each case. This mistake seems to have been introduced after submission.
I do not understand why the note at the top of this page says "Resubmitted from WS2024-01155" when the Notes column of the UK submitted Excel file says "Resubmitted from WS2021-01155". This mistake seems to have been introduced after submission.
I do not understand why the note at the top of this page says "Resubmitted from WS2024-04503" when the Notes column of the UK submitted Excel file says "Resubmitted from WS2021-04503". This mistake seems to have been introduced after submission.
I do not understand why the note at the top of this page says "Resubmitted from WS2024-03752" when the Notes column of the UK submitted Excel file says "Resubmitted from WS2021-03752". This mistake seems to have been introduced after submission.
I do not understand why the note at the top of this page says "Resubmitted from WS2024-00315" when the Notes column of the UK submitted Excel file says "Resubmitted from WS2021-00315". This mistake seems to have been introduced after submission.
Oppose new radical number. If ⿱田儿 is given a new radical number, that implies that it cannot be unified to the same character with the standard 鬼 radical. There are four characters in WS2024 where TCA and UK have submitted the same 鬼 character, but TCA has kept the ⿱田儿 form shown in the source evidence, whereas UK has normalized to 鬼. If we unify then the radical cannot be both 194.0 and 194.1, but it makes no sense to disunify when the evidence for the characters is from the same source.
In 《哥德尔、艾舍尔、巴赫——集异璧之大成》[translation of Douglas Hofstadter's "Gödel, Escher, Bach: an Eternal Golden Braid"] (商务印书馆, 1996) the character is written as ⿱卧龙. 尨 can be used as a variant form of 龍, and no doubt Yuen Ren Chao intended the character ⿱卧尨 to mean 卧龍, but Evidences 1-4 all use 尨 at the bottom, so ⿱卧尨 is the canonical form of the character which we propose for encoding. The form ⿱卧龙 could be considered a unifiable variant of ⿱卧尨, encodable using IVS.
Other
Note that the above evidence and the following page include hand-written unencoded characters ⿰犭偷, ⿰豸若, ⿰赤瓦, ⿱不亞, and ⿰赤禾 which are not used in the 1969 printed edition of Yuen Ren Chao's translation of Through the Looking Glass. Because these characters are only attested in a single hand-annotated draft we have not proposed them for encoding.
Evidence 3 also shows the unencoded characters ⿱罒二 and ⿱格兒, which are not attested elsewhere, so have not been proposed for encoding.
Other
Sources for the hand-annotated typescript draft translation shown in the above evidence:
In 《哥德尔、艾舍尔、巴赫——集异璧之大成》[translation of Douglas Hofstadter's "Gödel, Escher, Bach: an Eternal Golden Braid"] (商务印书馆, 1996) the character is miswritten as U+8994 覔.
To clarify, ⿰又住 means youq 'to be at', with the phonetic component 又 and the semantic component 住. There is also a Sawndip character 难 which is used as a classifier of inanimate objects (aen), but the two characters are distinct and separate.
The images are PDFs displayed within a frame. If they do not show on your system, you can try downloading the PDFs by clicking on the Save icon at the top right of each frame.
My main point is that this character should not have been submitted for encoding, but a disunification proposal for U+6ECB should have been submitted instead.
I don't understand how encoding this character affects FAB0 which is presumably a variant form of 7DF4 used in DPRK, and is not a transcription of a bronze script character as shown by the only evidence provided for the new character. Also, can we move the KP source reference without agreement from DPRK experts? In all the evidence for disunification seems very weak to me.
As this is a Han-Katakana hybrid (⿸广⿱タカ taka = U+9DF9 鷹), would it be more appropriate to include it in the proposed "Script-Hybrid CJK Ideographs" block?
Personally, I am OK with encoding script-hybrid CJK ideographs as ordinary CJK unified ideographs, so I am not opposed to encoding UTC-03391, but I am concerned that it is inconsistent with the proposed treatment of similar hybrid characters in L2/24-201.
IRG Working Set 2024v1.0
Source: Andrew WEST
Date: Generated on 2024-10-10
Unification
Unify to [ {{WS2021-00778}} ]
GDM-00364 ⿴囗恋 (Ext. J U+32629)
Unify to 𭎇 (U+2D387)
Unify to WS2021-00814 ⿰土辰 (Ext. J U+32649)
Unify to 㒷 (U+34B7). Both are vulgar forms of 興, and the crossing or not of the 人 component should be a unifiable difference.
Unify to 𥢑 (U+25891) ?
Unify to 𤍫 (U+2436B) by #307b, with horizontal extension by China.
Unify to 煚 (U+715A)
Unify to 𣕕 (U+23555) by UCV #89
Unify to 徹 (U+5FB9) with new UCV?
Unify to [ {{WS2021-01170}} ]
⿱山異 (Ext. J U+32778)
淮南子 gives 篅𥫱, so unify to 𥫱 (U+25AF1)?
Unify to 𡩟 (U+21A5F) (SAT glyph is the same as the K-source glyph for U+21A5F)
Unify to [ {{WS2021-01317}} ]
⿰忄史 (Ext. J U+327F6). Second evidence shows ⿰忄史 twice, and first evidence shows ⿰忄史 and ⿰忄⿻口人, but ⿰忄⿻口人 here is clearly a mistake for ⿰忄史.
Unify to 麚 (U+9E9A) with new UCV for 叚~𫨻. See also GZ-2091203 where the evidence shows ⿱⿰虫𫨻共, but the font glyph is normalized to ⿱蝦共.
Unify to 𨂍 (U+2808D) with new UCV for 𧾷and 𤴔. Note that the following character in the evidence (⿰𤴔居 = 踞) is not encoded or proposed for encoding.
Unify to 蹰 (U+8E70) with new UCV for 𧾷and 𤴔.
Unify to WS2021-03618 GKJ-00438 ⿱隊虫 (Evidence 2 shows this character form)
Unify to 𮜨 (U+2E728) with new UCV for 𧾷and 𤴔.
Unify to 𬦷 (U+2C9B7)
Unify to 蟲 (U+87F2) with new UCV for 虫~䖝?
Unify to 𥝸 (U+25778) by UCV #32a
Unify to 蚊 (U+868A) (which is the character actually given in Huainanzi)
Unify to [ {{WS2021-04301}} ]
⿱雨仙 (Ext. J U+33249)
Unify to [ {{WS2021-04318}} ]
⿱雨炎 (Ext. J U+3325A)
Unify to 𮎟 (U+2E39F)
There are no existing encoded characters with ⿱䀠瓦, although SJ/T 11239—2001 37-75 is ⿰石⿱䀠瓦 (but no corresponding encoded form ⿰石甖).
This is already the U-source form of U+6ECB 滋, so UTC should submit a disunification proposal if they believe that ⿰氵玆 was incorrectly unified to 滋 (U+6ECB) (personally, I think the unification is correct).
Unify to 縠 (U+7E20), and change NUCV #264a to a UCV.
Unify to 𭡽 (U+2D87D) with new UCV ?
Unify to 蔻 (U+853B) ?
The single piece of evidence provided does not convince me that this character should be separately encoded. If, as suspected, this is a variant form of 工 then it would be most appropriate to deal with it using IVS. Note that the much more common variant of 工 with a zig-zag middle stroke was not considered suitable for separate encoding, and is registered in the IVD for the Adobe and Hanyo-Denshi collections.
This my implementation of the two characters in my BabelStone Han and BabelStone Han PUA fonts, with the two characters clearly distinguished (although still easily confusable):
Unify to 矑 (U+77D1) which is the form that Evidence 2 seems to show.
Unify to 躔 (U+8E94) with new UCV?
Unify to 𪖌 (U+2A58C) -- Evidence 2 seems to show ⿺鼠盧
Unify to 𡎢 (U+213A2). The reading ngồi suggests that this is an error form of 𡎢, and unless there is additional evidence that it is a stable error found in multiple sources I suggest that it is postponed.
Unify to 𮔔 (U+2E514) by UCV #149
Attributes
Only in Exts. C and later does the calculated stroke count of 飠 change to 8: U+2AF97 𪾗 (108.12), U+2BA1C 𫨜 (27.12), U+325A3 (30.12).
9 strokes seems more consistent, especially when you consider the standard J and K form 𩙿 which is definitely 9 strokes.
Only in Exts. C and later does the calculated stroke count of 飠 change to 8: U+2AF97 𪾗 (108.12), U+2BA1C 𫨜 (27.12), U+325A3 (30.12).
9 strokes seems more consistent, especially when you consider the standard J and K form 𩙿 which is definitely 9 strokes.
Evidence
Pre-modern woodblock editions often show a wide range of glyph forms, and where appropriate we provide evidence showing error forms, as we believe this is useful for textual scholarship.
The evidence shows examples of text usage, and it is not relevant that the text occurs as part of a video, as there is no IRG rule prohibiting the use of video as evidence. There is also no IRG rule prohibiting the use of photographs of signs as evidence, e.g. GDM-00507 and GDM-00508, and if someone took a video of the places shown in the photos for these two characters, a still image from the video would be acceptable evidence, certainly not a caption, do you not agree?
The evidence shows examples of text usage, and it is not relevant that the text occurs as part of a video, as there is no IRG rule prohibiting the use of video as evidence. There is also no IRG rule prohibiting the use of photographs of signs as evidence, e.g. GDM-00507 and GDM-00508, and if someone took a video of the places shown in the photos for these two characters, a still image from the video would be acceptable evidence, certainly not a caption, do you not agree?
The evidence shows examples of text usage, and it is not relevant that the text occurs as part of a video, as there is no IRG rule prohibiting the use of video as evidence. There is also no IRG rule prohibiting the use of photographs of signs as evidence, e.g. GDM-00507 and GDM-00508, and if someone took a video of the places shown in the photos for these two characters, a still image from the video would be acceptable evidence, certainly not a caption, do you not agree?
Glyph Design & Normalization
Editorial
Other
Evidence 3 also shows the unencoded characters ⿱罒二 and ⿱格兒, which are not attested elsewhere, so have not been proposed for encoding.
https://repository.lib.cuhk.edu.hk/en/item/cuhk-2023562
https://repository.lib.cuhk.edu.hk/en/item/cuhk-2023552
Personally, I am OK with encoding script-hybrid CJK ideographs as ordinary CJK unified ideographs, so I am not opposed to encoding UTC-03391, but I am concerned that it is inconsistent with the proposed treatment of similar hybrid characters in L2/24-201.
Data for Unihan
UK-20805
GKJ-00477