«
00013
00014
00015
»
00014
1.0 一
SC=11, FS=5 TS=12

GKJ-00944
IRGN2632WS2021v6.0Pending
Postponed for further discussion of encoding model, IRG 57.
Attributes:



Review Comments

Type
Description
Submitter
Other
COMMENT
WS2021 v1.0
[ Resolved ]
00014
一 1.11.5
GKJ-00944
TS 12 · IDS
00016
一 1.12.5
GKJ-00941
TS 13 · IDS
00017
一 1.13.5
GKJ-00942
TS 14 · IDS
00777
囗 31.10.1
GKJ-00877
TS 13 · IDS
01900
欠 76.8.1
GKJ-00943
TS 12 · IDS


I doubt they should be encoded as Han Ideographs. Encoding them as a single character is open-ended.

They are parts of the examples in 有機化學命名芻議[1] (A proposal of Organic Chemicals Nomenclature, see attachments below) by 梁國常. Here I summarize the naming rules related to the WS2021 characters.

* Alkane CnH2n+2 is named as ⿰⿳一巛⿸厂X充, where X is one of 一二三四五六七八九十 when n = 1, 2, ... , 9, 10. So WS2021-00016 ⿰⿳一巛⿸厂一充 is methane (CH4) when n = 1, WS2021-00017 ⿰⿳一巛⿸厂二充 is ethane (C2H6) when n = 2. When n = 11, ..., 19, X is the top-down combination of 十一, 十二, ... , 十九. For example, he gave ⿰⿳一巛⿸厂⿱十五充 for C15H32. When n = 21, ... , 29, X is the combination of 廿一, 廿二, ... 廿九. For example he gave ⿰⿳一巛⿸厂⿱廿一充 for C21H44. He also gave ⿰⿳一巛⿸厂⿱六十充 for C60H122, but he didn't give any example for n > 60.

* Alkene CnH2n (n >= 2) is named as ⿰⿳一巛⿸厂X欠, how X is created from n is as same as how he proposed for alkane. So WS2021-01900 is ethylene (C2H4) when n = 2.

* Alkyne CnH2n-2 (n >= 2) is named as ⿰⿳一巛⿸厂X少, how X is created from n is as same as how he proposed for alkane. So WS2021-00014 is ethyne (C2H2) when n = 2.

* Aromatic ring compounds like Benzene and Furan are named as ⿴囗⿳一巛⿸厂X, where X represents the number of nucleus in the ring. So WS2021-00777 ⿴囗⿳一巛⿸厂六 is benzene and ⿴囗⿳一巛⿸厂五 (also shown in the WS2021-00777 evidence but not submitted) is furan (C4H4O). He did not mention how to name other ring compounds with identical nucleus number, such as pyrrole (C4H5N), let alone general ring compounds.

Encoding ⿰⿳一巛⿸厂X充 as a single character is open-ended because alkane, alkene and alkyne can have arbitrarily large number of carbons. If we accept ⿰⿳一巛⿸厂一充, should we accept ⿰⿳一巛⿸厂⿱六十充 (already in the given evidence)? Should we accept ⿰⿳一巛⿸厂⿳一百一充?

Encoding ⿴囗⿳一巛⿸厂X as a single character is also open-ended because cycloalkanes (hydrocarbons ring compounds) also have arbitrarily large number of carbons, let alone general ring compounds. If we accept ⿴囗⿳一巛⿸厂六, should we accept ⿴囗⿳一巛⿸厂七? Should we accept ⿴囗⿳一巛𫨅?

Alternative encoding solution

To encode these characters, I suggest they be placed in a new Unicode block, using a different encoding model than Han Ideographs. Take alkane for example. we can encode ⿰⿳一巛⿸厂◌充 as a base character, which starts a ZWJ sequence to represent different alkanes:

⿰⿳一巛⿸厂二充 --> ⿰⿳一巛⿸厂◌充 + ZWJ + 二

⿰⿳一巛⿸厂⿱六十充 --> ⿰⿳一巛⿸厂◌充 + ZWJ + 六 + ZWJ + 十

Reference:

[1] 梁國常. 有機化學命名芻議. 北京大學月刊. 1920(7). pp71-89. Available online: http://read.nlc.cn/allSearch/searchDetail?searchType=all&showType=1&indexName=data_404&fid=01J000340

Attached PDF file
HUANG Junliang
Individual
2021-08-28 02:14:16 UTC
Other
COMMENT
WS2021 v1.0
[ Resolved ]
The Mr. Huang’s comment #2590 is useful and reasonable. The rationale of this type of characters(?) is similar to Chinese Jianzi Musical Notation. I will provide some comments as the feedback of IRGN2492.
Eiso CHAN
Individual
2021-09-02 01:28:21 UTC
Other
COMMENT
WS2021 v1.0
[ Resolved ]
I agree with Mr Huang's comment that these examples of systematically constructed characters should not be encoded as individual CJK unified ideographs. If there is evidence that Liáng Guócháng's system for naming organic chemical compounds was actually used by the scientific community in China, then we can consider encoding them using some other mechanism (e.g. encoding a set of components which could be ligated together). However, if these characters are only attested in Liáng Guócháng's 1920 proposal and other works by the same author, then I do not believe that it is appropriate to encode them at all. Certainly I would be strongly opposed to defining a new Unicode block for a failed scientific orthography.
Andrew WEST
UK
2021-09-07 23:23:23 UTC
Other
COMMENT
WS2021 v1.0
[ Resolved ]
I have showed the preliminary proposal to encode this type of characters via sequences in my feedback on IRGN2492.
Eiso CHAN
Individual
2021-09-13 14:16:29 UTC
Evidence
NEW_EVIDENCE
WS2021 v1.0
[ Resolved ]
https://www.docin.com/p-1731720387.html
Eiso CHAN
Individual
2021-09-13 14:16:46 UTC
Other
COMMENT
WS2021 v2.0
[ Unresolved ]
The following is the comment on the Early Chinese chemical characters from the Script Ad Hoc group in L2/22-023, which is also the recommendation for section 3 of my feedback on IRGN2492.

Eiso CHAN
Individual
2022-01-25 01:06:29 UTC

Meeting Minutes

DateDescription
IRG #57
2021-09-17 (Fri)
8:54 am +0800
Recorded by CHEN Zhuang
Postponed for further investigation

Attribute Changes

VersionDescription
2.0
For 00014, change Status to Postponed
2.0
For 00014, add Discussion Record "Postponed for further discussion of encoding model, IRG 57."

Glyph Changes

Source ReferenceGlyph
GKJ-00944
1.0

Raw Info
groupChina (GKJ - Science and Technology Characters)
a) Source referenceGKJ-00944
b) PUA Code of TTFE374
c) KangXi Radical Code(Primary)1.0
d) Stroke Count(Primary)11
e) First Stroke(Primary)5
g) Total Stroke Count12
i) IDS (Ideographic Description Sequence)⿰⿳一巛⿸厂二少
j) Similar/ Variants N/A
k) Ref. to Evidence doc碳原子、取代基及官能团数目的中文命名演变:1908—1932