00014 | ⿰⿳一巛⿸厂二少

00013

00014

00015

00014

1.0 一

⿰⿳一巛⿸厂二少

SC=11, FS=5 TS=12

GKJ-00944

IRGN2632WS2021v6.0Pending
Postponed for further discussion of encoding model, IRG 57.

Attributes:

Review Comments

Type

Description

Submitter

Other

COMMENT

WS2021 v1.0

[ Resolved ]

00014

一 1.11.5

GKJ-00944

TS 12 · IDS ⿰⿳一巛⿸厂二少

00016

一 1.12.5

GKJ-00941

TS 13 · IDS ⿰⿳一巛⿸厂一充

00017

一 1.13.5

GKJ-00942

TS 14 · IDS ⿰⿳一巛⿸厂二充

00777

囗 31.10.1

GKJ-00877

TS 13 · IDS ⿴□⿳一巛⿸厂六

01900

欠 76.8.1

GKJ-00943

TS 12 · IDS ⿰⿳一巛⿸厂二欠

I doubt they should be encoded as Han Ideographs. Encoding them as a single character is open-ended.

They are parts of the examples in 有機化學命名芻議[1] (A proposal of Organic Chemicals Nomenclature, see attachments below) by 梁國常. Here I summarize the naming rules related to the WS2021 characters.

* Alkane CnH2n+2 is named as ⿰⿳一巛⿸厂X充, where X is one of 一二三四五六七八九十 when n = 1, 2, ... , 9, 10. So WS2021-00016 ⿰⿳一巛⿸厂一充 is methane (CH4) when n = 1, WS2021-00017 ⿰⿳一巛⿸厂二充 is ethane (C2H6) when n = 2. When n = 11, ..., 19, X is the top-down combination of 十一, 十二, ... , 十九. For example, he gave ⿰⿳一巛⿸厂⿱十五充 for C15H32. When n = 21, ... , 29, X is the combination of 廿一, 廿二, ... 廿九. For example he gave ⿰⿳一巛⿸厂⿱廿一充 for C21H44. He also gave ⿰⿳一巛⿸厂⿱六十充 for C60H122, but he didn't give any example for n > 60.

* Alkene CnH2n (n >= 2) is named as ⿰⿳一巛⿸厂X欠, how X is created from n is as same as how he proposed for alkane. So WS2021-01900 is ethylene (C2H4) when n = 2.

* Alkyne CnH2n-2 (n >= 2) is named as ⿰⿳一巛⿸厂X少, how X is created from n is as same as how he proposed for alkane. So WS2021-00014 is ethyne (C2H2) when n = 2.

* Aromatic ring compounds like Benzene and Furan are named as ⿴囗⿳一巛⿸厂X, where X represents the number of nucleus in the ring. So WS2021-00777 ⿴囗⿳一巛⿸厂六 is benzene and ⿴囗⿳一巛⿸厂五 (also shown in the WS2021-00777 evidence but not submitted) is furan (C4H4O). He did not mention how to name other ring compounds with identical nucleus number, such as pyrrole (C4H5N), let alone general ring compounds.

Encoding ⿰⿳一巛⿸厂X充 as a single character is open-ended because alkane, alkene and alkyne can have arbitrarily large number of carbons. If we accept ⿰⿳一巛⿸厂一充, should we accept ⿰⿳一巛⿸厂⿱六十充 (already in the given evidence)? Should we accept ⿰⿳一巛⿸厂⿳一百一充?

Encoding ⿴囗⿳一巛⿸厂X as a single character is also open-ended because cycloalkanes (hydrocarbons ring compounds) also have arbitrarily large number of carbons, let alone general ring compounds. If we accept ⿴囗⿳一巛⿸厂六, should we accept ⿴囗⿳一巛⿸厂七? Should we accept ⿴囗⿳一巛𫨅?

Alternative encoding solution

To encode these characters, I suggest they be placed in a new Unicode block, using a different encoding model than Han Ideographs. Take alkane for example. we can encode ⿰⿳一巛⿸厂◌充 as a base character, which starts a ZWJ sequence to represent different alkanes:

⿰⿳一巛⿸厂二充 --> ⿰⿳一巛⿸厂◌充 + ZWJ + 二

⿰⿳一巛⿸厂⿱六十充 --> ⿰⿳一巛⿸厂◌充 + ZWJ + 六 + ZWJ + 十

Reference:

[1] 梁國常. 有機化學命名芻議. 北京大學月刊. 1920(7). pp71-89. Available online: http://read.nlc.cn/allSearch/searchDetail?searchType=all&showType=1&indexName=data_404&fid=01J000340

Attached PDF file

HUANG Junliang

Individual

2021-08-28 02:14:16 UTC

#2590

Other

COMMENT

WS2021 v1.0

[ Resolved ]

The Mr. Huang’s comment #2590 is useful and reasonable. The rationale of this type of characters(?) is similar to Chinese Jianzi Musical Notation. I will provide some comments as the feedback of IRGN2492.

Eiso CHAN

Individual

2021-09-02 01:28:21 UTC

#2661

Other

COMMENT

WS2021 v1.0

[ Resolved ]

I agree with Mr Huang's comment that these examples of systematically constructed characters should not be encoded as individual CJK unified ideographs. If there is evidence that Liáng Guócháng's system for naming organic chemical compounds was actually used by the scientific community in China, then we can consider encoding them using some other mechanism (e.g. encoding a set of components which could be ligated together). However, if these characters are only attested in Liáng Guócháng's 1920 proposal and other works by the same author, then I do not believe that it is appropriate to encode them at all. Certainly I would be strongly opposed to defining a new Unicode block for a failed scientific orthography.

Andrew WEST

2021-09-07 23:23:23 UTC

#2796

Other

COMMENT

WS2021 v1.0

[ Resolved ]

I have showed the preliminary proposal to encode this type of characters via sequences in my feedback on IRGN2492.

Eiso CHAN

Individual

2021-09-13 14:16:29 UTC

#2836

Evidence

NEW_EVIDENCE

WS2021 v1.0

[ Resolved ]

https://www.docin.com/p-1731720387.html

Eiso CHAN

Individual

2021-09-13 14:16:46 UTC

#2837

Other

COMMENT

WS2021 v2.0

[ Unresolved ]

The following is the comment on the Early Chinese chemical characters from the Script Ad Hoc group in L2/22-023, which is also the recommendation for section 3 of my feedback on IRGN2492.

Eiso CHAN

Individual

2022-01-25 01:06:29 UTC

#5243

Meeting Minutes

Date	Description
IRG #57 2021-09-17 (Fri) 8:54 am +0800 Recorded by CHEN Zhuang	Postponed for further investigation

Attribute Changes

Version	Description
2.0	For 00014, change Status to Postponed
2.0	For 00014, add Discussion Record "Postponed for further discussion of encoding model, IRG 57."

Glyph Changes

Source Reference	Glyph
GKJ-00944	1.0

Raw Info

group	China (GKJ - Science and Technology Characters)
a) Source reference	GKJ-00944
b) PUA Code of TTF	E374
c) KangXi Radical Code(Primary)	1.0
d) Stroke Count(Primary)	11
e) First Stroke(Primary)	5
g) Total Stroke Count	12
i) IDS (Ideographic Description Sequence)	⿰⿳一巛⿸厂二少
j) Similar/ Variants	N/A
k) Ref. to Evidence doc	碳原子、取代基及官能团数目的中文命名演变：1908—1932

Character Info

Review Comments

Meeting Minutes

Attribute Changes

Glyph Changes