Cang Jie Dictionary

Technically, the dictionary may be used for two particular operations:

  1. Finding an input code of a given hieroglyph if one may copy and paste that hieroglyph; and,
  2. Finding an input code of a given hieroglyph if one cannot copy and paste that hieroglyph.
The former case is rather simple and needs no explanation, while the latter is described in more details.

Hieroglyph Lookup (one may copy and paste)

To find an input code of a Chinese or non-Chinese hieroglyph no need to do anything but just paste a copied into buffer one. The application will recognize automatically the input, switch the radio to input Hieroglyph right after the user push Enter key. Entering the input is obligatory due to some technical limitations existing now in the application.

The result is going to appear in no time on the right of the input field if the sought character is in DB. In this case usually only one resulting entry is shown; otherwise, the user may face a bug, which is rare, or something unpredictable has happened.

Hieroglyph Code Lookup (one cannot copy and paste)

This kind of searching is needed when the user is going to figure out or just check the code. The input may be exact or fuzzy one: some or all code-letters are replaced with a placeholder '-' dash symbol.

Either letters or dashes or both of these total number cannot exceed five, which is the longest possible code in Cang Jie.

Overusing of placeholders, i.e. trying to use them more than letters, may result in poor searching performance: too long lists of results may slow down one's look-ups significantly. For this reason the user should be as much specific as possible so that to narrow the result list to those combinations, within which the sought may be found for sure. Moreover, the dictionary API is limited to output 100 hits for any kind of query, so, be forewarned.

The best approach to use a fuzzy search is to use placeholders only in places, where the user is in doubt about either a position of a letter in the code or in number of letter-codes or when one has no clue how a given hieroglyph may be typed at all. However, in almost any situation one's look-ups tend to be more successful if the first and the last letters are provided – this narrows any kind of look-up tremendously.

Information About the Dictionary

The total number of hieroglyphs, which may be covered by Cang Jie methods depends largely on the implementation and operating system it is used on. The Open Source version of the mapping tables counts above 29k of entries, which is used here.

To have a picture of the dictionary size - 29178 code mappings - and structure check the list below:

Number of letters in the code
Total number of hieroglyphs
1: 日
24
2: 日月
404
3: 日月金
3,365
4: 日月金木
14,180
5: 日月金木水
11,205

It's known that Unicode is sparing some efforts to work out kind of sorting by Cang Jie codes as well as Chinese radicals, but now it may be of a little help for those who wish to master the method.

Presumably, it would be also helpful to know the level of complexity of each and every hieroglyph in relation to its Cang Jie input code, but such information isn't available yet. To obtain some understanding of the method, visit the online Cang Jie tutor application.