Mercurial > emacs
annotate src/coding.c @ 89665:9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
(CODING_GET_INFO): Delete argument eol_type. Callers changed.
(decode_coding_utf_8): Don't do eol converion.
(detect_coding_utf_16): Check coding->src_chars, not
coding->src_bytes. Add heuristics for those that have no
signature.
(decode_coding_emacs_mule): Don't do eol converion.
(decode_coding_iso_2022): Likewise.
(decode_coding_sjis): Likewise.
(decode_coding_big5): Likewise.
(decode_coding_charset): Likewise.
(adjust_coding_eol_type): Return a new coding system.
(detect_coding): Don't detect eol. Fix for utf-16 detection.
(decode_eol): In case of CRLF->LF conversion, use del_range_2 on
each change.
(decode_coding): Pay attention to undo_list. Do eol convesion for
all types of coding-systems (if necessary).
(Vcode_conversion_work_buf_list): Delete it.
(Vcode_conversion_reused_workbuf): Renamed from
Vcode_conversion_reused_work_buf.
(Vcode_conversion_workbuf_name): New variable.
(reused_workbuf_in_use): New variable.
(make_conversion_work_buffer): Delete the arg DEPTH.
(code_conversion_restore): Argument changed to cons.
(code_conversion_save): Delete the argument BUFFER. Callers
changed.
(detect_coding_system): New argument src_chars. Callers changed.
Fix for utf-16 detection.
(init_coding_once): Don't use ISO_carriage_return.
(syms_of_coding): Initialized Vcode_conversion_workbuf_name and
reused_workbuf_in_use.
| author | Kenichi Handa <handa@m17n.org> |
|---|---|
| date | Tue, 02 Dec 2003 01:40:27 +0000 |
| parents | cbaa9fd1aa5c |
| children | cf1ff36f92dc |
| rev | line source |
|---|---|
| 88936 | 1 /* Coding system handler (conversion, detection, etc). |
| 20708 | 2 Copyright (C) 1995, 1997, 1998 Electrotechnical Laboratory, JAPAN. |
| 89483 | 3 Licensed to the Free Software Foundation. |
|
88862
108e2535280d
(adjust_coding_eol_type): Fix eol_type/eol_seen mixup.
Dave Love <fx@gnu.org>
parents:
88856
diff
changeset
|
4 Copyright (C) 2001, 2002 Free Software Foundation, Inc. |
| 89483 | 5 Copyright (C) 2003 |
| 88365 | 6 National Institute of Advanced Industrial Science and Technology (AIST) |
| 7 Registration Number H13PRO009 | |
| 17052 | 8 |
| 17071 | 9 This file is part of GNU Emacs. |
| 10 | |
| 11 GNU Emacs is free software; you can redistribute it and/or modify | |
| 12 it under the terms of the GNU General Public License as published by | |
| 13 the Free Software Foundation; either version 2, or (at your option) | |
| 14 any later version. | |
| 15 | |
| 16 GNU Emacs is distributed in the hope that it will be useful, | |
| 17 but WITHOUT ANY WARRANTY; without even the implied warranty of | |
| 18 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
| 19 GNU General Public License for more details. | |
| 20 | |
| 21 You should have received a copy of the GNU General Public License | |
| 22 along with GNU Emacs; see the file COPYING. If not, write to | |
| 23 the Free Software Foundation, Inc., 59 Temple Place - Suite 330, | |
| 24 Boston, MA 02111-1307, USA. */ | |
| 17052 | 25 |
| 26 /*** TABLE OF CONTENTS *** | |
| 27 | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
28 0. General comments |
| 17052 | 29 1. Preamble |
| 88365 | 30 2. Emacs' internal format (emacs-utf-8) handlers |
| 31 3. UTF-8 handlers | |
| 32 4. UTF-16 handlers | |
| 33 5. Charset-base coding systems handlers | |
| 34 6. emacs-mule (old Emacs' internal format) handlers | |
| 35 7. ISO2022 handlers | |
| 36 8. Shift-JIS and BIG5 handlers | |
| 37 9. CCL handlers | |
| 38 10. C library functions | |
| 39 11. Emacs Lisp library functions | |
| 40 12. Postamble | |
| 17052 | 41 |
| 42 */ | |
| 43 | |
| 88365 | 44 /*** 0. General comments *** |
| 45 | |
| 46 | |
| 47 CODING SYSTEM | |
| 48 | |
| 88485 | 49 A coding system is an object for an encoding mechanism that contains |
| 50 information about how to convert byte sequences to character | |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
51 sequences and vice versa. When we say "decode", it means converting |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
52 a byte sequence of a specific coding system into a character |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
53 sequence that is represented by Emacs' internal coding system |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
54 `emacs-utf-8', and when we say "encode", it means converting a |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
55 character sequence of emacs-utf-8 to a byte sequence of a specific |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
56 coding system. |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
57 |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
58 In Emacs Lisp, a coding system is represented by a Lisp symbol. In |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
59 C level, a coding system is represented by a vector of attributes |
| 88485 | 60 stored in the hash table Vcharset_hash_table. The conversion from |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
61 coding system symbol to attributes vector is done by looking up |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
62 Vcharset_hash_table by the symbol. |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
63 |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
64 Coding systems are classified into the following types depending on |
| 88485 | 65 the encoding mechanism. Here's a brief description of the types. |
| 88365 | 66 |
| 67 o UTF-8 | |
| 68 | |
| 69 o UTF-16 | |
| 70 | |
| 71 o Charset-base coding system | |
| 72 | |
| 73 A coding system defined by one or more (coded) character sets. | |
| 88485 | 74 Decoding and encoding are done by a code converter defined for each |
| 88365 | 75 character set. |
| 76 | |
| 88485 | 77 o Old Emacs internal format (emacs-mule) |
| 78 | |
| 79 The coding system adopted by old versions of Emacs (20 and 21). | |
| 88365 | 80 |
| 81 o ISO2022-base coding system | |
| 17052 | 82 |
| 83 The most famous coding system for multiple character sets. X's | |
| 88365 | 84 Compound Text, various EUCs (Extended Unix Code), and coding systems |
| 85 used in the Internet communication such as ISO-2022-JP are all | |
| 86 variants of ISO2022. | |
| 87 | |
| 88 o SJIS (or Shift-JIS or MS-Kanji-Code) | |
|
42104
d69c2368e549
(DECODE_COMPOSITION_END): Fixed a typo in the last
Sam Steingold <sds@gnu.org>
parents:
42103
diff
changeset
|
89 |
| 17052 | 90 A coding system to encode character sets: ASCII, JISX0201, and |
| 91 JISX0208. Widely used for PC's in Japan. Details are described in | |
| 88365 | 92 section 8. |
| 93 | |
| 94 o BIG5 | |
| 95 | |
| 96 A coding system to encode character sets: ASCII and Big5. Widely | |
| 88771 | 97 used for Chinese (mainly in Taiwan and Hong Kong). Details are |
| 88365 | 98 described in section 8. In this file, when we write "big5" (all |
| 99 lowercase), we mean the coding system, and when we write "Big5" | |
| 100 (capitalized), we mean the character set. | |
| 101 | |
| 102 o CCL | |
| 103 | |
| 88485 | 104 If a user wants to decode/encode text encoded in a coding system |
| 88365 | 105 not listed above, he can supply a decoder and an encoder for it in |
| 106 CCL (Code Conversion Language) programs. Emacs executes the CCL | |
| 107 program while decoding/encoding. | |
| 108 | |
| 109 o Raw-text | |
| 110 | |
| 88771 | 111 A coding system for text containing raw eight-bit data. Emacs |
| 88485 | 112 treats each byte of source text as a character (except for |
| 88365 | 113 end-of-line conversion). |
| 114 | |
| 115 o No-conversion | |
| 116 | |
| 117 Like raw text, but don't do end-of-line conversion. | |
| 118 | |
| 119 | |
| 120 END-OF-LINE FORMAT | |
| 121 | |
| 88485 | 122 How text end-of-line is encoded depends on operating system. For |
| 88365 | 123 instance, Unix's format is just one byte of LF (line-feed) code, |
| 18766 | 124 whereas DOS's format is two-byte sequence of `carriage-return' and |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
125 `line-feed' codes. MacOS's format is usually one byte of |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
126 `carriage-return'. |
| 17052 | 127 |
| 88485 | 128 Since text character encoding and end-of-line encoding are |
| 88365 | 129 independent, any coding system described above can take any format |
| 130 of end-of-line (except for no-conversion). | |
| 17052 | 131 |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
132 STRUCT CODING_SYSTEM |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
133 |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
134 Before using a coding system for code conversion (i.e. decoding and |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
135 encoding), we setup a structure of type `struct coding_system'. |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
136 This structure keeps various information about a specific code |
| 88485 | 137 conversion (e.g. the location of source and destination data). |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
138 |
| 17052 | 139 */ |
| 140 | |
| 88365 | 141 /* COMMON MACROS */ |
| 142 | |
| 143 | |
| 17052 | 144 /*** GENERAL NOTES on `detect_coding_XXX ()' functions *** |
| 145 | |
| 88365 | 146 These functions check if a byte sequence specified as a source in |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
147 CODING conforms to the format of XXX, and update the members of |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
148 DETECT_INFO. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
149 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
150 Return 1 if the byte sequence conforms to XXX, otherwise return 0. |
| 88365 | 151 |
| 152 Below is the template of these functions. */ | |
| 153 | |
| 17052 | 154 #if 0 |
| 88365 | 155 static int |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
156 detect_coding_XXX (coding, detect_info) |
| 88365 | 157 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
158 struct coding_detection_info *detect_info; |
| 17052 | 159 { |
| 88365 | 160 unsigned char *src = coding->source; |
| 161 unsigned char *src_end = coding->source + coding->src_bytes; | |
| 162 int multibytep = coding->src_multibyte; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
163 int consumed_chars = 0; |
| 88365 | 164 int found = 0; |
| 165 ...; | |
| 166 | |
| 167 while (1) | |
| 168 { | |
| 169 /* Get one byte from the source. If the souce is exausted, jump | |
| 170 to no_more_source:. */ | |
| 171 ONE_MORE_BYTE (c); | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
172 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
173 if (! __C_conforms_to_XXX___ (c)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
174 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
175 if (! __C_strongly_suggests_XXX__ (c)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
176 found = CATEGORY_MASK_XXX; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
177 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
178 /* The byte sequence is invalid for XXX. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
179 detect_info->rejected |= CATEGORY_MASK_XXX; |
| 88365 | 180 return 0; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
181 |
| 88365 | 182 no_more_source: |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
183 /* The source exausted successfully. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
184 detect_info->found |= found; |
| 88365 | 185 return 1; |
| 17052 | 186 } |
| 187 #endif | |
| 188 | |
| 189 /*** GENERAL NOTES on `decode_coding_XXX ()' functions *** | |
| 190 | |
| 88365 | 191 These functions decode a byte sequence specified as a source by |
| 192 CODING. The resulting multibyte text goes to a place pointed to by | |
| 193 CODING->charbuf, the length of which should not exceed | |
| 194 CODING->charbuf_size; | |
| 195 | |
| 196 These functions set the information of original and decoded texts in | |
| 197 CODING->consumed, CODING->consumed_char, and CODING->charbuf_used. | |
| 198 They also set CODING->result to one of CODING_RESULT_XXX indicating | |
| 199 how the decoding is finished. | |
| 200 | |
| 201 Below is the template of these functions. */ | |
| 202 | |
| 17052 | 203 #if 0 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
204 static void |
| 88365 | 205 decode_coding_XXXX (coding) |
| 17052 | 206 struct coding_system *coding; |
| 207 { | |
| 88365 | 208 unsigned char *src = coding->source + coding->consumed; |
| 209 unsigned char *src_end = coding->source + coding->src_bytes; | |
| 210 /* SRC_BASE remembers the start position in source in each loop. | |
| 211 The loop will be exited when there's not enough source code, or | |
| 212 when there's no room in CHARBUF for a decoded character. */ | |
| 213 unsigned char *src_base; | |
| 214 /* A buffer to produce decoded characters. */ | |
| 215 int *charbuf = coding->charbuf; | |
| 216 int *charbuf_end = charbuf + coding->charbuf_size; | |
| 217 int multibytep = coding->src_multibyte; | |
| 218 | |
| 219 while (1) | |
| 220 { | |
| 221 src_base = src; | |
| 222 if (charbuf < charbuf_end) | |
| 223 /* No more room to produce a decoded character. */ | |
| 224 break; | |
| 225 ONE_MORE_BYTE (c); | |
| 226 /* Decode it. */ | |
| 227 } | |
| 228 | |
| 229 no_more_source: | |
| 230 if (src_base < src_end | |
| 231 && coding->mode & CODING_MODE_LAST_BLOCK) | |
| 232 /* If the source ends by partial bytes to construct a character, | |
| 233 treat them as eight-bit raw data. */ | |
| 234 while (src_base < src_end && charbuf < charbuf_end) | |
| 235 *charbuf++ = *src_base++; | |
| 236 /* Remember how many bytes and characters we consumed. If the | |
| 237 source is multibyte, the bytes and chars are not identical. */ | |
| 238 coding->consumed = coding->consumed_char = src_base - coding->source; | |
| 239 /* Remember how many characters we produced. */ | |
| 240 coding->charbuf_used = charbuf - coding->charbuf; | |
| 17052 | 241 } |
| 242 #endif | |
| 243 | |
| 244 /*** GENERAL NOTES on `encode_coding_XXX ()' functions *** | |
| 245 | |
| 88365 | 246 These functions encode SRC_BYTES length text at SOURCE of Emacs' |
| 247 internal multibyte format by CODING. The resulting byte sequence | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
248 goes to a place pointed to by DESTINATION, the length of which |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
249 should not exceed DST_BYTES. |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
250 |
| 88365 | 251 These functions set the information of original and encoded texts in |
| 252 the members produced, produced_char, consumed, and consumed_char of | |
| 253 the structure *CODING. They also set the member result to one of | |
| 254 CODING_RESULT_XXX indicating how the encoding finished. | |
| 255 | |
| 256 DST_BYTES zero means that source area and destination area are | |
| 257 overlapped, which means that we can produce a encoded text until it | |
| 258 reaches at the head of not-yet-encoded source text. | |
| 259 | |
| 260 Below is a template of these functions. */ | |
| 17052 | 261 #if 0 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
262 static void |
| 88365 | 263 encode_coding_XXX (coding) |
| 17052 | 264 struct coding_system *coding; |
| 265 { | |
| 88365 | 266 int multibytep = coding->dst_multibyte; |
| 267 int *charbuf = coding->charbuf; | |
| 268 int *charbuf_end = charbuf->charbuf + coding->charbuf_used; | |
| 269 unsigned char *dst = coding->destination + coding->produced; | |
| 270 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 271 unsigned char *adjusted_dst_end = dst_end - _MAX_BYTES_PRODUCED_IN_LOOP_; | |
| 272 int produced_chars = 0; | |
| 273 | |
| 274 for (; charbuf < charbuf_end && dst < adjusted_dst_end; charbuf++) | |
| 275 { | |
| 276 int c = *charbuf; | |
| 277 /* Encode C into DST, and increment DST. */ | |
| 278 } | |
| 279 label_no_more_destination: | |
| 280 /* How many chars and bytes we produced. */ | |
| 281 coding->produced_char += produced_chars; | |
| 282 coding->produced = dst - coding->destination; | |
| 17052 | 283 } |
| 284 #endif | |
| 285 | |
| 286 | |
| 287 /*** 1. Preamble ***/ | |
| 288 | |
|
26088
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
289 #include <config.h> |
| 17052 | 290 #include <stdio.h> |
| 291 | |
| 292 #include "lisp.h" | |
| 293 #include "buffer.h" | |
| 88365 | 294 #include "character.h" |
| 17052 | 295 #include "charset.h" |
| 88365 | 296 #include "ccl.h" |
| 26847 | 297 #include "composite.h" |
| 17052 | 298 #include "coding.h" |
| 299 #include "window.h" | |
| 300 | |
| 88365 | 301 Lisp_Object Vcoding_system_hash_table; |
| 302 | |
| 303 Lisp_Object Qcoding_system, Qcoding_aliases, Qeol_type; | |
|
88646
d3b1f30e2267
(Qmac): Remove (duplicated) definition.
Dave Love <fx@gnu.org>
parents:
88645
diff
changeset
|
304 Lisp_Object Qunix, Qdos; |
|
d3b1f30e2267
(Qmac): Remove (duplicated) definition.
Dave Love <fx@gnu.org>
parents:
88645
diff
changeset
|
305 extern Lisp_Object Qmac; /* frame.c */ |
| 17052 | 306 Lisp_Object Qbuffer_file_coding_system; |
| 307 Lisp_Object Qpost_read_conversion, Qpre_write_conversion; | |
| 88365 | 308 Lisp_Object Qdefault_char; |
|
19612
783efd6c7c1e
(Qno_conversion, Qundecided): New variables.
Kenichi Handa <handa@m17n.org>
parents:
19546
diff
changeset
|
309 Lisp_Object Qno_conversion, Qundecided; |
| 88365 | 310 Lisp_Object Qcharset, Qiso_2022, Qutf_8, Qutf_16, Qshift_jis, Qbig5; |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
311 Lisp_Object Qbig, Qlittle; |
|
19750
95e4e1cba6ac
(Qcoding_system_history): New variable.
Richard M. Stallman <rms@gnu.org>
parents:
19747
diff
changeset
|
312 Lisp_Object Qcoding_system_history; |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
313 Lisp_Object Qvalid_codes; |
|
89468
7dbbe692f70c
* coding.c (QCcategory): New variable.
Kenichi Handa <handa@m17n.org>
parents:
89467
diff
changeset
|
314 Lisp_Object QCcategory; |
| 17052 | 315 |
| 316 extern Lisp_Object Qinsert_file_contents, Qwrite_region; | |
| 317 Lisp_Object Qcall_process, Qcall_process_region, Qprocess_argument; | |
| 318 Lisp_Object Qstart_process, Qopen_network_stream; | |
| 319 Lisp_Object Qtarget_idx; | |
| 320 | |
| 89483 | 321 int coding_system_require_warning; |
| 322 | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
323 Lisp_Object Vselect_safe_coding_system_function; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
324 |
|
24200
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
325 /* Mnemonic string for each format of end-of-line. */ |
|
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
326 Lisp_Object eol_mnemonic_unix, eol_mnemonic_dos, eol_mnemonic_mac; |
|
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
327 /* Mnemonic string to indicate format of end-of-line is not yet |
| 17052 | 328 decided. */ |
|
24200
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
329 Lisp_Object eol_mnemonic_undecided; |
| 17052 | 330 |
| 331 #ifdef emacs | |
| 332 | |
|
20105
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
333 Lisp_Object Vcoding_system_list, Vcoding_system_alist; |
|
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
334 |
|
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
335 Lisp_Object Qcoding_system_p, Qcoding_system_error; |
| 17052 | 336 |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
337 /* Coding system emacs-mule and raw-text are for converting only |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
338 end-of-line format. */ |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
339 Lisp_Object Qemacs_mule, Qraw_text; |
| 89483 | 340 Lisp_Object Qutf_8_emacs; |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
341 |
| 17052 | 342 /* Coding-systems are handed between Emacs Lisp programs and C internal |
| 343 routines by the following three variables. */ | |
| 344 /* Coding-system for reading files and receiving data from process. */ | |
| 345 Lisp_Object Vcoding_system_for_read; | |
| 346 /* Coding-system for writing files and sending data to process. */ | |
| 347 Lisp_Object Vcoding_system_for_write; | |
| 348 /* Coding-system actually used in the latest I/O. */ | |
| 349 Lisp_Object Vlast_coding_system_used; | |
| 350 | |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
351 /* A vector of length 256 which contains information about special |
| 22529 | 352 Latin codes (especially for dealing with Microsoft codes). */ |
|
19365
d9374f5ebd3a
(CODING_FLAG_ISO_LATIN_EXTRA): New macro.
Kenichi Handa <handa@m17n.org>
parents:
19285
diff
changeset
|
353 Lisp_Object Vlatin_extra_code_table; |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
354 |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
355 /* Flag to inhibit code conversion of end-of-line format. */ |
|
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
356 int inhibit_eol_conversion; |
|
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
357 |
|
30204
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
358 /* Flag to inhibit ISO2022 escape sequence detection. */ |
|
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
359 int inhibit_iso_escape_detection; |
|
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
360 |
|
21574
30394e3ae7f8
(syms_of_coding): Declare and define inherit-process-coding-system.
Eli Zaretskii <eliz@gnu.org>
parents:
21520
diff
changeset
|
361 /* Flag to make buffer-file-coding-system inherit from process-coding. */ |
|
30394e3ae7f8
(syms_of_coding): Declare and define inherit-process-coding-system.
Eli Zaretskii <eliz@gnu.org>
parents:
21520
diff
changeset
|
362 int inherit_process_coding_system; |
|
30394e3ae7f8
(syms_of_coding): Declare and define inherit-process-coding-system.
Eli Zaretskii <eliz@gnu.org>
parents:
21520
diff
changeset
|
363 |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
364 /* Coding system to be used to encode text for terminal display. */ |
| 17052 | 365 struct coding_system terminal_coding; |
| 366 | |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
367 /* Coding system to be used to encode text for terminal display when |
|
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
368 terminal coding system is nil. */ |
|
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
369 struct coding_system safe_terminal_coding; |
|
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
370 |
|
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
371 /* Coding system of what is sent from terminal keyboard. */ |
| 17052 | 372 struct coding_system keyboard_coding; |
| 373 | |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
374 Lisp_Object Vfile_coding_system_alist; |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
375 Lisp_Object Vprocess_coding_system_alist; |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
376 Lisp_Object Vnetwork_coding_system_alist; |
| 17052 | 377 |
|
26088
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
378 Lisp_Object Vlocale_coding_system; |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
379 |
| 17052 | 380 #endif /* emacs */ |
| 381 | |
|
22186
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
382 /* Flag to tell if we look up translation table on character code |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
383 conversion. */ |
|
22119
592bb8b9bcfd
Change terms unify/unification to
Kenichi Handa <handa@m17n.org>
parents:
22020
diff
changeset
|
384 Lisp_Object Venable_character_translation; |
|
22186
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
385 /* Standard translation table to look up on decoding (reading). */ |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
386 Lisp_Object Vstandard_translation_table_for_decode; |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
387 /* Standard translation table to look up on encoding (writing). */ |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
388 Lisp_Object Vstandard_translation_table_for_encode; |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
389 |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
390 Lisp_Object Qtranslation_table; |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
391 Lisp_Object Qtranslation_table_id; |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
392 Lisp_Object Qtranslation_table_for_decode; |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
393 Lisp_Object Qtranslation_table_for_encode; |
| 17052 | 394 |
| 395 /* Alist of charsets vs revision number. */ | |
| 88365 | 396 static Lisp_Object Vcharset_revision_table; |
| 17052 | 397 |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
398 /* Default coding systems used for process I/O. */ |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
399 Lisp_Object Vdefault_process_coding_system; |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
400 |
|
48182
9474e269efd1
Reformat some DEFUNs so that etags works.
Dave Love <fx@gnu.org>
parents:
48125
diff
changeset
|
401 /* Char table for translating Quail and self-inserting input. */ |
|
9474e269efd1
Reformat some DEFUNs so that etags works.
Dave Love <fx@gnu.org>
parents:
48125
diff
changeset
|
402 Lisp_Object Vtranslation_table_for_input; |
|
9474e269efd1
Reformat some DEFUNs so that etags works.
Dave Love <fx@gnu.org>
parents:
48125
diff
changeset
|
403 |
| 88365 | 404 /* Two special coding systems. */ |
| 405 Lisp_Object Vsjis_coding_system; | |
| 406 Lisp_Object Vbig5_coding_system; | |
| 407 | |
| 408 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
409 static int detect_coding_utf_8 P_ ((struct coding_system *, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
410 struct coding_detection_info *info)); |
| 88365 | 411 static void decode_coding_utf_8 P_ ((struct coding_system *)); |
| 412 static int encode_coding_utf_8 P_ ((struct coding_system *)); | |
| 413 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
414 static int detect_coding_utf_16 P_ ((struct coding_system *, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
415 struct coding_detection_info *info)); |
| 88365 | 416 static void decode_coding_utf_16 P_ ((struct coding_system *)); |
| 417 static int encode_coding_utf_16 P_ ((struct coding_system *)); | |
| 418 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
419 static int detect_coding_iso_2022 P_ ((struct coding_system *, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
420 struct coding_detection_info *info)); |
| 88365 | 421 static void decode_coding_iso_2022 P_ ((struct coding_system *)); |
| 422 static int encode_coding_iso_2022 P_ ((struct coding_system *)); | |
| 423 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
424 static int detect_coding_emacs_mule P_ ((struct coding_system *, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
425 struct coding_detection_info *info)); |
| 88365 | 426 static void decode_coding_emacs_mule P_ ((struct coding_system *)); |
| 427 static int encode_coding_emacs_mule P_ ((struct coding_system *)); | |
| 428 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
429 static int detect_coding_sjis P_ ((struct coding_system *, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
430 struct coding_detection_info *info)); |
| 88365 | 431 static void decode_coding_sjis P_ ((struct coding_system *)); |
| 432 static int encode_coding_sjis P_ ((struct coding_system *)); | |
| 433 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
434 static int detect_coding_big5 P_ ((struct coding_system *, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
435 struct coding_detection_info *info)); |
| 88365 | 436 static void decode_coding_big5 P_ ((struct coding_system *)); |
| 437 static int encode_coding_big5 P_ ((struct coding_system *)); | |
| 438 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
439 static int detect_coding_ccl P_ ((struct coding_system *, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
440 struct coding_detection_info *info)); |
| 88365 | 441 static void decode_coding_ccl P_ ((struct coding_system *)); |
| 442 static int encode_coding_ccl P_ ((struct coding_system *)); | |
| 443 | |
| 444 static void decode_coding_raw_text P_ ((struct coding_system *)); | |
| 445 static int encode_coding_raw_text P_ ((struct coding_system *)); | |
| 446 | |
| 447 | |
| 448 /* ISO2022 section */ | |
| 449 | |
| 450 #define CODING_ISO_INITIAL(coding, reg) \ | |
| 451 (XINT (AREF (AREF (CODING_ID_ATTRS ((coding)->id), \ | |
| 452 coding_attr_iso_initial), \ | |
| 453 reg))) | |
| 454 | |
| 455 | |
| 456 #define CODING_ISO_REQUEST(coding, charset_id) \ | |
| 457 ((charset_id <= (coding)->max_charset_id \ | |
| 458 ? (coding)->safe_charsets[charset_id] \ | |
| 459 : -1)) | |
| 460 | |
| 461 | |
| 462 #define CODING_ISO_FLAGS(coding) \ | |
| 463 ((coding)->spec.iso_2022.flags) | |
| 464 #define CODING_ISO_DESIGNATION(coding, reg) \ | |
| 465 ((coding)->spec.iso_2022.current_designation[reg]) | |
| 466 #define CODING_ISO_INVOCATION(coding, plane) \ | |
| 467 ((coding)->spec.iso_2022.current_invocation[plane]) | |
| 468 #define CODING_ISO_SINGLE_SHIFTING(coding) \ | |
| 469 ((coding)->spec.iso_2022.single_shifting) | |
| 470 #define CODING_ISO_BOL(coding) \ | |
| 471 ((coding)->spec.iso_2022.bol) | |
| 472 #define CODING_ISO_INVOKED_CHARSET(coding, plane) \ | |
| 473 CODING_ISO_DESIGNATION ((coding), CODING_ISO_INVOCATION ((coding), (plane))) | |
| 474 | |
| 475 /* Control characters of ISO2022. */ | |
| 476 /* code */ /* function */ | |
| 477 #define ISO_CODE_LF 0x0A /* line-feed */ | |
| 478 #define ISO_CODE_CR 0x0D /* carriage-return */ | |
| 479 #define ISO_CODE_SO 0x0E /* shift-out */ | |
| 480 #define ISO_CODE_SI 0x0F /* shift-in */ | |
| 481 #define ISO_CODE_SS2_7 0x19 /* single-shift-2 for 7-bit code */ | |
| 482 #define ISO_CODE_ESC 0x1B /* escape */ | |
| 483 #define ISO_CODE_SS2 0x8E /* single-shift-2 */ | |
| 484 #define ISO_CODE_SS3 0x8F /* single-shift-3 */ | |
| 485 #define ISO_CODE_CSI 0x9B /* control-sequence-introducer */ | |
| 486 | |
| 487 /* All code (1-byte) of ISO2022 is classified into one of the | |
| 488 followings. */ | |
| 489 enum iso_code_class_type | |
| 490 { | |
| 491 ISO_control_0, /* Control codes in the range | |
| 492 0x00..0x1F and 0x7F, except for the | |
| 493 following 5 codes. */ | |
| 494 ISO_shift_out, /* ISO_CODE_SO (0x0E) */ | |
| 495 ISO_shift_in, /* ISO_CODE_SI (0x0F) */ | |
| 496 ISO_single_shift_2_7, /* ISO_CODE_SS2_7 (0x19) */ | |
| 497 ISO_escape, /* ISO_CODE_SO (0x1B) */ | |
| 498 ISO_control_1, /* Control codes in the range | |
| 499 0x80..0x9F, except for the | |
| 500 following 3 codes. */ | |
| 501 ISO_single_shift_2, /* ISO_CODE_SS2 (0x8E) */ | |
| 502 ISO_single_shift_3, /* ISO_CODE_SS3 (0x8F) */ | |
| 503 ISO_control_sequence_introducer, /* ISO_CODE_CSI (0x9B) */ | |
| 504 ISO_0x20_or_0x7F, /* Codes of the values 0x20 or 0x7F. */ | |
| 505 ISO_graphic_plane_0, /* Graphic codes in the range 0x21..0x7E. */ | |
| 506 ISO_0xA0_or_0xFF, /* Codes of the values 0xA0 or 0xFF. */ | |
| 507 ISO_graphic_plane_1 /* Graphic codes in the range 0xA1..0xFE. */ | |
| 508 }; | |
| 509 | |
| 510 /** The macros CODING_ISO_FLAG_XXX defines a flag bit of the | |
| 511 `iso-flags' attribute of an iso2022 coding system. */ | |
| 512 | |
| 513 /* If set, produce long-form designation sequence (e.g. ESC $ ( A) | |
| 514 instead of the correct short-form sequence (e.g. ESC $ A). */ | |
| 515 #define CODING_ISO_FLAG_LONG_FORM 0x0001 | |
| 516 | |
| 517 /* If set, reset graphic planes and registers at end-of-line to the | |
| 518 initial state. */ | |
| 519 #define CODING_ISO_FLAG_RESET_AT_EOL 0x0002 | |
| 520 | |
| 521 /* If set, reset graphic planes and registers before any control | |
| 522 characters to the initial state. */ | |
| 523 #define CODING_ISO_FLAG_RESET_AT_CNTL 0x0004 | |
| 524 | |
| 525 /* If set, encode by 7-bit environment. */ | |
| 526 #define CODING_ISO_FLAG_SEVEN_BITS 0x0008 | |
| 527 | |
| 528 /* If set, use locking-shift function. */ | |
| 529 #define CODING_ISO_FLAG_LOCKING_SHIFT 0x0010 | |
| 530 | |
| 531 /* If set, use single-shift function. Overwrite | |
| 532 CODING_ISO_FLAG_LOCKING_SHIFT. */ | |
| 533 #define CODING_ISO_FLAG_SINGLE_SHIFT 0x0020 | |
| 534 | |
| 535 /* If set, use designation escape sequence. */ | |
| 536 #define CODING_ISO_FLAG_DESIGNATION 0x0040 | |
| 537 | |
| 538 /* If set, produce revision number sequence. */ | |
| 539 #define CODING_ISO_FLAG_REVISION 0x0080 | |
| 540 | |
| 541 /* If set, produce ISO6429's direction specifying sequence. */ | |
| 542 #define CODING_ISO_FLAG_DIRECTION 0x0100 | |
| 543 | |
| 544 /* If set, assume designation states are reset at beginning of line on | |
| 545 output. */ | |
| 546 #define CODING_ISO_FLAG_INIT_AT_BOL 0x0200 | |
| 547 | |
| 548 /* If set, designation sequence should be placed at beginning of line | |
| 549 on output. */ | |
| 550 #define CODING_ISO_FLAG_DESIGNATE_AT_BOL 0x0400 | |
| 551 | |
| 552 /* If set, do not encode unsafe charactes on output. */ | |
| 553 #define CODING_ISO_FLAG_SAFE 0x0800 | |
| 554 | |
| 555 /* If set, extra latin codes (128..159) are accepted as a valid code | |
| 556 on input. */ | |
| 557 #define CODING_ISO_FLAG_LATIN_EXTRA 0x1000 | |
| 558 | |
| 559 #define CODING_ISO_FLAG_COMPOSITION 0x2000 | |
| 560 | |
| 561 #define CODING_ISO_FLAG_EUC_TW_SHIFT 0x4000 | |
| 562 | |
|
88681
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
563 #define CODING_ISO_FLAG_USE_ROMAN 0x8000 |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
564 |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
565 #define CODING_ISO_FLAG_USE_OLDJIS 0x10000 |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
566 |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
567 #define CODING_ISO_FLAG_FULL_SUPPORT 0x100000 |
| 88365 | 568 |
| 569 /* A character to be produced on output if encoding of the original | |
| 570 character is prohibited by CODING_ISO_FLAG_SAFE. */ | |
| 571 #define CODING_INHIBIT_CHARACTER_SUBSTITUTION '?' | |
| 572 | |
| 573 | |
| 574 /* UTF-16 section */ | |
| 575 #define CODING_UTF_16_BOM(coding) \ | |
| 576 ((coding)->spec.utf_16.bom) | |
| 577 | |
| 578 #define CODING_UTF_16_ENDIAN(coding) \ | |
| 579 ((coding)->spec.utf_16.endian) | |
| 580 | |
| 581 #define CODING_UTF_16_SURROGATE(coding) \ | |
| 582 ((coding)->spec.utf_16.surrogate) | |
| 583 | |
| 584 | |
| 585 /* CCL section */ | |
| 586 #define CODING_CCL_DECODER(coding) \ | |
| 587 AREF (CODING_ID_ATTRS ((coding)->id), coding_attr_ccl_decoder) | |
| 588 #define CODING_CCL_ENCODER(coding) \ | |
| 589 AREF (CODING_ID_ATTRS ((coding)->id), coding_attr_ccl_encoder) | |
| 590 #define CODING_CCL_VALIDS(coding) \ | |
| 89483 | 591 (SDATA (AREF (CODING_ID_ATTRS ((coding)->id), coding_attr_ccl_valids))) |
| 88365 | 592 |
| 88771 | 593 /* Index for each coding category in `coding_categories' */ |
| 88365 | 594 |
| 595 enum coding_category | |
| 596 { | |
| 597 coding_category_iso_7, | |
| 598 coding_category_iso_7_tight, | |
| 599 coding_category_iso_8_1, | |
| 600 coding_category_iso_8_2, | |
| 601 coding_category_iso_7_else, | |
| 602 coding_category_iso_8_else, | |
| 603 coding_category_utf_8, | |
| 604 coding_category_utf_16_auto, | |
| 605 coding_category_utf_16_be, | |
| 606 coding_category_utf_16_le, | |
| 607 coding_category_utf_16_be_nosig, | |
| 608 coding_category_utf_16_le_nosig, | |
| 609 coding_category_charset, | |
| 610 coding_category_sjis, | |
| 611 coding_category_big5, | |
| 612 coding_category_ccl, | |
| 613 coding_category_emacs_mule, | |
| 614 /* All above are targets of code detection. */ | |
| 615 coding_category_raw_text, | |
| 616 coding_category_undecided, | |
| 617 coding_category_max | |
| 618 }; | |
| 619 | |
| 620 /* Definitions of flag bits used in detect_coding_XXXX. */ | |
| 621 #define CATEGORY_MASK_ISO_7 (1 << coding_category_iso_7) | |
| 622 #define CATEGORY_MASK_ISO_7_TIGHT (1 << coding_category_iso_7_tight) | |
| 623 #define CATEGORY_MASK_ISO_8_1 (1 << coding_category_iso_8_1) | |
| 624 #define CATEGORY_MASK_ISO_8_2 (1 << coding_category_iso_8_2) | |
| 625 #define CATEGORY_MASK_ISO_7_ELSE (1 << coding_category_iso_7_else) | |
| 626 #define CATEGORY_MASK_ISO_8_ELSE (1 << coding_category_iso_8_else) | |
| 627 #define CATEGORY_MASK_UTF_8 (1 << coding_category_utf_8) | |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
628 #define CATEGORY_MASK_UTF_16_AUTO (1 << coding_category_utf_16_auto) |
| 88365 | 629 #define CATEGORY_MASK_UTF_16_BE (1 << coding_category_utf_16_be) |
| 630 #define CATEGORY_MASK_UTF_16_LE (1 << coding_category_utf_16_le) | |
| 631 #define CATEGORY_MASK_UTF_16_BE_NOSIG (1 << coding_category_utf_16_be_nosig) | |
| 632 #define CATEGORY_MASK_UTF_16_LE_NOSIG (1 << coding_category_utf_16_le_nosig) | |
| 633 #define CATEGORY_MASK_CHARSET (1 << coding_category_charset) | |
| 634 #define CATEGORY_MASK_SJIS (1 << coding_category_sjis) | |
| 635 #define CATEGORY_MASK_BIG5 (1 << coding_category_big5) | |
| 636 #define CATEGORY_MASK_CCL (1 << coding_category_ccl) | |
| 637 #define CATEGORY_MASK_EMACS_MULE (1 << coding_category_emacs_mule) | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
638 #define CATEGORY_MASK_RAW_TEXT (1 << coding_category_raw_text) |
| 88365 | 639 |
| 640 /* This value is returned if detect_coding_mask () find nothing other | |
| 641 than ASCII characters. */ | |
| 642 #define CATEGORY_MASK_ANY \ | |
| 643 (CATEGORY_MASK_ISO_7 \ | |
| 644 | CATEGORY_MASK_ISO_7_TIGHT \ | |
| 645 | CATEGORY_MASK_ISO_8_1 \ | |
| 646 | CATEGORY_MASK_ISO_8_2 \ | |
| 647 | CATEGORY_MASK_ISO_7_ELSE \ | |
| 648 | CATEGORY_MASK_ISO_8_ELSE \ | |
| 649 | CATEGORY_MASK_UTF_8 \ | |
| 650 | CATEGORY_MASK_UTF_16_BE \ | |
| 651 | CATEGORY_MASK_UTF_16_LE \ | |
| 652 | CATEGORY_MASK_UTF_16_BE_NOSIG \ | |
| 653 | CATEGORY_MASK_UTF_16_LE_NOSIG \ | |
| 654 | CATEGORY_MASK_CHARSET \ | |
| 655 | CATEGORY_MASK_SJIS \ | |
| 656 | CATEGORY_MASK_BIG5 \ | |
| 657 | CATEGORY_MASK_CCL \ | |
| 658 | CATEGORY_MASK_EMACS_MULE) | |
| 659 | |
| 660 | |
| 661 #define CATEGORY_MASK_ISO_7BIT \ | |
| 662 (CATEGORY_MASK_ISO_7 | CATEGORY_MASK_ISO_7_TIGHT) | |
| 663 | |
| 664 #define CATEGORY_MASK_ISO_8BIT \ | |
| 665 (CATEGORY_MASK_ISO_8_1 | CATEGORY_MASK_ISO_8_2) | |
| 666 | |
| 667 #define CATEGORY_MASK_ISO_ELSE \ | |
| 668 (CATEGORY_MASK_ISO_7_ELSE | CATEGORY_MASK_ISO_8_ELSE) | |
| 669 | |
| 670 #define CATEGORY_MASK_ISO_ESCAPE \ | |
| 671 (CATEGORY_MASK_ISO_7 \ | |
| 672 | CATEGORY_MASK_ISO_7_TIGHT \ | |
| 673 | CATEGORY_MASK_ISO_7_ELSE \ | |
| 674 | CATEGORY_MASK_ISO_8_ELSE) | |
| 675 | |
| 676 #define CATEGORY_MASK_ISO \ | |
| 677 ( CATEGORY_MASK_ISO_7BIT \ | |
| 678 | CATEGORY_MASK_ISO_8BIT \ | |
| 679 | CATEGORY_MASK_ISO_ELSE) | |
| 680 | |
| 681 #define CATEGORY_MASK_UTF_16 \ | |
| 682 (CATEGORY_MASK_UTF_16_BE \ | |
| 683 | CATEGORY_MASK_UTF_16_LE \ | |
| 684 | CATEGORY_MASK_UTF_16_BE_NOSIG \ | |
| 685 | CATEGORY_MASK_UTF_16_LE_NOSIG) | |
| 686 | |
| 687 | |
| 688 /* List of symbols `coding-category-xxx' ordered by priority. This | |
| 689 variable is exposed to Emacs Lisp. */ | |
| 690 static Lisp_Object Vcoding_category_list; | |
| 691 | |
| 692 /* Table of coding categories (Lisp symbols). This variable is for | |
| 693 internal use oly. */ | |
| 694 static Lisp_Object Vcoding_category_table; | |
| 695 | |
| 696 /* Table of coding-categories ordered by priority. */ | |
| 697 static enum coding_category coding_priorities[coding_category_max]; | |
| 698 | |
| 699 /* Nth element is a coding context for the coding system bound to the | |
| 700 Nth coding category. */ | |
| 701 static struct coding_system coding_categories[coding_category_max]; | |
| 702 | |
| 703 /*** Commonly used macros and functions ***/ | |
| 704 | |
| 705 #ifndef min | |
| 706 #define min(a, b) ((a) < (b) ? (a) : (b)) | |
| 707 #endif | |
| 708 #ifndef max | |
| 709 #define max(a, b) ((a) > (b) ? (a) : (b)) | |
| 710 #endif | |
| 711 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
712 #define CODING_GET_INFO(coding, attrs, charset_list) \ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
713 do { \ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
714 (attrs) = CODING_ID_ATTRS ((coding)->id); \ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
715 (charset_list) = CODING_ATTR_CHARSET_LIST (attrs); \ |
| 88365 | 716 } while (0) |
| 717 | |
| 718 | |
| 719 /* Safely get one byte from the source text pointed by SRC which ends | |
| 720 at SRC_END, and set C to that byte. If there are not enough bytes | |
| 721 in the source, it jumps to `no_more_source'. The caller | |
| 722 should declare and set these variables appropriately in advance: | |
| 723 src, src_end, multibytep | |
| 724 */ | |
| 725 | |
| 726 #define ONE_MORE_BYTE(c) \ | |
| 727 do { \ | |
| 728 if (src == src_end) \ | |
| 729 { \ | |
| 730 if (src_base < src) \ | |
| 731 coding->result = CODING_RESULT_INSUFFICIENT_SRC; \ | |
| 732 goto no_more_source; \ | |
| 733 } \ | |
| 734 c = *src++; \ | |
| 735 if (multibytep && (c & 0x80)) \ | |
| 736 { \ | |
| 737 if ((c & 0xFE) != 0xC0) \ | |
| 738 error ("Undecodable char found"); \ | |
| 739 c = ((c & 1) << 6) | *src++; \ | |
| 740 } \ | |
| 741 consumed_chars++; \ | |
| 742 } while (0) | |
| 743 | |
| 744 | |
| 745 #define ONE_MORE_BYTE_NO_CHECK(c) \ | |
| 746 do { \ | |
| 747 c = *src++; \ | |
| 748 if (multibytep && (c & 0x80)) \ | |
| 749 { \ | |
| 750 if ((c & 0xFE) != 0xC0) \ | |
| 751 error ("Undecodable char found"); \ | |
| 752 c = ((c & 1) << 6) | *src++; \ | |
| 753 } \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
754 consumed_chars++; \ |
| 88365 | 755 } while (0) |
| 756 | |
| 757 | |
| 758 /* Store a byte C in the place pointed by DST and increment DST to the | |
| 759 next free point, and increment PRODUCED_CHARS. The caller should | |
| 760 assure that C is 0..127, and declare and set the variable `dst' | |
| 761 appropriately in advance. | |
| 762 */ | |
| 763 | |
| 764 | |
| 765 #define EMIT_ONE_ASCII_BYTE(c) \ | |
| 766 do { \ | |
| 767 produced_chars++; \ | |
| 768 *dst++ = (c); \ | |
| 769 } while (0) | |
| 770 | |
| 771 | |
| 772 /* Like EMIT_ONE_ASCII_BYTE byt store two bytes; C1 and C2. */ | |
| 773 | |
| 774 #define EMIT_TWO_ASCII_BYTES(c1, c2) \ | |
| 775 do { \ | |
| 776 produced_chars += 2; \ | |
| 777 *dst++ = (c1), *dst++ = (c2); \ | |
| 778 } while (0) | |
| 779 | |
| 780 | |
| 781 /* Store a byte C in the place pointed by DST and increment DST to the | |
| 782 next free point, and increment PRODUCED_CHARS. If MULTIBYTEP is | |
| 783 nonzero, store in an appropriate multibyte from. The caller should | |
| 784 declare and set the variables `dst' and `multibytep' appropriately | |
| 785 in advance. */ | |
| 786 | |
| 787 #define EMIT_ONE_BYTE(c) \ | |
| 788 do { \ | |
| 789 produced_chars++; \ | |
| 790 if (multibytep) \ | |
| 791 { \ | |
| 792 int ch = (c); \ | |
| 793 if (ch >= 0x80) \ | |
| 794 ch = BYTE8_TO_CHAR (ch); \ | |
| 795 CHAR_STRING_ADVANCE (ch, dst); \ | |
| 796 } \ | |
| 797 else \ | |
| 798 *dst++ = (c); \ | |
| 799 } while (0) | |
| 800 | |
| 801 | |
| 802 /* Like EMIT_ONE_BYTE, but emit two bytes; C1 and C2. */ | |
| 803 | |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
804 #define EMIT_TWO_BYTES(c1, c2) \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
805 do { \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
806 produced_chars += 2; \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
807 if (multibytep) \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
808 { \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
809 int ch; \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
810 \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
811 ch = (c1); \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
812 if (ch >= 0x80) \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
813 ch = BYTE8_TO_CHAR (ch); \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
814 CHAR_STRING_ADVANCE (ch, dst); \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
815 ch = (c2); \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
816 if (ch >= 0x80) \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
817 ch = BYTE8_TO_CHAR (ch); \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
818 CHAR_STRING_ADVANCE (ch, dst); \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
819 } \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
820 else \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
821 { \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
822 *dst++ = (c1); \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
823 *dst++ = (c2); \ |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
824 } \ |
| 88365 | 825 } while (0) |
| 826 | |
| 827 | |
| 828 #define EMIT_THREE_BYTES(c1, c2, c3) \ | |
| 829 do { \ | |
| 830 EMIT_ONE_BYTE (c1); \ | |
| 831 EMIT_TWO_BYTES (c2, c3); \ | |
| 832 } while (0) | |
| 833 | |
| 834 | |
| 835 #define EMIT_FOUR_BYTES(c1, c2, c3, c4) \ | |
| 836 do { \ | |
| 837 EMIT_TWO_BYTES (c1, c2); \ | |
| 838 EMIT_TWO_BYTES (c3, c4); \ | |
| 839 } while (0) | |
| 840 | |
| 841 | |
| 842 #define CODING_DECODE_CHAR(coding, src, src_base, src_end, charset, code, c) \ | |
| 843 do { \ | |
| 844 charset_map_loaded = 0; \ | |
| 845 c = DECODE_CHAR (charset, code); \ | |
| 846 if (charset_map_loaded) \ | |
| 847 { \ | |
| 89483 | 848 const unsigned char *orig = coding->source; \ |
| 88365 | 849 EMACS_INT offset; \ |
| 850 \ | |
| 851 coding_set_source (coding); \ | |
| 852 offset = coding->source - orig; \ | |
| 853 src += offset; \ | |
| 854 src_base += offset; \ | |
| 855 src_end += offset; \ | |
| 856 } \ | |
| 857 } while (0) | |
| 858 | |
| 859 | |
| 860 #define ASSURE_DESTINATION(bytes) \ | |
| 861 do { \ | |
| 862 if (dst + (bytes) >= dst_end) \ | |
| 863 { \ | |
| 864 int more_bytes = charbuf_end - charbuf + (bytes); \ | |
| 865 \ | |
| 866 dst = alloc_destination (coding, more_bytes, dst); \ | |
| 867 dst_end = coding->destination + coding->dst_bytes; \ | |
| 868 } \ | |
| 869 } while (0) | |
| 870 | |
| 871 | |
| 872 | |
| 873 static void | |
| 874 coding_set_source (coding) | |
| 875 struct coding_system *coding; | |
| 876 { | |
| 877 if (BUFFERP (coding->src_object)) | |
| 878 { | |
|
89418
a9c2b3712863
(coding_set_source): Fix for the case that the current
Kenichi Handa <handa@m17n.org>
parents:
89404
diff
changeset
|
879 struct buffer *buf = XBUFFER (coding->src_object); |
|
a9c2b3712863
(coding_set_source): Fix for the case that the current
Kenichi Handa <handa@m17n.org>
parents:
89404
diff
changeset
|
880 |
| 88365 | 881 if (coding->src_pos < 0) |
|
89418
a9c2b3712863
(coding_set_source): Fix for the case that the current
Kenichi Handa <handa@m17n.org>
parents:
89404
diff
changeset
|
882 coding->source = BUF_GAP_END_ADDR (buf) + coding->src_pos_byte; |
| 88365 | 883 else |
|
89418
a9c2b3712863
(coding_set_source): Fix for the case that the current
Kenichi Handa <handa@m17n.org>
parents:
89404
diff
changeset
|
884 coding->source = BUF_BYTE_ADDRESS (buf, coding->src_pos_byte); |
| 88365 | 885 } |
| 886 else if (STRINGP (coding->src_object)) | |
| 887 { | |
| 89483 | 888 coding->source = SDATA (coding->src_object) + coding->src_pos_byte; |
| 88365 | 889 } |
| 890 else | |
| 891 /* Otherwise, the source is C string and is never relocated | |
| 892 automatically. Thus we don't have to update anything. */ | |
| 893 ; | |
| 894 } | |
| 895 | |
| 896 static void | |
| 897 coding_set_destination (coding) | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
898 struct coding_system *coding; |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
899 { |
| 88365 | 900 if (BUFFERP (coding->dst_object)) |
| 901 { | |
| 902 if (coding->src_pos < 0) | |
|
89042
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
903 { |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
904 coding->destination = BEG_ADDR + coding->dst_pos_byte - 1; |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
905 coding->dst_bytes = (GAP_END_ADDR |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
906 - (coding->src_bytes - coding->consumed) |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
907 - coding->destination); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
908 } |
| 88365 | 909 else |
|
89042
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
910 { |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
911 /* We are sure that coding->dst_pos_byte is before the gap |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
912 of the buffer. */ |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
913 coding->destination = (BUF_BEG_ADDR (XBUFFER (coding->dst_object)) |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
914 + coding->dst_pos_byte - 1); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
915 coding->dst_bytes = (BUF_GAP_END_ADDR (XBUFFER (coding->dst_object)) |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
916 - coding->destination); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
917 } |
| 88365 | 918 } |
| 919 else | |
| 920 /* Otherwise, the destination is C string and is never relocated | |
| 921 automatically. Thus we don't have to update anything. */ | |
| 922 ; | |
| 923 } | |
| 924 | |
| 925 | |
| 926 static void | |
| 927 coding_alloc_by_realloc (coding, bytes) | |
| 928 struct coding_system *coding; | |
| 929 EMACS_INT bytes; | |
| 930 { | |
| 931 coding->destination = (unsigned char *) xrealloc (coding->destination, | |
| 932 coding->dst_bytes + bytes); | |
| 933 coding->dst_bytes += bytes; | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
934 } |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
935 |
| 88365 | 936 static void |
| 937 coding_alloc_by_making_gap (coding, bytes) | |
| 938 struct coding_system *coding; | |
| 939 EMACS_INT bytes; | |
| 940 { | |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
941 if (BUFFERP (coding->dst_object) |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
942 && EQ (coding->src_object, coding->dst_object)) |
| 88365 | 943 { |
| 944 EMACS_INT add = coding->src_bytes - coding->consumed; | |
| 945 | |
| 946 GAP_SIZE -= add; ZV += add; Z += add; ZV_BYTE += add; Z_BYTE += add; | |
| 947 make_gap (bytes); | |
| 948 GAP_SIZE += add; ZV -= add; Z -= add; ZV_BYTE -= add; Z_BYTE -= add; | |
| 949 } | |
| 950 else | |
| 951 { | |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
952 Lisp_Object this_buffer; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
953 |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
954 this_buffer = Fcurrent_buffer (); |
| 88365 | 955 set_buffer_internal (XBUFFER (coding->dst_object)); |
| 956 make_gap (bytes); | |
| 957 set_buffer_internal (XBUFFER (this_buffer)); | |
| 958 } | |
| 959 } | |
| 89483 | 960 |
| 88365 | 961 |
| 962 static unsigned char * | |
| 963 alloc_destination (coding, nbytes, dst) | |
| 964 struct coding_system *coding; | |
|
89545
4f394eed6ff2
(inhibit_pre_post_conversion): Removed (unused).
Dave Love <fx@gnu.org>
parents:
89519
diff
changeset
|
965 EMACS_INT nbytes; |
| 88365 | 966 unsigned char *dst; |
| 967 { | |
| 968 EMACS_INT offset = dst - coding->destination; | |
| 969 | |
| 970 if (BUFFERP (coding->dst_object)) | |
| 971 coding_alloc_by_making_gap (coding, nbytes); | |
| 972 else | |
| 973 coding_alloc_by_realloc (coding, nbytes); | |
| 974 coding->result = CODING_RESULT_SUCCESS; | |
| 975 coding_set_destination (coding); | |
| 976 dst = coding->destination + offset; | |
| 977 return dst; | |
| 978 } | |
| 979 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
980 /** Macros for annotations. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
981 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
982 /* Maximum length of annotation data (sum of annotations for |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
983 composition and charset). */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
984 #define MAX_ANNOTATION_LENGTH (5 + (MAX_COMPOSITION_COMPONENTS * 2) - 1 + 5) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
985 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
986 /* An annotation data is stored in the array coding->charbuf in this |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
987 format: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
988 [ -LENGTH ANNOTATION_MASK FROM TO ... ] |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
989 LENGTH is the number of elements in the annotation. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
990 ANNOTATION_MASK is one of CODING_ANNOTATE_XXX_MASK. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
991 FROM and TO specify the range of text annotated. They are relative |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
992 to coding->src_pos (on encoding) or coding->dst_pos (on decoding). |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
993 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
994 The format of the following elements depend on ANNOTATION_MASK. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
995 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
996 In the case of CODING_ANNOTATE_COMPOSITION_MASK, these elements |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
997 follows: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
998 ... METHOD [ COMPOSITION-COMPONENTS ... ] |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
999 METHOD is one of enum composition_method. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1000 Optionnal COMPOSITION-COMPONENTS are characters and composition |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1001 rules. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1002 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1003 In the case of CODING_ANNOTATE_CHARSET_MASK, one element CHARSET-ID |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1004 follows. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1005 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1006 #define ADD_ANNOTATION_DATA(buf, len, mask, from, to) \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1007 do { \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1008 *(buf)++ = -(len); \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1009 *(buf)++ = (mask); \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1010 *(buf)++ = (from); \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1011 *(buf)++ = (to); \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1012 coding->annotated = 1; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1013 } while (0); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1014 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1015 #define ADD_COMPOSITION_DATA(buf, from, to, method) \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1016 do { \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1017 ADD_ANNOTATION_DATA (buf, 5, CODING_ANNOTATE_COMPOSITION_MASK, from, to); \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1018 *buf++ = method; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1019 } while (0) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1020 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1021 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1022 #define ADD_CHARSET_DATA(buf, from, to, id) \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1023 do { \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1024 ADD_ANNOTATION_DATA (buf, 5, CODING_ANNOTATE_CHARSET_MASK, from, to); \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1025 *buf++ = id; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1026 } while (0) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1027 |
| 88365 | 1028 |
| 1029 /*** 2. Emacs' internal format (emacs-utf-8) ***/ | |
| 1030 | |
| 1031 | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
1032 |
| 17052 | 1033 |
| 88365 | 1034 /*** 3. UTF-8 ***/ |
| 1035 | |
| 1036 /* See the above "GENERAL NOTES on `detect_coding_XXX ()' functions". | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1037 Check if a text is encoded in UTF-8. If it is, return 1, else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1038 return 0. */ |
| 88365 | 1039 |
| 1040 #define UTF_8_1_OCTET_P(c) ((c) < 0x80) | |
| 1041 #define UTF_8_EXTRA_OCTET_P(c) (((c) & 0xC0) == 0x80) | |
| 1042 #define UTF_8_2_OCTET_LEADING_P(c) (((c) & 0xE0) == 0xC0) | |
| 1043 #define UTF_8_3_OCTET_LEADING_P(c) (((c) & 0xF0) == 0xE0) | |
| 1044 #define UTF_8_4_OCTET_LEADING_P(c) (((c) & 0xF8) == 0xF0) | |
| 1045 #define UTF_8_5_OCTET_LEADING_P(c) (((c) & 0xFC) == 0xF8) | |
| 1046 | |
| 1047 static int | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1048 detect_coding_utf_8 (coding, detect_info) |
| 88365 | 1049 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1050 struct coding_detection_info *detect_info; |
| 88365 | 1051 { |
| 89483 | 1052 const unsigned char *src = coding->source, *src_base = src; |
| 1053 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 88365 | 1054 int multibytep = coding->src_multibyte; |
| 1055 int consumed_chars = 0; | |
| 1056 int found = 0; | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1057 int incomplete; |
| 88365 | 1058 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1059 detect_info->checked |= CATEGORY_MASK_UTF_8; |
| 88365 | 1060 /* A coding system of this category is always ASCII compatible. */ |
| 1061 src += coding->head_ascii; | |
| 1062 | |
| 1063 while (1) | |
| 1064 { | |
| 1065 int c, c1, c2, c3, c4; | |
| 1066 | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1067 incomplete = 0; |
| 88365 | 1068 ONE_MORE_BYTE (c); |
| 1069 if (UTF_8_1_OCTET_P (c)) | |
| 1070 continue; | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1071 incomplete = 1; |
| 88365 | 1072 ONE_MORE_BYTE (c1); |
| 1073 if (! UTF_8_EXTRA_OCTET_P (c1)) | |
| 1074 break; | |
| 1075 if (UTF_8_2_OCTET_LEADING_P (c)) | |
| 1076 { | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1077 found = CATEGORY_MASK_UTF_8; |
| 88365 | 1078 continue; |
| 1079 } | |
| 1080 ONE_MORE_BYTE (c2); | |
| 1081 if (! UTF_8_EXTRA_OCTET_P (c2)) | |
| 1082 break; | |
| 1083 if (UTF_8_3_OCTET_LEADING_P (c)) | |
| 1084 { | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1085 found = CATEGORY_MASK_UTF_8; |
| 88365 | 1086 continue; |
| 1087 } | |
| 1088 ONE_MORE_BYTE (c3); | |
| 1089 if (! UTF_8_EXTRA_OCTET_P (c3)) | |
| 1090 break; | |
| 1091 if (UTF_8_4_OCTET_LEADING_P (c)) | |
| 1092 { | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1093 found = CATEGORY_MASK_UTF_8; |
| 88365 | 1094 continue; |
| 1095 } | |
| 1096 ONE_MORE_BYTE (c4); | |
| 1097 if (! UTF_8_EXTRA_OCTET_P (c4)) | |
| 1098 break; | |
| 1099 if (UTF_8_5_OCTET_LEADING_P (c)) | |
| 1100 { | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1101 found = CATEGORY_MASK_UTF_8; |
| 88365 | 1102 continue; |
| 1103 } | |
| 1104 break; | |
| 1105 } | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1106 detect_info->rejected |= CATEGORY_MASK_UTF_8; |
| 88365 | 1107 return 0; |
| 1108 | |
| 1109 no_more_source: | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1110 if (incomplete && coding->mode & CODING_MODE_LAST_BLOCK) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1111 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1112 detect_info->rejected |= CATEGORY_MASK_UTF_8; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1113 return 0; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1114 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1115 detect_info->found |= found; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1116 return 1; |
| 88365 | 1117 } |
| 1118 | |
| 1119 | |
| 1120 static void | |
| 1121 decode_coding_utf_8 (coding) | |
| 1122 struct coding_system *coding; | |
| 1123 { | |
| 89483 | 1124 const unsigned char *src = coding->source + coding->consumed; |
| 1125 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 1126 const unsigned char *src_base; | |
| 88365 | 1127 int *charbuf = coding->charbuf; |
| 1128 int *charbuf_end = charbuf + coding->charbuf_size; | |
| 1129 int consumed_chars = 0, consumed_chars_base; | |
| 1130 int multibytep = coding->src_multibyte; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1131 Lisp_Object attr, charset_list; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1132 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1133 CODING_GET_INFO (coding, attr, charset_list); |
| 88365 | 1134 |
| 1135 while (1) | |
| 1136 { | |
| 1137 int c, c1, c2, c3, c4, c5; | |
| 1138 | |
| 1139 src_base = src; | |
| 1140 consumed_chars_base = consumed_chars; | |
| 1141 | |
| 1142 if (charbuf >= charbuf_end) | |
| 1143 break; | |
| 1144 | |
| 1145 ONE_MORE_BYTE (c1); | |
| 1146 if (UTF_8_1_OCTET_P(c1)) | |
| 1147 { | |
| 1148 c = c1; | |
| 1149 } | |
| 1150 else | |
| 1151 { | |
| 1152 ONE_MORE_BYTE (c2); | |
| 1153 if (! UTF_8_EXTRA_OCTET_P (c2)) | |
| 1154 goto invalid_code; | |
| 1155 if (UTF_8_2_OCTET_LEADING_P (c1)) | |
|
88669
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1156 { |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1157 c = ((c1 & 0x1F) << 6) | (c2 & 0x3F); |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1158 /* Reject overlong sequences here and below. Encoders |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1159 producing them are incorrect, they can be misleading, |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1160 and they mess up read/write invariance. */ |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1161 if (c < 128) |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1162 goto invalid_code; |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1163 } |
| 88365 | 1164 else |
| 1165 { | |
| 1166 ONE_MORE_BYTE (c3); | |
| 1167 if (! UTF_8_EXTRA_OCTET_P (c3)) | |
| 1168 goto invalid_code; | |
| 1169 if (UTF_8_3_OCTET_LEADING_P (c1)) | |
|
88669
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1170 { |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1171 c = (((c1 & 0xF) << 12) |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1172 | ((c2 & 0x3F) << 6) | (c3 & 0x3F)); |
|
89184
88a9e962e183
(decode_coding_utf_8): Treat surrogates as invalid.
Dave Love <fx@gnu.org>
parents:
89042
diff
changeset
|
1173 if (c < 0x800 |
|
88a9e962e183
(decode_coding_utf_8): Treat surrogates as invalid.
Dave Love <fx@gnu.org>
parents:
89042
diff
changeset
|
1174 || (c >= 0xd800 && c < 0xe000)) /* surrogates (invalid) */ |
|
88669
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1175 goto invalid_code; |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1176 } |
| 88365 | 1177 else |
| 1178 { | |
| 1179 ONE_MORE_BYTE (c4); | |
| 1180 if (! UTF_8_EXTRA_OCTET_P (c4)) | |
| 1181 goto invalid_code; | |
| 1182 if (UTF_8_4_OCTET_LEADING_P (c1)) | |
|
88669
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1183 { |
| 88365 | 1184 c = (((c1 & 0x7) << 18) | ((c2 & 0x3F) << 12) |
| 1185 | ((c3 & 0x3F) << 6) | (c4 & 0x3F)); | |
|
88669
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1186 if (c < 0x10000) |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1187 goto invalid_code; |
|
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1188 } |
| 88365 | 1189 else |
| 1190 { | |
| 1191 ONE_MORE_BYTE (c5); | |
| 1192 if (! UTF_8_EXTRA_OCTET_P (c5)) | |
| 1193 goto invalid_code; | |
| 1194 if (UTF_8_5_OCTET_LEADING_P (c1)) | |
| 1195 { | |
| 1196 c = (((c1 & 0x3) << 24) | ((c2 & 0x3F) << 18) | |
| 1197 | ((c3 & 0x3F) << 12) | ((c4 & 0x3F) << 6) | |
| 1198 | (c5 & 0x3F)); | |
|
88669
0bc5868f9f61
(decode_coding_utf_8): Reject overlong sequences.
Dave Love <fx@gnu.org>
parents:
88646
diff
changeset
|
1199 if ((c > MAX_CHAR) || (c < 0x200000)) |
| 88365 | 1200 goto invalid_code; |
| 1201 } | |
| 1202 else | |
| 1203 goto invalid_code; | |
| 1204 } | |
| 1205 } | |
| 1206 } | |
| 1207 } | |
| 1208 | |
| 1209 *charbuf++ = c; | |
| 1210 continue; | |
| 1211 | |
| 1212 invalid_code: | |
| 1213 src = src_base; | |
| 1214 consumed_chars = consumed_chars_base; | |
| 1215 ONE_MORE_BYTE (c); | |
| 1216 *charbuf++ = ASCII_BYTE_P (c) ? c : BYTE8_TO_CHAR (c); | |
| 1217 coding->errors++; | |
| 1218 } | |
| 1219 | |
| 1220 no_more_source: | |
| 1221 coding->consumed_char += consumed_chars_base; | |
| 1222 coding->consumed = src_base - coding->source; | |
| 1223 coding->charbuf_used = charbuf - coding->charbuf; | |
| 1224 } | |
| 1225 | |
| 1226 | |
| 1227 static int | |
| 1228 encode_coding_utf_8 (coding) | |
| 1229 struct coding_system *coding; | |
| 1230 { | |
| 1231 int multibytep = coding->dst_multibyte; | |
| 1232 int *charbuf = coding->charbuf; | |
| 1233 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 1234 unsigned char *dst = coding->destination + coding->produced; | |
| 1235 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
1236 int produced_chars = 0; |
| 88365 | 1237 int c; |
| 1238 | |
| 1239 if (multibytep) | |
| 1240 { | |
| 1241 int safe_room = MAX_MULTIBYTE_LENGTH * 2; | |
| 1242 | |
| 1243 while (charbuf < charbuf_end) | |
| 1244 { | |
| 1245 unsigned char str[MAX_MULTIBYTE_LENGTH], *p, *pend = str; | |
| 89483 | 1246 |
| 88365 | 1247 ASSURE_DESTINATION (safe_room); |
| 1248 c = *charbuf++; | |
|
89042
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1249 if (CHAR_BYTE8_P (c)) |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1250 { |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1251 c = CHAR_TO_BYTE8 (c); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1252 EMIT_ONE_BYTE (c); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1253 } |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1254 else |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1255 { |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1256 CHAR_STRING_ADVANCE (c, pend); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1257 for (p = str; p < pend; p++) |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1258 EMIT_ONE_BYTE (*p); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
1259 } |
| 88365 | 1260 } |
| 1261 } | |
| 1262 else | |
| 1263 { | |
| 1264 int safe_room = MAX_MULTIBYTE_LENGTH; | |
| 1265 | |
| 1266 while (charbuf < charbuf_end) | |
| 1267 { | |
| 1268 ASSURE_DESTINATION (safe_room); | |
| 1269 c = *charbuf++; | |
| 1270 dst += CHAR_STRING (c, dst); | |
| 1271 produced_chars++; | |
| 1272 } | |
| 1273 } | |
| 1274 coding->result = CODING_RESULT_SUCCESS; | |
| 1275 coding->produced_char += produced_chars; | |
| 1276 coding->produced = dst - coding->destination; | |
| 1277 return 0; | |
| 1278 } | |
| 1279 | |
| 1280 | |
| 1281 /* See the above "GENERAL NOTES on `detect_coding_XXX ()' functions". | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1282 Check if a text is encoded in one of UTF-16 based coding systems. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1283 If it is, return 1, else return 0. */ |
| 88365 | 1284 |
| 1285 #define UTF_16_HIGH_SURROGATE_P(val) \ | |
| 1286 (((val) & 0xFC00) == 0xD800) | |
| 1287 | |
| 1288 #define UTF_16_LOW_SURROGATE_P(val) \ | |
| 1289 (((val) & 0xFC00) == 0xDC00) | |
| 1290 | |
| 1291 #define UTF_16_INVALID_P(val) \ | |
| 1292 (((val) == 0xFFFE) \ | |
| 1293 || ((val) == 0xFFFF) \ | |
| 1294 || UTF_16_LOW_SURROGATE_P (val)) | |
| 1295 | |
| 1296 | |
| 1297 static int | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1298 detect_coding_utf_16 (coding, detect_info) |
| 88365 | 1299 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1300 struct coding_detection_info *detect_info; |
| 88365 | 1301 { |
| 89483 | 1302 const unsigned char *src = coding->source, *src_base = src; |
| 1303 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 88365 | 1304 int multibytep = coding->src_multibyte; |
| 1305 int consumed_chars = 0; | |
| 1306 int c1, c2; | |
| 1307 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1308 detect_info->checked |= CATEGORY_MASK_UTF_16; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1309 if (coding->mode & CODING_MODE_LAST_BLOCK |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1310 && (coding->src_chars & 1)) |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1311 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1312 detect_info->rejected |= CATEGORY_MASK_UTF_16; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1313 return 0; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1314 } |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1315 |
| 88365 | 1316 ONE_MORE_BYTE (c1); |
| 1317 ONE_MORE_BYTE (c2); | |
| 1318 if ((c1 == 0xFF) && (c2 == 0xFE)) | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1319 { |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1320 detect_info->found |= (CATEGORY_MASK_UTF_16_LE |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1321 | CATEGORY_MASK_UTF_16_AUTO); |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1322 detect_info->rejected |= (CATEGORY_MASK_UTF_16_BE |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1323 | CATEGORY_MASK_UTF_16_BE_NOSIG |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1324 | CATEGORY_MASK_UTF_16_LE_NOSIG); |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1325 } |
| 88365 | 1326 else if ((c1 == 0xFE) && (c2 == 0xFF)) |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1327 { |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1328 detect_info->found |= (CATEGORY_MASK_UTF_16_BE |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1329 | CATEGORY_MASK_UTF_16_AUTO); |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1330 detect_info->rejected |= (CATEGORY_MASK_UTF_16_LE |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1331 | CATEGORY_MASK_UTF_16_BE_NOSIG |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1332 | CATEGORY_MASK_UTF_16_LE_NOSIG); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1333 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1334 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1335 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1336 unsigned char b1[256], b2[256]; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1337 int b1_variants = 1, b2_variants = 1; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1338 int n; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1339 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1340 bzero (b1, 256), bzero (b2, 256); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1341 b1[c1]++, b2[c2]++; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1342 for (n = 0; n < 256 && src < src_end; n++) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1343 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1344 ONE_MORE_BYTE (c1); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1345 ONE_MORE_BYTE (c2); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1346 if (! b1[c1++]) b1_variants++; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1347 if (! b2[c2++]) b2_variants++; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1348 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1349 if (b1_variants < b2_variants) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1350 detect_info->found |= CATEGORY_MASK_UTF_16_BE_NOSIG; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1351 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1352 detect_info->found |= CATEGORY_MASK_UTF_16_LE_NOSIG; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1353 detect_info->rejected |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1354 |= (CATEGORY_MASK_UTF_16_BE | CATEGORY_MASK_UTF_16_LE); |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1355 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1356 no_more_source: |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1357 return 1; |
| 88365 | 1358 } |
| 1359 | |
| 1360 static void | |
| 1361 decode_coding_utf_16 (coding) | |
| 1362 struct coding_system *coding; | |
| 1363 { | |
| 89483 | 1364 const unsigned char *src = coding->source + coding->consumed; |
| 1365 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 1366 const unsigned char *src_base; | |
| 88365 | 1367 int *charbuf = coding->charbuf; |
| 1368 int *charbuf_end = charbuf + coding->charbuf_size; | |
| 1369 int consumed_chars = 0, consumed_chars_base; | |
| 1370 int multibytep = coding->src_multibyte; | |
| 1371 enum utf_16_bom_type bom = CODING_UTF_16_BOM (coding); | |
| 1372 enum utf_16_endian_type endian = CODING_UTF_16_ENDIAN (coding); | |
| 1373 int surrogate = CODING_UTF_16_SURROGATE (coding); | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1374 Lisp_Object attr, charset_list; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1375 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1376 CODING_GET_INFO (coding, attr, charset_list); |
| 88365 | 1377 |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1378 if (bom == utf_16_with_bom) |
| 88365 | 1379 { |
| 1380 int c, c1, c2; | |
| 1381 | |
| 1382 src_base = src; | |
| 1383 ONE_MORE_BYTE (c1); | |
| 1384 ONE_MORE_BYTE (c2); | |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
1385 c = (c1 << 8) | c2; |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1386 |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1387 if (endian == utf_16_big_endian |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1388 ? c != 0xFEFF : c != 0xFFFE) |
| 88365 | 1389 { |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1390 /* The first two bytes are not BOM. Treat them as bytes |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1391 for a normal character. */ |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1392 src = src_base; |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1393 coding->errors++; |
| 88365 | 1394 } |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1395 CODING_UTF_16_BOM (coding) = utf_16_without_bom; |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1396 } |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1397 else if (bom == utf_16_detect_bom) |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1398 { |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1399 /* We have already tried to detect BOM and failed in |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1400 detect_coding. */ |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1401 CODING_UTF_16_BOM (coding) = utf_16_without_bom; |
| 88365 | 1402 } |
| 1403 | |
| 1404 while (1) | |
| 1405 { | |
| 1406 int c, c1, c2; | |
| 1407 | |
| 1408 src_base = src; | |
| 1409 consumed_chars_base = consumed_chars; | |
| 1410 | |
| 1411 if (charbuf + 2 >= charbuf_end) | |
| 1412 break; | |
| 1413 | |
| 1414 ONE_MORE_BYTE (c1); | |
| 1415 ONE_MORE_BYTE (c2); | |
| 1416 c = (endian == utf_16_big_endian | |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
1417 ? ((c1 << 8) | c2) : ((c2 << 8) | c1)); |
| 88365 | 1418 if (surrogate) |
| 1419 { | |
| 1420 if (! UTF_16_LOW_SURROGATE_P (c)) | |
| 1421 { | |
| 1422 if (endian == utf_16_big_endian) | |
| 1423 c1 = surrogate >> 8, c2 = surrogate & 0xFF; | |
| 1424 else | |
| 1425 c1 = surrogate & 0xFF, c2 = surrogate >> 8; | |
| 1426 *charbuf++ = c1; | |
| 1427 *charbuf++ = c2; | |
| 1428 coding->errors++; | |
| 1429 if (UTF_16_HIGH_SURROGATE_P (c)) | |
| 1430 CODING_UTF_16_SURROGATE (coding) = surrogate = c; | |
| 1431 else | |
| 1432 *charbuf++ = c; | |
| 1433 } | |
| 1434 else | |
| 1435 { | |
| 1436 c = ((surrogate - 0xD800) << 10) | (c - 0xDC00); | |
| 1437 CODING_UTF_16_SURROGATE (coding) = surrogate = 0; | |
| 1438 *charbuf++ = c; | |
| 1439 } | |
| 1440 } | |
| 1441 else | |
| 1442 { | |
| 1443 if (UTF_16_HIGH_SURROGATE_P (c)) | |
| 1444 CODING_UTF_16_SURROGATE (coding) = surrogate = c; | |
| 1445 else | |
| 1446 *charbuf++ = c; | |
| 89483 | 1447 } |
| 88365 | 1448 } |
| 1449 | |
| 1450 no_more_source: | |
| 1451 coding->consumed_char += consumed_chars_base; | |
| 1452 coding->consumed = src_base - coding->source; | |
| 1453 coding->charbuf_used = charbuf - coding->charbuf; | |
| 1454 } | |
| 1455 | |
| 1456 static int | |
| 1457 encode_coding_utf_16 (coding) | |
| 1458 struct coding_system *coding; | |
| 1459 { | |
| 1460 int multibytep = coding->dst_multibyte; | |
| 1461 int *charbuf = coding->charbuf; | |
| 1462 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 1463 unsigned char *dst = coding->destination + coding->produced; | |
| 1464 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 1465 int safe_room = 8; | |
| 1466 enum utf_16_bom_type bom = CODING_UTF_16_BOM (coding); | |
| 1467 int big_endian = CODING_UTF_16_ENDIAN (coding) == utf_16_big_endian; | |
| 1468 int produced_chars = 0; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1469 Lisp_Object attrs, charset_list; |
| 88365 | 1470 int c; |
| 1471 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1472 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 1473 |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
1474 if (bom != utf_16_without_bom) |
| 88365 | 1475 { |
| 1476 ASSURE_DESTINATION (safe_room); | |
| 1477 if (big_endian) | |
|
89404
3e1d187b52c3
(Qutf_16_be_nosig, Qutf_16_be, Qutf_16_le_nosig)
Kenichi Handa <handa@m17n.org>
parents:
89394
diff
changeset
|
1478 EMIT_TWO_BYTES (0xFE, 0xFF); |
| 88365 | 1479 else |
|
89404
3e1d187b52c3
(Qutf_16_be_nosig, Qutf_16_be, Qutf_16_le_nosig)
Kenichi Handa <handa@m17n.org>
parents:
89394
diff
changeset
|
1480 EMIT_TWO_BYTES (0xFF, 0xFE); |
| 88365 | 1481 CODING_UTF_16_BOM (coding) = utf_16_without_bom; |
| 1482 } | |
| 1483 | |
| 1484 while (charbuf < charbuf_end) | |
| 1485 { | |
| 1486 ASSURE_DESTINATION (safe_room); | |
| 1487 c = *charbuf++; | |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
1488 if (c >= MAX_UNICODE_CHAR) |
|
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
1489 c = coding->default_char; |
| 88365 | 1490 |
| 1491 if (c < 0x10000) | |
| 1492 { | |
| 1493 if (big_endian) | |
| 1494 EMIT_TWO_BYTES (c >> 8, c & 0xFF); | |
| 1495 else | |
| 1496 EMIT_TWO_BYTES (c & 0xFF, c >> 8); | |
| 1497 } | |
| 1498 else | |
| 1499 { | |
| 1500 int c1, c2; | |
| 1501 | |
| 1502 c -= 0x10000; | |
| 1503 c1 = (c >> 10) + 0xD800; | |
| 1504 c2 = (c & 0x3FF) + 0xDC00; | |
| 1505 if (big_endian) | |
| 1506 EMIT_FOUR_BYTES (c1 >> 8, c1 & 0xFF, c2 >> 8, c2 & 0xFF); | |
| 1507 else | |
| 1508 EMIT_FOUR_BYTES (c1 & 0xFF, c1 >> 8, c2 & 0xFF, c2 >> 8); | |
| 1509 } | |
| 1510 } | |
| 1511 coding->result = CODING_RESULT_SUCCESS; | |
| 1512 coding->produced = dst - coding->destination; | |
| 1513 coding->produced_char += produced_chars; | |
| 1514 return 0; | |
| 1515 } | |
| 1516 | |
| 1517 | |
| 1518 /*** 6. Old Emacs' internal format (emacs-mule) ***/ | |
| 17052 | 1519 |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1520 /* Emacs' internal format for representation of multiple character |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1521 sets is a kind of multi-byte encoding, i.e. characters are |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1522 represented by variable-length sequences of one-byte codes. |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1523 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1524 ASCII characters and control characters (e.g. `tab', `newline') are |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1525 represented by one-byte sequences which are their ASCII codes, in |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1526 the range 0x00 through 0x7F. |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1527 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1528 8-bit characters of the range 0x80..0x9F are represented by |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1529 two-byte sequences of LEADING_CODE_8_BIT_CONTROL and (their 8-bit |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1530 code + 0x20). |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1531 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1532 8-bit characters of the range 0xA0..0xFF are represented by |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1533 one-byte sequences which are their 8-bit code. |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1534 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1535 The other characters are represented by a sequence of `base |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1536 leading-code', optional `extended leading-code', and one or two |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1537 `position-code's. The length of the sequence is determined by the |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1538 base leading-code. Leading-code takes the range 0x81 through 0x9D, |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1539 whereas extended leading-code and position-code take the range 0xA0 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1540 through 0xFF. See `charset.h' for more details about leading-code |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1541 and position-code. |
| 18766 | 1542 |
| 17052 | 1543 --- CODE RANGE of Emacs' internal format --- |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1544 character set range |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1545 ------------- ----- |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1546 ascii 0x00..0x7F |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1547 eight-bit-control LEADING_CODE_8_BIT_CONTROL + 0xA0..0xBF |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1548 eight-bit-graphic 0xA0..0xBF |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1549 ELSE 0x81..0x9D + [0xA0..0xFF]+ |
| 17052 | 1550 --------------------------------------------- |
| 1551 | |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1552 As this is the internal character representation, the format is |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1553 usually not used externally (i.e. in a file or in a data sent to a |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1554 process). But, it is possible to have a text externally in this |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1555 format (i.e. by encoding by the coding system `emacs-mule'). |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1556 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1557 In that case, a sequence of one-byte codes has a slightly different |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1558 form. |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1559 |
| 88365 | 1560 At first, all characters in eight-bit-control are represented by |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1561 one-byte sequences which are their 8-bit code. |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1562 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1563 Next, character composition data are represented by the byte |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1564 sequence of the form: 0x80 METHOD BYTES CHARS COMPONENT ..., |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1565 where, |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1566 METHOD is 0xF0 plus one of composition method (enum |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1567 composition_method), |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1568 |
| 88365 | 1569 BYTES is 0xA0 plus a byte length of this composition data, |
| 1570 | |
| 1571 CHARS is 0x20 plus a number of characters composed by this | |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1572 data, |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1573 |
| 88365 | 1574 COMPONENTs are characters of multibye form or composition |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1575 rules encoded by two-byte of ASCII codes. |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1576 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1577 In addition, for backward compatibility, the following formats are |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1578 also recognized as composition data on decoding. |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1579 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1580 0x80 MSEQ ... |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1581 0x80 0xFF MSEQ RULE MSEQ RULE ... MSEQ |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1582 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1583 Here, |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1584 MSEQ is a multibyte form but in these special format: |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1585 ASCII: 0xA0 ASCII_CODE+0x80, |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1586 other: LEADING_CODE+0x20 FOLLOWING-BYTE ..., |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1587 RULE is a one byte code of the range 0xA0..0xF0 that |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1588 represents a composition rule. |
| 17052 | 1589 */ |
| 1590 | |
| 88365 | 1591 char emacs_mule_bytes[256]; |
| 1592 | |
| 1593 int | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1594 emacs_mule_char (coding, src, nbytes, nchars, id) |
| 88365 | 1595 struct coding_system *coding; |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1596 unsigned char *src; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1597 int *nbytes, *nchars, *id; |
| 88365 | 1598 { |
| 89483 | 1599 const unsigned char *src_end = coding->source + coding->src_bytes; |
| 1600 const unsigned char *src_base = src; | |
| 88365 | 1601 int multibytep = coding->src_multibyte; |
| 1602 struct charset *charset; | |
| 1603 unsigned code; | |
| 1604 int c; | |
| 1605 int consumed_chars = 0; | |
| 1606 | |
| 1607 ONE_MORE_BYTE (c); | |
| 1608 switch (emacs_mule_bytes[c]) | |
| 1609 { | |
| 1610 case 2: | |
| 1611 if (! (charset = emacs_mule_charset[c])) | |
| 1612 goto invalid_code; | |
| 1613 ONE_MORE_BYTE (c); | |
| 1614 code = c & 0x7F; | |
| 1615 break; | |
| 1616 | |
| 1617 case 3: | |
|
88876
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
1618 if (c == EMACS_MULE_LEADING_CODE_PRIVATE_11 |
|
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
1619 || c == EMACS_MULE_LEADING_CODE_PRIVATE_12) |
| 88365 | 1620 { |
| 1621 ONE_MORE_BYTE (c); | |
| 1622 if (! (charset = emacs_mule_charset[c])) | |
| 1623 goto invalid_code; | |
| 1624 ONE_MORE_BYTE (c); | |
| 1625 code = c & 0x7F; | |
| 1626 } | |
| 1627 else | |
| 1628 { | |
| 1629 if (! (charset = emacs_mule_charset[c])) | |
| 1630 goto invalid_code; | |
| 1631 ONE_MORE_BYTE (c); | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1632 code = (c & 0x7F) << 8; |
| 88365 | 1633 ONE_MORE_BYTE (c); |
| 1634 code |= c & 0x7F; | |
| 1635 } | |
| 1636 break; | |
| 1637 | |
| 1638 case 4: | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1639 ONE_MORE_BYTE (c); |
| 88365 | 1640 if (! (charset = emacs_mule_charset[c])) |
| 1641 goto invalid_code; | |
| 1642 ONE_MORE_BYTE (c); | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1643 code = (c & 0x7F) << 8; |
| 88365 | 1644 ONE_MORE_BYTE (c); |
| 1645 code |= c & 0x7F; | |
| 1646 break; | |
| 1647 | |
| 1648 case 1: | |
| 1649 code = c; | |
|
88950
ad258ee59fbb
* coding.c (make_conversion_work_buffer): Adjusted for the change
Kenichi Handa <handa@m17n.org>
parents:
88936
diff
changeset
|
1650 charset = CHARSET_FROM_ID (ASCII_BYTE_P (code) |
|
ad258ee59fbb
* coding.c (make_conversion_work_buffer): Adjusted for the change
Kenichi Handa <handa@m17n.org>
parents:
88936
diff
changeset
|
1651 ? charset_ascii : charset_eight_bit); |
| 88365 | 1652 break; |
| 1653 | |
| 1654 default: | |
| 1655 abort (); | |
| 1656 } | |
| 1657 c = DECODE_CHAR (charset, code); | |
| 1658 if (c < 0) | |
| 1659 goto invalid_code; | |
| 1660 *nbytes = src - src_base; | |
| 1661 *nchars = consumed_chars; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1662 if (id) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1663 *id = charset->id; |
| 88365 | 1664 return c; |
| 1665 | |
| 1666 no_more_source: | |
| 1667 return -2; | |
| 1668 | |
| 1669 invalid_code: | |
| 1670 return -1; | |
| 1671 } | |
| 1672 | |
| 17052 | 1673 |
| 1674 /* See the above "GENERAL NOTES on `detect_coding_XXX ()' functions". | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1675 Check if a text is encoded in `emacs-mule'. If it is, return 1, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1676 else return 0. */ |
| 17052 | 1677 |
|
34531
37f85e931855
(ONE_MORE_BYTE_CHECK_MULTIBYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34197
diff
changeset
|
1678 static int |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1679 detect_coding_emacs_mule (coding, detect_info) |
| 88365 | 1680 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1681 struct coding_detection_info *detect_info; |
| 17052 | 1682 { |
| 89483 | 1683 const unsigned char *src = coding->source, *src_base = src; |
| 1684 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 88365 | 1685 int multibytep = coding->src_multibyte; |
| 1686 int consumed_chars = 0; | |
| 1687 int c; | |
| 1688 int found = 0; | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1689 int incomplete; |
| 88365 | 1690 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1691 detect_info->checked |= CATEGORY_MASK_EMACS_MULE; |
| 88365 | 1692 /* A coding system of this category is always ASCII compatible. */ |
| 1693 src += coding->head_ascii; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1694 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1695 while (1) |
| 17052 | 1696 { |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1697 incomplete = 0; |
| 88365 | 1698 ONE_MORE_BYTE (c); |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1699 incomplete = 1; |
| 88365 | 1700 |
| 1701 if (c == 0x80) | |
| 17052 | 1702 { |
| 88365 | 1703 /* Perhaps the start of composite character. We simple skip |
| 1704 it because analyzing it is too heavy for detecting. But, | |
| 1705 at least, we check that the composite character | |
| 1706 constitues of more than 4 bytes. */ | |
| 89483 | 1707 const unsigned char *src_base; |
| 88365 | 1708 |
| 1709 repeat: | |
| 1710 src_base = src; | |
| 1711 do | |
| 1712 { | |
| 1713 ONE_MORE_BYTE (c); | |
| 1714 } | |
| 1715 while (c >= 0xA0); | |
| 1716 | |
| 1717 if (src - src_base <= 4) | |
| 1718 break; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1719 found = CATEGORY_MASK_EMACS_MULE; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1720 if (c == 0x80) |
| 88365 | 1721 goto repeat; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1722 } |
| 88365 | 1723 |
| 1724 if (c < 0x80) | |
| 1725 { | |
| 1726 if (c < 0x20 | |
| 1727 && (c == ISO_CODE_ESC || c == ISO_CODE_SI || c == ISO_CODE_SO)) | |
| 1728 break; | |
| 1729 } | |
| 1730 else | |
| 1731 { | |
| 89483 | 1732 const unsigned char *src_base = src - 1; |
| 88365 | 1733 |
| 1734 do | |
| 1735 { | |
| 1736 ONE_MORE_BYTE (c); | |
| 1737 } | |
| 1738 while (c >= 0xA0); | |
| 1739 if (src - src_base != emacs_mule_bytes[*src_base]) | |
| 1740 break; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1741 found = CATEGORY_MASK_EMACS_MULE; |
| 88365 | 1742 } |
| 1743 } | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1744 detect_info->rejected |= CATEGORY_MASK_EMACS_MULE; |
| 88365 | 1745 return 0; |
| 1746 | |
| 1747 no_more_source: | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1748 if (incomplete && coding->mode & CODING_MODE_LAST_BLOCK) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1749 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1750 detect_info->rejected |= CATEGORY_MASK_EMACS_MULE; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1751 return 0; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
1752 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1753 detect_info->found |= found; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1754 return 1; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1755 } |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1756 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1757 |
| 88365 | 1758 /* See the above "GENERAL NOTES on `decode_coding_XXX ()' functions". */ |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1759 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1760 /* Decode a character represented as a component of composition |
| 88365 | 1761 sequence of Emacs 20/21 style at SRC. Set C to that character and |
| 1762 update SRC to the head of next character (or an encoded composition | |
| 1763 rule). If SRC doesn't points a composition component, set C to -1. | |
| 1764 If SRC points an invalid byte sequence, global exit by a return | |
| 1765 value 0. */ | |
| 1766 | |
| 1767 #define DECODE_EMACS_MULE_COMPOSITION_CHAR(buf) \ | |
| 1768 if (1) \ | |
| 1769 { \ | |
| 1770 int c; \ | |
| 1771 int nbytes, nchars; \ | |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1772 \ |
| 88365 | 1773 if (src == src_end) \ |
| 1774 break; \ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1775 c = emacs_mule_char (coding, src, &nbytes, &nchars, NULL);\ |
| 88365 | 1776 if (c < 0) \ |
| 1777 { \ | |
| 1778 if (c == -2) \ | |
| 1779 break; \ | |
| 1780 goto invalid_code; \ | |
| 1781 } \ | |
| 1782 *buf++ = c; \ | |
| 1783 src += nbytes; \ | |
| 1784 consumed_chars += nchars; \ | |
| 1785 } \ | |
| 1786 else | |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1787 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1788 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1789 /* Decode a composition rule represented as a component of composition |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1790 sequence of Emacs 20 style at SRC. Store the decoded rule in *BUF, |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1791 and increment BUF. If SRC points an invalid byte sequence, set C |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1792 to -1. */ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1793 |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1794 #define DECODE_EMACS_MULE_COMPOSITION_RULE_20(buf) \ |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1795 do { \ |
| 88365 | 1796 int c, gref, nref; \ |
| 1797 \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1798 if (src >= src_end) \ |
| 88365 | 1799 goto invalid_code; \ |
| 1800 ONE_MORE_BYTE_NO_CHECK (c); \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1801 c -= 0x20; \ |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1802 if (c < 0 || c >= 81) \ |
| 88365 | 1803 goto invalid_code; \ |
| 1804 \ | |
| 1805 gref = c / 9, nref = c % 9; \ | |
| 1806 *buf++ = COMPOSITION_ENCODE_RULE (gref, nref); \ | |
| 1807 } while (0) | |
| 1808 | |
| 1809 | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1810 /* Decode a composition rule represented as a component of composition |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1811 sequence of Emacs 21 style at SRC. Store the decoded rule in *BUF, |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1812 and increment BUF. If SRC points an invalid byte sequence, set C |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1813 to -1. */ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1814 |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1815 #define DECODE_EMACS_MULE_COMPOSITION_RULE_21(buf) \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1816 do { \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1817 int gref, nref; \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1818 \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1819 if (src + 1>= src_end) \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1820 goto invalid_code; \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1821 ONE_MORE_BYTE_NO_CHECK (gref); \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1822 gref -= 0x20; \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1823 ONE_MORE_BYTE_NO_CHECK (nref); \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1824 nref -= 0x20; \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1825 if (gref < 0 || gref >= 81 \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1826 || nref < 0 || nref >= 81) \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1827 goto invalid_code; \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1828 *buf++ = COMPOSITION_ENCODE_RULE (gref, nref); \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1829 } while (0) |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1830 |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1831 |
| 88365 | 1832 #define DECODE_EMACS_MULE_21_COMPOSITION(c) \ |
| 1833 do { \ | |
| 1834 /* Emacs 21 style format. The first three bytes at SRC are \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1835 (METHOD - 0xF2), (BYTES - 0xA0), (CHARS - 0xA0), where BYTES is \ |
| 88365 | 1836 the byte length of this composition information, CHARS is the \ |
| 1837 number of characters composed by this composition. */ \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1838 enum composition_method method = c - 0xF2; \ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1839 int *charbuf_base = charbuf; \ |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1840 int from, to; \ |
| 88365 | 1841 int consumed_chars_limit; \ |
| 1842 int nbytes, nchars; \ | |
| 1843 \ | |
| 1844 ONE_MORE_BYTE (c); \ | |
| 1845 nbytes = c - 0xA0; \ | |
| 1846 if (nbytes < 3) \ | |
| 1847 goto invalid_code; \ | |
| 1848 ONE_MORE_BYTE (c); \ | |
| 1849 nchars = c - 0xA0; \ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1850 from = coding->produced + char_offset; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1851 to = from + nchars; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1852 ADD_COMPOSITION_DATA (charbuf, from, to, method); \ |
| 88365 | 1853 consumed_chars_limit = consumed_chars_base + nbytes; \ |
| 1854 if (method != COMPOSITION_RELATIVE) \ | |
| 1855 { \ | |
| 1856 int i = 0; \ | |
| 1857 while (consumed_chars < consumed_chars_limit) \ | |
| 1858 { \ | |
| 1859 if (i % 2 && method != COMPOSITION_WITH_ALTCHARS) \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1860 DECODE_EMACS_MULE_COMPOSITION_RULE_21 (charbuf); \ |
| 88365 | 1861 else \ |
| 1862 DECODE_EMACS_MULE_COMPOSITION_CHAR (charbuf); \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1863 i++; \ |
| 88365 | 1864 } \ |
| 1865 if (consumed_chars < consumed_chars_limit) \ | |
| 1866 goto invalid_code; \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1867 charbuf_base[0] -= i; \ |
| 88365 | 1868 } \ |
| 1869 } while (0) | |
| 1870 | |
| 1871 | |
| 1872 #define DECODE_EMACS_MULE_20_RELATIVE_COMPOSITION(c) \ | |
| 1873 do { \ | |
| 1874 /* Emacs 20 style format for relative composition. */ \ | |
| 1875 /* Store multibyte form of characters to be composed. */ \ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1876 enum composition_method method = COMPOSITION_RELATIVE; \ |
| 88365 | 1877 int components[MAX_COMPOSITION_COMPONENTS * 2 - 1]; \ |
| 1878 int *buf = components; \ | |
| 1879 int i, j; \ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1880 int from, to; \ |
| 88365 | 1881 \ |
| 1882 src = src_base; \ | |
| 1883 ONE_MORE_BYTE (c); /* skip 0x80 */ \ | |
| 1884 for (i = 0; i < MAX_COMPOSITION_COMPONENTS; i++) \ | |
| 1885 DECODE_EMACS_MULE_COMPOSITION_CHAR (buf); \ | |
| 1886 if (i < 2) \ | |
| 1887 goto invalid_code; \ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1888 from = coding->produced_char + char_offset; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1889 to = from + i; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1890 ADD_COMPOSITION_DATA (charbuf, from, to, method); \ |
| 88365 | 1891 for (j = 0; j < i; j++) \ |
| 1892 *charbuf++ = components[j]; \ | |
| 1893 } while (0) | |
| 1894 | |
| 1895 | |
| 1896 #define DECODE_EMACS_MULE_20_RULEBASE_COMPOSITION(c) \ | |
| 1897 do { \ | |
| 1898 /* Emacs 20 style format for rule-base composition. */ \ | |
| 1899 /* Store multibyte form of characters to be composed. */ \ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1900 enum composition_method method = COMPOSITION_WITH_RULE; \ |
| 88365 | 1901 int components[MAX_COMPOSITION_COMPONENTS * 2 - 1]; \ |
| 1902 int *buf = components; \ | |
| 1903 int i, j; \ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1904 int from, to; \ |
| 88365 | 1905 \ |
| 1906 DECODE_EMACS_MULE_COMPOSITION_CHAR (buf); \ | |
| 1907 for (i = 0; i < MAX_COMPOSITION_COMPONENTS; i++) \ | |
| 1908 { \ | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1909 DECODE_EMACS_MULE_COMPOSITION_RULE_20 (buf); \ |
| 88365 | 1910 DECODE_EMACS_MULE_COMPOSITION_CHAR (buf); \ |
| 1911 } \ | |
| 1912 if (i < 1 || (buf - components) % 2 == 0) \ | |
| 1913 goto invalid_code; \ | |
| 1914 if (charbuf + i + (i / 2) + 1 < charbuf_end) \ | |
| 1915 goto no_more_source; \ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1916 from = coding->produced_char + char_offset; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1917 to = from + i; \ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1918 ADD_COMPOSITION_DATA (buf, from, to, method); \ |
| 88365 | 1919 for (j = 0; j < i; j++) \ |
| 1920 *charbuf++ = components[j]; \ | |
| 1921 for (j = 0; j < i; j += 2) \ | |
| 1922 *charbuf++ = components[j]; \ | |
| 1923 } while (0) | |
| 1924 | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1925 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1926 static void |
| 88365 | 1927 decode_coding_emacs_mule (coding) |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1928 struct coding_system *coding; |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
1929 { |
| 89483 | 1930 const unsigned char *src = coding->source + coding->consumed; |
| 1931 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 1932 const unsigned char *src_base; | |
| 88365 | 1933 int *charbuf = coding->charbuf; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1934 int *charbuf_end = charbuf + coding->charbuf_size - MAX_ANNOTATION_LENGTH; |
| 88365 | 1935 int consumed_chars = 0, consumed_chars_base; |
| 1936 int multibytep = coding->src_multibyte; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1937 Lisp_Object attrs, charset_list; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1938 int char_offset = coding->produced_char; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1939 int last_offset = char_offset; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1940 int last_id = charset_ascii; |
| 88365 | 1941 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
1942 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 1943 |
| 1944 while (1) | |
| 1945 { | |
| 1946 int c; | |
| 1947 | |
| 1948 src_base = src; | |
| 1949 consumed_chars_base = consumed_chars; | |
| 1950 | |
| 1951 if (charbuf >= charbuf_end) | |
| 1952 break; | |
| 1953 | |
| 1954 ONE_MORE_BYTE (c); | |
| 1955 | |
| 1956 if (c < 0x80) | |
|
32806
9502d0a5b2ad
(decode_coding_emacs_mule): If coding->eol_type is CR
Eli Zaretskii <eliz@gnu.org>
parents:
32745
diff
changeset
|
1957 { |
| 88365 | 1958 *charbuf++ = c; |
| 1959 char_offset++; | |
|
32806
9502d0a5b2ad
(decode_coding_emacs_mule): If coding->eol_type is CR
Eli Zaretskii <eliz@gnu.org>
parents:
32745
diff
changeset
|
1960 } |
| 88365 | 1961 else if (c == 0x80) |
|
32806
9502d0a5b2ad
(decode_coding_emacs_mule): If coding->eol_type is CR
Eli Zaretskii <eliz@gnu.org>
parents:
32745
diff
changeset
|
1962 { |
| 88365 | 1963 ONE_MORE_BYTE (c); |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1964 if (c - 0xF2 >= COMPOSITION_RELATIVE |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1965 && c - 0xF2 <= COMPOSITION_WITH_RULE_ALTCHARS) |
| 88365 | 1966 DECODE_EMACS_MULE_21_COMPOSITION (c); |
| 1967 else if (c < 0xC0) | |
| 1968 DECODE_EMACS_MULE_20_RELATIVE_COMPOSITION (c); | |
| 1969 else if (c == 0xFF) | |
| 1970 DECODE_EMACS_MULE_20_RULEBASE_COMPOSITION (c); | |
| 1971 else | |
| 1972 goto invalid_code; | |
|
32806
9502d0a5b2ad
(decode_coding_emacs_mule): If coding->eol_type is CR
Eli Zaretskii <eliz@gnu.org>
parents:
32745
diff
changeset
|
1973 } |
| 88365 | 1974 else if (c < 0xA0 && emacs_mule_bytes[c] > 1) |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1975 { |
| 88365 | 1976 int nbytes, nchars; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1977 int id; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1978 |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1979 src = src_base; |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1980 consumed_chars = consumed_chars_base; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1981 c = emacs_mule_char (coding, src, &nbytes, &nchars, &id); |
| 88365 | 1982 if (c < 0) |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1983 { |
| 88365 | 1984 if (c == -2) |
| 1985 break; | |
| 1986 goto invalid_code; | |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1987 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1988 if (last_id != id) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1989 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1990 if (last_id != charset_ascii) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1991 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1992 last_id = id; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1993 last_offset = char_offset; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
1994 } |
| 88365 | 1995 *charbuf++ = c; |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1996 src += nbytes; |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
1997 consumed_chars += nchars; |
| 88365 | 1998 char_offset++; |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
1999 } |
| 88365 | 2000 continue; |
| 2001 | |
| 2002 invalid_code: | |
| 2003 src = src_base; | |
| 2004 consumed_chars = consumed_chars_base; | |
| 2005 ONE_MORE_BYTE (c); | |
| 2006 *charbuf++ = ASCII_BYTE_P (c) ? c : BYTE8_TO_CHAR (c); | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2007 char_offset++; |
| 88365 | 2008 coding->errors++; |
| 2009 } | |
| 2010 | |
| 2011 no_more_source: | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2012 if (last_id != charset_ascii) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2013 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
| 88365 | 2014 coding->consumed_char += consumed_chars_base; |
| 2015 coding->consumed = src_base - coding->source; | |
| 2016 coding->charbuf_used = charbuf - coding->charbuf; | |
| 2017 } | |
| 2018 | |
| 2019 | |
| 2020 #define EMACS_MULE_LEADING_CODES(id, codes) \ | |
| 2021 do { \ | |
| 2022 if (id < 0xA0) \ | |
| 2023 codes[0] = id, codes[1] = 0; \ | |
| 2024 else if (id < 0xE0) \ | |
| 2025 codes[0] = 0x9A, codes[1] = id; \ | |
| 2026 else if (id < 0xF0) \ | |
| 2027 codes[0] = 0x9B, codes[1] = id; \ | |
| 2028 else if (id < 0xF5) \ | |
| 2029 codes[0] = 0x9C, codes[1] = id; \ | |
| 2030 else \ | |
| 2031 codes[0] = 0x9D, codes[1] = id; \ | |
| 2032 } while (0); | |
| 2033 | |
| 2034 | |
| 2035 static int | |
| 2036 encode_coding_emacs_mule (coding) | |
| 2037 struct coding_system *coding; | |
| 2038 { | |
| 2039 int multibytep = coding->dst_multibyte; | |
| 2040 int *charbuf = coding->charbuf; | |
| 2041 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 2042 unsigned char *dst = coding->destination + coding->produced; | |
| 2043 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 2044 int safe_room = 8; | |
| 2045 int produced_chars = 0; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
2046 Lisp_Object attrs, charset_list; |
| 88365 | 2047 int c; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2048 int preferred_charset_id = -1; |
| 88365 | 2049 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
2050 CODING_GET_INFO (coding, attrs, charset_list); |
|
89644
fc9cda144ffc
(encode_coding_emacs_mule): Resync charset_list to
Kenichi Handa <handa@m17n.org>
parents:
89642
diff
changeset
|
2051 if (! EQ (charset_list, Vemacs_mule_charset_list)) |
|
fc9cda144ffc
(encode_coding_emacs_mule): Resync charset_list to
Kenichi Handa <handa@m17n.org>
parents:
89642
diff
changeset
|
2052 { |
|
fc9cda144ffc
(encode_coding_emacs_mule): Resync charset_list to
Kenichi Handa <handa@m17n.org>
parents:
89642
diff
changeset
|
2053 CODING_ATTR_CHARSET_LIST (attrs) |
|
fc9cda144ffc
(encode_coding_emacs_mule): Resync charset_list to
Kenichi Handa <handa@m17n.org>
parents:
89642
diff
changeset
|
2054 = charset_list = Vemacs_mule_charset_list; |
|
fc9cda144ffc
(encode_coding_emacs_mule): Resync charset_list to
Kenichi Handa <handa@m17n.org>
parents:
89642
diff
changeset
|
2055 } |
| 88365 | 2056 |
| 2057 while (charbuf < charbuf_end) | |
| 2058 { | |
| 2059 ASSURE_DESTINATION (safe_room); | |
| 2060 c = *charbuf++; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2061 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2062 if (c < 0) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2063 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2064 /* Handle an annotation. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2065 switch (*charbuf) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2066 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2067 case CODING_ANNOTATE_COMPOSITION_MASK: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2068 /* Not yet implemented. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2069 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2070 case CODING_ANNOTATE_CHARSET_MASK: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2071 preferred_charset_id = charbuf[3]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2072 if (preferred_charset_id >= 0 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2073 && NILP (Fmemq (make_number (preferred_charset_id), |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2074 charset_list))) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2075 preferred_charset_id = -1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2076 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2077 default: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2078 abort (); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2079 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2080 charbuf += -c - 1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2081 continue; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2082 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2083 |
| 88365 | 2084 if (ASCII_CHAR_P (c)) |
| 2085 EMIT_ONE_ASCII_BYTE (c); | |
|
88690
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
2086 else if (CHAR_BYTE8_P (c)) |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
2087 { |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
2088 c = CHAR_TO_BYTE8 (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
2089 EMIT_ONE_BYTE (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
2090 } |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2091 else |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2092 { |
| 88365 | 2093 struct charset *charset; |
| 2094 unsigned code; | |
| 2095 int dimension; | |
| 2096 int emacs_mule_id; | |
| 2097 unsigned char leading_codes[2]; | |
| 2098 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2099 if (preferred_charset_id >= 0) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2100 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2101 charset = CHARSET_FROM_ID (preferred_charset_id); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2102 if (! CHAR_CHARSET_P (c, charset)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2103 charset = char_charset (c, charset_list, NULL); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2104 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2105 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2106 charset = char_charset (c, charset_list, &code); |
| 88365 | 2107 if (! charset) |
| 2108 { | |
| 2109 c = coding->default_char; | |
| 2110 if (ASCII_CHAR_P (c)) | |
| 2111 { | |
| 2112 EMIT_ONE_ASCII_BYTE (c); | |
| 2113 continue; | |
| 2114 } | |
| 2115 charset = char_charset (c, charset_list, &code); | |
| 2116 } | |
| 2117 dimension = CHARSET_DIMENSION (charset); | |
| 2118 emacs_mule_id = CHARSET_EMACS_MULE_ID (charset); | |
| 2119 EMACS_MULE_LEADING_CODES (emacs_mule_id, leading_codes); | |
| 2120 EMIT_ONE_BYTE (leading_codes[0]); | |
| 2121 if (leading_codes[1]) | |
| 2122 EMIT_ONE_BYTE (leading_codes[1]); | |
| 2123 if (dimension == 1) | |
|
89642
e97441b6244b
(encode_coding_emacs_mule): Emit bytes with MSB.
Kenichi Handa <handa@m17n.org>
parents:
89575
diff
changeset
|
2124 EMIT_ONE_BYTE (code | 0x80); |
| 88365 | 2125 else |
| 2126 { | |
|
89642
e97441b6244b
(encode_coding_emacs_mule): Emit bytes with MSB.
Kenichi Handa <handa@m17n.org>
parents:
89575
diff
changeset
|
2127 code |= 0x8080; |
| 88365 | 2128 EMIT_ONE_BYTE (code >> 8); |
| 2129 EMIT_ONE_BYTE (code & 0xFF); | |
| 2130 } | |
| 17052 | 2131 } |
| 88365 | 2132 } |
| 2133 coding->result = CODING_RESULT_SUCCESS; | |
| 2134 coding->produced_char += produced_chars; | |
| 2135 coding->produced = dst - coding->destination; | |
| 2136 return 0; | |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
2137 } |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2138 |
| 17052 | 2139 |
| 88365 | 2140 /*** 7. ISO2022 handlers ***/ |
| 17052 | 2141 |
| 2142 /* The following note describes the coding system ISO2022 briefly. | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2143 Since the intention of this note is to help understand the |
| 88771 | 2144 functions in this file, some parts are NOT ACCURATE or are OVERLY |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2145 SIMPLIFIED. For thorough understanding, please refer to the |
| 88771 | 2146 original document of ISO2022. This is equivalent to the standard |
| 2147 ECMA-35, obtainable from <URL:http://www.ecma.ch/> (*). | |
| 17052 | 2148 |
| 2149 ISO2022 provides many mechanisms to encode several character sets | |
| 88771 | 2150 in 7-bit and 8-bit environments. For 7-bit environments, all text |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2151 is encoded using bytes less than 128. This may make the encoded |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2152 text a little bit longer, but the text passes more easily through |
| 88771 | 2153 several types of gateway, some of which strip off the MSB (Most |
| 2154 Significant Bit). | |
| 2155 | |
| 2156 There are two kinds of character sets: control character sets and | |
| 2157 graphic character sets. The former contain control characters such | |
| 17052 | 2158 as `newline' and `escape' to provide control functions (control |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2159 functions are also provided by escape sequences). The latter |
| 88771 | 2160 contain graphic characters such as 'A' and '-'. Emacs recognizes |
| 17052 | 2161 two control character sets and many graphic character sets. |
| 2162 | |
| 2163 Graphic character sets are classified into one of the following | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2164 four classes, according to the number of bytes (DIMENSION) and |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2165 number of characters in one dimension (CHARS) of the set: |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2166 - DIMENSION1_CHARS94 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2167 - DIMENSION1_CHARS96 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2168 - DIMENSION2_CHARS94 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2169 - DIMENSION2_CHARS96 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2170 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2171 In addition, each character set is assigned an identification tag, |
| 88771 | 2172 unique for each set, called the "final character" (denoted as <F> |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2173 hereafter). The <F> of each character set is decided by ECMA(*) |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2174 when it is registered in ISO. The code range of <F> is 0x30..0x7F |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2175 (0x30..0x3F are for private use only). |
| 17052 | 2176 |
| 2177 Note (*): ECMA = European Computer Manufacturers Association | |
| 2178 | |
| 88771 | 2179 Here are examples of graphic character sets [NAME(<F>)]: |
| 17052 | 2180 o DIMENSION1_CHARS94 -- ASCII('B'), right-half-of-JISX0201('I'), ... |
| 2181 o DIMENSION1_CHARS96 -- right-half-of-ISO8859-1('A'), ... | |
| 2182 o DIMENSION2_CHARS94 -- GB2312('A'), JISX0208('B'), ... | |
| 2183 o DIMENSION2_CHARS96 -- none for the moment | |
| 2184 | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2185 A code area (1 byte=8 bits) is divided into 4 areas, C0, GL, C1, and GR. |
| 17052 | 2186 C0 [0x00..0x1F] -- control character plane 0 |
| 2187 GL [0x20..0x7F] -- graphic character plane 0 | |
| 2188 C1 [0x80..0x9F] -- control character plane 1 | |
| 2189 GR [0xA0..0xFF] -- graphic character plane 1 | |
| 2190 | |
| 2191 A control character set is directly designated and invoked to C0 or | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2192 C1 by an escape sequence. The most common case is that: |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2193 - ISO646's control character set is designated/invoked to C0, and |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2194 - ISO6429's control character set is designated/invoked to C1, |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2195 and usually these designations/invocations are omitted in encoded |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2196 text. In a 7-bit environment, only C0 can be used, and a control |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2197 character for C1 is encoded by an appropriate escape sequence to |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2198 fit into the environment. All control characters for C1 are |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2199 defined to have corresponding escape sequences. |
| 17052 | 2200 |
| 2201 A graphic character set is at first designated to one of four | |
| 2202 graphic registers (G0 through G3), then these graphic registers are | |
| 2203 invoked to GL or GR. These designations and invocations can be | |
| 2204 done independently. The most common case is that G0 is invoked to | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2205 GL, G1 is invoked to GR, and ASCII is designated to G0. Usually |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2206 these invocations and designations are omitted in encoded text. |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2207 In a 7-bit environment, only GL can be used. |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2208 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2209 When a graphic character set of CHARS94 is invoked to GL, codes |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2210 0x20 and 0x7F of the GL area work as control characters SPACE and |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2211 DEL respectively, and codes 0xA0 and 0xFF of the GR area should not |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2212 be used. |
| 17052 | 2213 |
| 2214 There are two ways of invocation: locking-shift and single-shift. | |
| 2215 With locking-shift, the invocation lasts until the next different | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2216 invocation, whereas with single-shift, the invocation affects the |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2217 following character only and doesn't affect the locking-shift |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2218 state. Invocations are done by the following control characters or |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2219 escape sequences: |
| 17052 | 2220 |
| 2221 ---------------------------------------------------------------------- | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2222 abbrev function cntrl escape seq description |
| 17052 | 2223 ---------------------------------------------------------------------- |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2224 SI/LS0 (shift-in) 0x0F none invoke G0 into GL |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2225 SO/LS1 (shift-out) 0x0E none invoke G1 into GL |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2226 LS2 (locking-shift-2) none ESC 'n' invoke G2 into GL |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2227 LS3 (locking-shift-3) none ESC 'o' invoke G3 into GL |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2228 LS1R (locking-shift-1 right) none ESC '~' invoke G1 into GR (*) |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2229 LS2R (locking-shift-2 right) none ESC '}' invoke G2 into GR (*) |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2230 LS3R (locking-shift 3 right) none ESC '|' invoke G3 into GR (*) |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2231 SS2 (single-shift-2) 0x8E ESC 'N' invoke G2 for one char |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2232 SS3 (single-shift-3) 0x8F ESC 'O' invoke G3 for one char |
| 17052 | 2233 ---------------------------------------------------------------------- |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2234 (*) These are not used by any known coding system. |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2235 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2236 Control characters for these functions are defined by macros |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2237 ISO_CODE_XXX in `coding.h'. |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2238 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2239 Designations are done by the following escape sequences: |
| 17052 | 2240 ---------------------------------------------------------------------- |
| 2241 escape sequence description | |
| 2242 ---------------------------------------------------------------------- | |
| 2243 ESC '(' <F> designate DIMENSION1_CHARS94<F> to G0 | |
| 2244 ESC ')' <F> designate DIMENSION1_CHARS94<F> to G1 | |
| 2245 ESC '*' <F> designate DIMENSION1_CHARS94<F> to G2 | |
| 2246 ESC '+' <F> designate DIMENSION1_CHARS94<F> to G3 | |
| 2247 ESC ',' <F> designate DIMENSION1_CHARS96<F> to G0 (*) | |
| 2248 ESC '-' <F> designate DIMENSION1_CHARS96<F> to G1 | |
| 2249 ESC '.' <F> designate DIMENSION1_CHARS96<F> to G2 | |
| 2250 ESC '/' <F> designate DIMENSION1_CHARS96<F> to G3 | |
| 2251 ESC '$' '(' <F> designate DIMENSION2_CHARS94<F> to G0 (**) | |
| 2252 ESC '$' ')' <F> designate DIMENSION2_CHARS94<F> to G1 | |
| 2253 ESC '$' '*' <F> designate DIMENSION2_CHARS94<F> to G2 | |
| 2254 ESC '$' '+' <F> designate DIMENSION2_CHARS94<F> to G3 | |
| 2255 ESC '$' ',' <F> designate DIMENSION2_CHARS96<F> to G0 (*) | |
| 2256 ESC '$' '-' <F> designate DIMENSION2_CHARS96<F> to G1 | |
| 2257 ESC '$' '.' <F> designate DIMENSION2_CHARS96<F> to G2 | |
| 2258 ESC '$' '/' <F> designate DIMENSION2_CHARS96<F> to G3 | |
| 2259 ---------------------------------------------------------------------- | |
| 2260 | |
| 2261 In this list, "DIMENSION1_CHARS94<F>" means a graphic character set | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2262 of dimension 1, chars 94, and final character <F>, etc... |
| 17052 | 2263 |
| 2264 Note (*): Although these designations are not allowed in ISO2022, | |
| 2265 Emacs accepts them on decoding, and produces them on encoding | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2266 CHARS96 character sets in a coding system which is characterized as |
| 17052 | 2267 7-bit environment, non-locking-shift, and non-single-shift. |
| 2268 | |
| 2269 Note (**): If <F> is '@', 'A', or 'B', the intermediate character | |
| 88365 | 2270 '(' must be omitted. We refer to this as "short-form" hereafter. |
| 2271 | |
| 88771 | 2272 Now you may notice that there are a lot of ways of encoding the |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2273 same multilingual text in ISO2022. Actually, there exist many |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2274 coding systems such as Compound Text (used in X11's inter client |
| 88771 | 2275 communication, ISO-2022-JP (used in Japanese Internet), ISO-2022-KR |
| 2276 (used in Korean Internet), EUC (Extended UNIX Code, used in Asian | |
| 17052 | 2277 localized platforms), and all of these are variants of ISO2022. |
| 2278 | |
| 2279 In addition to the above, Emacs handles two more kinds of escape | |
| 2280 sequences: ISO6429's direction specification and Emacs' private | |
| 2281 sequence for specifying character composition. | |
| 2282 | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2283 ISO6429's direction specification takes the following form: |
| 17052 | 2284 o CSI ']' -- end of the current direction |
| 2285 o CSI '0' ']' -- end of the current direction | |
| 2286 o CSI '1' ']' -- start of left-to-right text | |
| 2287 o CSI '2' ']' -- start of right-to-left text | |
| 2288 The control character CSI (0x9B: control sequence introducer) is | |
|
24425
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2289 abbreviated to the escape sequence ESC '[' in a 7-bit environment. |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2290 |
|
61c6b3be1d51
Comment for ISO 2022 encoding mechanism modified.
Kenichi Handa <handa@m17n.org>
parents:
24344
diff
changeset
|
2291 Character composition specification takes the following form: |
| 26847 | 2292 o ESC '0' -- start relative composition |
| 2293 o ESC '1' -- end composition | |
| 2294 o ESC '2' -- start rule-base composition (*) | |
| 2295 o ESC '3' -- start relative composition with alternate chars (**) | |
| 2296 o ESC '4' -- start rule-base composition with alternate chars (**) | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2297 Since these are not standard escape sequences of any ISO standard, |
| 88771 | 2298 the use of them with these meanings is restricted to Emacs only. |
| 2299 | |
| 2300 (*) This form is used only in Emacs 20.7 and older versions, | |
| 2301 but newer versions can safely decode it. | |
| 2302 (**) This form is used only in Emacs 21.1 and newer versions, | |
| 2303 and older versions can't decode it. | |
| 2304 | |
| 2305 Here's a list of example usages of these composition escape | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2306 sequences (categorized by `enum composition_method'). |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2307 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2308 COMPOSITION_RELATIVE: |
| 26847 | 2309 ESC 0 CHAR [ CHAR ] ESC 1 |
| 88771 | 2310 COMPOSITION_WITH_RULE: |
| 26847 | 2311 ESC 2 CHAR [ RULE CHAR ] ESC 1 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2312 COMPOSITION_WITH_ALTCHARS: |
| 26847 | 2313 ESC 3 ALTCHAR [ ALTCHAR ] ESC 0 CHAR [ CHAR ] ESC 1 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2314 COMPOSITION_WITH_RULE_ALTCHARS: |
| 26847 | 2315 ESC 4 ALTCHAR [ RULE ALTCHAR ] ESC 0 CHAR [ CHAR ] ESC 1 */ |
| 17052 | 2316 |
| 2317 enum iso_code_class_type iso_code_class[256]; | |
| 2318 | |
| 88365 | 2319 #define SAFE_CHARSET_P(coding, id) \ |
| 2320 ((id) <= (coding)->max_charset_id \ | |
| 2321 && (coding)->safe_charsets[id] >= 0) | |
| 2322 | |
| 2323 | |
| 2324 #define SHIFT_OUT_OK(category) \ | |
| 2325 (CODING_ISO_INITIAL (&coding_categories[category], 1) >= 0) | |
| 2326 | |
| 2327 static void | |
|
88631
780b91d4a7e5
(setup_iso_safe_charsets): Fix arg decl for K&R.
Dave Love <fx@gnu.org>
parents:
88607
diff
changeset
|
2328 setup_iso_safe_charsets (attrs) |
|
780b91d4a7e5
(setup_iso_safe_charsets): Fix arg decl for K&R.
Dave Love <fx@gnu.org>
parents:
88607
diff
changeset
|
2329 Lisp_Object attrs; |
| 88365 | 2330 { |
| 2331 Lisp_Object charset_list, safe_charsets; | |
| 2332 Lisp_Object request; | |
| 2333 Lisp_Object reg_usage; | |
| 2334 Lisp_Object tail; | |
| 2335 int reg94, reg96; | |
| 2336 int flags = XINT (AREF (attrs, coding_attr_iso_flags)); | |
| 2337 int max_charset_id; | |
| 2338 | |
| 2339 charset_list = CODING_ATTR_CHARSET_LIST (attrs); | |
| 2340 if ((flags & CODING_ISO_FLAG_FULL_SUPPORT) | |
| 2341 && ! EQ (charset_list, Viso_2022_charset_list)) | |
| 2342 { | |
| 2343 CODING_ATTR_CHARSET_LIST (attrs) | |
| 2344 = charset_list = Viso_2022_charset_list; | |
| 2345 ASET (attrs, coding_attr_safe_charsets, Qnil); | |
| 2346 } | |
| 2347 | |
| 2348 if (STRINGP (AREF (attrs, coding_attr_safe_charsets))) | |
| 2349 return; | |
| 2350 | |
| 2351 max_charset_id = 0; | |
| 2352 for (tail = charset_list; CONSP (tail); tail = XCDR (tail)) | |
| 2353 { | |
| 2354 int id = XINT (XCAR (tail)); | |
| 2355 if (max_charset_id < id) | |
| 2356 max_charset_id = id; | |
| 2357 } | |
| 2358 | |
| 2359 safe_charsets = Fmake_string (make_number (max_charset_id + 1), | |
| 2360 make_number (255)); | |
| 2361 request = AREF (attrs, coding_attr_iso_request); | |
| 2362 reg_usage = AREF (attrs, coding_attr_iso_usage); | |
| 2363 reg94 = XINT (XCAR (reg_usage)); | |
| 2364 reg96 = XINT (XCDR (reg_usage)); | |
| 2365 | |
| 2366 for (tail = charset_list; CONSP (tail); tail = XCDR (tail)) | |
| 2367 { | |
| 2368 Lisp_Object id; | |
| 2369 Lisp_Object reg; | |
| 2370 struct charset *charset; | |
| 2371 | |
| 2372 id = XCAR (tail); | |
| 2373 charset = CHARSET_FROM_ID (XINT (id)); | |
|
88681
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2374 reg = Fcdr (Fassq (id, request)); |
| 88365 | 2375 if (! NILP (reg)) |
| 89483 | 2376 SSET (safe_charsets, XINT (id), XINT (reg)); |
| 88365 | 2377 else if (charset->iso_chars_96) |
| 2378 { | |
| 2379 if (reg96 < 4) | |
| 89483 | 2380 SSET (safe_charsets, XINT (id), reg96); |
| 88365 | 2381 } |
| 2382 else | |
| 2383 { | |
| 2384 if (reg94 < 4) | |
| 89483 | 2385 SSET (safe_charsets, XINT (id), reg94); |
| 88365 | 2386 } |
| 2387 } | |
| 2388 ASET (attrs, coding_attr_safe_charsets, safe_charsets); | |
| 2389 } | |
| 2390 | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2391 |
| 17052 | 2392 /* See the above "GENERAL NOTES on `detect_coding_XXX ()' functions". |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2393 Check if a text is encoded in one of ISO-2022 based codig systems. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2394 If it is, return 1, else return 0. */ |
| 17052 | 2395 |
|
34531
37f85e931855
(ONE_MORE_BYTE_CHECK_MULTIBYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34197
diff
changeset
|
2396 static int |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2397 detect_coding_iso_2022 (coding, detect_info) |
| 88365 | 2398 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2399 struct coding_detection_info *detect_info; |
| 17052 | 2400 { |
| 89483 | 2401 const unsigned char *src = coding->source, *src_base = src; |
| 2402 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 88365 | 2403 int multibytep = coding->src_multibyte; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2404 int single_shifting = 0; |
| 88365 | 2405 int id; |
| 2406 int c, c1; | |
| 2407 int consumed_chars = 0; | |
| 2408 int i; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2409 int rejected = 0; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2410 int found = 0; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2411 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2412 detect_info->checked |= CATEGORY_MASK_ISO; |
| 88365 | 2413 |
| 2414 for (i = coding_category_iso_7; i <= coding_category_iso_8_else; i++) | |
| 2415 { | |
| 2416 struct coding_system *this = &(coding_categories[i]); | |
| 2417 Lisp_Object attrs, val; | |
| 2418 | |
| 2419 attrs = CODING_ID_ATTRS (this->id); | |
| 2420 if (CODING_ISO_FLAGS (this) & CODING_ISO_FLAG_FULL_SUPPORT | |
| 2421 && ! EQ (CODING_ATTR_SAFE_CHARSETS (attrs), Viso_2022_charset_list)) | |
| 2422 setup_iso_safe_charsets (attrs); | |
| 2423 val = CODING_ATTR_SAFE_CHARSETS (attrs); | |
| 89483 | 2424 this->max_charset_id = SCHARS (val) - 1; |
| 2425 this->safe_charsets = (char *) SDATA (val); | |
| 88365 | 2426 } |
| 2427 | |
| 2428 /* A coding system of this category is always ASCII compatible. */ | |
| 2429 src += coding->head_ascii; | |
| 2430 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2431 while (rejected != CATEGORY_MASK_ISO) |
| 88365 | 2432 { |
| 2433 ONE_MORE_BYTE (c); | |
| 17052 | 2434 switch (c) |
| 2435 { | |
| 2436 case ISO_CODE_ESC: | |
|
30204
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
2437 if (inhibit_iso_escape_detection) |
|
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
2438 break; |
|
23088
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2439 single_shifting = 0; |
| 88365 | 2440 ONE_MORE_BYTE (c); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2441 if (c >= '(' && c <= '/') |
|
19134
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2442 { |
|
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2443 /* Designation sequence for a charset of dimension 1. */ |
| 88365 | 2444 ONE_MORE_BYTE (c1); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2445 if (c1 < ' ' || c1 >= 0x80 |
| 88365 | 2446 || (id = iso_charset_table[0][c >= ','][c1]) < 0) |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2447 /* Invalid designation sequence. Just ignore. */ |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2448 break; |
|
19134
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2449 } |
|
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2450 else if (c == '$') |
| 17052 | 2451 { |
|
19134
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2452 /* Designation sequence for a charset of dimension 2. */ |
| 88365 | 2453 ONE_MORE_BYTE (c); |
|
19134
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2454 if (c >= '@' && c <= 'B') |
|
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2455 /* Designation for JISX0208.1978, GB2312, or JISX0208. */ |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2456 id = iso_charset_table[1][0][c]; |
|
19134
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2457 else if (c >= '(' && c <= '/') |
|
17320
9d15bec5f47e
(detect_coding_iso2022, detect_coding_mask): Ignore
Kenichi Handa <handa@m17n.org>
parents:
17304
diff
changeset
|
2458 { |
| 88365 | 2459 ONE_MORE_BYTE (c1); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2460 if (c1 < ' ' || c1 >= 0x80 |
| 88365 | 2461 || (id = iso_charset_table[1][c >= ','][c1]) < 0) |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2462 /* Invalid designation sequence. Just ignore. */ |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2463 break; |
|
17320
9d15bec5f47e
(detect_coding_iso2022, detect_coding_mask): Ignore
Kenichi Handa <handa@m17n.org>
parents:
17304
diff
changeset
|
2464 } |
|
19134
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2465 else |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2466 /* Invalid designation sequence. Just ignore it. */ |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2467 break; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2468 } |
|
23116
6736da064f4a
(detect_coding_iso2022): Handle ESC N and ESC O
Kenichi Handa <handa@m17n.org>
parents:
23089
diff
changeset
|
2469 else if (c == 'N' || c == 'O') |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2470 { |
|
23116
6736da064f4a
(detect_coding_iso2022): Handle ESC N and ESC O
Kenichi Handa <handa@m17n.org>
parents:
23089
diff
changeset
|
2471 /* ESC <Fe> for SS2 or SS3. */ |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2472 single_shifting = 1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2473 rejected |= CATEGORY_MASK_ISO_7BIT | CATEGORY_MASK_ISO_8BIT; |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2474 break; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2475 } |
| 26847 | 2476 else if (c >= '0' && c <= '4') |
| 2477 { | |
| 2478 /* ESC <Fp> for start/end composition. */ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2479 found |= CATEGORY_MASK_ISO; |
| 26847 | 2480 break; |
| 2481 } | |
|
19134
8fa6e23f8d22
(detect_coding_iso2022): Do not exclude posibility of
Kenichi Handa <handa@m17n.org>
parents:
19118
diff
changeset
|
2482 else |
| 88365 | 2483 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2484 /* Invalid escape sequence. Just ignore it. */ |
| 88365 | 2485 break; |
| 2486 } | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2487 |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2488 /* We found a valid designation sequence for CHARSET. */ |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2489 rejected |= CATEGORY_MASK_ISO_8BIT; |
| 88365 | 2490 if (SAFE_CHARSET_P (&coding_categories[coding_category_iso_7], |
| 2491 id)) | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2492 found |= CATEGORY_MASK_ISO_7; |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2493 else |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2494 rejected |= CATEGORY_MASK_ISO_7; |
| 88365 | 2495 if (SAFE_CHARSET_P (&coding_categories[coding_category_iso_7_tight], |
| 2496 id)) | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2497 found |= CATEGORY_MASK_ISO_7_TIGHT; |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2498 else |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2499 rejected |= CATEGORY_MASK_ISO_7_TIGHT; |
| 88365 | 2500 if (SAFE_CHARSET_P (&coding_categories[coding_category_iso_7_else], |
| 2501 id)) | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2502 found |= CATEGORY_MASK_ISO_7_ELSE; |
|
23116
6736da064f4a
(detect_coding_iso2022): Handle ESC N and ESC O
Kenichi Handa <handa@m17n.org>
parents:
23089
diff
changeset
|
2503 else |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2504 rejected |= CATEGORY_MASK_ISO_7_ELSE; |
| 88365 | 2505 if (SAFE_CHARSET_P (&coding_categories[coding_category_iso_8_else], |
| 2506 id)) | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2507 found |= CATEGORY_MASK_ISO_8_ELSE; |
|
23116
6736da064f4a
(detect_coding_iso2022): Handle ESC N and ESC O
Kenichi Handa <handa@m17n.org>
parents:
23089
diff
changeset
|
2508 else |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2509 rejected |= CATEGORY_MASK_ISO_8_ELSE; |
| 17052 | 2510 break; |
| 2511 | |
| 2512 case ISO_CODE_SO: | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2513 case ISO_CODE_SI: |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2514 /* Locking shift out/in. */ |
|
30204
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
2515 if (inhibit_iso_escape_detection) |
|
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
2516 break; |
|
23088
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2517 single_shifting = 0; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2518 rejected |= CATEGORY_MASK_ISO_7BIT | CATEGORY_MASK_ISO_8BIT; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2519 found |= CATEGORY_MASK_ISO_ELSE; |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2520 break; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2521 |
| 17052 | 2522 case ISO_CODE_CSI: |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2523 /* Control sequence introducer. */ |
|
23088
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2524 single_shifting = 0; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2525 rejected |= CATEGORY_MASK_ISO_7BIT | CATEGORY_MASK_ISO_7_ELSE; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2526 found |= CATEGORY_MASK_ISO_8_ELSE; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2527 goto check_extra_latin; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2528 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2529 |
| 17052 | 2530 case ISO_CODE_SS2: |
| 2531 case ISO_CODE_SS3: | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2532 /* Single shift. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2533 if (inhibit_iso_escape_detection) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2534 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2535 single_shifting = 1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2536 rejected |= CATEGORY_MASK_ISO_7BIT; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2537 if (CODING_ISO_FLAGS (&coding_categories[coding_category_iso_8_1]) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2538 & CODING_ISO_FLAG_SINGLE_SHIFT) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2539 found |= CATEGORY_MASK_ISO_8_1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2540 if (CODING_ISO_FLAGS (&coding_categories[coding_category_iso_8_2]) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2541 & CODING_ISO_FLAG_SINGLE_SHIFT) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2542 found |= CATEGORY_MASK_ISO_8_2; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2543 goto check_extra_latin; |
| 17052 | 2544 |
| 2545 default: | |
| 2546 if (c < 0x80) | |
|
23088
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2547 { |
|
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2548 single_shifting = 0; |
|
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2549 break; |
|
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2550 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2551 if (c >= 0xA0) |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
2552 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2553 rejected |= CATEGORY_MASK_ISO_7BIT | CATEGORY_MASK_ISO_7_ELSE; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2554 found |= CATEGORY_MASK_ISO_8_1; |
|
23088
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2555 /* Check the length of succeeding codes of the range |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2556 0xA0..0FF. If the byte length is even, we include |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2557 CATEGORY_MASK_ISO_8_2 in `found'. We can check this |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2558 only when we are not single shifting. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2559 if (! single_shifting |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2560 && ! (rejected & CATEGORY_MASK_ISO_8_2)) |
|
23088
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2561 { |
|
29299
b33b38d81020
(detect_coding_iso2022): Fix code for checking
Kenichi Handa <handa@m17n.org>
parents:
29275
diff
changeset
|
2562 int i = 1; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2563 while (src < src_end) |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2564 { |
| 88365 | 2565 ONE_MORE_BYTE (c); |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2566 if (c < 0xA0) |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2567 break; |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2568 i++; |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2569 } |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2570 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2571 if (i & 1 && src < src_end) |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2572 rejected |= CATEGORY_MASK_ISO_8_2; |
|
23088
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2573 else |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2574 found |= CATEGORY_MASK_ISO_8_2; |
|
23088
45c36d636f66
(detect_coding_iso2022): Don't check the byte length of
Kenichi Handa <handa@m17n.org>
parents:
23082
diff
changeset
|
2575 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2576 break; |
| 17052 | 2577 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2578 check_extra_latin: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2579 single_shifting = 0; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2580 if (! VECTORP (Vlatin_extra_code_table) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2581 || NILP (XVECTOR (Vlatin_extra_code_table)->contents[c])) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2582 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2583 rejected = CATEGORY_MASK_ISO; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2584 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2585 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2586 if (CODING_ISO_FLAGS (&coding_categories[coding_category_iso_8_1]) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2587 & CODING_ISO_FLAG_LATIN_EXTRA) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2588 found |= CATEGORY_MASK_ISO_8_1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2589 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2590 rejected |= CATEGORY_MASK_ISO_8_1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2591 if (CODING_ISO_FLAGS (&coding_categories[coding_category_iso_8_2]) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2592 & CODING_ISO_FLAG_LATIN_EXTRA) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2593 found |= CATEGORY_MASK_ISO_8_2; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2594 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2595 rejected |= CATEGORY_MASK_ISO_8_2; |
| 17052 | 2596 } |
| 2597 } | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2598 detect_info->rejected |= CATEGORY_MASK_ISO; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2599 return 0; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2600 |
| 88365 | 2601 no_more_source: |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2602 detect_info->rejected |= rejected; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2603 detect_info->found |= (found & ~rejected); |
| 88365 | 2604 return 1; |
| 17052 | 2605 } |
| 2606 | |
| 2607 | |
| 2608 /* Set designation state into CODING. */ | |
| 88365 | 2609 #define DECODE_DESIGNATION(reg, dim, chars_96, final) \ |
| 2610 do { \ | |
| 2611 int id, prev; \ | |
| 2612 \ | |
| 2613 if (final < '0' || final >= 128 \ | |
| 2614 || ((id = ISO_CHARSET_TABLE (dim, chars_96, final)) < 0) \ | |
| 2615 || !SAFE_CHARSET_P (coding, id)) \ | |
| 2616 { \ | |
| 2617 CODING_ISO_DESIGNATION (coding, reg) = -2; \ | |
| 2618 goto invalid_code; \ | |
| 2619 } \ | |
| 2620 prev = CODING_ISO_DESIGNATION (coding, reg); \ | |
|
88681
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2621 if (id == charset_jisx0201_roman) \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2622 { \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2623 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_USE_ROMAN) \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2624 id = charset_ascii; \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2625 } \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2626 else if (id == charset_jisx0208_1978) \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2627 { \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2628 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_USE_OLDJIS) \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2629 id = charset_jisx0208; \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
2630 } \ |
| 88365 | 2631 CODING_ISO_DESIGNATION (coding, reg) = id; \ |
| 2632 /* If there was an invalid designation to REG previously, and this \ | |
| 2633 designation is ASCII to REG, we should keep this designation \ | |
| 2634 sequence. */ \ | |
| 2635 if (prev == -2 && id == charset_ascii) \ | |
| 2636 goto invalid_code; \ | |
| 17052 | 2637 } while (0) |
| 2638 | |
| 88365 | 2639 |
| 2640 #define MAYBE_FINISH_COMPOSITION() \ | |
| 2641 do { \ | |
| 2642 int i; \ | |
| 2643 if (composition_state == COMPOSING_NO) \ | |
| 2644 break; \ | |
| 2645 /* It is assured that we have enough room for producing \ | |
| 2646 characters stored in the table `components'. */ \ | |
| 2647 if (charbuf + component_idx > charbuf_end) \ | |
| 2648 goto no_more_source; \ | |
| 2649 composition_state = COMPOSING_NO; \ | |
| 2650 if (method == COMPOSITION_RELATIVE \ | |
| 2651 || method == COMPOSITION_WITH_ALTCHARS) \ | |
| 2652 { \ | |
| 2653 for (i = 0; i < component_idx; i++) \ | |
| 2654 *charbuf++ = components[i]; \ | |
| 2655 char_offset += component_idx; \ | |
| 2656 } \ | |
| 2657 else \ | |
| 2658 { \ | |
| 2659 for (i = 0; i < component_idx; i += 2) \ | |
| 2660 *charbuf++ = components[i]; \ | |
| 2661 char_offset += (component_idx / 2) + 1; \ | |
| 2662 } \ | |
| 2663 } while (0) | |
| 2664 | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
2665 |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
2666 /* Handle composition start sequence ESC 0, ESC 2, ESC 3, or ESC 4. |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
2667 ESC 0 : relative composition : ESC 0 CHAR ... ESC 1 |
|
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
2668 ESC 2 : rulebase composition : ESC 2 CHAR RULE CHAR RULE ... CHAR ESC 1 |
| 88365 | 2669 ESC 3 : altchar composition : ESC 3 CHAR ... ESC 0 CHAR ... ESC 1 |
| 2670 ESC 4 : alt&rule composition : ESC 4 CHAR RULE ... CHAR ESC 0 CHAR ... ESC 1 | |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
2671 */ |
| 26847 | 2672 |
| 88365 | 2673 #define DECODE_COMPOSITION_START(c1) \ |
| 26847 | 2674 do { \ |
| 88365 | 2675 if (c1 == '0' \ |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2676 && composition_state == COMPOSING_COMPONENT_RULE) \ |
| 26847 | 2677 { \ |
| 88365 | 2678 component_len = component_idx; \ |
| 2679 composition_state = COMPOSING_CHAR; \ | |
| 26847 | 2680 } \ |
| 2681 else \ | |
| 2682 { \ | |
| 89483 | 2683 const unsigned char *p; \ |
| 88365 | 2684 \ |
| 2685 MAYBE_FINISH_COMPOSITION (); \ | |
| 2686 if (charbuf + MAX_COMPOSITION_COMPONENTS > charbuf_end) \ | |
| 2687 goto no_more_source; \ | |
| 2688 for (p = src; p < src_end - 1; p++) \ | |
| 2689 if (*p == ISO_CODE_ESC && p[1] == '1') \ | |
| 2690 break; \ | |
| 2691 if (p == src_end - 1) \ | |
| 2692 { \ | |
| 2693 if (coding->mode & CODING_MODE_LAST_BLOCK) \ | |
| 2694 goto invalid_code; \ | |
| 2695 goto no_more_source; \ | |
| 2696 } \ | |
| 2697 \ | |
| 2698 /* This is surely the start of a composition. */ \ | |
| 2699 method = (c1 == '0' ? COMPOSITION_RELATIVE \ | |
| 2700 : c1 == '2' ? COMPOSITION_WITH_RULE \ | |
| 2701 : c1 == '3' ? COMPOSITION_WITH_ALTCHARS \ | |
| 2702 : COMPOSITION_WITH_RULE_ALTCHARS); \ | |
| 2703 composition_state = (c1 <= '2' ? COMPOSING_CHAR \ | |
| 2704 : COMPOSING_COMPONENT_CHAR); \ | |
| 2705 component_idx = component_len = 0; \ | |
| 26847 | 2706 } \ |
| 2707 } while (0) | |
| 2708 | |
| 88365 | 2709 |
| 2710 /* Handle compositoin end sequence ESC 1. */ | |
| 2711 | |
| 2712 #define DECODE_COMPOSITION_END() \ | |
| 2713 do { \ | |
| 2714 int nchars = (component_len > 0 ? component_idx - component_len \ | |
| 2715 : method == COMPOSITION_RELATIVE ? component_idx \ | |
| 2716 : (component_idx + 1) / 2); \ | |
| 2717 int i; \ | |
| 2718 int *saved_charbuf = charbuf; \ | |
| 89483 | 2719 int from = char_offset; \ |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2720 int to = from + nchars; \ |
| 88365 | 2721 \ |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2722 ADD_COMPOSITION_DATA (charbuf, from, to, method); \ |
| 88365 | 2723 if (method != COMPOSITION_RELATIVE) \ |
| 2724 { \ | |
| 2725 if (component_len == 0) \ | |
| 2726 for (i = 0; i < component_idx; i++) \ | |
| 2727 *charbuf++ = components[i]; \ | |
| 2728 else \ | |
| 2729 for (i = 0; i < component_len; i++) \ | |
| 2730 *charbuf++ = components[i]; \ | |
| 2731 *saved_charbuf = saved_charbuf - charbuf; \ | |
| 2732 } \ | |
| 2733 if (method == COMPOSITION_WITH_RULE) \ | |
| 2734 for (i = 0; i < component_idx; i += 2, char_offset++) \ | |
| 2735 *charbuf++ = components[i]; \ | |
| 2736 else \ | |
| 2737 for (i = component_len; i < component_idx; i++, char_offset++) \ | |
| 2738 *charbuf++ = components[i]; \ | |
| 2739 coding->annotated = 1; \ | |
| 2740 composition_state = COMPOSING_NO; \ | |
| 2741 } while (0) | |
| 2742 | |
| 2743 | |
| 26847 | 2744 /* Decode a composition rule from the byte C1 (and maybe one more byte |
| 2745 from SRC) and store one encoded composition rule in | |
| 2746 coding->cmp_data. */ | |
| 2747 | |
| 2748 #define DECODE_COMPOSITION_RULE(c1) \ | |
| 2749 do { \ | |
| 2750 (c1) -= 32; \ | |
| 2751 if (c1 < 81) /* old format (before ver.21) */ \ | |
| 2752 { \ | |
| 2753 int gref = (c1) / 9; \ | |
| 2754 int nref = (c1) % 9; \ | |
| 2755 if (gref == 4) gref = 10; \ | |
| 2756 if (nref == 4) nref = 10; \ | |
| 88365 | 2757 c1 = COMPOSITION_ENCODE_RULE (gref, nref); \ |
| 26847 | 2758 } \ |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2759 else if (c1 < 93) /* new format (after ver.21) */ \ |
| 26847 | 2760 { \ |
| 2761 ONE_MORE_BYTE (c2); \ | |
| 88365 | 2762 c1 = COMPOSITION_ENCODE_RULE (c1 - 81, c2 - 32); \ |
| 26847 | 2763 } \ |
| 88365 | 2764 else \ |
| 2765 c1 = 0; \ | |
| 26847 | 2766 } while (0) |
| 2767 | |
| 2768 | |
| 17052 | 2769 /* See the above "GENERAL NOTES on `decode_coding_XXX ()' functions". */ |
| 2770 | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2771 static void |
| 88365 | 2772 decode_coding_iso_2022 (coding) |
| 17052 | 2773 struct coding_system *coding; |
| 2774 { | |
| 89483 | 2775 const unsigned char *src = coding->source + coding->consumed; |
| 2776 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 2777 const unsigned char *src_base; | |
| 88365 | 2778 int *charbuf = coding->charbuf; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2779 int *charbuf_end |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2780 = charbuf + coding->charbuf_size - 4 - MAX_ANNOTATION_LENGTH; |
| 88365 | 2781 int consumed_chars = 0, consumed_chars_base; |
| 2782 int multibytep = coding->src_multibyte; | |
| 17052 | 2783 /* Charsets invoked to graphic plane 0 and 1 respectively. */ |
| 88365 | 2784 int charset_id_0 = CODING_ISO_INVOKED_CHARSET (coding, 0); |
| 2785 int charset_id_1 = CODING_ISO_INVOKED_CHARSET (coding, 1); | |
| 2786 struct charset *charset; | |
| 2787 int c; | |
| 2788 /* For handling composition sequence. */ | |
| 2789 #define COMPOSING_NO 0 | |
| 2790 #define COMPOSING_CHAR 1 | |
| 2791 #define COMPOSING_RULE 2 | |
| 2792 #define COMPOSING_COMPONENT_CHAR 3 | |
| 2793 #define COMPOSING_COMPONENT_RULE 4 | |
| 2794 | |
| 2795 int composition_state = COMPOSING_NO; | |
| 2796 enum composition_method method; | |
| 2797 int components[MAX_COMPOSITION_COMPONENTS * 2 + 1]; | |
| 2798 int component_idx; | |
| 2799 int component_len; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
2800 Lisp_Object attrs, charset_list; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2801 int char_offset = coding->produced_char; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2802 int last_offset = char_offset; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
2803 int last_id = charset_ascii; |
| 88365 | 2804 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
2805 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 2806 setup_iso_safe_charsets (attrs); |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2807 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2808 while (1) |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2809 { |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2810 int c1, c2; |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2811 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2812 src_base = src; |
| 88365 | 2813 consumed_chars_base = consumed_chars; |
| 2814 | |
| 2815 if (charbuf >= charbuf_end) | |
| 2816 break; | |
| 2817 | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2818 ONE_MORE_BYTE (c1); |
| 17052 | 2819 |
|
89279
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
2820 /* We produce at most one character. */ |
| 17052 | 2821 switch (iso_code_class [c1]) |
| 2822 { | |
| 2823 case ISO_0x20_or_0x7F: | |
| 88365 | 2824 if (composition_state != COMPOSING_NO) |
| 26847 | 2825 { |
| 88365 | 2826 if (composition_state == COMPOSING_RULE |
| 2827 || composition_state == COMPOSING_COMPONENT_RULE) | |
| 2828 { | |
| 2829 DECODE_COMPOSITION_RULE (c1); | |
| 2830 components[component_idx++] = c1; | |
| 2831 composition_state--; | |
| 2832 continue; | |
| 2833 } | |
| 26847 | 2834 } |
| 88365 | 2835 if (charset_id_0 < 0 |
| 2836 || ! CHARSET_ISO_CHARS_96 (CHARSET_FROM_ID (charset_id_0))) | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2837 /* This is SPACE or DEL. */ |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2838 charset = CHARSET_FROM_ID (charset_ascii); |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2839 else |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2840 charset = CHARSET_FROM_ID (charset_id_0); |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2841 break; |
| 17052 | 2842 |
| 2843 case ISO_graphic_plane_0: | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2844 if (composition_state != COMPOSING_NO) |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2845 { |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2846 if (composition_state == COMPOSING_RULE |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2847 || composition_state == COMPOSING_COMPONENT_RULE) |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2848 { |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2849 DECODE_COMPOSITION_RULE (c1); |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2850 components[component_idx++] = c1; |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2851 composition_state--; |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2852 continue; |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
2853 } |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2854 } |
| 88365 | 2855 charset = CHARSET_FROM_ID (charset_id_0); |
| 17052 | 2856 break; |
| 2857 | |
| 2858 case ISO_0xA0_or_0xFF: | |
| 88365 | 2859 if (charset_id_1 < 0 |
| 2860 || ! CHARSET_ISO_CHARS_96 (CHARSET_FROM_ID (charset_id_1)) | |
| 2861 || CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SEVEN_BITS) | |
| 2862 goto invalid_code; | |
| 17052 | 2863 /* This is a graphic character, we fall down ... */ |
| 2864 | |
| 2865 case ISO_graphic_plane_1: | |
| 88365 | 2866 if (charset_id_1 < 0) |
| 2867 goto invalid_code; | |
| 2868 charset = CHARSET_FROM_ID (charset_id_1); | |
| 17052 | 2869 break; |
| 2870 | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2871 case ISO_control_0: |
| 88365 | 2872 MAYBE_FINISH_COMPOSITION (); |
| 2873 charset = CHARSET_FROM_ID (charset_ascii); | |
| 17052 | 2874 break; |
| 2875 | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2876 case ISO_control_1: |
| 88365 | 2877 MAYBE_FINISH_COMPOSITION (); |
| 2878 goto invalid_code; | |
| 17052 | 2879 |
| 2880 case ISO_shift_out: | |
| 88365 | 2881 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_LOCKING_SHIFT) |
| 2882 || CODING_ISO_DESIGNATION (coding, 1) < 0) | |
| 2883 goto invalid_code; | |
| 2884 CODING_ISO_INVOCATION (coding, 0) = 1; | |
| 2885 charset_id_0 = CODING_ISO_INVOKED_CHARSET (coding, 0); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2886 continue; |
| 17052 | 2887 |
| 2888 case ISO_shift_in: | |
| 88365 | 2889 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_LOCKING_SHIFT)) |
| 2890 goto invalid_code; | |
| 2891 CODING_ISO_INVOCATION (coding, 0) = 0; | |
| 2892 charset_id_0 = CODING_ISO_INVOKED_CHARSET (coding, 0); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2893 continue; |
| 17052 | 2894 |
| 2895 case ISO_single_shift_2_7: | |
| 2896 case ISO_single_shift_2: | |
| 88365 | 2897 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SINGLE_SHIFT)) |
| 2898 goto invalid_code; | |
| 17052 | 2899 /* SS2 is handled as an escape sequence of ESC 'N' */ |
| 2900 c1 = 'N'; | |
| 2901 goto label_escape_sequence; | |
| 2902 | |
| 2903 case ISO_single_shift_3: | |
| 88365 | 2904 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SINGLE_SHIFT)) |
| 2905 goto invalid_code; | |
| 17052 | 2906 /* SS2 is handled as an escape sequence of ESC 'O' */ |
| 2907 c1 = 'O'; | |
| 2908 goto label_escape_sequence; | |
| 2909 | |
| 2910 case ISO_control_sequence_introducer: | |
| 2911 /* CSI is handled as an escape sequence of ESC '[' ... */ | |
| 2912 c1 = '['; | |
| 2913 goto label_escape_sequence; | |
| 2914 | |
| 2915 case ISO_escape: | |
| 2916 ONE_MORE_BYTE (c1); | |
| 2917 label_escape_sequence: | |
| 88365 | 2918 /* Escape sequences handled here are invocation, |
| 17052 | 2919 designation, direction specification, and character |
| 2920 composition specification. */ | |
| 2921 switch (c1) | |
| 2922 { | |
| 2923 case '&': /* revision of following character set */ | |
| 2924 ONE_MORE_BYTE (c1); | |
| 2925 if (!(c1 >= '@' && c1 <= '~')) | |
| 88365 | 2926 goto invalid_code; |
| 17052 | 2927 ONE_MORE_BYTE (c1); |
| 2928 if (c1 != ISO_CODE_ESC) | |
| 88365 | 2929 goto invalid_code; |
| 17052 | 2930 ONE_MORE_BYTE (c1); |
| 2931 goto label_escape_sequence; | |
| 2932 | |
| 2933 case '$': /* designation of 2-byte character set */ | |
| 88365 | 2934 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_DESIGNATION)) |
| 2935 goto invalid_code; | |
| 17052 | 2936 ONE_MORE_BYTE (c1); |
| 2937 if (c1 >= '@' && c1 <= 'B') | |
| 2938 { /* designation of JISX0208.1978, GB2312.1980, | |
|
23339
2da87b489590
(check_composing_code): Fix previous change. Now it
Kenichi Handa <handa@m17n.org>
parents:
23325
diff
changeset
|
2939 or JISX0208.1980 */ |
| 88365 | 2940 DECODE_DESIGNATION (0, 2, 0, c1); |
| 17052 | 2941 } |
| 2942 else if (c1 >= 0x28 && c1 <= 0x2B) | |
| 2943 { /* designation of DIMENSION2_CHARS94 character set */ | |
| 2944 ONE_MORE_BYTE (c2); | |
| 88365 | 2945 DECODE_DESIGNATION (c1 - 0x28, 2, 0, c2); |
| 17052 | 2946 } |
| 2947 else if (c1 >= 0x2C && c1 <= 0x2F) | |
| 2948 { /* designation of DIMENSION2_CHARS96 character set */ | |
| 2949 ONE_MORE_BYTE (c2); | |
| 88365 | 2950 DECODE_DESIGNATION (c1 - 0x2C, 2, 1, c2); |
| 17052 | 2951 } |
| 2952 else | |
| 88365 | 2953 goto invalid_code; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2954 /* We must update these variables now. */ |
| 88365 | 2955 charset_id_0 = CODING_ISO_INVOKED_CHARSET (coding, 0); |
| 2956 charset_id_1 = CODING_ISO_INVOKED_CHARSET (coding, 1); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2957 continue; |
| 17052 | 2958 |
| 2959 case 'n': /* invocation of locking-shift-2 */ | |
| 88365 | 2960 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_LOCKING_SHIFT) |
| 2961 || CODING_ISO_DESIGNATION (coding, 2) < 0) | |
| 2962 goto invalid_code; | |
| 2963 CODING_ISO_INVOCATION (coding, 0) = 2; | |
| 2964 charset_id_0 = CODING_ISO_INVOKED_CHARSET (coding, 0); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2965 continue; |
| 17052 | 2966 |
| 2967 case 'o': /* invocation of locking-shift-3 */ | |
| 88365 | 2968 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_LOCKING_SHIFT) |
| 2969 || CODING_ISO_DESIGNATION (coding, 3) < 0) | |
| 2970 goto invalid_code; | |
| 2971 CODING_ISO_INVOCATION (coding, 0) = 3; | |
| 2972 charset_id_0 = CODING_ISO_INVOKED_CHARSET (coding, 0); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2973 continue; |
| 17052 | 2974 |
| 2975 case 'N': /* invocation of single-shift-2 */ | |
| 88365 | 2976 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SINGLE_SHIFT) |
| 2977 || CODING_ISO_DESIGNATION (coding, 2) < 0) | |
| 2978 goto invalid_code; | |
| 2979 charset = CHARSET_FROM_ID (CODING_ISO_DESIGNATION (coding, 2)); | |
| 17052 | 2980 ONE_MORE_BYTE (c1); |
|
30578
705b94e152b1
(decode_coding_iso2022): More strict check for handling single
Kenichi Handa <handa@m17n.org>
parents:
30487
diff
changeset
|
2981 if (c1 < 0x20 || (c1 >= 0x80 && c1 < 0xA0)) |
| 88365 | 2982 goto invalid_code; |
| 17052 | 2983 break; |
| 2984 | |
| 2985 case 'O': /* invocation of single-shift-3 */ | |
| 88365 | 2986 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SINGLE_SHIFT) |
| 2987 || CODING_ISO_DESIGNATION (coding, 3) < 0) | |
| 2988 goto invalid_code; | |
| 2989 charset = CHARSET_FROM_ID (CODING_ISO_DESIGNATION (coding, 3)); | |
| 17052 | 2990 ONE_MORE_BYTE (c1); |
|
30578
705b94e152b1
(decode_coding_iso2022): More strict check for handling single
Kenichi Handa <handa@m17n.org>
parents:
30487
diff
changeset
|
2991 if (c1 < 0x20 || (c1 >= 0x80 && c1 < 0xA0)) |
| 88365 | 2992 goto invalid_code; |
| 17052 | 2993 break; |
| 2994 | |
| 26847 | 2995 case '0': case '2': case '3': case '4': /* start composition */ |
| 88365 | 2996 if (! (coding->common_flags & CODING_ANNOTATE_COMPOSITION_MASK)) |
| 2997 goto invalid_code; | |
| 26847 | 2998 DECODE_COMPOSITION_START (c1); |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
2999 continue; |
| 17052 | 3000 |
| 26847 | 3001 case '1': /* end composition */ |
| 88365 | 3002 if (composition_state == COMPOSING_NO) |
| 3003 goto invalid_code; | |
| 3004 DECODE_COMPOSITION_END (); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3005 continue; |
| 17052 | 3006 |
| 3007 case '[': /* specification of direction */ | |
| 88365 | 3008 if (! CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_DIRECTION) |
| 3009 goto invalid_code; | |
| 17052 | 3010 /* For the moment, nested direction is not supported. |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
3011 So, `coding->mode & CODING_MODE_DIRECTION' zero means |
| 88365 | 3012 left-to-right, and nozero means right-to-left. */ |
| 17052 | 3013 ONE_MORE_BYTE (c1); |
| 3014 switch (c1) | |
| 3015 { | |
| 3016 case ']': /* end of the current direction */ | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
3017 coding->mode &= ~CODING_MODE_DIRECTION; |
| 17052 | 3018 |
| 3019 case '0': /* end of the current direction */ | |
| 3020 case '1': /* start of left-to-right direction */ | |
| 3021 ONE_MORE_BYTE (c1); | |
| 3022 if (c1 == ']') | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
3023 coding->mode &= ~CODING_MODE_DIRECTION; |
| 17052 | 3024 else |
| 88365 | 3025 goto invalid_code; |
| 17052 | 3026 break; |
| 3027 | |
| 3028 case '2': /* start of right-to-left direction */ | |
| 3029 ONE_MORE_BYTE (c1); | |
| 3030 if (c1 == ']') | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
3031 coding->mode |= CODING_MODE_DIRECTION; |
| 17052 | 3032 else |
| 88365 | 3033 goto invalid_code; |
| 17052 | 3034 break; |
| 3035 | |
| 3036 default: | |
| 88365 | 3037 goto invalid_code; |
| 17052 | 3038 } |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3039 continue; |
| 17052 | 3040 |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3041 case '%': |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3042 ONE_MORE_BYTE (c1); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3043 if (c1 == '/') |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3044 { |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3045 /* CTEXT extended segment: |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3046 ESC % / [0-4] M L --ENCODING-NAME-- \002 --BYTES-- |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3047 We keep these bytes as is for the moment. |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3048 They may be decoded by post-read-conversion. */ |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3049 int dim, M, L; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3050 int size; |
| 89483 | 3051 |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3052 ONE_MORE_BYTE (dim); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3053 ONE_MORE_BYTE (M); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3054 ONE_MORE_BYTE (L); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3055 size = ((M - 128) * 128) + (L - 128); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3056 if (charbuf + 8 + size > charbuf_end) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3057 goto break_loop; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3058 *charbuf++ = ISO_CODE_ESC; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3059 *charbuf++ = '%'; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3060 *charbuf++ = '/'; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3061 *charbuf++ = dim; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3062 *charbuf++ = BYTE8_TO_CHAR (M); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3063 *charbuf++ = BYTE8_TO_CHAR (L); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3064 while (size-- > 0) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3065 { |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3066 ONE_MORE_BYTE (c1); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3067 *charbuf++ = ASCII_BYTE_P (c1) ? c1 : BYTE8_TO_CHAR (c1); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3068 } |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3069 } |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3070 else if (c1 == 'G') |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3071 { |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3072 /* XFree86 extension for embedding UTF-8 in CTEXT: |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3073 ESC % G --UTF-8-BYTES-- ESC % @ |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3074 We keep these bytes as is for the moment. |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3075 They may be decoded by post-read-conversion. */ |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3076 int *p = charbuf; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3077 |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3078 if (p + 6 > charbuf_end) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3079 goto break_loop; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3080 *p++ = ISO_CODE_ESC; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3081 *p++ = '%'; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3082 *p++ = 'G'; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3083 while (p < charbuf_end) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3084 { |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3085 ONE_MORE_BYTE (c1); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3086 if (c1 == ISO_CODE_ESC |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3087 && src + 1 < src_end |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3088 && src[0] == '%' |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3089 && src[1] == '@') |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3090 break; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3091 *p++ = ASCII_BYTE_P (c1) ? c1 : BYTE8_TO_CHAR (c1); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3092 } |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3093 if (p + 3 > charbuf_end) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3094 goto break_loop; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3095 *p++ = ISO_CODE_ESC; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3096 *p++ = '%'; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3097 *p++ = '@'; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3098 charbuf = p; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3099 } |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3100 else |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3101 goto invalid_code; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3102 continue; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3103 break; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3104 |
| 17052 | 3105 default: |
| 88365 | 3106 if (! (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_DESIGNATION)) |
| 3107 goto invalid_code; | |
| 17052 | 3108 if (c1 >= 0x28 && c1 <= 0x2B) |
| 3109 { /* designation of DIMENSION1_CHARS94 character set */ | |
| 3110 ONE_MORE_BYTE (c2); | |
| 88365 | 3111 DECODE_DESIGNATION (c1 - 0x28, 1, 0, c2); |
| 17052 | 3112 } |
| 3113 else if (c1 >= 0x2C && c1 <= 0x2F) | |
| 3114 { /* designation of DIMENSION1_CHARS96 character set */ | |
| 3115 ONE_MORE_BYTE (c2); | |
| 88365 | 3116 DECODE_DESIGNATION (c1 - 0x2C, 1, 1, c2); |
| 17052 | 3117 } |
| 3118 else | |
| 88365 | 3119 goto invalid_code; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3120 /* We must update these variables now. */ |
| 88365 | 3121 charset_id_0 = CODING_ISO_INVOKED_CHARSET (coding, 0); |
| 3122 charset_id_1 = CODING_ISO_INVOKED_CHARSET (coding, 1); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3123 continue; |
| 17052 | 3124 } |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3125 } |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3126 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3127 if (charset->id != charset_ascii |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3128 && last_id != charset->id) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3129 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3130 if (last_id != charset_ascii) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3131 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3132 last_id = charset->id; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3133 last_offset = char_offset; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3134 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3135 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3136 /* Now we know CHARSET and 1st position code C1 of a character. |
| 88365 | 3137 Produce a decoded character while getting 2nd position code |
| 3138 C2 if necessary. */ | |
| 3139 c1 &= 0x7F; | |
| 3140 if (CHARSET_DIMENSION (charset) > 1) | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3141 { |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3142 ONE_MORE_BYTE (c2); |
| 88365 | 3143 if (c2 < 0x20 || (c2 >= 0x80 && c2 < 0xA0)) |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3144 /* C2 is not in a valid range. */ |
| 88365 | 3145 goto invalid_code; |
| 3146 c1 = (c1 << 8) | (c2 & 0x7F); | |
| 3147 if (CHARSET_DIMENSION (charset) > 2) | |
| 3148 { | |
| 3149 ONE_MORE_BYTE (c2); | |
| 3150 if (c2 < 0x20 || (c2 >= 0x80 && c2 < 0xA0)) | |
| 3151 /* C2 is not in a valid range. */ | |
| 3152 goto invalid_code; | |
| 3153 c1 = (c1 << 8) | (c2 & 0x7F); | |
| 3154 } | |
| 17052 | 3155 } |
| 88365 | 3156 |
| 3157 CODING_DECODE_CHAR (coding, src, src_base, src_end, charset, c1, c); | |
| 3158 if (c < 0) | |
| 3159 { | |
| 3160 MAYBE_FINISH_COMPOSITION (); | |
| 3161 for (; src_base < src; src_base++, char_offset++) | |
| 3162 { | |
| 3163 if (ASCII_BYTE_P (*src_base)) | |
| 3164 *charbuf++ = *src_base; | |
| 3165 else | |
| 3166 *charbuf++ = BYTE8_TO_CHAR (*src_base); | |
| 3167 } | |
| 3168 } | |
| 3169 else if (composition_state == COMPOSING_NO) | |
| 3170 { | |
| 3171 *charbuf++ = c; | |
| 3172 char_offset++; | |
| 3173 } | |
| 3174 else | |
|
88585
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
3175 { |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
3176 components[component_idx++] = c; |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
3177 if (method == COMPOSITION_WITH_RULE |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
3178 || (method == COMPOSITION_WITH_RULE_ALTCHARS |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
3179 && composition_state == COMPOSING_COMPONENT_CHAR)) |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
3180 composition_state++; |
|
c7772f702227
(ONE_MORE_BYTE_NO_CHECK): Increment consumed_chars.
Kenichi Handa <handa@m17n.org>
parents:
88573
diff
changeset
|
3181 } |
| 17052 | 3182 continue; |
| 3183 | |
| 88365 | 3184 invalid_code: |
| 3185 MAYBE_FINISH_COMPOSITION (); | |
| 3186 src = src_base; | |
| 3187 consumed_chars = consumed_chars_base; | |
| 3188 ONE_MORE_BYTE (c); | |
| 3189 *charbuf++ = ASCII_BYTE_P (c) ? c : BYTE8_TO_CHAR (c); | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3190 char_offset++; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3191 coding->errors++; |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3192 continue; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3193 |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3194 break_loop: |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
3195 break; |
| 88365 | 3196 } |
| 3197 | |
| 3198 no_more_source: | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3199 if (last_id != charset_ascii) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3200 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
| 88365 | 3201 coding->consumed_char += consumed_chars_base; |
| 3202 coding->consumed = src_base - coding->source; | |
| 3203 coding->charbuf_used = charbuf - coding->charbuf; | |
| 17052 | 3204 } |
| 3205 | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3206 |
| 18766 | 3207 /* ISO2022 encoding stuff. */ |
| 17052 | 3208 |
| 3209 /* | |
| 18766 | 3210 It is not enough to say just "ISO2022" on encoding, we have to |
| 88365 | 3211 specify more details. In Emacs, each coding system of ISO2022 |
| 17052 | 3212 variant has the following specifications: |
| 88365 | 3213 1. Initial designation to G0 thru G3. |
| 17052 | 3214 2. Allows short-form designation? |
| 3215 3. ASCII should be designated to G0 before control characters? | |
| 3216 4. ASCII should be designated to G0 at end of line? | |
| 3217 5. 7-bit environment or 8-bit environment? | |
| 3218 6. Use locking-shift? | |
| 3219 7. Use Single-shift? | |
| 3220 And the following two are only for Japanese: | |
| 3221 8. Use ASCII in place of JIS0201-1976-Roman? | |
| 3222 9. Use JISX0208-1983 in place of JISX0208-1978? | |
| 88365 | 3223 These specifications are encoded in CODING_ISO_FLAGS (coding) as flag bits |
| 3224 defined by macros CODING_ISO_FLAG_XXX. See `coding.h' for more | |
| 18766 | 3225 details. |
| 17052 | 3226 */ |
| 3227 | |
| 3228 /* Produce codes (escape sequence) for designating CHARSET to graphic | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3229 register REG at DST, and increment DST. If <final-char> of CHARSET is |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3230 '@', 'A', or 'B' and the coding system CODING allows, produce |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3231 designation sequence of short-form. */ |
| 17052 | 3232 |
| 3233 #define ENCODE_DESIGNATION(charset, reg, coding) \ | |
| 3234 do { \ | |
| 88365 | 3235 unsigned char final_char = CHARSET_ISO_FINAL (charset); \ |
| 17052 | 3236 char *intermediate_char_94 = "()*+"; \ |
| 3237 char *intermediate_char_96 = ",-./"; \ | |
| 88365 | 3238 int revision = -1; \ |
| 3239 int c; \ | |
| 3240 \ | |
| 3241 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_REVISION) \ | |
| 88856 | 3242 revision = CHARSET_ISO_REVISION (charset); \ |
| 88365 | 3243 \ |
| 3244 if (revision >= 0) \ | |
|
20150
402b6e5f4b58
(encode_designation_at_bol): Fix bug of finding graphic
Kenichi Handa <handa@m17n.org>
parents:
20105
diff
changeset
|
3245 { \ |
| 88365 | 3246 EMIT_TWO_ASCII_BYTES (ISO_CODE_ESC, '&'); \ |
| 3247 EMIT_ONE_BYTE ('@' + revision); \ | |
| 17052 | 3248 } \ |
| 88365 | 3249 EMIT_ONE_ASCII_BYTE (ISO_CODE_ESC); \ |
| 17052 | 3250 if (CHARSET_DIMENSION (charset) == 1) \ |
| 3251 { \ | |
| 88365 | 3252 if (! CHARSET_ISO_CHARS_96 (charset)) \ |
| 3253 c = intermediate_char_94[reg]; \ | |
| 17052 | 3254 else \ |
| 88365 | 3255 c = intermediate_char_96[reg]; \ |
| 3256 EMIT_ONE_ASCII_BYTE (c); \ | |
| 17052 | 3257 } \ |
| 3258 else \ | |
| 3259 { \ | |
| 88365 | 3260 EMIT_ONE_ASCII_BYTE ('$'); \ |
| 3261 if (! CHARSET_ISO_CHARS_96 (charset)) \ | |
| 17052 | 3262 { \ |
| 88365 | 3263 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_LONG_FORM \ |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3264 || reg != 0 \ |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3265 || final_char < '@' || final_char > 'B') \ |
| 88365 | 3266 EMIT_ONE_ASCII_BYTE (intermediate_char_94[reg]); \ |
| 17052 | 3267 } \ |
| 3268 else \ | |
| 88365 | 3269 EMIT_ONE_ASCII_BYTE (intermediate_char_96[reg]); \ |
| 17052 | 3270 } \ |
| 88365 | 3271 EMIT_ONE_ASCII_BYTE (final_char); \ |
| 3272 \ | |
| 3273 CODING_ISO_DESIGNATION (coding, reg) = CHARSET_ID (charset); \ | |
| 17052 | 3274 } while (0) |
| 3275 | |
| 88365 | 3276 |
| 17052 | 3277 /* The following two macros produce codes (control character or escape |
| 3278 sequence) for ISO2022 single-shift functions (single-shift-2 and | |
| 3279 single-shift-3). */ | |
| 3280 | |
| 88365 | 3281 #define ENCODE_SINGLE_SHIFT_2 \ |
| 3282 do { \ | |
| 3283 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SEVEN_BITS) \ | |
| 3284 EMIT_TWO_ASCII_BYTES (ISO_CODE_ESC, 'N'); \ | |
| 3285 else \ | |
| 3286 EMIT_ONE_BYTE (ISO_CODE_SS2); \ | |
| 3287 CODING_ISO_SINGLE_SHIFTING (coding) = 1; \ | |
| 17052 | 3288 } while (0) |
| 3289 | |
| 88365 | 3290 |
| 3291 #define ENCODE_SINGLE_SHIFT_3 \ | |
| 3292 do { \ | |
| 3293 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SEVEN_BITS) \ | |
| 3294 EMIT_TWO_ASCII_BYTES (ISO_CODE_ESC, 'O'); \ | |
| 3295 else \ | |
| 3296 EMIT_ONE_BYTE (ISO_CODE_SS3); \ | |
| 3297 CODING_ISO_SINGLE_SHIFTING (coding) = 1; \ | |
| 17052 | 3298 } while (0) |
| 3299 | |
| 88365 | 3300 |
| 17052 | 3301 /* The following four macros produce codes (control character or |
| 3302 escape sequence) for ISO2022 locking-shift functions (shift-in, | |
| 3303 shift-out, locking-shift-2, and locking-shift-3). */ | |
| 3304 | |
| 88365 | 3305 #define ENCODE_SHIFT_IN \ |
| 3306 do { \ | |
| 3307 EMIT_ONE_ASCII_BYTE (ISO_CODE_SI); \ | |
| 3308 CODING_ISO_INVOCATION (coding, 0) = 0; \ | |
| 17052 | 3309 } while (0) |
| 3310 | |
| 88365 | 3311 |
| 3312 #define ENCODE_SHIFT_OUT \ | |
| 3313 do { \ | |
| 3314 EMIT_ONE_ASCII_BYTE (ISO_CODE_SO); \ | |
| 3315 CODING_ISO_INVOCATION (coding, 0) = 1; \ | |
| 17052 | 3316 } while (0) |
| 3317 | |
| 88365 | 3318 |
| 3319 #define ENCODE_LOCKING_SHIFT_2 \ | |
| 3320 do { \ | |
| 3321 EMIT_TWO_ASCII_BYTES (ISO_CODE_ESC, 'n'); \ | |
| 3322 CODING_ISO_INVOCATION (coding, 0) = 2; \ | |
| 17052 | 3323 } while (0) |
| 3324 | |
| 88365 | 3325 |
| 3326 #define ENCODE_LOCKING_SHIFT_3 \ | |
| 3327 do { \ | |
| 3328 EMIT_TWO_ASCII_BYTES (ISO_CODE_ESC, 'n'); \ | |
| 3329 CODING_ISO_INVOCATION (coding, 0) = 3; \ | |
| 17052 | 3330 } while (0) |
| 3331 | |
| 88365 | 3332 |
| 18766 | 3333 /* Produce codes for a DIMENSION1 character whose character set is |
| 3334 CHARSET and whose position-code is C1. Designation and invocation | |
| 17052 | 3335 sequences are also produced in advance if necessary. */ |
| 3336 | |
|
19285
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3337 #define ENCODE_ISO_CHARACTER_DIMENSION1(charset, c1) \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3338 do { \ |
| 88365 | 3339 int id = CHARSET_ID (charset); \ |
|
88681
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3340 \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3341 if ((CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_USE_ROMAN) \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3342 && id == charset_ascii) \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3343 { \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3344 id = charset_jisx0201_roman; \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3345 charset = CHARSET_FROM_ID (id); \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3346 } \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3347 \ |
| 88365 | 3348 if (CODING_ISO_SINGLE_SHIFTING (coding)) \ |
|
19285
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3349 { \ |
| 88365 | 3350 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SEVEN_BITS) \ |
| 3351 EMIT_ONE_ASCII_BYTE (c1 & 0x7F); \ | |
|
19285
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3352 else \ |
| 88365 | 3353 EMIT_ONE_BYTE (c1 | 0x80); \ |
| 3354 CODING_ISO_SINGLE_SHIFTING (coding) = 0; \ | |
|
19285
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3355 break; \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3356 } \ |
| 88365 | 3357 else if (id == CODING_ISO_INVOKED_CHARSET (coding, 0)) \ |
|
19285
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3358 { \ |
| 88365 | 3359 EMIT_ONE_ASCII_BYTE (c1 & 0x7F); \ |
|
19285
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3360 break; \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3361 } \ |
| 88365 | 3362 else if (id == CODING_ISO_INVOKED_CHARSET (coding, 1)) \ |
|
19285
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3363 { \ |
| 88365 | 3364 EMIT_ONE_BYTE (c1 | 0x80); \ |
|
19285
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3365 break; \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3366 } \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3367 else \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3368 /* Since CHARSET is not yet invoked to any graphic planes, we \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3369 must invoke it, or, at first, designate it to some graphic \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3370 register. Then repeat the loop to actually produce the \ |
|
af3d00fde172
(Fset_terminal_coding_system_internal): Set
Kenichi Handa <handa@m17n.org>
parents:
19280
diff
changeset
|
3371 character. */ \ |
| 88365 | 3372 dst = encode_invocation_designation (charset, coding, dst, \ |
| 3373 &produced_chars); \ | |
| 17052 | 3374 } while (1) |
| 3375 | |
| 88365 | 3376 |
| 3377 /* Produce codes for a DIMENSION2 character whose character set is | |
| 3378 CHARSET and whose position-codes are C1 and C2. Designation and | |
| 3379 invocation codes are also produced in advance if necessary. */ | |
| 3380 | |
| 3381 #define ENCODE_ISO_CHARACTER_DIMENSION2(charset, c1, c2) \ | |
|
24506
219c99669e4b
(ENCODE_ISO_CHARACTER): Check validity of CHARSET. If
Kenichi Handa <handa@m17n.org>
parents:
24460
diff
changeset
|
3382 do { \ |
| 88365 | 3383 int id = CHARSET_ID (charset); \ |
|
88681
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3384 \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3385 if ((CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_USE_OLDJIS) \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3386 && id == charset_jisx0208) \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3387 { \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3388 id = charset_jisx0208_1978; \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3389 charset = CHARSET_FROM_ID (id); \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3390 } \ |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3391 \ |
| 88365 | 3392 if (CODING_ISO_SINGLE_SHIFTING (coding)) \ |
| 3393 { \ | |
| 3394 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SEVEN_BITS) \ | |
| 3395 EMIT_TWO_ASCII_BYTES ((c1) & 0x7F, (c2) & 0x7F); \ | |
| 3396 else \ | |
| 3397 EMIT_TWO_BYTES ((c1) | 0x80, (c2) | 0x80); \ | |
| 3398 CODING_ISO_SINGLE_SHIFTING (coding) = 0; \ | |
| 3399 break; \ | |
| 3400 } \ | |
| 3401 else if (id == CODING_ISO_INVOKED_CHARSET (coding, 0)) \ | |
| 3402 { \ | |
| 3403 EMIT_TWO_ASCII_BYTES ((c1) & 0x7F, (c2) & 0x7F); \ | |
| 3404 break; \ | |
| 3405 } \ | |
| 3406 else if (id == CODING_ISO_INVOKED_CHARSET (coding, 1)) \ | |
| 3407 { \ | |
| 3408 EMIT_TWO_BYTES ((c1) | 0x80, (c2) | 0x80); \ | |
| 3409 break; \ | |
| 3410 } \ | |
| 3411 else \ | |
| 3412 /* Since CHARSET is not yet invoked to any graphic planes, we \ | |
| 3413 must invoke it, or, at first, designate it to some graphic \ | |
| 3414 register. Then repeat the loop to actually produce the \ | |
| 3415 character. */ \ | |
| 3416 dst = encode_invocation_designation (charset, coding, dst, \ | |
| 3417 &produced_chars); \ | |
| 3418 } while (1) | |
| 3419 | |
| 3420 | |
| 3421 #define ENCODE_ISO_CHARACTER(charset, c) \ | |
| 3422 do { \ | |
| 3423 int code = ENCODE_CHAR ((charset),(c)); \ | |
| 3424 \ | |
| 3425 if (CHARSET_DIMENSION (charset) == 1) \ | |
| 3426 ENCODE_ISO_CHARACTER_DIMENSION1 ((charset), code); \ | |
| 3427 else \ | |
| 3428 ENCODE_ISO_CHARACTER_DIMENSION2 ((charset), code >> 8, code & 0xFF); \ | |
|
22119
592bb8b9bcfd
Change terms unify/unification to
Kenichi Handa <handa@m17n.org>
parents:
22020
diff
changeset
|
3429 } while (0) |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3430 |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
3431 |
| 17052 | 3432 /* Produce designation and invocation codes at a place pointed by DST |
| 88365 | 3433 to use CHARSET. The element `spec.iso_2022' of *CODING is updated. |
| 17052 | 3434 Return new DST. */ |
| 3435 | |
| 3436 unsigned char * | |
| 88365 | 3437 encode_invocation_designation (charset, coding, dst, p_nchars) |
| 3438 struct charset *charset; | |
| 17052 | 3439 struct coding_system *coding; |
| 3440 unsigned char *dst; | |
| 88365 | 3441 int *p_nchars; |
| 17052 | 3442 { |
| 88365 | 3443 int multibytep = coding->dst_multibyte; |
| 3444 int produced_chars = *p_nchars; | |
| 17052 | 3445 int reg; /* graphic register number */ |
| 88365 | 3446 int id = CHARSET_ID (charset); |
| 17052 | 3447 |
| 3448 /* At first, check designations. */ | |
| 3449 for (reg = 0; reg < 4; reg++) | |
| 88365 | 3450 if (id == CODING_ISO_DESIGNATION (coding, reg)) |
| 17052 | 3451 break; |
| 3452 | |
| 3453 if (reg >= 4) | |
| 3454 { | |
| 3455 /* CHARSET is not yet designated to any graphic registers. */ | |
| 3456 /* At first check the requested designation. */ | |
| 88365 | 3457 reg = CODING_ISO_REQUEST (coding, id); |
| 3458 if (reg < 0) | |
|
18002
a14261786239
(encode_invocation_designation): Use macro
Kenichi Handa <handa@m17n.org>
parents:
17835
diff
changeset
|
3459 /* Since CHARSET requests no special designation, designate it |
|
a14261786239
(encode_invocation_designation): Use macro
Kenichi Handa <handa@m17n.org>
parents:
17835
diff
changeset
|
3460 to graphic register 0. */ |
| 17052 | 3461 reg = 0; |
| 3462 | |
| 3463 ENCODE_DESIGNATION (charset, reg, coding); | |
| 3464 } | |
| 3465 | |
| 88365 | 3466 if (CODING_ISO_INVOCATION (coding, 0) != reg |
| 3467 && CODING_ISO_INVOCATION (coding, 1) != reg) | |
| 17052 | 3468 { |
| 3469 /* Since the graphic register REG is not invoked to any graphic | |
| 3470 planes, invoke it to graphic plane 0. */ | |
| 3471 switch (reg) | |
| 3472 { | |
| 3473 case 0: /* graphic register 0 */ | |
| 3474 ENCODE_SHIFT_IN; | |
| 3475 break; | |
| 3476 | |
| 3477 case 1: /* graphic register 1 */ | |
| 3478 ENCODE_SHIFT_OUT; | |
| 3479 break; | |
| 3480 | |
| 3481 case 2: /* graphic register 2 */ | |
| 88365 | 3482 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SINGLE_SHIFT) |
| 17052 | 3483 ENCODE_SINGLE_SHIFT_2; |
| 3484 else | |
| 3485 ENCODE_LOCKING_SHIFT_2; | |
| 3486 break; | |
| 3487 | |
| 3488 case 3: /* graphic register 3 */ | |
| 88365 | 3489 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_SINGLE_SHIFT) |
| 17052 | 3490 ENCODE_SINGLE_SHIFT_3; |
| 3491 else | |
| 3492 ENCODE_LOCKING_SHIFT_3; | |
| 3493 break; | |
| 3494 } | |
| 3495 } | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3496 |
| 88365 | 3497 *p_nchars = produced_chars; |
| 17052 | 3498 return dst; |
| 3499 } | |
| 3500 | |
| 3501 /* The following three macros produce codes for indicating direction | |
| 3502 of text. */ | |
| 88365 | 3503 #define ENCODE_CONTROL_SEQUENCE_INTRODUCER \ |
| 3504 do { \ | |
| 3505 if (CODING_ISO_FLAGS (coding) == CODING_ISO_FLAG_SEVEN_BITS) \ | |
| 3506 EMIT_TWO_ASCII_BYTES (ISO_CODE_ESC, '['); \ | |
| 3507 else \ | |
| 3508 EMIT_ONE_BYTE (ISO_CODE_CSI); \ | |
| 17052 | 3509 } while (0) |
| 3510 | |
| 88365 | 3511 |
| 3512 #define ENCODE_DIRECTION_R2L() \ | |
| 3513 do { \ | |
| 3514 ENCODE_CONTROL_SEQUENCE_INTRODUCER (dst); \ | |
| 3515 EMIT_TWO_ASCII_BYTES ('2', ']'); \ | |
| 3516 } while (0) | |
| 3517 | |
| 3518 | |
| 3519 #define ENCODE_DIRECTION_L2R() \ | |
| 3520 do { \ | |
| 3521 ENCODE_CONTROL_SEQUENCE_INTRODUCER (dst); \ | |
| 3522 EMIT_TWO_ASCII_BYTES ('0', ']'); \ | |
| 3523 } while (0) | |
| 3524 | |
| 17052 | 3525 |
| 3526 /* Produce codes for designation and invocation to reset the graphic | |
| 3527 planes and registers to initial state. */ | |
| 88365 | 3528 #define ENCODE_RESET_PLANE_AND_REGISTER() \ |
| 3529 do { \ | |
| 3530 int reg; \ | |
| 3531 struct charset *charset; \ | |
| 3532 \ | |
| 3533 if (CODING_ISO_INVOCATION (coding, 0) != 0) \ | |
| 3534 ENCODE_SHIFT_IN; \ | |
| 3535 for (reg = 0; reg < 4; reg++) \ | |
| 3536 if (CODING_ISO_INITIAL (coding, reg) >= 0 \ | |
| 3537 && (CODING_ISO_DESIGNATION (coding, reg) \ | |
| 3538 != CODING_ISO_INITIAL (coding, reg))) \ | |
| 3539 { \ | |
| 3540 charset = CHARSET_FROM_ID (CODING_ISO_INITIAL (coding, reg)); \ | |
| 3541 ENCODE_DESIGNATION (charset, reg, coding); \ | |
| 3542 } \ | |
| 17052 | 3543 } while (0) |
| 3544 | |
| 88365 | 3545 |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3546 /* Produce designation sequences of charsets in the line started from |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3547 SRC to a place pointed by DST, and return updated DST. |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3548 |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3549 If the current block ends before any end-of-line, we may fail to |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
3550 find all the necessary designations. */ |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
3551 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3552 static unsigned char * |
| 88365 | 3553 encode_designation_at_bol (coding, charbuf, charbuf_end, dst) |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3554 struct coding_system *coding; |
| 88365 | 3555 int *charbuf, *charbuf_end; |
| 3556 unsigned char *dst; | |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3557 { |
| 88365 | 3558 struct charset *charset; |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3559 /* Table of charsets to be designated to each graphic register. */ |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3560 int r[4]; |
| 88365 | 3561 int c, found = 0, reg; |
| 3562 int produced_chars = 0; | |
| 3563 int multibytep = coding->dst_multibyte; | |
| 3564 Lisp_Object attrs; | |
| 3565 Lisp_Object charset_list; | |
| 3566 | |
| 3567 attrs = CODING_ID_ATTRS (coding->id); | |
| 3568 charset_list = CODING_ATTR_CHARSET_LIST (attrs); | |
| 3569 if (EQ (charset_list, Qiso_2022)) | |
| 3570 charset_list = Viso_2022_charset_list; | |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3571 |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3572 for (reg = 0; reg < 4; reg++) |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3573 r[reg] = -1; |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3574 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3575 while (found < 4) |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3576 { |
| 88365 | 3577 int id; |
| 3578 | |
| 3579 c = *charbuf++; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3580 if (c == '\n') |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3581 break; |
| 88365 | 3582 charset = char_charset (c, charset_list, NULL); |
| 3583 id = CHARSET_ID (charset); | |
| 3584 reg = CODING_ISO_REQUEST (coding, id); | |
| 3585 if (reg >= 0 && r[reg] < 0) | |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3586 { |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3587 found++; |
| 88365 | 3588 r[reg] = id; |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3589 } |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3590 } |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3591 |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3592 if (found) |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3593 { |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3594 for (reg = 0; reg < 4; reg++) |
|
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3595 if (r[reg] >= 0 |
| 88365 | 3596 && CODING_ISO_DESIGNATION (coding, reg) != r[reg]) |
| 3597 ENCODE_DESIGNATION (CHARSET_FROM_ID (r[reg]), reg, coding); | |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3598 } |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3599 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3600 return dst; |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3601 } |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3602 |
| 17052 | 3603 /* See the above "GENERAL NOTES on `encode_coding_XXX ()' functions". */ |
| 3604 | |
| 88365 | 3605 static int |
| 3606 encode_coding_iso_2022 (coding) | |
| 17052 | 3607 struct coding_system *coding; |
| 3608 { | |
| 88365 | 3609 int multibytep = coding->dst_multibyte; |
| 3610 int *charbuf = coding->charbuf; | |
| 3611 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 3612 unsigned char *dst = coding->destination + coding->produced; | |
| 3613 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 3614 int safe_room = 16; | |
| 3615 int bol_designation | |
| 3616 = (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_DESIGNATE_AT_BOL | |
| 3617 && CODING_ISO_BOL (coding)); | |
| 3618 int produced_chars = 0; | |
| 3619 Lisp_Object attrs, eol_type, charset_list; | |
| 3620 int ascii_compatible; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3621 int c; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3622 int preferred_charset_id = -1; |
| 88365 | 3623 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3624 CODING_GET_INFO (coding, attrs, charset_list); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3625 eol_type = CODING_ID_EOL_TYPE (coding->id); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3626 if (VECTORP (eol_type)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3627 eol_type = Qunix; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3628 |
|
88497
d2b9e0d4c2f6
(Fdecode_sjis_char): Fix typo (0x7F->0xFF). Fix the
Kenichi Handa <handa@m17n.org>
parents:
88485
diff
changeset
|
3629 setup_iso_safe_charsets (attrs); |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3630 /* Charset list may have been changed. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3631 charset_list = CODING_ATTR_CHARSET_LIST (attrs); \ |
| 89483 | 3632 coding->safe_charsets = (char *) SDATA (CODING_ATTR_SAFE_CHARSETS(attrs)); |
| 88365 | 3633 |
| 3634 ascii_compatible = ! NILP (CODING_ATTR_ASCII_COMPAT (attrs)); | |
| 3635 | |
| 3636 while (charbuf < charbuf_end) | |
| 3637 { | |
| 3638 ASSURE_DESTINATION (safe_room); | |
| 3639 | |
| 3640 if (bol_designation) | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3641 { |
| 88365 | 3642 unsigned char *dst_prev = dst; |
| 3643 | |
|
17725
92f042f73be2
(Valternate_charset_table): The valiable deleted.
Kenichi Handa <handa@m17n.org>
parents:
17717
diff
changeset
|
3644 /* We have to produce designation sequences if any now. */ |
| 88365 | 3645 dst = encode_designation_at_bol (coding, charbuf, charbuf_end, dst); |
| 3646 bol_designation = 0; | |
| 3647 /* We are sure that designation sequences are all ASCII bytes. */ | |
| 3648 produced_chars += dst - dst_prev; | |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3649 } |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
3650 |
| 88365 | 3651 c = *charbuf++; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3652 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3653 if (c < 0) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3654 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3655 /* Handle an annotation. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3656 switch (*charbuf) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3657 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3658 case CODING_ANNOTATE_COMPOSITION_MASK: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3659 /* Not yet implemented. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3660 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3661 case CODING_ANNOTATE_CHARSET_MASK: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3662 preferred_charset_id = charbuf[3]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3663 if (preferred_charset_id >= 0 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3664 && NILP (Fmemq (make_number (preferred_charset_id), |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3665 charset_list))) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3666 preferred_charset_id = -1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3667 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3668 default: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3669 abort (); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3670 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3671 charbuf += -c - 1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3672 continue; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3673 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3674 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3675 /* Now encode the character C. */ |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3676 if (c < 0x20 || c == 0x7F) |
| 17052 | 3677 { |
| 88365 | 3678 if (c == '\n' |
| 3679 || (c == '\r' && EQ (eol_type, Qmac))) | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3680 { |
| 88365 | 3681 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_RESET_AT_EOL) |
| 3682 ENCODE_RESET_PLANE_AND_REGISTER (); | |
| 3683 if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_INIT_AT_BOL) | |
| 3684 { | |
| 3685 int i; | |
| 3686 | |
| 3687 for (i = 0; i < 4; i++) | |
| 3688 CODING_ISO_DESIGNATION (coding, i) | |
| 3689 = CODING_ISO_INITIAL (coding, i); | |
| 3690 } | |
| 3691 bol_designation | |
| 3692 = CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_DESIGNATE_AT_BOL; | |
|
19052
302a7b2a6948
(encode_coding_iso2022): Write out invalid multibyte
Kenichi Handa <handa@m17n.org>
parents:
18910
diff
changeset
|
3693 } |
| 88365 | 3694 else if (CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_RESET_AT_CNTL) |
| 3695 ENCODE_RESET_PLANE_AND_REGISTER (); | |
| 3696 EMIT_ONE_ASCII_BYTE (c); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3697 } |
| 88365 | 3698 else if (ASCII_CHAR_P (c)) |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3699 { |
| 88365 | 3700 if (ascii_compatible) |
| 3701 EMIT_ONE_ASCII_BYTE (c); | |
| 3702 else | |
|
88681
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3703 { |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3704 struct charset *charset = CHARSET_FROM_ID (charset_ascii); |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3705 ENCODE_ISO_CHARACTER (charset, c); |
|
2cdfbffa8a0d
(CODING_ISO_FLAG_USE_ROMAN): New macro
Kenichi Handa <handa@m17n.org>
parents:
88669
diff
changeset
|
3706 } |
| 17052 | 3707 } |
|
88690
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
3708 else if (CHAR_BYTE8_P (c)) |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
3709 { |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
3710 c = CHAR_TO_BYTE8 (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
3711 EMIT_ONE_BYTE (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
3712 } |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3713 else |
| 88365 | 3714 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3715 struct charset *charset; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3716 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3717 if (preferred_charset_id >= 0) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3718 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3719 charset = CHARSET_FROM_ID (preferred_charset_id); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3720 if (! CHAR_CHARSET_P (c, charset)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3721 charset = char_charset (c, charset_list, NULL); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3722 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3723 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3724 charset = char_charset (c, charset_list, NULL); |
| 88365 | 3725 if (!charset) |
| 3726 { | |
|
88573
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3727 if (coding->mode & CODING_MODE_SAFE_ENCODING) |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3728 { |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3729 c = CODING_INHIBIT_CHARACTER_SUBSTITUTION; |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3730 charset = CHARSET_FROM_ID (charset_ascii); |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3731 } |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3732 else |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3733 { |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3734 c = coding->default_char; |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3735 charset = char_charset (c, charset_list, NULL); |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
3736 } |
| 88365 | 3737 } |
| 3738 ENCODE_ISO_CHARACTER (charset, c); | |
| 3739 } | |
| 3740 } | |
| 3741 | |
| 3742 if (coding->mode & CODING_MODE_LAST_BLOCK | |
| 3743 && CODING_ISO_FLAGS (coding) & CODING_ISO_FLAG_RESET_AT_EOL) | |
| 3744 { | |
| 3745 ASSURE_DESTINATION (safe_room); | |
| 3746 ENCODE_RESET_PLANE_AND_REGISTER (); | |
| 3747 } | |
| 3748 coding->result = CODING_RESULT_SUCCESS; | |
| 3749 CODING_ISO_BOL (coding) = bol_designation; | |
| 3750 coding->produced_char += produced_chars; | |
| 3751 coding->produced = dst - coding->destination; | |
| 3752 return 0; | |
| 17052 | 3753 } |
| 3754 | |
| 3755 | |
| 88365 | 3756 /*** 8,9. SJIS and BIG5 handlers ***/ |
| 3757 | |
| 3758 /* Although SJIS and BIG5 are not ISO's coding system, they are used | |
| 17052 | 3759 quite widely. So, for the moment, Emacs supports them in the bare |
| 3760 C code. But, in the future, they may be supported only by CCL. */ | |
| 3761 | |
| 3762 /* SJIS is a coding system encoding three character sets: ASCII, right | |
| 3763 half of JISX0201-Kana, and JISX0208. An ASCII character is encoded | |
| 3764 as is. A character of charset katakana-jisx0201 is encoded by | |
| 3765 "position-code + 0x80". A character of charset japanese-jisx0208 | |
| 3766 is encoded in 2-byte but two position-codes are divided and shifted | |
| 88365 | 3767 so that it fit in the range below. |
| 17052 | 3768 |
| 3769 --- CODE RANGE of SJIS --- | |
| 3770 (character set) (range) | |
| 3771 ASCII 0x00 .. 0x7F | |
| 88365 | 3772 KATAKANA-JISX0201 0xA0 .. 0xDF |
|
24324
2eec590faf26
(Fdecode_sjis_char, Fencode_sjis_char): Hanlde
Kenichi Handa <handa@m17n.org>
parents:
24316
diff
changeset
|
3773 JISX0208 (1st byte) 0x81 .. 0x9F and 0xE0 .. 0xEF |
|
23564
6eb3e346d1fd
(DECODE_CHARACTER_ASCII): Check validity of inserted
Kenichi Handa <handa@m17n.org>
parents:
23542
diff
changeset
|
3774 (2nd byte) 0x40 .. 0x7E and 0x80 .. 0xFC |
| 17052 | 3775 ------------------------------- |
| 3776 | |
| 3777 */ | |
| 3778 | |
| 3779 /* BIG5 is a coding system encoding two character sets: ASCII and | |
| 3780 Big5. An ASCII character is encoded as is. Big5 is a two-byte | |
| 88365 | 3781 character set and is encoded in two-byte. |
| 17052 | 3782 |
| 3783 --- CODE RANGE of BIG5 --- | |
| 3784 (character set) (range) | |
| 3785 ASCII 0x00 .. 0x7F | |
| 3786 Big5 (1st byte) 0xA1 .. 0xFE | |
| 3787 (2nd byte) 0x40 .. 0x7E and 0xA1 .. 0xFE | |
| 3788 -------------------------- | |
| 3789 | |
| 88365 | 3790 */ |
| 17052 | 3791 |
| 3792 /* See the above "GENERAL NOTES on `detect_coding_XXX ()' functions". | |
| 3793 Check if a text is encoded in SJIS. If it is, return | |
| 88365 | 3794 CATEGORY_MASK_SJIS, else return 0. */ |
| 17052 | 3795 |
|
34531
37f85e931855
(ONE_MORE_BYTE_CHECK_MULTIBYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34197
diff
changeset
|
3796 static int |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3797 detect_coding_sjis (coding, detect_info) |
| 88365 | 3798 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3799 struct coding_detection_info *detect_info; |
| 17052 | 3800 { |
| 89483 | 3801 const unsigned char *src = coding->source, *src_base = src; |
| 3802 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 88365 | 3803 int multibytep = coding->src_multibyte; |
| 3804 int consumed_chars = 0; | |
| 3805 int found = 0; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3806 int c; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3807 int incomplete; |
| 88365 | 3808 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3809 detect_info->checked |= CATEGORY_MASK_SJIS; |
| 88365 | 3810 /* A coding system of this category is always ASCII compatible. */ |
| 3811 src += coding->head_ascii; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3812 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3813 while (1) |
| 17052 | 3814 { |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3815 incomplete = 0; |
| 88365 | 3816 ONE_MORE_BYTE (c); |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3817 incomplete = 1; |
|
36647
0a75ccbe42b2
(detect_coding_sjis): Do more rigid check.
Kenichi Handa <handa@m17n.org>
parents:
36520
diff
changeset
|
3818 if (c < 0x80) |
|
0a75ccbe42b2
(detect_coding_sjis): Do more rigid check.
Kenichi Handa <handa@m17n.org>
parents:
36520
diff
changeset
|
3819 continue; |
| 88365 | 3820 if ((c >= 0x81 && c <= 0x9F) || (c >= 0xE0 && c <= 0xEF)) |
| 17052 | 3821 { |
| 88365 | 3822 ONE_MORE_BYTE (c); |
|
36647
0a75ccbe42b2
(detect_coding_sjis): Do more rigid check.
Kenichi Handa <handa@m17n.org>
parents:
36520
diff
changeset
|
3823 if (c < 0x40 || c == 0x7F || c > 0xFC) |
| 88365 | 3824 break; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3825 found = CATEGORY_MASK_SJIS; |
| 17052 | 3826 } |
| 88365 | 3827 else if (c >= 0xA0 && c < 0xE0) |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3828 found = CATEGORY_MASK_SJIS; |
| 88365 | 3829 else |
| 3830 break; | |
| 3831 } | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3832 detect_info->rejected |= CATEGORY_MASK_SJIS; |
| 88365 | 3833 return 0; |
| 3834 | |
| 3835 no_more_source: | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3836 if (incomplete && coding->mode & CODING_MODE_LAST_BLOCK) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3837 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3838 detect_info->rejected |= CATEGORY_MASK_SJIS; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3839 return 0; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3840 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3841 detect_info->found |= found; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3842 return 1; |
| 17052 | 3843 } |
| 3844 | |
| 3845 /* See the above "GENERAL NOTES on `detect_coding_XXX ()' functions". | |
| 3846 Check if a text is encoded in BIG5. If it is, return | |
| 88365 | 3847 CATEGORY_MASK_BIG5, else return 0. */ |
| 17052 | 3848 |
|
34531
37f85e931855
(ONE_MORE_BYTE_CHECK_MULTIBYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34197
diff
changeset
|
3849 static int |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3850 detect_coding_big5 (coding, detect_info) |
| 88365 | 3851 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3852 struct coding_detection_info *detect_info; |
| 17052 | 3853 { |
| 89483 | 3854 const unsigned char *src = coding->source, *src_base = src; |
| 3855 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 88365 | 3856 int multibytep = coding->src_multibyte; |
| 3857 int consumed_chars = 0; | |
| 3858 int found = 0; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3859 int c; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3860 int incomplete; |
| 88365 | 3861 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3862 detect_info->checked |= CATEGORY_MASK_BIG5; |
| 88365 | 3863 /* A coding system of this category is always ASCII compatible. */ |
| 3864 src += coding->head_ascii; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3865 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3866 while (1) |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
3867 { |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3868 incomplete = 0; |
| 88365 | 3869 ONE_MORE_BYTE (c); |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3870 incomplete = 1; |
| 88365 | 3871 if (c < 0x80) |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
3872 continue; |
| 88365 | 3873 if (c >= 0xA1) |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
3874 { |
| 88365 | 3875 ONE_MORE_BYTE (c); |
| 3876 if (c < 0x40 || (c >= 0x7F && c <= 0xA0)) | |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
3877 return 0; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3878 found = CATEGORY_MASK_BIG5; |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
3879 } |
| 88365 | 3880 else |
| 3881 break; | |
| 3882 } | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3883 detect_info->rejected |= CATEGORY_MASK_BIG5; |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
3884 return 0; |
| 88365 | 3885 |
| 3886 no_more_source: | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3887 if (incomplete && coding->mode & CODING_MODE_LAST_BLOCK) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3888 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3889 detect_info->rejected |= CATEGORY_MASK_BIG5; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3890 return 0; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3891 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3892 detect_info->found |= found; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3893 return 1; |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
3894 } |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
3895 |
| 17052 | 3896 /* See the above "GENERAL NOTES on `decode_coding_XXX ()' functions". |
| 3897 If SJIS_P is 1, decode SJIS text, else decode BIG5 test. */ | |
| 3898 | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3899 static void |
| 88365 | 3900 decode_coding_sjis (coding) |
| 17052 | 3901 struct coding_system *coding; |
| 3902 { | |
| 89483 | 3903 const unsigned char *src = coding->source + coding->consumed; |
| 3904 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 3905 const unsigned char *src_base; | |
| 88365 | 3906 int *charbuf = coding->charbuf; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3907 int *charbuf_end = charbuf + coding->charbuf_size - MAX_ANNOTATION_LENGTH; |
| 88365 | 3908 int consumed_chars = 0, consumed_chars_base; |
| 3909 int multibytep = coding->src_multibyte; | |
| 3910 struct charset *charset_roman, *charset_kanji, *charset_kana; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3911 Lisp_Object attrs, charset_list, val; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3912 int char_offset = coding->produced_char; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3913 int last_offset = char_offset; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3914 int last_id = charset_ascii; |
| 88365 | 3915 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3916 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 3917 |
| 3918 val = charset_list; | |
| 3919 charset_roman = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3920 charset_kana = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
3921 charset_kanji = CHARSET_FROM_ID (XINT (XCAR (val))); |
| 88365 | 3922 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3923 while (1) |
| 17052 | 3924 { |
| 88365 | 3925 int c, c1; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3926 struct charset *charset; |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3927 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
3928 src_base = src; |
| 88365 | 3929 consumed_chars_base = consumed_chars; |
| 3930 | |
| 3931 if (charbuf >= charbuf_end) | |
| 3932 break; | |
| 3933 | |
| 3934 ONE_MORE_BYTE (c); | |
| 3935 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3936 if (c < 0x80) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3937 charset = charset_roman; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3938 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3939 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3940 if (c >= 0xF0) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3941 goto invalid_code; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3942 if (c < 0xA0 || c >= 0xE0) |
| 17052 | 3943 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3944 /* SJIS -> JISX0208 */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3945 ONE_MORE_BYTE (c1); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3946 if (c1 < 0x40 || c1 == 0x7F || c1 > 0xFC) |
| 88365 | 3947 goto invalid_code; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3948 c = (c << 8) | c1; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3949 SJIS_TO_JIS (c); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3950 charset = charset_kanji; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3951 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3952 else if (c > 0xA0) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3953 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3954 /* SJIS -> JISX0201-Kana */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3955 c &= 0x7F; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3956 charset = charset_kana; |
|
20931
068eb408c911
(decode_coding_iso2022): Update coding->fake_multibyte.
Kenichi Handa <handa@m17n.org>
parents:
20803
diff
changeset
|
3957 } |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3958 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3959 goto invalid_code; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3960 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3961 if (charset->id != charset_ascii |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3962 && last_id != charset->id) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3963 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3964 if (last_id != charset_ascii) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3965 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3966 last_id = charset->id; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3967 last_offset = char_offset; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3968 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
3969 CODING_DECODE_CHAR (coding, src, src_base, src_end, charset, c, c); |
| 88365 | 3970 *charbuf++ = c; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3971 char_offset++; |
| 88365 | 3972 continue; |
| 3973 | |
| 3974 invalid_code: | |
| 3975 src = src_base; | |
| 3976 consumed_chars = consumed_chars_base; | |
| 3977 ONE_MORE_BYTE (c); | |
| 3978 *charbuf++ = ASCII_BYTE_P (c) ? c : BYTE8_TO_CHAR (c); | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3979 char_offset++; |
| 88365 | 3980 coding->errors++; |
| 3981 } | |
| 3982 | |
| 3983 no_more_source: | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3984 if (last_id != charset_ascii) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3985 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
| 88365 | 3986 coding->consumed_char += consumed_chars_base; |
| 3987 coding->consumed = src_base - coding->source; | |
| 3988 coding->charbuf_used = charbuf - coding->charbuf; | |
| 3989 } | |
| 3990 | |
| 3991 static void | |
| 3992 decode_coding_big5 (coding) | |
| 3993 struct coding_system *coding; | |
| 3994 { | |
| 89483 | 3995 const unsigned char *src = coding->source + coding->consumed; |
| 3996 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 3997 const unsigned char *src_base; | |
| 88365 | 3998 int *charbuf = coding->charbuf; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
3999 int *charbuf_end = charbuf + coding->charbuf_size - MAX_ANNOTATION_LENGTH; |
| 88365 | 4000 int consumed_chars = 0, consumed_chars_base; |
| 4001 int multibytep = coding->src_multibyte; | |
| 4002 struct charset *charset_roman, *charset_big5; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4003 Lisp_Object attrs, charset_list, val; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4004 int char_offset = coding->produced_char; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4005 int last_offset = char_offset; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4006 int last_id = charset_ascii; |
| 88365 | 4007 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4008 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 4009 val = charset_list; |
| 4010 charset_roman = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); | |
| 4011 charset_big5 = CHARSET_FROM_ID (XINT (XCAR (val))); | |
| 4012 | |
| 4013 while (1) | |
| 4014 { | |
| 4015 int c, c1; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4016 struct charset *charset; |
| 88365 | 4017 |
| 4018 src_base = src; | |
| 4019 consumed_chars_base = consumed_chars; | |
| 4020 | |
| 4021 if (charbuf >= charbuf_end) | |
| 4022 break; | |
| 4023 | |
| 4024 ONE_MORE_BYTE (c); | |
| 4025 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4026 if (c < 0x80) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4027 charset = charset_roman; |
| 88365 | 4028 else |
| 4029 { | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4030 /* BIG5 -> Big5 */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4031 if (c < 0xA1 || c > 0xFE) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4032 goto invalid_code; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4033 ONE_MORE_BYTE (c1); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4034 if (c1 < 0x40 || (c1 > 0x7E && c1 < 0xA1) || c1 > 0xFE) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4035 goto invalid_code; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4036 c = c << 8 | c1; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4037 charset = charset_big5; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4038 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4039 if (charset->id != charset_ascii |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4040 && last_id != charset->id) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4041 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4042 if (last_id != charset_ascii) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4043 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4044 last_id = charset->id; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4045 last_offset = char_offset; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4046 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4047 CODING_DECODE_CHAR (coding, src, src_base, src_end, charset, c, c); |
| 88365 | 4048 *charbuf++ = c; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4049 char_offset++; |
|
20931
068eb408c911
(decode_coding_iso2022): Update coding->fake_multibyte.
Kenichi Handa <handa@m17n.org>
parents:
20803
diff
changeset
|
4050 continue; |
|
068eb408c911
(decode_coding_iso2022): Update coding->fake_multibyte.
Kenichi Handa <handa@m17n.org>
parents:
20803
diff
changeset
|
4051 |
| 88365 | 4052 invalid_code: |
| 17052 | 4053 src = src_base; |
| 88365 | 4054 consumed_chars = consumed_chars_base; |
| 4055 ONE_MORE_BYTE (c); | |
| 4056 *charbuf++ = ASCII_BYTE_P (c) ? c : BYTE8_TO_CHAR (c); | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4057 char_offset++; |
| 88365 | 4058 coding->errors++; |
| 4059 } | |
| 4060 | |
| 4061 no_more_source: | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4062 if (last_id != charset_ascii) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4063 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
| 88365 | 4064 coding->consumed_char += consumed_chars_base; |
| 4065 coding->consumed = src_base - coding->source; | |
| 4066 coding->charbuf_used = charbuf - coding->charbuf; | |
| 17052 | 4067 } |
| 4068 | |
| 4069 /* See the above "GENERAL NOTES on `encode_coding_XXX ()' functions". | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4070 This function can encode charsets `ascii', `katakana-jisx0201', |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4071 `japanese-jisx0208', `chinese-big5-1', and `chinese-big5-2'. We |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4072 are sure that all these charsets are registered as official charset |
| 17052 | 4073 (i.e. do not have extended leading-codes). Characters of other |
| 4074 charsets are produced without any encoding. If SJIS_P is 1, encode | |
| 4075 SJIS text, else encode BIG5 text. */ | |
| 4076 | |
| 88365 | 4077 static int |
| 4078 encode_coding_sjis (coding) | |
| 17052 | 4079 struct coding_system *coding; |
| 4080 { | |
| 88365 | 4081 int multibytep = coding->dst_multibyte; |
| 4082 int *charbuf = coding->charbuf; | |
| 4083 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 4084 unsigned char *dst = coding->destination + coding->produced; | |
| 4085 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 4086 int safe_room = 4; | |
| 4087 int produced_chars = 0; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4088 Lisp_Object attrs, charset_list, val; |
| 88365 | 4089 int ascii_compatible; |
| 4090 struct charset *charset_roman, *charset_kanji, *charset_kana; | |
| 4091 int c; | |
| 4092 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4093 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 4094 val = charset_list; |
| 4095 charset_roman = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); | |
| 4096 charset_kana = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); | |
| 4097 charset_kanji = CHARSET_FROM_ID (XINT (XCAR (val))); | |
| 4098 | |
| 4099 ascii_compatible = ! NILP (CODING_ATTR_ASCII_COMPAT (attrs)); | |
| 4100 | |
| 4101 while (charbuf < charbuf_end) | |
| 4102 { | |
| 4103 ASSURE_DESTINATION (safe_room); | |
| 4104 c = *charbuf++; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4105 /* Now encode the character C. */ |
| 88365 | 4106 if (ASCII_CHAR_P (c) && ascii_compatible) |
| 4107 EMIT_ONE_ASCII_BYTE (c); | |
|
88690
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4108 else if (CHAR_BYTE8_P (c)) |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4109 { |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4110 c = CHAR_TO_BYTE8 (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4111 EMIT_ONE_BYTE (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4112 } |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4113 else |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4114 { |
| 88365 | 4115 unsigned code; |
| 4116 struct charset *charset = char_charset (c, charset_list, &code); | |
| 4117 | |
| 4118 if (!charset) | |
| 4119 { | |
|
88573
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4120 if (coding->mode & CODING_MODE_SAFE_ENCODING) |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4121 { |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4122 code = CODING_INHIBIT_CHARACTER_SUBSTITUTION; |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4123 charset = CHARSET_FROM_ID (charset_ascii); |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4124 } |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4125 else |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4126 { |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4127 c = coding->default_char; |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4128 charset = char_charset (c, charset_list, &code); |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4129 } |
| 88365 | 4130 } |
| 4131 if (code == CHARSET_INVALID_CODE (charset)) | |
| 4132 abort (); | |
| 4133 if (charset == charset_kanji) | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4134 { |
| 88365 | 4135 int c1, c2; |
| 4136 JIS_TO_SJIS (code); | |
| 4137 c1 = code >> 8, c2 = code & 0xFF; | |
| 4138 EMIT_TWO_BYTES (c1, c2); | |
| 4139 } | |
| 4140 else if (charset == charset_kana) | |
| 4141 EMIT_ONE_BYTE (code | 0x80); | |
| 4142 else | |
| 4143 EMIT_ONE_ASCII_BYTE (code & 0x7F); | |
| 4144 } | |
| 4145 } | |
| 4146 coding->result = CODING_RESULT_SUCCESS; | |
| 4147 coding->produced_char += produced_chars; | |
| 4148 coding->produced = dst - coding->destination; | |
| 4149 return 0; | |
| 4150 } | |
| 4151 | |
| 4152 static int | |
| 4153 encode_coding_big5 (coding) | |
| 4154 struct coding_system *coding; | |
| 4155 { | |
| 4156 int multibytep = coding->dst_multibyte; | |
| 4157 int *charbuf = coding->charbuf; | |
| 4158 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 4159 unsigned char *dst = coding->destination + coding->produced; | |
| 4160 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 4161 int safe_room = 4; | |
| 4162 int produced_chars = 0; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4163 Lisp_Object attrs, charset_list, val; |
| 88365 | 4164 int ascii_compatible; |
| 4165 struct charset *charset_roman, *charset_big5; | |
| 4166 int c; | |
| 4167 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4168 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 4169 val = charset_list; |
| 4170 charset_roman = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); | |
| 4171 charset_big5 = CHARSET_FROM_ID (XINT (XCAR (val))); | |
| 4172 ascii_compatible = ! NILP (CODING_ATTR_ASCII_COMPAT (attrs)); | |
| 4173 | |
| 4174 while (charbuf < charbuf_end) | |
| 4175 { | |
| 4176 ASSURE_DESTINATION (safe_room); | |
| 4177 c = *charbuf++; | |
| 4178 /* Now encode the character C. */ | |
| 4179 if (ASCII_CHAR_P (c) && ascii_compatible) | |
| 4180 EMIT_ONE_ASCII_BYTE (c); | |
|
88690
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4181 else if (CHAR_BYTE8_P (c)) |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4182 { |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4183 c = CHAR_TO_BYTE8 (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4184 EMIT_ONE_BYTE (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4185 } |
| 88365 | 4186 else |
| 4187 { | |
| 4188 unsigned code; | |
| 4189 struct charset *charset = char_charset (c, charset_list, &code); | |
| 4190 | |
| 4191 if (! charset) | |
| 4192 { | |
|
88573
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4193 if (coding->mode & CODING_MODE_SAFE_ENCODING) |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4194 { |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4195 code = CODING_INHIBIT_CHARACTER_SUBSTITUTION; |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4196 charset = CHARSET_FROM_ID (charset_ascii); |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4197 } |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4198 else |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4199 { |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4200 c = coding->default_char; |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4201 charset = char_charset (c, charset_list, &code); |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4202 } |
| 88365 | 4203 } |
| 4204 if (code == CHARSET_INVALID_CODE (charset)) | |
| 4205 abort (); | |
| 4206 if (charset == charset_big5) | |
| 4207 { | |
| 4208 int c1, c2; | |
| 4209 | |
| 4210 c1 = code >> 8, c2 = code & 0xFF; | |
| 4211 EMIT_TWO_BYTES (c1, c2); | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4212 } |
| 17052 | 4213 else |
| 88365 | 4214 EMIT_ONE_ASCII_BYTE (code & 0x7F); |
| 17052 | 4215 } |
| 88365 | 4216 } |
| 4217 coding->result = CODING_RESULT_SUCCESS; | |
| 4218 coding->produced_char += produced_chars; | |
| 4219 coding->produced = dst - coding->destination; | |
| 4220 return 0; | |
| 17052 | 4221 } |
| 4222 | |
| 4223 | |
| 88365 | 4224 /*** 10. CCL handlers ***/ |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4225 |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4226 /* See the above "GENERAL NOTES on `detect_coding_XXX ()' functions". |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4227 Check if a text is encoded in a coding system of which |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4228 encoder/decoder are written in CCL program. If it is, return |
| 88365 | 4229 CATEGORY_MASK_CCL, else return 0. */ |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4230 |
|
34531
37f85e931855
(ONE_MORE_BYTE_CHECK_MULTIBYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34197
diff
changeset
|
4231 static int |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4232 detect_coding_ccl (coding, detect_info) |
| 88365 | 4233 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4234 struct coding_detection_info *detect_info; |
| 88365 | 4235 { |
| 89483 | 4236 const unsigned char *src = coding->source, *src_base = src; |
| 4237 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 88365 | 4238 int multibytep = coding->src_multibyte; |
| 4239 int consumed_chars = 0; | |
| 4240 int found = 0; | |
| 4241 unsigned char *valids = CODING_CCL_VALIDS (coding); | |
| 4242 int head_ascii = coding->head_ascii; | |
| 4243 Lisp_Object attrs; | |
| 4244 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4245 detect_info->checked |= CATEGORY_MASK_CCL; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4246 |
| 88365 | 4247 coding = &coding_categories[coding_category_ccl]; |
| 4248 attrs = CODING_ID_ATTRS (coding->id); | |
| 4249 if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs))) | |
| 4250 src += head_ascii; | |
| 4251 | |
| 4252 while (1) | |
| 4253 { | |
| 4254 int c; | |
| 4255 ONE_MORE_BYTE (c); | |
| 4256 if (! valids[c]) | |
| 4257 break; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4258 if ((valids[c] > 1)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4259 found = CATEGORY_MASK_CCL; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4260 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4261 detect_info->rejected |= CATEGORY_MASK_CCL; |
| 88365 | 4262 return 0; |
| 4263 | |
| 4264 no_more_source: | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4265 detect_info->found |= found; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4266 return 1; |
| 88365 | 4267 } |
| 4268 | |
| 4269 static void | |
| 4270 decode_coding_ccl (coding) | |
| 4271 struct coding_system *coding; | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4272 { |
|
88876
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
4273 const unsigned char *src = coding->source + coding->consumed; |
| 89483 | 4274 const unsigned char *src_end = coding->source + coding->src_bytes; |
| 88365 | 4275 int *charbuf = coding->charbuf; |
| 4276 int *charbuf_end = charbuf + coding->charbuf_size; | |
| 4277 int consumed_chars = 0; | |
| 4278 int multibytep = coding->src_multibyte; | |
| 4279 struct ccl_program ccl; | |
| 4280 int source_charbuf[1024]; | |
| 4281 int source_byteidx[1024]; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4282 Lisp_Object attrs, charset_list; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4283 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4284 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 4285 setup_ccl_program (&ccl, CODING_CCL_DECODER (coding)); |
| 4286 | |
| 4287 while (src < src_end) | |
| 4288 { | |
|
88876
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
4289 const unsigned char *p = src; |
| 88365 | 4290 int *source, *source_end; |
| 4291 int i = 0; | |
| 4292 | |
| 4293 if (multibytep) | |
| 4294 while (i < 1024 && p < src_end) | |
| 4295 { | |
| 4296 source_byteidx[i] = p - src; | |
| 4297 source_charbuf[i++] = STRING_CHAR_ADVANCE (p); | |
| 4298 } | |
| 4299 else | |
| 4300 while (i < 1024 && p < src_end) | |
| 4301 source_charbuf[i++] = *p++; | |
| 89483 | 4302 |
| 88365 | 4303 if (p == src_end && coding->mode & CODING_MODE_LAST_BLOCK) |
| 4304 ccl.last_block = 1; | |
| 4305 | |
| 4306 source = source_charbuf; | |
| 4307 source_end = source + i; | |
| 4308 while (source < source_end) | |
| 4309 { | |
| 4310 ccl_driver (&ccl, source, charbuf, | |
|
89373
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
4311 source_end - source, charbuf_end - charbuf, |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
4312 charset_list); |
| 88365 | 4313 source += ccl.consumed; |
| 4314 charbuf += ccl.produced; | |
| 4315 if (ccl.status != CCL_STAT_SUSPEND_BY_DST) | |
| 4316 break; | |
| 4317 } | |
| 4318 if (source < source_end) | |
| 4319 src += source_byteidx[source - source_charbuf]; | |
| 4320 else | |
| 4321 src = p; | |
| 4322 consumed_chars += source - source_charbuf; | |
| 4323 | |
| 4324 if (ccl.status != CCL_STAT_SUSPEND_BY_SRC | |
| 4325 && ccl.status != CODING_RESULT_INSUFFICIENT_SRC) | |
| 4326 break; | |
| 4327 } | |
| 4328 | |
| 4329 switch (ccl.status) | |
| 4330 { | |
| 4331 case CCL_STAT_SUSPEND_BY_SRC: | |
| 4332 coding->result = CODING_RESULT_INSUFFICIENT_SRC; | |
| 4333 break; | |
| 4334 case CCL_STAT_SUSPEND_BY_DST: | |
| 4335 break; | |
| 4336 case CCL_STAT_QUIT: | |
| 4337 case CCL_STAT_INVALID_CMD: | |
| 4338 coding->result = CODING_RESULT_INTERRUPT; | |
| 4339 break; | |
| 4340 default: | |
| 4341 coding->result = CODING_RESULT_SUCCESS; | |
| 4342 break; | |
| 4343 } | |
| 4344 coding->consumed_char += consumed_chars; | |
| 4345 coding->consumed = src - coding->source; | |
| 4346 coding->charbuf_used = charbuf - coding->charbuf; | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4347 } |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4348 |
| 88365 | 4349 static int |
| 4350 encode_coding_ccl (coding) | |
| 4351 struct coding_system *coding; | |
| 4352 { | |
| 4353 struct ccl_program ccl; | |
| 4354 int multibytep = coding->dst_multibyte; | |
| 4355 int *charbuf = coding->charbuf; | |
| 4356 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 4357 unsigned char *dst = coding->destination + coding->produced; | |
| 4358 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 4359 unsigned char *adjusted_dst_end = dst_end - 1; | |
| 4360 int destination_charbuf[1024]; | |
| 4361 int i, produced_chars = 0; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4362 Lisp_Object attrs, charset_list; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4363 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4364 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 4365 setup_ccl_program (&ccl, CODING_CCL_ENCODER (coding)); |
| 4366 | |
| 4367 ccl.last_block = coding->mode & CODING_MODE_LAST_BLOCK; | |
| 4368 ccl.dst_multibyte = coding->dst_multibyte; | |
| 4369 | |
| 4370 while (charbuf < charbuf_end && dst < adjusted_dst_end) | |
| 4371 { | |
| 4372 int dst_bytes = dst_end - dst; | |
| 4373 if (dst_bytes > 1024) | |
| 4374 dst_bytes = 1024; | |
| 4375 | |
| 4376 ccl_driver (&ccl, charbuf, destination_charbuf, | |
|
89373
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
4377 charbuf_end - charbuf, dst_bytes, charset_list); |
| 88365 | 4378 charbuf += ccl.consumed; |
| 4379 if (multibytep) | |
| 4380 for (i = 0; i < ccl.produced; i++) | |
| 4381 EMIT_ONE_BYTE (destination_charbuf[i] & 0xFF); | |
| 4382 else | |
| 4383 { | |
| 4384 for (i = 0; i < ccl.produced; i++) | |
| 4385 *dst++ = destination_charbuf[i] & 0xFF; | |
| 4386 produced_chars += ccl.produced; | |
| 4387 } | |
| 4388 } | |
| 4389 | |
| 4390 switch (ccl.status) | |
| 4391 { | |
| 4392 case CCL_STAT_SUSPEND_BY_SRC: | |
| 4393 coding->result = CODING_RESULT_INSUFFICIENT_SRC; | |
| 4394 break; | |
| 4395 case CCL_STAT_SUSPEND_BY_DST: | |
| 4396 coding->result = CODING_RESULT_INSUFFICIENT_DST; | |
| 4397 break; | |
| 4398 case CCL_STAT_QUIT: | |
| 4399 case CCL_STAT_INVALID_CMD: | |
| 4400 coding->result = CODING_RESULT_INTERRUPT; | |
| 4401 break; | |
| 4402 default: | |
| 4403 coding->result = CODING_RESULT_SUCCESS; | |
| 4404 break; | |
| 4405 } | |
| 4406 | |
| 4407 coding->produced_char += produced_chars; | |
| 4408 coding->produced = dst - coding->destination; | |
| 4409 return 0; | |
| 4410 } | |
| 4411 | |
| 4412 | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4413 |
| 88365 | 4414 /*** 10, 11. no-conversion handlers ***/ |
| 17052 | 4415 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4416 /* See the above "GENERAL NOTES on `decode_coding_XXX ()' functions". */ |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4417 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4418 static void |
| 88365 | 4419 decode_coding_raw_text (coding) |
| 17052 | 4420 struct coding_system *coding; |
| 4421 { | |
| 88365 | 4422 coding->chars_at_source = 1; |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
4423 coding->consumed_char = 0; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
4424 coding->consumed = 0; |
| 88365 | 4425 coding->result = CODING_RESULT_SUCCESS; |
| 4426 } | |
| 4427 | |
| 4428 static int | |
| 4429 encode_coding_raw_text (coding) | |
| 4430 struct coding_system *coding; | |
| 4431 { | |
| 4432 int multibytep = coding->dst_multibyte; | |
| 4433 int *charbuf = coding->charbuf; | |
| 4434 int *charbuf_end = coding->charbuf + coding->charbuf_used; | |
| 4435 unsigned char *dst = coding->destination + coding->produced; | |
| 4436 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 4437 int produced_chars = 0; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4438 int c; |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4439 |
| 88365 | 4440 if (multibytep) |
| 4441 { | |
| 4442 int safe_room = MAX_MULTIBYTE_LENGTH * 2; | |
| 4443 | |
| 4444 if (coding->src_multibyte) | |
| 4445 while (charbuf < charbuf_end) | |
| 4446 { | |
| 4447 ASSURE_DESTINATION (safe_room); | |
| 4448 c = *charbuf++; | |
| 4449 if (ASCII_CHAR_P (c)) | |
| 4450 EMIT_ONE_ASCII_BYTE (c); | |
| 4451 else if (CHAR_BYTE8_P (c)) | |
| 4452 { | |
| 4453 c = CHAR_TO_BYTE8 (c); | |
| 4454 EMIT_ONE_BYTE (c); | |
| 4455 } | |
| 4456 else | |
| 4457 { | |
| 4458 unsigned char str[MAX_MULTIBYTE_LENGTH], *p0 = str, *p1 = str; | |
| 4459 | |
| 4460 CHAR_STRING_ADVANCE (c, p1); | |
| 4461 while (p0 < p1) | |
|
88950
ad258ee59fbb
* coding.c (make_conversion_work_buffer): Adjusted for the change
Kenichi Handa <handa@m17n.org>
parents:
88936
diff
changeset
|
4462 { |
|
ad258ee59fbb
* coding.c (make_conversion_work_buffer): Adjusted for the change
Kenichi Handa <handa@m17n.org>
parents:
88936
diff
changeset
|
4463 EMIT_ONE_BYTE (*p0); |
|
ad258ee59fbb
* coding.c (make_conversion_work_buffer): Adjusted for the change
Kenichi Handa <handa@m17n.org>
parents:
88936
diff
changeset
|
4464 p0++; |
|
ad258ee59fbb
* coding.c (make_conversion_work_buffer): Adjusted for the change
Kenichi Handa <handa@m17n.org>
parents:
88936
diff
changeset
|
4465 } |
| 88365 | 4466 } |
| 4467 } | |
| 4468 else | |
| 4469 while (charbuf < charbuf_end) | |
| 4470 { | |
| 4471 ASSURE_DESTINATION (safe_room); | |
| 4472 c = *charbuf++; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4473 EMIT_ONE_BYTE (c); |
| 88365 | 4474 } |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4475 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4476 else |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4477 { |
| 88365 | 4478 if (coding->src_multibyte) |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4479 { |
| 88365 | 4480 int safe_room = MAX_MULTIBYTE_LENGTH; |
| 4481 | |
| 4482 while (charbuf < charbuf_end) | |
| 4483 { | |
| 4484 ASSURE_DESTINATION (safe_room); | |
| 4485 c = *charbuf++; | |
| 4486 if (ASCII_CHAR_P (c)) | |
| 4487 *dst++ = c; | |
| 4488 else if (CHAR_BYTE8_P (c)) | |
| 4489 *dst++ = CHAR_TO_BYTE8 (c); | |
| 4490 else | |
| 4491 CHAR_STRING_ADVANCE (c, dst); | |
| 4492 produced_chars++; | |
| 4493 } | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4494 } |
|
20931
068eb408c911
(decode_coding_iso2022): Update coding->fake_multibyte.
Kenichi Handa <handa@m17n.org>
parents:
20803
diff
changeset
|
4495 else |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4496 { |
| 88365 | 4497 ASSURE_DESTINATION (charbuf_end - charbuf); |
| 4498 while (charbuf < charbuf_end && dst < dst_end) | |
| 4499 *dst++ = *charbuf++; | |
| 4500 produced_chars = dst - (coding->destination + coding->dst_bytes); | |
| 89483 | 4501 } |
| 88365 | 4502 } |
| 4503 coding->result = CODING_RESULT_SUCCESS; | |
| 4504 coding->produced_char += produced_chars; | |
| 4505 coding->produced = dst - coding->destination; | |
| 4506 return 0; | |
| 4507 } | |
| 4508 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4509 /* See the above "GENERAL NOTES on `detect_coding_XXX ()' functions". |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4510 Check if a text is encoded in a charset-based coding system. If it |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4511 is, return 1, else return 0. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4512 |
| 88365 | 4513 static int |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4514 detect_coding_charset (coding, detect_info) |
| 88365 | 4515 struct coding_system *coding; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4516 struct coding_detection_info *detect_info; |
| 88365 | 4517 { |
| 89483 | 4518 const unsigned char *src = coding->source, *src_base = src; |
| 4519 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 88365 | 4520 int multibytep = coding->src_multibyte; |
| 4521 int consumed_chars = 0; | |
| 4522 Lisp_Object attrs, valids; | |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
4523 int found = 0; |
| 88365 | 4524 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4525 detect_info->checked |= CATEGORY_MASK_CHARSET; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4526 |
| 88365 | 4527 coding = &coding_categories[coding_category_charset]; |
| 4528 attrs = CODING_ID_ATTRS (coding->id); | |
| 4529 valids = AREF (attrs, coding_attr_charset_valids); | |
| 4530 | |
| 4531 if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs))) | |
| 4532 src += coding->head_ascii; | |
| 4533 | |
| 4534 while (1) | |
| 4535 { | |
| 4536 int c; | |
| 4537 | |
| 4538 ONE_MORE_BYTE (c); | |
| 4539 if (NILP (AREF (valids, c))) | |
| 4540 break; | |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
4541 if (c >= 0x80) |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4542 found = CATEGORY_MASK_CHARSET; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4543 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4544 detect_info->rejected |= CATEGORY_MASK_CHARSET; |
| 88365 | 4545 return 0; |
| 4546 | |
| 4547 no_more_source: | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4548 detect_info->found |= found; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4549 return 1; |
| 88365 | 4550 } |
| 4551 | |
| 4552 static void | |
| 4553 decode_coding_charset (coding) | |
| 4554 struct coding_system *coding; | |
| 4555 { | |
| 89483 | 4556 const unsigned char *src = coding->source + coding->consumed; |
| 4557 const unsigned char *src_end = coding->source + coding->src_bytes; | |
| 4558 const unsigned char *src_base; | |
| 88365 | 4559 int *charbuf = coding->charbuf; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4560 int *charbuf_end = charbuf + coding->charbuf_size - MAX_ANNOTATION_LENGTH; |
| 88365 | 4561 int consumed_chars = 0, consumed_chars_base; |
| 4562 int multibytep = coding->src_multibyte; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4563 Lisp_Object attrs, charset_list, valids; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4564 int char_offset = coding->produced_char; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4565 int last_offset = char_offset; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4566 int last_id = charset_ascii; |
| 88365 | 4567 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4568 CODING_GET_INFO (coding, attrs, charset_list); |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4569 valids = AREF (attrs, coding_attr_charset_valids); |
| 88365 | 4570 |
| 4571 while (1) | |
| 4572 { | |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4573 int c; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4574 Lisp_Object val; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4575 struct charset *charset; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4576 int dim; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4577 int len = 1; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4578 unsigned code; |
| 88365 | 4579 |
| 4580 src_base = src; | |
| 4581 consumed_chars_base = consumed_chars; | |
| 4582 | |
| 4583 if (charbuf >= charbuf_end) | |
| 4584 break; | |
| 4585 | |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4586 ONE_MORE_BYTE (c); |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4587 code = c; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4588 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4589 val = AREF (valids, c); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4590 if (NILP (val)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4591 goto invalid_code; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4592 if (INTEGERP (val)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4593 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4594 charset = CHARSET_FROM_ID (XFASTINT (val)); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4595 dim = CHARSET_DIMENSION (charset); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4596 while (len < dim) |
| 88365 | 4597 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4598 ONE_MORE_BYTE (c); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4599 code = (code << 8) | c; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4600 len++; |
| 88365 | 4601 } |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4602 CODING_DECODE_CHAR (coding, src, src_base, src_end, |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4603 charset, code, c); |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4604 } |
| 88365 | 4605 else |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
4606 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4607 /* VAL is a list of charset IDs. It is assured that the |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4608 list is sorted by charset dimensions (smaller one |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4609 comes first). */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4610 while (CONSP (val)) |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4611 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4612 charset = CHARSET_FROM_ID (XFASTINT (XCAR (val))); |
|
88597
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
4613 dim = CHARSET_DIMENSION (charset); |
|
88607
18436bf3d6dd
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
88598
diff
changeset
|
4614 while (len < dim) |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4615 { |
|
88598
b88195f69856
(decode_coding_charset): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
88597
diff
changeset
|
4616 ONE_MORE_BYTE (c); |
|
b88195f69856
(decode_coding_charset): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
88597
diff
changeset
|
4617 code = (code << 8) | c; |
|
88607
18436bf3d6dd
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
88598
diff
changeset
|
4618 len++; |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4619 } |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4620 CODING_DECODE_CHAR (coding, src, src_base, |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4621 src_end, charset, code, c); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4622 if (c >= 0) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4623 break; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4624 val = XCDR (val); |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4625 } |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4626 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4627 if (c < 0) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4628 goto invalid_code; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4629 if (charset->id != charset_ascii |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4630 && last_id != charset->id) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4631 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4632 if (last_id != charset_ascii) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4633 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4634 last_id = charset->id; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4635 last_offset = char_offset; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4636 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4637 |
| 88365 | 4638 *charbuf++ = c; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4639 char_offset++; |
| 88365 | 4640 continue; |
| 4641 | |
| 4642 invalid_code: | |
| 4643 src = src_base; | |
| 4644 consumed_chars = consumed_chars_base; | |
| 4645 ONE_MORE_BYTE (c); | |
| 4646 *charbuf++ = ASCII_BYTE_P (c) ? c : BYTE8_TO_CHAR (c); | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4647 char_offset++; |
| 88365 | 4648 coding->errors++; |
| 4649 } | |
| 4650 | |
| 4651 no_more_source: | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4652 if (last_id != charset_ascii) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4653 ADD_CHARSET_DATA (charbuf, last_offset, char_offset, last_id); |
| 88365 | 4654 coding->consumed_char += consumed_chars_base; |
| 4655 coding->consumed = src_base - coding->source; | |
| 4656 coding->charbuf_used = charbuf - coding->charbuf; | |
| 4657 } | |
| 4658 | |
| 4659 static int | |
| 4660 encode_coding_charset (coding) | |
| 4661 struct coding_system *coding; | |
| 4662 { | |
| 4663 int multibytep = coding->dst_multibyte; | |
| 4664 int *charbuf = coding->charbuf; | |
| 4665 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 4666 unsigned char *dst = coding->destination + coding->produced; | |
| 4667 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 4668 int safe_room = MAX_MULTIBYTE_LENGTH; | |
| 4669 int produced_chars = 0; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4670 Lisp_Object attrs, charset_list; |
| 88365 | 4671 int ascii_compatible; |
| 4672 int c; | |
| 4673 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
4674 CODING_GET_INFO (coding, attrs, charset_list); |
| 88365 | 4675 ascii_compatible = ! NILP (CODING_ATTR_ASCII_COMPAT (attrs)); |
| 4676 | |
| 4677 while (charbuf < charbuf_end) | |
| 4678 { | |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4679 struct charset *charset; |
| 88365 | 4680 unsigned code; |
| 89483 | 4681 |
| 88365 | 4682 ASSURE_DESTINATION (safe_room); |
| 4683 c = *charbuf++; | |
| 4684 if (ascii_compatible && ASCII_CHAR_P (c)) | |
| 4685 EMIT_ONE_ASCII_BYTE (c); | |
|
88690
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4686 else if (CHAR_BYTE8_P (c)) |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4687 { |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4688 c = CHAR_TO_BYTE8 (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4689 EMIT_ONE_BYTE (c); |
|
7f284ac55b07
(encode_coding_emacs_mule): Pay attention to raw-8-bit chars.
Kenichi Handa <handa@m17n.org>
parents:
88681
diff
changeset
|
4690 } |
| 88365 | 4691 else |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4692 { |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4693 charset = char_charset (c, charset_list, &code); |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4694 if (charset) |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4695 { |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4696 if (CHARSET_DIMENSION (charset) == 1) |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4697 EMIT_ONE_BYTE (code); |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4698 else if (CHARSET_DIMENSION (charset) == 2) |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4699 EMIT_TWO_BYTES (code >> 8, code & 0xFF); |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4700 else if (CHARSET_DIMENSION (charset) == 3) |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4701 EMIT_THREE_BYTES (code >> 16, (code >> 8) & 0xFF, code & 0xFF); |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4702 else |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4703 EMIT_FOUR_BYTES (code >> 24, (code >> 16) & 0xFF, |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4704 (code >> 8) & 0xFF, code & 0xFF); |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4705 } |
|
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4706 else |
|
88573
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4707 { |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4708 if (coding->mode & CODING_MODE_SAFE_ENCODING) |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4709 c = CODING_INHIBIT_CHARACTER_SUBSTITUTION; |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4710 else |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4711 c = coding->default_char; |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4712 EMIT_ONE_BYTE (c); |
|
133bf7ab1bad
(encode_coding_iso_2022): If coding requires safe
Kenichi Handa <handa@m17n.org>
parents:
88544
diff
changeset
|
4713 } |
|
88465
ae455bb40718
(decode_coding_charset, encode_coding_charset): Handle
Kenichi Handa <handa@m17n.org>
parents:
88456
diff
changeset
|
4714 } |
| 88365 | 4715 } |
| 4716 | |
| 4717 coding->result = CODING_RESULT_SUCCESS; | |
| 4718 coding->produced_char += produced_chars; | |
| 4719 coding->produced = dst - coding->destination; | |
| 4720 return 0; | |
| 17052 | 4721 } |
| 4722 | |
| 4723 | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
4724 /*** 7. C library functions ***/ |
| 17052 | 4725 |
| 88365 | 4726 /* Setup coding context CODING from information about CODING_SYSTEM. |
| 4727 If CODING_SYSTEM is nil, `no-conversion' is assumed. If | |
| 4728 CODING_SYSTEM is invalid, signal an error. */ | |
| 4729 | |
| 4730 void | |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
4731 setup_coding_system (coding_system, coding) |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
4732 Lisp_Object coding_system; |
| 17052 | 4733 struct coding_system *coding; |
| 4734 { | |
| 88365 | 4735 Lisp_Object attrs; |
| 4736 Lisp_Object eol_type; | |
| 4737 Lisp_Object coding_type; | |
|
20105
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
4738 Lisp_Object val; |
| 17052 | 4739 |
|
24460
be35d27a4bfb
(setup_coding_system): Check for CODING_SYSTEM = nil.
Kenichi Handa <handa@m17n.org>
parents:
24425
diff
changeset
|
4740 if (NILP (coding_system)) |
| 88365 | 4741 coding_system = Qno_conversion; |
| 4742 | |
| 4743 CHECK_CODING_SYSTEM_GET_ID (coding_system, coding->id); | |
| 4744 | |
| 4745 attrs = CODING_ID_ATTRS (coding->id); | |
| 4746 eol_type = CODING_ID_EOL_TYPE (coding->id); | |
| 4747 | |
| 4748 coding->mode = 0; | |
| 4749 coding->head_ascii = -1; | |
| 4750 coding->common_flags | |
| 4751 = (VECTORP (eol_type) ? CODING_REQUIRE_DETECTION_MASK : 0); | |
|
89448
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
4752 if (! NILP (CODING_ATTR_POST_READ (attrs))) |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
4753 coding->common_flags |= CODING_REQUIRE_DECODING_MASK; |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
4754 if (! NILP (CODING_ATTR_PRE_WRITE (attrs))) |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
4755 coding->common_flags |= CODING_REQUIRE_ENCODING_MASK; |
| 89483 | 4756 if (! NILP (CODING_ATTR_FOR_UNIBYTE (attrs))) |
| 4757 coding->common_flags |= CODING_FOR_UNIBYTE_MASK; | |
| 88365 | 4758 |
| 4759 val = CODING_ATTR_SAFE_CHARSETS (attrs); | |
| 89483 | 4760 coding->max_charset_id = SCHARS (val) - 1; |
| 4761 coding->safe_charsets = (char *) SDATA (val); | |
| 88365 | 4762 coding->default_char = XINT (CODING_ATTR_DEFAULT_CHAR (attrs)); |
| 4763 | |
| 4764 coding_type = CODING_ATTR_TYPE (attrs); | |
| 4765 if (EQ (coding_type, Qundecided)) | |
| 4766 { | |
| 4767 coding->detector = NULL; | |
| 4768 coding->decoder = decode_coding_raw_text; | |
| 4769 coding->encoder = encode_coding_raw_text; | |
| 4770 coding->common_flags |= CODING_REQUIRE_DETECTION_MASK; | |
| 4771 } | |
| 4772 else if (EQ (coding_type, Qiso_2022)) | |
| 4773 { | |
| 4774 int i; | |
| 4775 int flags = XINT (AREF (attrs, coding_attr_iso_flags)); | |
| 4776 | |
| 4777 /* Invoke graphic register 0 to plane 0. */ | |
| 4778 CODING_ISO_INVOCATION (coding, 0) = 0; | |
| 4779 /* Invoke graphic register 1 to plane 1 if we can use 8-bit. */ | |
| 4780 CODING_ISO_INVOCATION (coding, 1) | |
| 4781 = (flags & CODING_ISO_FLAG_SEVEN_BITS ? -1 : 1); | |
| 4782 /* Setup the initial status of designation. */ | |
| 4783 for (i = 0; i < 4; i++) | |
| 4784 CODING_ISO_DESIGNATION (coding, i) = CODING_ISO_INITIAL (coding, i); | |
| 4785 /* Not single shifting initially. */ | |
| 4786 CODING_ISO_SINGLE_SHIFTING (coding) = 0; | |
| 4787 /* Beginning of buffer should also be regarded as bol. */ | |
| 4788 CODING_ISO_BOL (coding) = 1; | |
| 4789 coding->detector = detect_coding_iso_2022; | |
| 4790 coding->decoder = decode_coding_iso_2022; | |
| 4791 coding->encoder = encode_coding_iso_2022; | |
| 4792 if (flags & CODING_ISO_FLAG_SAFE) | |
| 4793 coding->mode |= CODING_MODE_SAFE_ENCODING; | |
|
20227
71008f909642
(setup_coding_system): Initialize common_flags member
Kenichi Handa <handa@m17n.org>
parents:
20150
diff
changeset
|
4794 coding->common_flags |
| 88365 | 4795 |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK |
| 4796 | CODING_REQUIRE_FLUSHING_MASK); | |
| 4797 if (flags & CODING_ISO_FLAG_COMPOSITION) | |
| 4798 coding->common_flags |= CODING_ANNOTATE_COMPOSITION_MASK; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4799 if (flags & CODING_ISO_FLAG_DESIGNATION) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
4800 coding->common_flags |= CODING_ANNOTATE_CHARSET_MASK; |
| 88365 | 4801 if (flags & CODING_ISO_FLAG_FULL_SUPPORT) |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4802 { |
| 88365 | 4803 setup_iso_safe_charsets (attrs); |
| 4804 val = CODING_ATTR_SAFE_CHARSETS (attrs); | |
| 89483 | 4805 coding->max_charset_id = SCHARS (val) - 1; |
| 4806 coding->safe_charsets = (char *) SDATA (val); | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4807 } |
| 88365 | 4808 CODING_ISO_FLAGS (coding) = flags; |
| 4809 } | |
| 4810 else if (EQ (coding_type, Qcharset)) | |
| 4811 { | |
| 4812 coding->detector = detect_coding_charset; | |
| 4813 coding->decoder = decode_coding_charset; | |
| 4814 coding->encoder = encode_coding_charset; | |
| 4815 coding->common_flags | |
| 4816 |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK); | |
| 4817 } | |
| 4818 else if (EQ (coding_type, Qutf_8)) | |
| 4819 { | |
| 4820 coding->detector = detect_coding_utf_8; | |
| 4821 coding->decoder = decode_coding_utf_8; | |
| 4822 coding->encoder = encode_coding_utf_8; | |
|
34888
b469d29c0815
(SAFE_ONE_MORE_BYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34813
diff
changeset
|
4823 coding->common_flags |
| 88365 | 4824 |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK); |
| 4825 } | |
| 4826 else if (EQ (coding_type, Qutf_16)) | |
| 4827 { | |
| 4828 val = AREF (attrs, coding_attr_utf_16_bom); | |
| 4829 CODING_UTF_16_BOM (coding) = (CONSP (val) ? utf_16_detect_bom | |
| 4830 : EQ (val, Qt) ? utf_16_with_bom | |
| 4831 : utf_16_without_bom); | |
| 4832 val = AREF (attrs, coding_attr_utf_16_endian); | |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
4833 CODING_UTF_16_ENDIAN (coding) = (EQ (val, Qbig) ? utf_16_big_endian |
| 88365 | 4834 : utf_16_little_endian); |
|
88438
3a34b722dd71
(encode_coding_utf_8): Initialize produced_chars to 0.
Kenichi Handa <handa@m17n.org>
parents:
88430
diff
changeset
|
4835 CODING_UTF_16_SURROGATE (coding) = 0; |
| 88365 | 4836 coding->detector = detect_coding_utf_16; |
| 4837 coding->decoder = decode_coding_utf_16; | |
| 4838 coding->encoder = encode_coding_utf_16; | |
|
20227
71008f909642
(setup_coding_system): Initialize common_flags member
Kenichi Handa <handa@m17n.org>
parents:
20150
diff
changeset
|
4839 coding->common_flags |
| 88365 | 4840 |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK); |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
4841 if (CODING_UTF_16_BOM (coding) == utf_16_detect_bom) |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
4842 coding->common_flags |= CODING_REQUIRE_DETECTION_MASK; |
| 88365 | 4843 } |
| 4844 else if (EQ (coding_type, Qccl)) | |
| 4845 { | |
| 4846 coding->detector = detect_coding_ccl; | |
| 4847 coding->decoder = decode_coding_ccl; | |
| 4848 coding->encoder = encode_coding_ccl; | |
|
20227
71008f909642
(setup_coding_system): Initialize common_flags member
Kenichi Handa <handa@m17n.org>
parents:
20150
diff
changeset
|
4849 coding->common_flags |
| 88365 | 4850 |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK |
| 4851 | CODING_REQUIRE_FLUSHING_MASK); | |
| 4852 } | |
| 4853 else if (EQ (coding_type, Qemacs_mule)) | |
| 4854 { | |
| 4855 coding->detector = detect_coding_emacs_mule; | |
| 4856 coding->decoder = decode_coding_emacs_mule; | |
| 4857 coding->encoder = encode_coding_emacs_mule; | |
| 4858 coding->common_flags | |
| 4859 |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK); | |
| 4860 if (! NILP (AREF (attrs, coding_attr_emacs_mule_full)) | |
| 4861 && ! EQ (CODING_ATTR_CHARSET_LIST (attrs), Vemacs_mule_charset_list)) | |
| 4862 { | |
| 4863 Lisp_Object tail, safe_charsets; | |
| 4864 int max_charset_id = 0; | |
| 4865 | |
| 4866 for (tail = Vemacs_mule_charset_list; CONSP (tail); | |
| 4867 tail = XCDR (tail)) | |
| 4868 if (max_charset_id < XFASTINT (XCAR (tail))) | |
| 4869 max_charset_id = XFASTINT (XCAR (tail)); | |
| 4870 safe_charsets = Fmake_string (make_number (max_charset_id + 1), | |
| 4871 make_number (255)); | |
| 4872 for (tail = Vemacs_mule_charset_list; CONSP (tail); | |
| 4873 tail = XCDR (tail)) | |
| 89483 | 4874 SSET (safe_charsets, XFASTINT (XCAR (tail)), 0); |
| 88365 | 4875 coding->max_charset_id = max_charset_id; |
| 89483 | 4876 coding->safe_charsets = (char *) SDATA (safe_charsets); |
| 88365 | 4877 } |
| 4878 } | |
| 4879 else if (EQ (coding_type, Qshift_jis)) | |
| 4880 { | |
| 4881 coding->detector = detect_coding_sjis; | |
| 4882 coding->decoder = decode_coding_sjis; | |
| 4883 coding->encoder = encode_coding_sjis; | |
| 4884 coding->common_flags | |
| 4885 |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK); | |
| 4886 } | |
| 4887 else if (EQ (coding_type, Qbig5)) | |
| 4888 { | |
| 4889 coding->detector = detect_coding_big5; | |
| 4890 coding->decoder = decode_coding_big5; | |
| 4891 coding->encoder = encode_coding_big5; | |
|
20227
71008f909642
(setup_coding_system): Initialize common_flags member
Kenichi Handa <handa@m17n.org>
parents:
20150
diff
changeset
|
4892 coding->common_flags |
| 88365 | 4893 |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK); |
| 4894 } | |
| 4895 else /* EQ (coding_type, Qraw_text) */ | |
| 4896 { | |
| 4897 coding->detector = NULL; | |
| 4898 coding->decoder = decode_coding_raw_text; | |
| 4899 coding->encoder = encode_coding_raw_text; | |
| 4900 } | |
| 4901 | |
| 4902 return; | |
| 17052 | 4903 } |
| 4904 | |
| 88365 | 4905 /* Return raw-text or one of its subsidiaries that has the same |
| 4906 eol_type as CODING-SYSTEM. */ | |
| 4907 | |
| 4908 Lisp_Object | |
| 4909 raw_text_coding_system (coding_system) | |
| 4910 Lisp_Object coding_system; | |
| 26847 | 4911 { |
|
88430
6418a272b97e
* coding.c: Delete unused variables.
Kenichi Handa <handa@m17n.org>
parents:
88365
diff
changeset
|
4912 Lisp_Object spec, attrs; |
| 88365 | 4913 Lisp_Object eol_type, raw_text_eol_type; |
| 4914 | |
|
89462
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
4915 if (NILP (coding_system)) |
|
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
4916 return Qraw_text; |
| 88365 | 4917 spec = CODING_SYSTEM_SPEC (coding_system); |
| 4918 attrs = AREF (spec, 0); | |
| 89483 | 4919 |
| 88365 | 4920 if (EQ (CODING_ATTR_TYPE (attrs), Qraw_text)) |
| 4921 return coding_system; | |
| 4922 | |
| 4923 eol_type = AREF (spec, 2); | |
| 4924 if (VECTORP (eol_type)) | |
| 4925 return Qraw_text; | |
| 4926 spec = CODING_SYSTEM_SPEC (Qraw_text); | |
| 4927 raw_text_eol_type = AREF (spec, 2); | |
| 4928 return (EQ (eol_type, Qunix) ? AREF (raw_text_eol_type, 0) | |
| 4929 : EQ (eol_type, Qdos) ? AREF (raw_text_eol_type, 1) | |
| 4930 : AREF (raw_text_eol_type, 2)); | |
| 26847 | 4931 } |
| 4932 | |
| 88365 | 4933 |
| 4934 /* If CODING_SYSTEM doesn't specify end-of-line format but PARENT | |
| 4935 does, return one of the subsidiary that has the same eol-spec as | |
| 4936 PARENT. Otherwise, return CODING_SYSTEM. */ | |
| 4937 | |
| 4938 Lisp_Object | |
| 4939 coding_inherit_eol_type (coding_system, parent) | |
| 88473 | 4940 Lisp_Object coding_system, parent; |
|
22616
c493ce6a31e4
(setup_raw_text_coding_system): New function.
Kenichi Handa <handa@m17n.org>
parents:
22529
diff
changeset
|
4941 { |
|
89545
4f394eed6ff2
(inhibit_pre_post_conversion): Removed (unused).
Dave Love <fx@gnu.org>
parents:
89519
diff
changeset
|
4942 Lisp_Object spec, eol_type; |
| 88365 | 4943 |
|
89462
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
4944 if (NILP (coding_system)) |
|
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
4945 coding_system = Qraw_text; |
| 88365 | 4946 spec = CODING_SYSTEM_SPEC (coding_system); |
| 4947 eol_type = AREF (spec, 2); | |
|
89462
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
4948 if (VECTORP (eol_type) |
|
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
4949 && ! NILP (parent)) |
| 88365 | 4950 { |
| 4951 Lisp_Object parent_spec; | |
| 4952 Lisp_Object parent_eol_type; | |
| 4953 | |
| 4954 parent_spec | |
| 4955 = CODING_SYSTEM_SPEC (buffer_defaults.buffer_file_coding_system); | |
| 4956 parent_eol_type = AREF (parent_spec, 2); | |
| 4957 if (EQ (parent_eol_type, Qunix)) | |
| 4958 coding_system = AREF (eol_type, 0); | |
| 4959 else if (EQ (parent_eol_type, Qdos)) | |
| 4960 coding_system = AREF (eol_type, 1); | |
| 4961 else if (EQ (parent_eol_type, Qmac)) | |
| 4962 coding_system = AREF (eol_type, 2); | |
| 4963 } | |
| 4964 return coding_system; | |
|
22616
c493ce6a31e4
(setup_raw_text_coding_system): New function.
Kenichi Handa <handa@m17n.org>
parents:
22529
diff
changeset
|
4965 } |
|
c493ce6a31e4
(setup_raw_text_coding_system): New function.
Kenichi Handa <handa@m17n.org>
parents:
22529
diff
changeset
|
4966 |
| 17052 | 4967 /* Emacs has a mechanism to automatically detect a coding system if it |
| 4968 is one of Emacs' internal format, ISO2022, SJIS, and BIG5. But, | |
| 4969 it's impossible to distinguish some coding systems accurately | |
| 4970 because they use the same range of codes. So, at first, coding | |
| 4971 systems are categorized into 7, those are: | |
| 4972 | |
|
17835
f36ffb6f1208
Name change through the code:
Kenichi Handa <handa@m17n.org>
parents:
17725
diff
changeset
|
4973 o coding-category-emacs-mule |
| 17052 | 4974 |
| 4975 The category for a coding system which has the same code range | |
| 4976 as Emacs' internal format. Assigned the coding-system (Lisp | |
|
17835
f36ffb6f1208
Name change through the code:
Kenichi Handa <handa@m17n.org>
parents:
17725
diff
changeset
|
4977 symbol) `emacs-mule' by default. |
| 17052 | 4978 |
| 4979 o coding-category-sjis | |
| 4980 | |
| 4981 The category for a coding system which has the same code range | |
| 4982 as SJIS. Assigned the coding-system (Lisp | |
|
18787
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
4983 symbol) `japanese-shift-jis' by default. |
| 17052 | 4984 |
| 4985 o coding-category-iso-7 | |
| 4986 | |
| 4987 The category for a coding system which has the same code range | |
|
18787
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
4988 as ISO2022 of 7-bit environment. This doesn't use any locking |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4989 shift and single shift functions. This can encode/decode all |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4990 charsets. Assigned the coding-system (Lisp symbol) |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4991 `iso-2022-7bit' by default. |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4992 |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4993 o coding-category-iso-7-tight |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4994 |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4995 Same as coding-category-iso-7 except that this can |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
4996 encode/decode only the specified charsets. |
| 17052 | 4997 |
| 4998 o coding-category-iso-8-1 | |
| 4999 | |
| 5000 The category for a coding system which has the same code range | |
| 5001 as ISO2022 of 8-bit environment and graphic plane 1 used only | |
|
18787
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5002 for DIMENSION1 charset. This doesn't use any locking shift |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5003 and single shift functions. Assigned the coding-system (Lisp |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5004 symbol) `iso-latin-1' by default. |
| 17052 | 5005 |
| 5006 o coding-category-iso-8-2 | |
| 5007 | |
| 5008 The category for a coding system which has the same code range | |
| 5009 as ISO2022 of 8-bit environment and graphic plane 1 used only | |
|
18787
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5010 for DIMENSION2 charset. This doesn't use any locking shift |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5011 and single shift functions. Assigned the coding-system (Lisp |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5012 symbol) `japanese-iso-8bit' by default. |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5013 |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5014 o coding-category-iso-7-else |
| 17052 | 5015 |
| 5016 The category for a coding system which has the same code range | |
| 88365 | 5017 as ISO2022 of 7-bit environemnt but uses locking shift or |
|
18787
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5018 single shift functions. Assigned the coding-system (Lisp |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5019 symbol) `iso-2022-7bit-lock' by default. |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5020 |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5021 o coding-category-iso-8-else |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5022 |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5023 The category for a coding system which has the same code range |
| 88365 | 5024 as ISO2022 of 8-bit environemnt but uses locking shift or |
|
18787
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5025 single shift functions. Assigned the coding-system (Lisp |
|
954e6be0a757
(detect_coding_iso2022): Distinguish coding-category-iso-7-else and
Kenichi Handa <handa@m17n.org>
parents:
18766
diff
changeset
|
5026 symbol) `iso-2022-8bit-ss2' by default. |
| 17052 | 5027 |
| 5028 o coding-category-big5 | |
| 5029 | |
| 5030 The category for a coding system which has the same code range | |
| 5031 as BIG5. Assigned the coding-system (Lisp symbol) | |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
5032 `cn-big5' by default. |
| 17052 | 5033 |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5034 o coding-category-utf-8 |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5035 |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5036 The category for a coding system which has the same code range |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5037 as UTF-8 (cf. RFC2279). Assigned the coding-system (Lisp |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5038 symbol) `utf-8' by default. |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5039 |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5040 o coding-category-utf-16-be |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5041 |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5042 The category for a coding system in which a text has an |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5043 Unicode signature (cf. Unicode Standard) in the order of BIG |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5044 endian at the head. Assigned the coding-system (Lisp symbol) |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5045 `utf-16-be' by default. |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5046 |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5047 o coding-category-utf-16-le |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5048 |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5049 The category for a coding system in which a text has an |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5050 Unicode signature (cf. Unicode Standard) in the order of |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5051 LITTLE endian at the head. Assigned the coding-system (Lisp |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5052 symbol) `utf-16-le' by default. |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5053 |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
5054 o coding-category-ccl |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
5055 |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
5056 The category for a coding system of which encoder/decoder is |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
5057 written in CCL programs. The default value is nil, i.e., no |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
5058 coding system is assigned. |
|
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
5059 |
| 17052 | 5060 o coding-category-binary |
| 5061 | |
| 5062 The category for a coding system not categorized in any of the | |
| 5063 above. Assigned the coding-system (Lisp symbol) | |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
5064 `no-conversion' by default. |
| 17052 | 5065 |
| 5066 Each of them is a Lisp symbol and the value is an actual | |
| 88365 | 5067 `coding-system's (this is also a Lisp symbol) assigned by a user. |
| 17052 | 5068 What Emacs does actually is to detect a category of coding system. |
| 5069 Then, it uses a `coding-system' assigned to it. If Emacs can't | |
| 88365 | 5070 decide only one possible category, it selects a category of the |
| 17052 | 5071 highest priority. Priorities of categories are also specified by a |
| 5072 user in a Lisp variable `coding-category-list'. | |
| 5073 | |
| 5074 */ | |
| 5075 | |
| 88365 | 5076 #define EOL_SEEN_NONE 0 |
| 5077 #define EOL_SEEN_LF 1 | |
| 5078 #define EOL_SEEN_CR 2 | |
| 5079 #define EOL_SEEN_CRLF 4 | |
| 5080 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5081 /* Detect how end-of-line of a text of length SRC_BYTES pointed by |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5082 SOURCE is encoded. If CATEGORY is one of |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5083 coding_category_utf_16_XXXX, assume that CR and LF are encoded by |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5084 two-byte, else they are encoded by one-byte. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5085 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5086 Return one of EOL_SEEN_XXX. */ |
| 17052 | 5087 |
|
19173
04ed7c3f5cee
(detect_eol_type): If EOL representation does not
Richard M. Stallman <rms@gnu.org>
parents:
19134
diff
changeset
|
5088 #define MAX_EOL_CHECK_COUNT 3 |
|
04ed7c3f5cee
(detect_eol_type): If EOL representation does not
Richard M. Stallman <rms@gnu.org>
parents:
19134
diff
changeset
|
5089 |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5090 static int |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
5091 detect_eol (source, src_bytes, category) |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5092 unsigned char *source; |
| 88365 | 5093 EMACS_INT src_bytes; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
5094 enum coding_category category; |
| 17052 | 5095 { |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5096 unsigned char *src = source, *src_end = src + src_bytes; |
| 17052 | 5097 unsigned char c; |
| 88365 | 5098 int total = 0; |
| 5099 int eol_seen = EOL_SEEN_NONE; | |
| 5100 | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
5101 if ((1 << category) & CATEGORY_MASK_UTF_16) |
| 88365 | 5102 { |
| 5103 int msb, lsb; | |
| 5104 | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
5105 msb = category == (coding_category_utf_16_le |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
5106 | coding_category_utf_16_le_nosig); |
| 88365 | 5107 lsb = 1 - msb; |
| 5108 | |
| 5109 while (src + 1 < src_end) | |
| 17052 | 5110 { |
| 88365 | 5111 c = src[lsb]; |
| 5112 if (src[msb] == 0 && (c == '\n' || c == '\r')) | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5113 { |
| 88365 | 5114 int this_eol; |
| 5115 | |
| 5116 if (c == '\n') | |
| 5117 this_eol = EOL_SEEN_LF; | |
| 5118 else if (src + 3 >= src_end | |
| 5119 || src[msb + 2] != 0 | |
| 5120 || src[lsb + 2] != '\n') | |
| 5121 this_eol = EOL_SEEN_CR; | |
| 5122 else | |
| 89483 | 5123 this_eol = EOL_SEEN_CRLF; |
| 88365 | 5124 |
| 5125 if (eol_seen == EOL_SEEN_NONE) | |
| 5126 /* This is the first end-of-line. */ | |
| 5127 eol_seen = this_eol; | |
| 5128 else if (eol_seen != this_eol) | |
| 5129 { | |
| 5130 /* The found type is different from what found before. */ | |
| 5131 eol_seen = EOL_SEEN_LF; | |
| 5132 break; | |
| 5133 } | |
| 5134 if (++total == MAX_EOL_CHECK_COUNT) | |
| 5135 break; | |
| 5136 } | |
| 5137 src += 2; | |
| 5138 } | |
|
30833
2db6e42a6ba3
(MINIMUM_CONVERSION_BUFFER_SIZE): Macro deleted.
Kenichi Handa <handa@m17n.org>
parents:
30756
diff
changeset
|
5139 } |
| 88365 | 5140 else |
| 5141 { | |
| 5142 while (src < src_end) | |
| 5143 { | |
| 5144 c = *src++; | |
| 5145 if (c == '\n' || c == '\r') | |
| 5146 { | |
| 5147 int this_eol; | |
| 5148 | |
| 5149 if (c == '\n') | |
| 5150 this_eol = EOL_SEEN_LF; | |
| 5151 else if (src >= src_end || *src != '\n') | |
| 5152 this_eol = EOL_SEEN_CR; | |
| 5153 else | |
| 5154 this_eol = EOL_SEEN_CRLF, src++; | |
| 5155 | |
| 5156 if (eol_seen == EOL_SEEN_NONE) | |
| 5157 /* This is the first end-of-line. */ | |
| 5158 eol_seen = this_eol; | |
| 5159 else if (eol_seen != this_eol) | |
| 5160 { | |
| 5161 /* The found type is different from what found before. */ | |
| 5162 eol_seen = EOL_SEEN_LF; | |
| 5163 break; | |
| 5164 } | |
| 5165 if (++total == MAX_EOL_CHECK_COUNT) | |
| 5166 break; | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5167 } |
| 17052 | 5168 } |
| 5169 } | |
| 88365 | 5170 return eol_seen; |
| 17052 | 5171 } |
| 5172 | |
| 88365 | 5173 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5174 static Lisp_Object |
| 88365 | 5175 adjust_coding_eol_type (coding, eol_seen) |
| 5176 struct coding_system *coding; | |
| 5177 int eol_seen; | |
| 5178 { | |
|
88430
6418a272b97e
* coding.c: Delete unused variables.
Kenichi Handa <handa@m17n.org>
parents:
88365
diff
changeset
|
5179 Lisp_Object eol_type; |
| 89483 | 5180 |
| 88365 | 5181 eol_type = CODING_ID_EOL_TYPE (coding->id); |
| 5182 if (eol_seen & EOL_SEEN_LF) | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5183 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5184 coding->id = CODING_SYSTEM_ID (AREF (eol_type, 0)); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5185 eol_type = Qunix; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5186 } |
|
88862
108e2535280d
(adjust_coding_eol_type): Fix eol_type/eol_seen mixup.
Dave Love <fx@gnu.org>
parents:
88856
diff
changeset
|
5187 else if (eol_seen & EOL_SEEN_CRLF) |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5188 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5189 coding->id = CODING_SYSTEM_ID (AREF (eol_type, 1)); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5190 eol_type = Qdos; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5191 } |
|
88862
108e2535280d
(adjust_coding_eol_type): Fix eol_type/eol_seen mixup.
Dave Love <fx@gnu.org>
parents:
88856
diff
changeset
|
5192 else if (eol_seen & EOL_SEEN_CR) |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5193 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5194 coding->id = CODING_SYSTEM_ID (AREF (eol_type, 2)); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5195 eol_type = Qmac; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5196 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5197 return eol_type; |
| 88365 | 5198 } |
| 5199 | |
| 5200 /* Detect how a text specified in CODING is encoded. If a coding | |
| 5201 system is detected, update fields of CODING by the detected coding | |
| 5202 system. */ | |
| 5203 | |
| 5204 void | |
| 5205 detect_coding (coding) | |
| 5206 struct coding_system *coding; | |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5207 { |
| 89483 | 5208 const unsigned char *src, *src_end; |
| 88365 | 5209 Lisp_Object attrs, coding_type; |
| 5210 | |
| 5211 coding->consumed = coding->consumed_char = 0; | |
| 5212 coding->produced = coding->produced_char = 0; | |
| 5213 coding_set_source (coding); | |
| 5214 | |
| 5215 src_end = coding->source + coding->src_bytes; | |
| 5216 | |
| 5217 /* If we have not yet decided the text encoding type, detect it | |
| 5218 now. */ | |
| 5219 if (EQ (CODING_ATTR_TYPE (CODING_ID_ATTRS (coding->id)), Qundecided)) | |
| 5220 { | |
| 5221 int c, i; | |
| 5222 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5223 for (i = 0, src = coding->source; src < src_end; i++, src++) |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5224 { |
| 88365 | 5225 c = *src; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5226 if (c & 0x80 || (c < 0x20 && (c == 0 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5227 || c == ISO_CODE_ESC |
| 88365 | 5228 || c == ISO_CODE_SI |
| 5229 || c == ISO_CODE_SO))) | |
| 5230 break; | |
| 5231 } | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5232 /* Skipped bytes must be even for utf-16 detector. */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5233 if (i % 2) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5234 src--; |
| 88365 | 5235 coding->head_ascii = src - (coding->source + coding->consumed); |
| 5236 | |
| 5237 if (coding->head_ascii < coding->src_bytes) | |
| 5238 { | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5239 struct coding_detection_info detect_info; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5240 enum coding_category category; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5241 struct coding_system *this; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5242 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5243 detect_info.checked = detect_info.found = detect_info.rejected = 0; |
| 88365 | 5244 for (i = 0; i < coding_category_raw_text; i++) |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5245 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5246 category = coding_priorities[i]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5247 this = coding_categories + category; |
| 88365 | 5248 if (this->id < 0) |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5249 { |
| 88365 | 5250 /* No coding system of this category is defined. */ |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5251 detect_info.rejected |= (1 << category); |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5252 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5253 else if (category >= coding_category_raw_text) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
5254 continue; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5255 else if (detect_info.checked & (1 << category)) |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5256 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5257 if (detect_info.found & (1 << category)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5258 break; |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5259 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5260 else if ((*(this->detector)) (coding, &detect_info) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5261 && detect_info.found & (1 << category)) |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5262 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5263 if (category == coding_category_utf_16_auto) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5264 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5265 if (detect_info.found & CATEGORY_MASK_UTF_16_LE) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5266 category = coding_category_utf_16_le; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5267 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5268 category = coding_category_utf_16_be; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5269 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5270 break; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5271 } |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5272 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5273 if (i < coding_category_raw_text) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5274 setup_coding_system (CODING_ID_NAME (this->id), coding); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5275 else if (detect_info.rejected == CATEGORY_MASK_ANY) |
| 88365 | 5276 setup_coding_system (Qraw_text, coding); |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5277 else if (detect_info.rejected) |
| 88365 | 5278 for (i = 0; i < coding_category_raw_text; i++) |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5279 if (! (detect_info.rejected & (1 << coding_priorities[i]))) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5280 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5281 this = coding_categories + coding_priorities[i]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5282 setup_coding_system (CODING_ID_NAME (this->id), coding); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5283 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5284 } |
| 88365 | 5285 } |
| 5286 } | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5287 else if (XINT (CODING_ATTR_CATEGORY (CODING_ID_ATTRS (coding->id))) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5288 == coding_category_utf_16_auto) |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5289 { |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5290 Lisp_Object coding_systems; |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5291 struct coding_detection_info detect_info; |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5292 |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5293 coding_systems |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5294 = AREF (CODING_ID_ATTRS (coding->id), coding_attr_utf_16_bom); |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5295 detect_info.found = detect_info.rejected = 0; |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5296 if (CONSP (coding_systems) |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5297 && detect_coding_utf_16 (coding, &detect_info)) |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5298 { |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5299 if (detect_info.found & CATEGORY_MASK_UTF_16_LE) |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5300 setup_coding_system (XCAR (coding_systems), coding); |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5301 else if (detect_info.found & CATEGORY_MASK_UTF_16_BE) |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5302 setup_coding_system (XCDR (coding_systems), coding); |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5303 } |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
5304 } |
| 88365 | 5305 } |
| 5306 | |
| 5307 | |
| 5308 static void | |
| 5309 decode_eol (coding) | |
| 5310 struct coding_system *coding; | |
| 5311 { | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5312 Lisp_Object eol_type; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5313 unsigned char *p, *pbeg, *pend; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5314 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5315 eol_type = CODING_ID_EOL_TYPE (coding->id); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5316 if (EQ (eol_type, Qunix)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5317 return; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5318 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5319 if (NILP (coding->dst_object)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5320 pbeg = coding->destination; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5321 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5322 pbeg = BYTE_POS_ADDR (coding->dst_pos_byte); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5323 pend = pbeg + coding->produced; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5324 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5325 if (VECTORP (eol_type)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5326 { |
| 88365 | 5327 int eol_seen = EOL_SEEN_NONE; |
| 5328 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5329 for (p = pbeg; p < pend; p++) |
| 88365 | 5330 { |
| 5331 if (*p == '\n') | |
| 5332 eol_seen |= EOL_SEEN_LF; | |
| 5333 else if (*p == '\r') | |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5334 { |
| 88365 | 5335 if (p + 1 < pend && *(p + 1) == '\n') |
| 5336 { | |
| 5337 eol_seen |= EOL_SEEN_CRLF; | |
| 5338 p++; | |
| 5339 } | |
| 5340 else | |
| 5341 eol_seen |= EOL_SEEN_CR; | |
|
28022
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5342 } |
|
6c41f3276340
Add comments on coding-category-utf-8,
Kenichi Handa <handa@m17n.org>
parents:
27943
diff
changeset
|
5343 } |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5344 if (eol_seen != EOL_SEEN_NONE |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5345 && eol_seen != EOL_SEEN_LF |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5346 && eol_seen != EOL_SEEN_CRLF |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5347 && eol_seen != EOL_SEEN_CR) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5348 eol_seen = EOL_SEEN_LF; |
| 88365 | 5349 if (eol_seen != EOL_SEEN_NONE) |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5350 eol_type = adjust_coding_eol_type (coding, eol_seen); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5351 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5352 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5353 if (EQ (eol_type, Qmac)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5354 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5355 for (p = pbeg; p < pend; p++) |
| 88365 | 5356 if (*p == '\r') |
| 5357 *p = '\n'; | |
| 5358 } | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5359 else if (EQ (eol_type, Qdos)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5360 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5361 int n = 0; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5362 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5363 if (NILP (coding->dst_object)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5364 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5365 for (p = pend - 2; p >= pbeg; p--) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5366 if (*p == '\r') |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5367 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5368 safe_bcopy ((char *) (p + 1), (char *) p, pend-- - p - 1); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5369 n++; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5370 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5371 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5372 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5373 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5374 for (p = pend - 2; p >= pbeg; p--) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5375 if (*p == '\r') |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5376 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5377 int pos_byte = coding->dst_pos_byte + (p - pbeg); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5378 int pos = BYTE_TO_CHAR (pos_byte); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5379 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5380 del_range_2 (pos, pos_byte, pos + 1, pos_byte + 1, 0); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5381 n++; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5382 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5383 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5384 coding->produced -= n; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5385 coding->produced_char -= n; |
| 17052 | 5386 } |
| 5387 } | |
| 5388 | |
| 88365 | 5389 static void |
| 5390 translate_chars (coding, table) | |
| 17052 | 5391 struct coding_system *coding; |
| 88365 | 5392 Lisp_Object table; |
| 17052 | 5393 { |
| 88365 | 5394 int *charbuf = coding->charbuf; |
| 5395 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 5396 int c; | |
| 5397 | |
| 5398 if (coding->chars_at_source) | |
| 5399 return; | |
| 5400 | |
| 5401 while (charbuf < charbuf_end) | |
| 5402 { | |
| 5403 c = *charbuf; | |
| 5404 if (c < 0) | |
| 5405 charbuf += c; | |
| 5406 else | |
| 5407 *charbuf++ = translate_char (table, c); | |
| 5408 } | |
| 17052 | 5409 } |
| 5410 | |
| 88365 | 5411 static int |
| 5412 produce_chars (coding) | |
| 5413 struct coding_system *coding; | |
| 17052 | 5414 { |
| 88365 | 5415 unsigned char *dst = coding->destination + coding->produced; |
| 5416 unsigned char *dst_end = coding->destination + coding->dst_bytes; | |
| 5417 int produced; | |
| 5418 int produced_chars = 0; | |
| 5419 | |
| 5420 if (! coding->chars_at_source) | |
| 5421 { | |
| 5422 /* Characters are in coding->charbuf. */ | |
|
89575
59d10ebd2a0b
(produce_chars): Revert last change.
Andreas Schwab <schwab@suse.de>
parents:
89571
diff
changeset
|
5423 int *buf = coding->charbuf; |
|
59d10ebd2a0b
(produce_chars): Revert last change.
Andreas Schwab <schwab@suse.de>
parents:
89571
diff
changeset
|
5424 int *buf_end = buf + coding->charbuf_used; |
| 88365 | 5425 unsigned char *adjusted_dst_end; |
| 5426 | |
| 5427 if (BUFFERP (coding->src_object) | |
| 5428 && EQ (coding->src_object, coding->dst_object)) | |
| 89483 | 5429 dst_end = ((unsigned char *) coding->source) + coding->consumed; |
| 88365 | 5430 adjusted_dst_end = dst_end - MAX_MULTIBYTE_LENGTH; |
| 5431 | |
| 5432 while (buf < buf_end) | |
| 5433 { | |
| 5434 int c = *buf++; | |
| 89483 | 5435 |
| 88365 | 5436 if (dst >= adjusted_dst_end) |
| 5437 { | |
| 5438 dst = alloc_destination (coding, | |
| 5439 buf_end - buf + MAX_MULTIBYTE_LENGTH, | |
| 5440 dst); | |
| 5441 dst_end = coding->destination + coding->dst_bytes; | |
| 5442 adjusted_dst_end = dst_end - MAX_MULTIBYTE_LENGTH; | |
| 5443 } | |
| 5444 if (c >= 0) | |
| 5445 { | |
| 5446 if (coding->dst_multibyte | |
| 5447 || ! CHAR_BYTE8_P (c)) | |
| 5448 CHAR_STRING_ADVANCE (c, dst); | |
| 5449 else | |
| 5450 *dst++ = CHAR_TO_BYTE8 (c); | |
| 5451 produced_chars++; | |
| 5452 } | |
| 5453 else | |
|
89462
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
5454 /* This is an annotation datum. (-C) is the length of |
|
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
5455 it. */ |
|
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
5456 buf += -c - 1; |
| 88365 | 5457 } |
|
30833
2db6e42a6ba3
(MINIMUM_CONVERSION_BUFFER_SIZE): Macro deleted.
Kenichi Handa <handa@m17n.org>
parents:
30756
diff
changeset
|
5458 } |
|
2db6e42a6ba3
(MINIMUM_CONVERSION_BUFFER_SIZE): Macro deleted.
Kenichi Handa <handa@m17n.org>
parents:
30756
diff
changeset
|
5459 else |
|
2db6e42a6ba3
(MINIMUM_CONVERSION_BUFFER_SIZE): Macro deleted.
Kenichi Handa <handa@m17n.org>
parents:
30756
diff
changeset
|
5460 { |
| 89483 | 5461 const unsigned char *src = coding->source; |
| 5462 const unsigned char *src_end = src + coding->src_bytes; | |
| 88365 | 5463 Lisp_Object eol_type; |
| 5464 | |
| 5465 eol_type = CODING_ID_EOL_TYPE (coding->id); | |
| 5466 | |
| 5467 if (coding->src_multibyte != coding->dst_multibyte) | |
|
34892
3868f2e7355a
(setup_coding_system): Initialize
Kenichi Handa <handa@m17n.org>
parents:
34888
diff
changeset
|
5468 { |
| 88365 | 5469 if (coding->src_multibyte) |
|
34892
3868f2e7355a
(setup_coding_system): Initialize
Kenichi Handa <handa@m17n.org>
parents:
34888
diff
changeset
|
5470 { |
|
88443
6b86cf30a0b9
(produce_chars): Set the variable `multibytep' correctly.
Kenichi Handa <handa@m17n.org>
parents:
88438
diff
changeset
|
5471 int multibytep = 1; |
| 88365 | 5472 int consumed_chars; |
| 5473 | |
| 5474 while (1) | |
|
34892
3868f2e7355a
(setup_coding_system): Initialize
Kenichi Handa <handa@m17n.org>
parents:
34888
diff
changeset
|
5475 { |
| 89483 | 5476 const unsigned char *src_base = src; |
| 88365 | 5477 int c; |
| 5478 | |
| 5479 ONE_MORE_BYTE (c); | |
| 5480 if (c == '\r') | |
| 5481 { | |
| 5482 if (EQ (eol_type, Qdos)) | |
| 5483 { | |
|
89279
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
5484 if (src == src_end) |
|
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
5485 { |
|
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
5486 coding->result = CODING_RESULT_INSUFFICIENT_SRC; |
|
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
5487 goto no_more_source; |
|
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
5488 } |
|
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
5489 if (*src == '\n') |
| 88365 | 5490 c = *src++; |
| 5491 } | |
| 5492 else if (EQ (eol_type, Qmac)) | |
| 5493 c = '\n'; | |
| 5494 } | |
| 5495 if (dst == dst_end) | |
| 5496 { | |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5497 coding->consumed = src - coding->source; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5498 |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5499 if (EQ (coding->src_object, coding->dst_object)) |
| 89483 | 5500 dst_end = (unsigned char *) src; |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5501 if (dst == dst_end) |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5502 { |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5503 dst = alloc_destination (coding, src_end - src + 1, |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5504 dst); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5505 dst_end = coding->destination + coding->dst_bytes; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5506 coding_set_source (coding); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5507 src = coding->source + coding->consumed; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5508 src_end = coding->source + coding->src_bytes; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5509 } |
| 88365 | 5510 } |
| 5511 *dst++ = c; | |
| 5512 produced_chars++; | |
| 5513 } | |
| 5514 no_more_source: | |
| 5515 ; | |
| 5516 } | |
| 5517 else | |
| 5518 while (src < src_end) | |
| 5519 { | |
|
88443
6b86cf30a0b9
(produce_chars): Set the variable `multibytep' correctly.
Kenichi Handa <handa@m17n.org>
parents:
88438
diff
changeset
|
5520 int multibytep = 1; |
| 88365 | 5521 int c = *src++; |
| 5522 | |
| 5523 if (c == '\r') | |
| 5524 { | |
| 5525 if (EQ (eol_type, Qdos)) | |
| 5526 { | |
| 5527 if (src < src_end | |
| 5528 && *src == '\n') | |
| 5529 c = *src++; | |
| 5530 } | |
| 5531 else if (EQ (eol_type, Qmac)) | |
| 5532 c = '\n'; | |
| 5533 } | |
| 5534 if (dst >= dst_end - 1) | |
| 5535 { | |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5536 coding->consumed = src - coding->source; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5537 |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5538 if (EQ (coding->src_object, coding->dst_object)) |
| 89483 | 5539 dst_end = (unsigned char *) src; |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5540 if (dst >= dst_end - 1) |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5541 { |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5542 dst = alloc_destination (coding, src_end - src + 2, |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5543 dst); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5544 dst_end = coding->destination + coding->dst_bytes; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5545 coding_set_source (coding); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5546 src = coding->source + coding->consumed; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5547 src_end = coding->source + coding->src_bytes; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5548 } |
| 88365 | 5549 } |
| 5550 EMIT_ONE_BYTE (c); | |
| 5551 } | |
| 5552 } | |
| 5553 else | |
| 5554 { | |
| 5555 if (!EQ (coding->src_object, coding->dst_object)) | |
| 5556 { | |
| 5557 int require = coding->src_bytes - coding->dst_bytes; | |
| 5558 | |
| 5559 if (require > 0) | |
| 5560 { | |
| 5561 EMACS_INT offset = src - coding->source; | |
| 5562 | |
| 5563 dst = alloc_destination (coding, require, dst); | |
| 5564 coding_set_source (coding); | |
| 5565 src = coding->source + offset; | |
| 5566 src_end = coding->source + coding->src_bytes; | |
|
34892
3868f2e7355a
(setup_coding_system): Initialize
Kenichi Handa <handa@m17n.org>
parents:
34888
diff
changeset
|
5567 } |
|
3868f2e7355a
(setup_coding_system): Initialize
Kenichi Handa <handa@m17n.org>
parents:
34888
diff
changeset
|
5568 } |
| 88365 | 5569 produced_chars = coding->src_chars; |
| 5570 while (src < src_end) | |
|
34892
3868f2e7355a
(setup_coding_system): Initialize
Kenichi Handa <handa@m17n.org>
parents:
34888
diff
changeset
|
5571 { |
| 88365 | 5572 int c = *src++; |
| 5573 | |
| 5574 if (c == '\r') | |
| 5575 { | |
| 5576 if (EQ (eol_type, Qdos)) | |
| 5577 { | |
| 5578 if (src < src_end | |
| 5579 && *src == '\n') | |
| 5580 c = *src++; | |
| 5581 produced_chars--; | |
| 5582 } | |
| 5583 else if (EQ (eol_type, Qmac)) | |
| 5584 c = '\n'; | |
| 5585 } | |
| 5586 *dst++ = c; | |
|
34892
3868f2e7355a
(setup_coding_system): Initialize
Kenichi Handa <handa@m17n.org>
parents:
34888
diff
changeset
|
5587 } |
|
3868f2e7355a
(setup_coding_system): Initialize
Kenichi Handa <handa@m17n.org>
parents:
34888
diff
changeset
|
5588 } |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5589 coding->consumed = coding->src_bytes; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
5590 coding->consumed_char = coding->src_chars; |
| 88365 | 5591 } |
| 5592 | |
| 5593 produced = dst - (coding->destination + coding->produced); | |
| 5594 if (BUFFERP (coding->dst_object)) | |
| 5595 insert_from_gap (produced_chars, produced); | |
| 5596 coding->produced += produced; | |
| 5597 coding->produced_char += produced_chars; | |
| 5598 return produced_chars; | |
| 5599 } | |
| 5600 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5601 /* Compose text in CODING->object according to the annotation data at |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5602 CHARBUF. CHARBUF is an array: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5603 [ -LENGTH ANNOTATION_MASK FROM TO METHOD COMP_LEN [ COMPONENTS... ] ] |
| 88365 | 5604 */ |
| 5605 | |
| 5606 static INLINE void | |
| 5607 produce_composition (coding, charbuf) | |
| 5608 struct coding_system *coding; | |
| 5609 int *charbuf; | |
| 5610 { | |
| 5611 int len; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5612 EMACS_INT from, to; |
| 88365 | 5613 enum composition_method method; |
| 5614 Lisp_Object components; | |
| 5615 | |
| 5616 len = -charbuf[0]; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5617 from = coding->dst_pos + charbuf[2]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5618 to = coding->dst_pos + charbuf[3]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5619 method = (enum composition_method) (charbuf[4]); |
| 88365 | 5620 |
| 5621 if (method == COMPOSITION_RELATIVE) | |
| 5622 components = Qnil; | |
| 5623 else | |
| 5624 { | |
| 5625 Lisp_Object args[MAX_COMPOSITION_COMPONENTS * 2 - 1]; | |
| 5626 int i; | |
| 5627 | |
| 5628 len -= 5; | |
| 5629 charbuf += 5; | |
| 5630 for (i = 0; i < len; i++) | |
| 5631 args[i] = make_number (charbuf[i]); | |
| 5632 components = (method == COMPOSITION_WITH_ALTCHARS | |
| 5633 ? Fstring (len, args) : Fvector (len, args)); | |
| 5634 } | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5635 compose_text (from, to, components, Qnil, coding->dst_object); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5636 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5637 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5638 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5639 /* Put `charset' property on text in CODING->object according to |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5640 the annotation data at CHARBUF. CHARBUF is an array: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5641 [ -LENGTH ANNOTATION_MASK FROM TO CHARSET-ID ] |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5642 */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5643 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5644 static INLINE void |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5645 produce_charset (coding, charbuf) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5646 struct coding_system *coding; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5647 int *charbuf; |
| 88365 | 5648 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5649 EMACS_INT from = coding->dst_pos + charbuf[2]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5650 EMACS_INT to = coding->dst_pos + charbuf[3]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5651 struct charset *charset = CHARSET_FROM_ID (charbuf[4]); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5652 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5653 Fput_text_property (make_number (from), make_number (to), |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5654 Qcharset, CHARSET_NAME (charset), |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5655 coding->dst_object); |
| 88365 | 5656 } |
| 5657 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5658 |
| 88365 | 5659 #define CHARBUF_SIZE 0x4000 |
| 5660 | |
| 5661 #define ALLOC_CONVERSION_WORK_AREA(coding) \ | |
| 5662 do { \ | |
| 5663 int size = CHARBUF_SIZE;; \ | |
| 5664 \ | |
| 5665 coding->charbuf = NULL; \ | |
| 5666 while (size > 1024) \ | |
| 5667 { \ | |
| 5668 coding->charbuf = (int *) alloca (sizeof (int) * size); \ | |
| 5669 if (coding->charbuf) \ | |
| 5670 break; \ | |
| 5671 size >>= 1; \ | |
| 5672 } \ | |
| 5673 if (! coding->charbuf) \ | |
| 5674 { \ | |
| 5675 coding->result = CODING_RESULT_INSUFFICIENT_MEM; \ | |
| 5676 return coding->result; \ | |
| 5677 } \ | |
| 5678 coding->charbuf_size = size; \ | |
| 5679 } while (0) | |
| 5680 | |
|
29725
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5681 |
|
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5682 static void |
| 88365 | 5683 produce_annotation (coding) |
|
29725
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5684 struct coding_system *coding; |
|
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5685 { |
| 88365 | 5686 int *charbuf = coding->charbuf; |
| 5687 int *charbuf_end = charbuf + coding->charbuf_used; | |
| 5688 | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5689 if (NILP (coding->dst_object)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5690 return; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5691 |
| 88365 | 5692 while (charbuf < charbuf_end) |
| 5693 { | |
| 5694 if (*charbuf >= 0) | |
| 5695 charbuf++; | |
| 5696 else | |
|
29877
7b43e1fb478a
(decode_eol_post_ccl): Special handling for undecided
Eli Zaretskii <eliz@gnu.org>
parents:
29725
diff
changeset
|
5697 { |
| 88365 | 5698 int len = -*charbuf; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5699 switch (charbuf[1]) |
|
29725
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5700 { |
| 88365 | 5701 case CODING_ANNOTATE_COMPOSITION_MASK: |
| 5702 produce_composition (coding, charbuf); | |
| 5703 break; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5704 case CODING_ANNOTATE_CHARSET_MASK: |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5705 produce_charset (coding, charbuf); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5706 break; |
| 88365 | 5707 default: |
| 5708 abort (); | |
|
29725
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5709 } |
| 88365 | 5710 charbuf += len; |
|
29725
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5711 } |
|
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5712 } |
|
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5713 } |
|
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5714 |
| 88365 | 5715 /* Decode the data at CODING->src_object into CODING->dst_object. |
| 5716 CODING->src_object is a buffer, a string, or nil. | |
| 5717 CODING->dst_object is a buffer. | |
| 5718 | |
| 5719 If CODING->src_object is a buffer, it must be the current buffer. | |
| 5720 In this case, if CODING->src_pos is positive, it is a position of | |
| 5721 the source text in the buffer, otherwise, the source text is in the | |
| 5722 gap area of the buffer, and CODING->src_pos specifies the offset of | |
| 5723 the text from GPT (which must be the same as PT). If this is the | |
| 5724 same buffer as CODING->dst_object, CODING->src_pos must be | |
| 5725 negative. | |
| 5726 | |
| 5727 If CODING->src_object is a string, CODING->src_pos in an index to | |
| 5728 that string. | |
| 5729 | |
| 5730 If CODING->src_object is nil, CODING->source must already point to | |
| 5731 the non-relocatable memory area. In this case, CODING->src_pos is | |
| 5732 an offset from CODING->source. | |
| 5733 | |
| 5734 The decoded data is inserted at the current point of the buffer | |
| 5735 CODING->dst_object. | |
| 5736 */ | |
| 5737 | |
| 5738 static int | |
| 5739 decode_coding (coding) | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5740 struct coding_system *coding; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5741 { |
| 88365 | 5742 Lisp_Object attrs; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5743 Lisp_Object undo_list; |
| 88365 | 5744 |
| 5745 if (BUFFERP (coding->src_object) | |
| 5746 && coding->src_pos > 0 | |
| 5747 && coding->src_pos < GPT | |
| 5748 && coding->src_pos + coding->src_chars > GPT) | |
| 5749 move_gap_both (coding->src_pos, coding->src_pos_byte); | |
| 5750 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5751 undo_list = Qt; |
| 88365 | 5752 if (BUFFERP (coding->dst_object)) |
| 5753 { | |
| 5754 if (current_buffer != XBUFFER (coding->dst_object)) | |
| 5755 set_buffer_internal (XBUFFER (coding->dst_object)); | |
| 5756 if (GPT != PT) | |
| 5757 move_gap_both (PT, PT_BYTE); | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5758 undo_list = current_buffer->undo_list; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5759 current_buffer->undo_list = Qt; |
| 88365 | 5760 } |
| 5761 | |
| 5762 coding->consumed = coding->consumed_char = 0; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
5763 coding->produced = coding->produced_char = 0; |
| 88365 | 5764 coding->chars_at_source = 0; |
| 5765 coding->result = CODING_RESULT_SUCCESS; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
5766 coding->errors = 0; |
| 88365 | 5767 |
| 5768 ALLOC_CONVERSION_WORK_AREA (coding); | |
| 5769 | |
| 5770 attrs = CODING_ID_ATTRS (coding->id); | |
| 5771 | |
| 5772 do | |
| 5773 { | |
| 5774 coding_set_source (coding); | |
| 5775 coding->annotated = 0; | |
| 5776 (*(coding->decoder)) (coding); | |
| 5777 if (!NILP (CODING_ATTR_DECODE_TBL (attrs))) | |
|
89207
c232917f49f7
(decode_coding): Fix args to translate_chars. Pay
Kenichi Handa <handa@m17n.org>
parents:
89193
diff
changeset
|
5778 translate_chars (coding, CODING_ATTR_DECODE_TBL (attrs)); |
|
c232917f49f7
(decode_coding): Fix args to translate_chars. Pay
Kenichi Handa <handa@m17n.org>
parents:
89193
diff
changeset
|
5779 else if (!NILP (Vstandard_translation_table_for_decode)) |
|
c232917f49f7
(decode_coding): Fix args to translate_chars. Pay
Kenichi Handa <handa@m17n.org>
parents:
89193
diff
changeset
|
5780 translate_chars (coding, Vstandard_translation_table_for_decode); |
| 88365 | 5781 coding_set_destination (coding); |
| 5782 produce_chars (coding); | |
| 5783 if (coding->annotated) | |
| 5784 produce_annotation (coding); | |
| 5785 } | |
| 5786 while (coding->consumed < coding->src_bytes | |
| 5787 && ! coding->result); | |
| 5788 | |
| 5789 coding->carryover_bytes = 0; | |
| 5790 if (coding->consumed < coding->src_bytes) | |
| 5791 { | |
| 5792 int nbytes = coding->src_bytes - coding->consumed; | |
| 89483 | 5793 const unsigned char *src; |
| 88365 | 5794 |
| 5795 coding_set_source (coding); | |
| 5796 coding_set_destination (coding); | |
| 5797 src = coding->source + coding->consumed; | |
| 5798 | |
| 5799 if (coding->mode & CODING_MODE_LAST_BLOCK) | |
|
29725
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5800 { |
| 88365 | 5801 /* Flush out unprocessed data as binary chars. We are sure |
| 5802 that the number of data is less than the size of | |
| 5803 coding->charbuf. */ | |
| 5804 while (nbytes-- > 0) | |
| 5805 { | |
| 5806 int c = *src++; | |
|
89279
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
5807 |
|
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
5808 coding->charbuf[coding->charbuf_used++] = (c & 0x80 ? - c : c); |
| 88365 | 5809 } |
| 5810 produce_chars (coding); | |
|
29725
2bc397e9b09a
(setup_coding_system) <4>: Reset member `cr_carryover'.
Kenichi Handa <handa@m17n.org>
parents:
29663
diff
changeset
|
5811 } |
| 88365 | 5812 else |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5813 { |
| 88365 | 5814 /* Record unprocessed bytes in coding->carryover. We are |
| 5815 sure that the number of data is less than the size of | |
| 5816 coding->carryover. */ | |
| 5817 unsigned char *p = coding->carryover; | |
| 5818 | |
| 5819 coding->carryover_bytes = nbytes; | |
| 5820 while (nbytes-- > 0) | |
| 5821 *p++ = *src++; | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5822 } |
| 88365 | 5823 coding->consumed = coding->src_bytes; |
| 5824 } | |
| 5825 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5826 if (BUFFERP (coding->dst_object)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5827 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5828 current_buffer->undo_list = undo_list; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5829 record_insert (coding->dst_pos, coding->produced_char); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5830 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5831 if (! EQ (CODING_ID_EOL_TYPE (coding->id), Qunix)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5832 decode_eol (coding); |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
5833 return coding->result; |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5834 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5835 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5836 |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
5837 /* Extract an annotation datum from a composition starting at POS and |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5838 ending before LIMIT of CODING->src_object (buffer or string), store |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5839 the data in BUF, set *STOP to a starting position of the next |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5840 composition (if any) or to LIMIT, and return the address of the |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5841 next element of BUF. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5842 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5843 If such an annotation is not found, set *STOP to a starting |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5844 position of a composition after POS (if any) or to LIMIT, and |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5845 return BUF. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5846 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5847 static INLINE int * |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5848 handle_composition_annotation (pos, limit, coding, buf, stop) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5849 EMACS_INT pos, limit; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5850 struct coding_system *coding; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5851 int *buf; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5852 EMACS_INT *stop; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5853 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5854 EMACS_INT start, end; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5855 Lisp_Object prop; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5856 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5857 if (! find_composition (pos, limit, &start, &end, &prop, coding->src_object) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5858 || end > limit) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5859 *stop = limit; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5860 else if (start > pos) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5861 *stop = start; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5862 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5863 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5864 if (start == pos) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5865 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5866 /* We found a composition. Store the corresponding |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5867 annotation data in BUF. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5868 int *head = buf; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5869 enum composition_method method = COMPOSITION_METHOD (prop); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5870 int nchars = COMPOSITION_LENGTH (prop); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5871 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5872 ADD_COMPOSITION_DATA (buf, 0, nchars, method); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5873 if (method != COMPOSITION_RELATIVE) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5874 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5875 Lisp_Object components; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5876 int len, i, i_byte; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5877 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5878 components = COMPOSITION_COMPONENTS (prop); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5879 if (VECTORP (components)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5880 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5881 len = XVECTOR (components)->size; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5882 for (i = 0; i < len; i++) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5883 *buf++ = XINT (AREF (components, i)); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5884 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5885 else if (STRINGP (components)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5886 { |
| 89483 | 5887 len = SCHARS (components); |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5888 i = i_byte = 0; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5889 while (i < len) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5890 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5891 FETCH_STRING_CHAR_ADVANCE (*buf, components, i, i_byte); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5892 buf++; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5893 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5894 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5895 else if (INTEGERP (components)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5896 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5897 len = 1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5898 *buf++ = XINT (components); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5899 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5900 else if (CONSP (components)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5901 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5902 for (len = 0; CONSP (components); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5903 len++, components = XCDR (components)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5904 *buf++ = XINT (XCAR (components)); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5905 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5906 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5907 abort (); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5908 *head -= len; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5909 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5910 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5911 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5912 if (find_composition (end, limit, &start, &end, &prop, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5913 coding->src_object) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5914 && end <= limit) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5915 *stop = start; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5916 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5917 *stop = limit; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5918 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5919 return buf; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5920 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5921 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5922 |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
5923 /* Extract an annotation datum from a text property `charset' at POS of |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5924 CODING->src_object (buffer of string), store the data in BUF, set |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5925 *STOP to the position where the value of `charset' property changes |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5926 (limiting by LIMIT), and return the address of the next element of |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5927 BUF. |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5928 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5929 If the property value is nil, set *STOP to the position where the |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5930 property value is non-nil (limiting by LIMIT), and return BUF. */ |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5931 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5932 static INLINE int * |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5933 handle_charset_annotation (pos, limit, coding, buf, stop) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5934 EMACS_INT pos, limit; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5935 struct coding_system *coding; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5936 int *buf; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5937 EMACS_INT *stop; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5938 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5939 Lisp_Object val, next; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5940 int id; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5941 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5942 val = Fget_text_property (make_number (pos), Qcharset, coding->src_object); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5943 if (! NILP (val) && CHARSETP (val)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5944 id = XINT (CHARSET_SYMBOL_ID (val)); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5945 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5946 id = -1; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5947 ADD_CHARSET_DATA (buf, 0, 0, id); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5948 next = Fnext_single_property_change (make_number (pos), Qcharset, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5949 coding->src_object, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5950 make_number (limit)); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5951 *stop = XINT (next); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5952 return buf; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5953 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5954 |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5955 |
| 88365 | 5956 static void |
| 5957 consume_chars (coding) | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5958 struct coding_system *coding; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5959 { |
| 88365 | 5960 int *buf = coding->charbuf; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5961 int *buf_end = coding->charbuf + coding->charbuf_size; |
|
88876
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
5962 const unsigned char *src = coding->source + coding->consumed; |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
5963 const unsigned char *src_end = coding->source + coding->src_bytes; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5964 EMACS_INT pos = coding->src_pos + coding->consumed_char; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5965 EMACS_INT end_pos = coding->src_pos + coding->src_chars; |
| 88365 | 5966 int multibytep = coding->src_multibyte; |
| 5967 Lisp_Object eol_type; | |
| 5968 int c; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5969 EMACS_INT stop, stop_composition, stop_charset; |
| 88365 | 5970 |
| 5971 eol_type = CODING_ID_EOL_TYPE (coding->id); | |
| 5972 if (VECTORP (eol_type)) | |
| 5973 eol_type = Qunix; | |
| 5974 | |
| 5975 /* Note: composition handling is not yet implemented. */ | |
| 5976 coding->common_flags &= ~CODING_ANNOTATE_COMPOSITION_MASK; | |
| 5977 | |
|
89562
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5978 if (NILP (coding->src_object)) |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5979 stop = stop_composition = stop_charset = end_pos; |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5980 else |
|
89562
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5981 { |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5982 if (coding->common_flags & CODING_ANNOTATE_COMPOSITION_MASK) |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5983 stop = stop_composition = pos; |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5984 else |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5985 stop = stop_composition = end_pos; |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5986 if (coding->common_flags & CODING_ANNOTATE_CHARSET_MASK) |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5987 stop = stop_charset = pos; |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5988 else |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5989 stop_charset = end_pos; |
|
12fbcfebb9ad
(consume_chars): If coding->src_object is nil, don't check annotation.
Kenichi Handa <handa@m17n.org>
parents:
89545
diff
changeset
|
5990 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5991 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
5992 /* Compensate for CRLF and conversion. */ |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
5993 buf_end -= 1 + MAX_ANNOTATION_LENGTH; |
| 88365 | 5994 while (buf < buf_end) |
| 5995 { | |
| 5996 if (pos == stop) | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
5997 { |
| 88365 | 5998 if (pos == end_pos) |
| 5999 break; | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6000 if (pos == stop_composition) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6001 buf = handle_composition_annotation (pos, end_pos, coding, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6002 buf, &stop_composition); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6003 if (pos == stop_charset) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6004 buf = handle_charset_annotation (pos, end_pos, coding, |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6005 buf, &stop_charset); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6006 stop = (stop_composition < stop_charset |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6007 ? stop_composition : stop_charset); |
| 88365 | 6008 } |
| 6009 | |
| 6010 if (! multibytep) | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6011 { |
|
89462
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
6012 EMACS_INT bytes; |
|
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
6013 |
|
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
6014 if (! CODING_FOR_UNIBYTE (coding) |
|
4e359ebf3984
(decode_coding_iso_2022): Fix handling of invalid
Kenichi Handa <handa@m17n.org>
parents:
89448
diff
changeset
|
6015 && (bytes = MULTIBYTE_LENGTH (src, src_end)) > 0) |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6016 c = STRING_CHAR_ADVANCE (src), pos += bytes; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6017 else |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6018 c = *src++, pos++; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6019 } |
| 88365 | 6020 else |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6021 c = STRING_CHAR_ADVANCE (src), pos++; |
| 88365 | 6022 if ((c == '\r') && (coding->mode & CODING_MODE_SELECTIVE_DISPLAY)) |
| 6023 c = '\n'; | |
| 6024 if (! EQ (eol_type, Qunix)) | |
| 6025 { | |
| 6026 if (c == '\n') | |
| 6027 { | |
| 6028 if (EQ (eol_type, Qdos)) | |
| 6029 *buf++ = '\r'; | |
| 6030 else | |
| 6031 c = '\r'; | |
| 6032 } | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6033 } |
| 88365 | 6034 *buf++ = c; |
| 6035 } | |
| 6036 | |
| 6037 coding->consumed = src - coding->source; | |
| 6038 coding->consumed_char = pos - coding->src_pos; | |
| 6039 coding->charbuf_used = buf - coding->charbuf; | |
| 6040 coding->chars_at_source = 0; | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6041 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6042 |
| 88365 | 6043 |
| 6044 /* Encode the text at CODING->src_object into CODING->dst_object. | |
| 6045 CODING->src_object is a buffer or a string. | |
| 6046 CODING->dst_object is a buffer or nil. | |
| 6047 | |
| 6048 If CODING->src_object is a buffer, it must be the current buffer. | |
| 6049 In this case, if CODING->src_pos is positive, it is a position of | |
| 6050 the source text in the buffer, otherwise. the source text is in the | |
| 6051 gap area of the buffer, and coding->src_pos specifies the offset of | |
| 6052 the text from GPT (which must be the same as PT). If this is the | |
| 6053 same buffer as CODING->dst_object, CODING->src_pos must be | |
| 6054 negative and CODING should not have `pre-write-conversion'. | |
| 6055 | |
| 6056 If CODING->src_object is a string, CODING should not have | |
| 6057 `pre-write-conversion'. | |
| 6058 | |
| 6059 If CODING->dst_object is a buffer, the encoded data is inserted at | |
| 6060 the current point of that buffer. | |
| 6061 | |
| 6062 If CODING->dst_object is nil, the encoded data is placed at the | |
| 6063 memory area specified by CODING->destination. */ | |
| 6064 | |
| 6065 static int | |
| 6066 encode_coding (coding) | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6067 struct coding_system *coding; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6068 { |
| 88365 | 6069 Lisp_Object attrs; |
| 6070 | |
| 6071 attrs = CODING_ID_ATTRS (coding->id); | |
| 6072 | |
| 6073 if (BUFFERP (coding->dst_object)) | |
| 6074 { | |
| 6075 set_buffer_internal (XBUFFER (coding->dst_object)); | |
| 6076 coding->dst_multibyte | |
| 6077 = ! NILP (current_buffer->enable_multibyte_characters); | |
| 6078 } | |
| 6079 | |
| 6080 coding->consumed = coding->consumed_char = 0; | |
| 6081 coding->produced = coding->produced_char = 0; | |
| 6082 coding->result = CODING_RESULT_SUCCESS; | |
| 6083 coding->errors = 0; | |
| 6084 | |
| 6085 ALLOC_CONVERSION_WORK_AREA (coding); | |
| 6086 | |
| 6087 do { | |
| 6088 coding_set_source (coding); | |
| 6089 consume_chars (coding); | |
| 6090 | |
| 6091 if (!NILP (CODING_ATTR_ENCODE_TBL (attrs))) | |
|
89207
c232917f49f7
(decode_coding): Fix args to translate_chars. Pay
Kenichi Handa <handa@m17n.org>
parents:
89193
diff
changeset
|
6092 translate_chars (coding, CODING_ATTR_ENCODE_TBL (attrs)); |
|
c232917f49f7
(decode_coding): Fix args to translate_chars. Pay
Kenichi Handa <handa@m17n.org>
parents:
89193
diff
changeset
|
6093 else if (!NILP (Vstandard_translation_table_for_encode)) |
|
c232917f49f7
(decode_coding): Fix args to translate_chars. Pay
Kenichi Handa <handa@m17n.org>
parents:
89193
diff
changeset
|
6094 translate_chars (coding, Vstandard_translation_table_for_encode); |
| 88365 | 6095 |
| 6096 coding_set_destination (coding); | |
| 6097 (*(coding->encoder)) (coding); | |
| 6098 } while (coding->consumed_char < coding->src_chars); | |
| 6099 | |
| 6100 if (BUFFERP (coding->dst_object)) | |
| 6101 insert_from_gap (coding->produced_char, coding->produced); | |
| 6102 | |
| 6103 return (coding->result); | |
| 6104 } | |
| 6105 | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6106 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6107 /* Name (or base name) of work buffer for code conversion. */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6108 static Lisp_Object Vcode_conversion_workbuf_name; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6109 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6110 /* A working buffer used by the top level conversion. Once it is |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6111 created, it is never destroyed. It has the name |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6112 Vcode_conversion_workbuf_name. The other working buffers are |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6113 destroyed after the use is finished, and their names are modified |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6114 versions of Vcode_conversion_workbuf_name. */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6115 static Lisp_Object Vcode_conversion_reused_workbuf; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6116 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6117 /* 1 iff Vcode_conversion_reused_workbuf is already in use. */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6118 static int reused_workbuf_in_use; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6119 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6120 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6121 /* Return a working buffer of code convesion. MULTIBYTE specifies the |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6122 multibyteness of returning buffer. */ |
| 88365 | 6123 |
| 6124 Lisp_Object | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6125 make_conversion_work_buffer (multibyte) |
| 88365 | 6126 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6127 Lisp_Object name, workbuf; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6128 struct buffer *current; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6129 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6130 if (reused_workbuf_in_use++) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6131 name = Fgenerate_new_buffer_name (Vcode_conversion_workbuf_name, Qnil); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6132 else |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6133 name = Vcode_conversion_workbuf_name; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6134 workbuf = Fget_buffer_create (name); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6135 current = current_buffer; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6136 set_buffer_internal (XBUFFER (workbuf)); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6137 Ferase_buffer (); |
| 88365 | 6138 current_buffer->undo_list = Qt; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6139 current_buffer->enable_multibyte_characters = multibyte ? Qt : Qnil; |
| 88365 | 6140 set_buffer_internal (current); |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6141 return workbuf; |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6142 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6143 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6144 |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6145 static Lisp_Object |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6146 code_conversion_restore (arg) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6147 Lisp_Object arg; |
|
26067
f54ca66e2571
(code_convert_string): Add record_unwind_protect to
Kenichi Handa <handa@m17n.org>
parents:
25860
diff
changeset
|
6148 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6149 Lisp_Object current, workbuf; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6150 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6151 current = XCAR (arg); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6152 workbuf = XCDR (arg); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6153 if (! NILP (workbuf)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6154 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6155 if (EQ (workbuf, Vcode_conversion_reused_workbuf)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6156 reused_workbuf_in_use = 0; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6157 else if (! NILP (Fbuffer_live_p (workbuf))) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6158 Fkill_buffer (workbuf); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6159 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6160 set_buffer_internal (XBUFFER (current)); |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6161 return Qnil; |
|
26067
f54ca66e2571
(code_convert_string): Add record_unwind_protect to
Kenichi Handa <handa@m17n.org>
parents:
25860
diff
changeset
|
6162 } |
|
f54ca66e2571
(code_convert_string): Add record_unwind_protect to
Kenichi Handa <handa@m17n.org>
parents:
25860
diff
changeset
|
6163 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6164 Lisp_Object |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6165 code_conversion_save (with_work_buf, multibyte) |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6166 int with_work_buf, multibyte; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6167 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6168 Lisp_Object workbuf = Qnil; |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6169 |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6170 if (with_work_buf) |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6171 workbuf = make_conversion_work_buffer (multibyte); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6172 record_unwind_protect (code_conversion_restore, |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6173 Fcons (Fcurrent_buffer (), workbuf)); |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6174 return workbuf; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6175 } |
| 88365 | 6176 |
| 6177 int | |
| 6178 decode_coding_gap (coding, chars, bytes) | |
| 26847 | 6179 struct coding_system *coding; |
| 88365 | 6180 EMACS_INT chars, bytes; |
| 6181 { | |
| 6182 int count = specpdl_ptr - specpdl; | |
|
89448
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6183 Lisp_Object attrs; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6184 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6185 code_conversion_save (0, 0); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6186 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6187 coding->src_object = Fcurrent_buffer (); |
| 88365 | 6188 coding->src_chars = chars; |
| 6189 coding->src_bytes = bytes; | |
| 6190 coding->src_pos = -chars; | |
| 6191 coding->src_pos_byte = -bytes; | |
| 6192 coding->src_multibyte = chars < bytes; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6193 coding->dst_object = coding->src_object; |
| 88365 | 6194 coding->dst_pos = PT; |
| 6195 coding->dst_pos_byte = PT_BYTE; | |
|
88443
6b86cf30a0b9
(produce_chars): Set the variable `multibytep' correctly.
Kenichi Handa <handa@m17n.org>
parents:
88438
diff
changeset
|
6196 coding->dst_multibyte = ! NILP (current_buffer->enable_multibyte_characters); |
|
89279
1fd77c471ee6
(decode_coding_utf_8): When eol_type is Qdos, handle
Kenichi Handa <handa@m17n.org>
parents:
89227
diff
changeset
|
6197 coding->mode |= CODING_MODE_LAST_BLOCK; |
| 88365 | 6198 |
| 6199 if (CODING_REQUIRE_DETECTION (coding)) | |
| 6200 detect_coding (coding); | |
| 89483 | 6201 |
| 88365 | 6202 decode_coding (coding); |
| 6203 | |
|
89448
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6204 attrs = CODING_ID_ATTRS (coding->id); |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6205 if (! NILP (CODING_ATTR_POST_READ (attrs))) |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6206 { |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6207 EMACS_INT prev_Z = Z, prev_Z_BYTE = Z_BYTE; |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6208 Lisp_Object val; |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6209 |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6210 TEMP_SET_PT_BOTH (coding->dst_pos, coding->dst_pos_byte); |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6211 val = call1 (CODING_ATTR_POST_READ (attrs), |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6212 make_number (coding->produced_char)); |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6213 CHECK_NATNUM (val); |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6214 coding->produced_char += Z - prev_Z; |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6215 coding->produced += Z_BYTE - prev_Z_BYTE; |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6216 } |
|
de8b460070cc
(setup_coding_system): If coding has
Kenichi Handa <handa@m17n.org>
parents:
89446
diff
changeset
|
6217 |
| 88365 | 6218 unbind_to (count, Qnil); |
| 6219 return coding->result; | |
| 6220 } | |
| 6221 | |
| 6222 int | |
| 6223 encode_coding_gap (coding, chars, bytes) | |
| 6224 struct coding_system *coding; | |
| 6225 EMACS_INT chars, bytes; | |
| 26847 | 6226 { |
| 88365 | 6227 int count = specpdl_ptr - specpdl; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6228 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6229 code_conversion_save (0, 0); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6230 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6231 coding->src_object = Fcurrent_buffer (); |
| 88365 | 6232 coding->src_chars = chars; |
| 6233 coding->src_bytes = bytes; | |
| 6234 coding->src_pos = -chars; | |
| 6235 coding->src_pos_byte = -bytes; | |
| 6236 coding->src_multibyte = chars < bytes; | |
| 6237 coding->dst_object = coding->src_object; | |
| 6238 coding->dst_pos = PT; | |
| 6239 coding->dst_pos_byte = PT_BYTE; | |
| 6240 | |
| 6241 encode_coding (coding); | |
| 6242 | |
| 6243 unbind_to (count, Qnil); | |
| 6244 return coding->result; | |
| 26847 | 6245 } |
| 6246 | |
| 88365 | 6247 |
| 6248 /* Decode the text in the range FROM/FROM_BYTE and TO/TO_BYTE in | |
| 6249 SRC_OBJECT into DST_OBJECT by coding context CODING. | |
| 6250 | |
| 6251 SRC_OBJECT is a buffer, a string, or Qnil. | |
| 6252 | |
| 6253 If it is a buffer, the text is at point of the buffer. FROM and TO | |
| 6254 are positions in the buffer. | |
| 6255 | |
| 6256 If it is a string, the text is at the beginning of the string. | |
| 6257 FROM and TO are indices to the string. | |
| 6258 | |
| 6259 If it is nil, the text is at coding->source. FROM and TO are | |
| 6260 indices to coding->source. | |
| 6261 | |
| 6262 DST_OBJECT is a buffer, Qt, or Qnil. | |
| 6263 | |
| 6264 If it is a buffer, the decoded text is inserted at point of the | |
| 6265 buffer. If the buffer is the same as SRC_OBJECT, the source text | |
| 6266 is deleted. | |
| 6267 | |
| 6268 If it is Qt, a string is made from the decoded text, and | |
| 6269 set in CODING->dst_object. | |
| 6270 | |
| 6271 If it is Qnil, the decoded text is stored at CODING->destination. | |
|
89418
a9c2b3712863
(coding_set_source): Fix for the case that the current
Kenichi Handa <handa@m17n.org>
parents:
89404
diff
changeset
|
6272 The caller must allocate CODING->dst_bytes bytes at |
| 88365 | 6273 CODING->destination by xmalloc. If the decoded text is longer than |
| 6274 CODING->dst_bytes, CODING->destination is relocated by xrealloc. | |
| 6275 */ | |
| 26847 | 6276 |
|
29275
b4ea9178e480
(DECODE_COMPOSITION_START): If coding->cmp_data is not
Kenichi Handa <handa@m17n.org>
parents:
29247
diff
changeset
|
6277 void |
| 88365 | 6278 decode_coding_object (coding, src_object, from, from_byte, to, to_byte, |
| 6279 dst_object) | |
| 26847 | 6280 struct coding_system *coding; |
| 88365 | 6281 Lisp_Object src_object; |
| 6282 EMACS_INT from, from_byte, to, to_byte; | |
| 6283 Lisp_Object dst_object; | |
| 26847 | 6284 { |
| 88365 | 6285 int count = specpdl_ptr - specpdl; |
| 6286 unsigned char *destination; | |
| 6287 EMACS_INT dst_bytes; | |
| 6288 EMACS_INT chars = to - from; | |
| 6289 EMACS_INT bytes = to_byte - from_byte; | |
| 6290 Lisp_Object attrs; | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6291 Lisp_Object buffer; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6292 int saved_pt = -1, saved_pt_byte; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6293 |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6294 buffer = Fcurrent_buffer (); |
| 88365 | 6295 |
| 6296 if (NILP (dst_object)) | |
| 6297 { | |
| 6298 destination = coding->destination; | |
| 6299 dst_bytes = coding->dst_bytes; | |
| 6300 } | |
| 6301 | |
| 6302 coding->src_object = src_object; | |
| 6303 coding->src_chars = chars; | |
| 6304 coding->src_bytes = bytes; | |
| 6305 coding->src_multibyte = chars < bytes; | |
| 6306 | |
| 6307 if (STRINGP (src_object)) | |
| 6308 { | |
| 6309 coding->src_pos = from; | |
| 6310 coding->src_pos_byte = from_byte; | |
| 6311 } | |
| 6312 else if (BUFFERP (src_object)) | |
| 6313 { | |
| 6314 set_buffer_internal (XBUFFER (src_object)); | |
| 6315 if (from != GPT) | |
| 6316 move_gap_both (from, from_byte); | |
| 6317 if (EQ (src_object, dst_object)) | |
| 26847 | 6318 { |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6319 saved_pt = PT, saved_pt_byte = PT_BYTE; |
| 88365 | 6320 TEMP_SET_PT_BOTH (from, from_byte); |
| 6321 del_range_both (from, from_byte, to, to_byte, 1); | |
| 6322 coding->src_pos = -chars; | |
| 6323 coding->src_pos_byte = -bytes; | |
|
20931
068eb408c911
(decode_coding_iso2022): Update coding->fake_multibyte.
Kenichi Handa <handa@m17n.org>
parents:
20803
diff
changeset
|
6324 } |
|
42661
e85e4d9494b1
(code_convert_region): Don't copy old text if undo disabled.
Richard M. Stallman <rms@gnu.org>
parents:
42105
diff
changeset
|
6325 else |
|
e85e4d9494b1
(code_convert_region): Don't copy old text if undo disabled.
Richard M. Stallman <rms@gnu.org>
parents:
42105
diff
changeset
|
6326 { |
| 88365 | 6327 coding->src_pos = from; |
| 6328 coding->src_pos_byte = from_byte; | |
|
29985
c17e78d8c720
(code_convert_region): Even if the length of text is
Kenichi Handa <handa@m17n.org>
parents:
29932
diff
changeset
|
6329 } |
| 88365 | 6330 } |
| 6331 | |
| 6332 if (CODING_REQUIRE_DETECTION (coding)) | |
| 6333 detect_coding (coding); | |
| 6334 attrs = CODING_ID_ATTRS (coding->id); | |
| 6335 | |
|
89418
a9c2b3712863
(coding_set_source): Fix for the case that the current
Kenichi Handa <handa@m17n.org>
parents:
89404
diff
changeset
|
6336 if (EQ (dst_object, Qt) |
|
a9c2b3712863
(coding_set_source): Fix for the case that the current
Kenichi Handa <handa@m17n.org>
parents:
89404
diff
changeset
|
6337 || (! NILP (CODING_ATTR_POST_READ (attrs)) |
|
a9c2b3712863
(coding_set_source): Fix for the case that the current
Kenichi Handa <handa@m17n.org>
parents:
89404
diff
changeset
|
6338 && NILP (dst_object))) |
| 88365 | 6339 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6340 coding->dst_object = code_conversion_save (1, 1); |
| 88365 | 6341 coding->dst_pos = BEG; |
| 6342 coding->dst_pos_byte = BEG_BYTE; | |
| 6343 coding->dst_multibyte = 1; | |
| 6344 } | |
| 6345 else if (BUFFERP (dst_object)) | |
| 6346 { | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6347 code_conversion_save (0, 0); |
| 88365 | 6348 coding->dst_object = dst_object; |
| 6349 coding->dst_pos = BUF_PT (XBUFFER (dst_object)); | |
| 6350 coding->dst_pos_byte = BUF_PT_BYTE (XBUFFER (dst_object)); | |
| 6351 coding->dst_multibyte | |
| 6352 = ! NILP (XBUFFER (dst_object)->enable_multibyte_characters); | |
| 6353 } | |
| 6354 else | |
| 6355 { | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6356 code_conversion_save (0, 0); |
| 88365 | 6357 coding->dst_object = Qnil; |
| 6358 coding->dst_multibyte = 1; | |
| 6359 } | |
| 6360 | |
| 6361 decode_coding (coding); | |
| 6362 | |
| 6363 if (BUFFERP (coding->dst_object)) | |
| 6364 set_buffer_internal (XBUFFER (coding->dst_object)); | |
| 6365 | |
| 6366 if (! NILP (CODING_ATTR_POST_READ (attrs))) | |
| 6367 { | |
| 6368 struct gcpro gcpro1, gcpro2; | |
| 6369 EMACS_INT prev_Z = Z, prev_Z_BYTE = Z_BYTE; | |
| 6370 Lisp_Object val; | |
| 6371 | |
|
88506
a7f0d13affa5
(decode_coding_object): Move point to coding->dst_pos before
Kenichi Handa <handa@m17n.org>
parents:
88497
diff
changeset
|
6372 TEMP_SET_PT_BOTH (coding->dst_pos, coding->dst_pos_byte); |
| 88365 | 6373 GCPRO2 (coding->src_object, coding->dst_object); |
| 6374 val = call1 (CODING_ATTR_POST_READ (attrs), | |
| 6375 make_number (coding->produced_char)); | |
| 6376 UNGCPRO; | |
| 6377 CHECK_NATNUM (val); | |
| 6378 coding->produced_char += Z - prev_Z; | |
| 6379 coding->produced += Z_BYTE - prev_Z_BYTE; | |
| 6380 } | |
| 6381 | |
| 6382 if (EQ (dst_object, Qt)) | |
| 6383 { | |
| 6384 coding->dst_object = Fbuffer_string (); | |
| 6385 } | |
| 6386 else if (NILP (dst_object) && BUFFERP (coding->dst_object)) | |
| 6387 { | |
| 6388 set_buffer_internal (XBUFFER (coding->dst_object)); | |
| 6389 if (dst_bytes < coding->produced) | |
|
42105
09cc243e2d14
(code_convert_region): Update coding->cmp_data->char_offset
Richard M. Stallman <rms@gnu.org>
parents:
42104
diff
changeset
|
6390 { |
| 88365 | 6391 destination |
| 6392 = (unsigned char *) xrealloc (destination, coding->produced); | |
| 6393 if (! destination) | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6394 { |
| 88365 | 6395 coding->result = CODING_RESULT_INSUFFICIENT_DST; |
| 6396 unbind_to (count, Qnil); | |
| 6397 return; | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6398 } |
| 88365 | 6399 if (BEGV < GPT && GPT < BEGV + coding->produced_char) |
| 6400 move_gap_both (BEGV, BEGV_BYTE); | |
| 6401 bcopy (BEGV_ADDR, destination, coding->produced); | |
| 6402 coding->destination = destination; | |
|
23279
ca159e828a68
(ccl_coding_driver): If ccl_driver is interrupted by a
Kenichi Handa <handa@m17n.org>
parents:
23258
diff
changeset
|
6403 } |
| 88365 | 6404 } |
| 6405 | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6406 if (saved_pt >= 0) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6407 { |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6408 /* This is the case of: |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6409 (BUFFERP (src_object) && EQ (src_object, dst_object)) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6410 As we have moved PT while replacing the original buffer |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6411 contents, we must recover it now. */ |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6412 set_buffer_internal (XBUFFER (src_object)); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6413 if (saved_pt < from) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6414 TEMP_SET_PT_BOTH (saved_pt, saved_pt_byte); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6415 else if (saved_pt < from + chars) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6416 TEMP_SET_PT_BOTH (from, from_byte); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6417 else if (! NILP (current_buffer->enable_multibyte_characters)) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6418 TEMP_SET_PT_BOTH (saved_pt + (coding->produced_char - chars), |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6419 saved_pt_byte + (coding->produced - bytes)); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6420 else |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6421 TEMP_SET_PT_BOTH (saved_pt + (coding->produced - bytes), |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6422 saved_pt_byte + (coding->produced - bytes)); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6423 } |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6424 |
| 88365 | 6425 unbind_to (count, Qnil); |
| 6426 } | |
| 6427 | |
| 6428 | |
| 6429 void | |
| 6430 encode_coding_object (coding, src_object, from, from_byte, to, to_byte, | |
| 6431 dst_object) | |
| 6432 struct coding_system *coding; | |
| 6433 Lisp_Object src_object; | |
| 6434 EMACS_INT from, from_byte, to, to_byte; | |
| 6435 Lisp_Object dst_object; | |
| 6436 { | |
| 6437 int count = specpdl_ptr - specpdl; | |
| 6438 EMACS_INT chars = to - from; | |
| 6439 EMACS_INT bytes = to_byte - from_byte; | |
| 6440 Lisp_Object attrs; | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6441 Lisp_Object buffer; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6442 int saved_pt = -1, saved_pt_byte; |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6443 |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6444 buffer = Fcurrent_buffer (); |
| 88365 | 6445 |
| 6446 coding->src_object = src_object; | |
| 6447 coding->src_chars = chars; | |
| 6448 coding->src_bytes = bytes; | |
| 6449 coding->src_multibyte = chars < bytes; | |
| 6450 | |
| 6451 attrs = CODING_ID_ATTRS (coding->id); | |
| 6452 | |
| 6453 if (! NILP (CODING_ATTR_PRE_WRITE (attrs))) | |
|
21062
839b22ad1e42
(code_convert_region): Handle the case that codes
Kenichi Handa <handa@m17n.org>
parents:
20999
diff
changeset
|
6454 { |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6455 coding->src_object = code_conversion_save (1, coding->src_multibyte); |
| 88365 | 6456 set_buffer_internal (XBUFFER (coding->src_object)); |
| 6457 if (STRINGP (src_object)) | |
| 6458 insert_from_string (src_object, from, from_byte, chars, bytes, 0); | |
| 6459 else if (BUFFERP (src_object)) | |
| 6460 insert_from_buffer (XBUFFER (src_object), from, chars, 0); | |
| 6461 else | |
| 6462 insert_1_both (coding->source + from, chars, bytes, 0, 0, 0); | |
| 6463 | |
| 6464 if (EQ (src_object, dst_object)) | |
| 6465 { | |
| 6466 set_buffer_internal (XBUFFER (src_object)); | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6467 saved_pt = PT, saved_pt_byte = PT_BYTE; |
| 88365 | 6468 del_range_both (from, from_byte, to, to_byte, 1); |
| 6469 set_buffer_internal (XBUFFER (coding->src_object)); | |
| 6470 } | |
| 6471 | |
|
88510
d266b8fb8761
(encode_coding_object): Give correct arguments ot
Kenichi Handa <handa@m17n.org>
parents:
88506
diff
changeset
|
6472 call2 (CODING_ATTR_PRE_WRITE (attrs), |
|
d266b8fb8761
(encode_coding_object): Give correct arguments ot
Kenichi Handa <handa@m17n.org>
parents:
88506
diff
changeset
|
6473 make_number (BEG), make_number (Z)); |
|
d266b8fb8761
(encode_coding_object): Give correct arguments ot
Kenichi Handa <handa@m17n.org>
parents:
88506
diff
changeset
|
6474 coding->src_object = Fcurrent_buffer (); |
| 88365 | 6475 if (BEG != GPT) |
| 6476 move_gap_both (BEG, BEG_BYTE); | |
| 6477 coding->src_chars = Z - BEG; | |
| 6478 coding->src_bytes = Z_BYTE - BEG_BYTE; | |
| 6479 coding->src_pos = BEG; | |
| 6480 coding->src_pos_byte = BEG_BYTE; | |
| 6481 coding->src_multibyte = Z < Z_BYTE; | |
| 6482 } | |
| 6483 else if (STRINGP (src_object)) | |
| 6484 { | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6485 code_conversion_save (0, 0); |
| 88365 | 6486 coding->src_pos = from; |
| 6487 coding->src_pos_byte = from_byte; | |
| 6488 } | |
| 6489 else if (BUFFERP (src_object)) | |
| 6490 { | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6491 code_conversion_save (0, 0); |
| 88365 | 6492 set_buffer_internal (XBUFFER (src_object)); |
| 6493 if (EQ (src_object, dst_object)) | |
| 6494 { | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6495 saved_pt = PT, saved_pt_byte = PT_BYTE; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6496 coding->src_object = del_range_1 (from, to, 1, 1); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6497 coding->src_pos = 0; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6498 coding->src_pos_byte = 0; |
| 88365 | 6499 } |
|
23514
7bad909cd6f1
(setup_coding_system): Fix setting up
Kenichi Handa <handa@m17n.org>
parents:
23475
diff
changeset
|
6500 else |
| 88365 | 6501 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6502 if (from < GPT && to >= GPT) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6503 move_gap_both (from, from_byte); |
| 88365 | 6504 coding->src_pos = from; |
| 6505 coding->src_pos_byte = from_byte; | |
| 6506 } | |
| 6507 } | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6508 else |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6509 code_conversion_save (0, 0); |
| 88365 | 6510 |
| 6511 if (BUFFERP (dst_object)) | |
| 6512 { | |
| 6513 coding->dst_object = dst_object; | |
|
89042
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6514 if (EQ (src_object, dst_object)) |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6515 { |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6516 coding->dst_pos = from; |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6517 coding->dst_pos_byte = from_byte; |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6518 } |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6519 else |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6520 { |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6521 coding->dst_pos = BUF_PT (XBUFFER (dst_object)); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6522 coding->dst_pos_byte = BUF_PT_BYTE (XBUFFER (dst_object)); |
|
2b9f8973f240
(coding_set_destination): Fix coding->destination for
Kenichi Handa <handa@m17n.org>
parents:
88977
diff
changeset
|
6523 } |
| 88365 | 6524 coding->dst_multibyte |
| 6525 = ! NILP (XBUFFER (dst_object)->enable_multibyte_characters); | |
| 6526 } | |
| 6527 else if (EQ (dst_object, Qt)) | |
| 6528 { | |
| 6529 coding->dst_object = Qnil; | |
| 6530 coding->dst_bytes = coding->src_chars; | |
|
88510
d266b8fb8761
(encode_coding_object): Give correct arguments ot
Kenichi Handa <handa@m17n.org>
parents:
88506
diff
changeset
|
6531 if (coding->dst_bytes == 0) |
|
d266b8fb8761
(encode_coding_object): Give correct arguments ot
Kenichi Handa <handa@m17n.org>
parents:
88506
diff
changeset
|
6532 coding->dst_bytes = 1; |
|
d266b8fb8761
(encode_coding_object): Give correct arguments ot
Kenichi Handa <handa@m17n.org>
parents:
88506
diff
changeset
|
6533 coding->destination = (unsigned char *) xmalloc (coding->dst_bytes); |
| 88365 | 6534 coding->dst_multibyte = 0; |
| 6535 } | |
| 6536 else | |
| 6537 { | |
| 6538 coding->dst_object = Qnil; | |
| 6539 coding->dst_multibyte = 0; | |
| 6540 } | |
| 6541 | |
| 6542 encode_coding (coding); | |
| 6543 | |
| 6544 if (EQ (dst_object, Qt)) | |
| 6545 { | |
| 6546 if (BUFFERP (coding->dst_object)) | |
| 6547 coding->dst_object = Fbuffer_string (); | |
| 6548 else | |
| 6549 { | |
| 6550 coding->dst_object | |
| 6551 = make_unibyte_string ((char *) coding->destination, | |
| 6552 coding->produced); | |
| 6553 xfree (coding->destination); | |
| 6554 } | |
| 6555 } | |
| 6556 | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6557 if (saved_pt >= 0) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6558 { |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6559 /* This is the case of: |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6560 (BUFFERP (src_object) && EQ (src_object, dst_object)) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6561 As we have moved PT while replacing the original buffer |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6562 contents, we must recover it now. */ |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6563 set_buffer_internal (XBUFFER (src_object)); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6564 if (saved_pt < from) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6565 TEMP_SET_PT_BOTH (saved_pt, saved_pt_byte); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6566 else if (saved_pt < from + chars) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6567 TEMP_SET_PT_BOTH (from, from_byte); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6568 else if (! NILP (current_buffer->enable_multibyte_characters)) |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6569 TEMP_SET_PT_BOTH (saved_pt + (coding->produced_char - chars), |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6570 saved_pt_byte + (coding->produced - bytes)); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6571 else |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6572 TEMP_SET_PT_BOTH (saved_pt + (coding->produced - bytes), |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6573 saved_pt_byte + (coding->produced - bytes)); |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6574 } |
|
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
6575 |
| 88365 | 6576 unbind_to (count, Qnil); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6577 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6578 |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
6579 |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
6580 Lisp_Object |
| 88365 | 6581 preferred_coding_system () |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6582 { |
| 88365 | 6583 int id = coding_categories[coding_priorities[0]].id; |
| 6584 | |
| 6585 return CODING_ID_NAME (id); | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6586 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6587 |
| 17052 | 6588 |
| 6589 #ifdef emacs | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
6590 /*** 8. Emacs Lisp library functions ***/ |
| 17052 | 6591 |
| 6592 DEFUN ("coding-system-p", Fcoding_system_p, Scoding_system_p, 1, 1, 0, | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6593 doc: /* Return t if OBJECT is nil or a coding-system. |
| 88365 | 6594 See the documentation of `define-coding-system' for information |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6595 about coding-system objects. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6596 (obj) |
| 17052 | 6597 Lisp_Object obj; |
| 6598 { | |
| 88365 | 6599 return ((NILP (obj) || CODING_SYSTEM_P (obj)) ? Qt : Qnil); |
| 17052 | 6600 } |
| 6601 | |
|
17717
4891aaecc5cc
(Fread_coding_system, Fread_non_nil_coding_system):
Richard M. Stallman <rms@gnu.org>
parents:
17485
diff
changeset
|
6602 DEFUN ("read-non-nil-coding-system", Fread_non_nil_coding_system, |
|
4891aaecc5cc
(Fread_coding_system, Fread_non_nil_coding_system):
Richard M. Stallman <rms@gnu.org>
parents:
17485
diff
changeset
|
6603 Sread_non_nil_coding_system, 1, 1, 0, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6604 doc: /* Read a coding system from the minibuffer, prompting with string PROMPT. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6605 (prompt) |
| 17052 | 6606 Lisp_Object prompt; |
| 6607 { | |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
6608 Lisp_Object val; |
|
17717
4891aaecc5cc
(Fread_coding_system, Fread_non_nil_coding_system):
Richard M. Stallman <rms@gnu.org>
parents:
17485
diff
changeset
|
6609 do |
|
4891aaecc5cc
(Fread_coding_system, Fread_non_nil_coding_system):
Richard M. Stallman <rms@gnu.org>
parents:
17485
diff
changeset
|
6610 { |
|
20105
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
6611 val = Fcompleting_read (prompt, Vcoding_system_alist, Qnil, |
|
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
6612 Qt, Qnil, Qcoding_system_history, Qnil, Qnil); |
|
17717
4891aaecc5cc
(Fread_coding_system, Fread_non_nil_coding_system):
Richard M. Stallman <rms@gnu.org>
parents:
17485
diff
changeset
|
6613 } |
|
46370
40db0673e6f0
Most uses of XSTRING combined with STRING_BYTES or indirection changed to
Ken Raeburn <raeburn@raeburn.org>
parents:
46293
diff
changeset
|
6614 while (SCHARS (val) == 0); |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
6615 return (Fintern (val, Qnil)); |
| 17052 | 6616 } |
| 6617 | |
|
19758
49a1662b68dd
(Fread_coding_system): New optional arg DEFAULT_CODING_SYSTEM.
Richard M. Stallman <rms@gnu.org>
parents:
19750
diff
changeset
|
6618 DEFUN ("read-coding-system", Fread_coding_system, Sread_coding_system, 1, 2, 0, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6619 doc: /* Read a coding system from the minibuffer, prompting with string PROMPT. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6620 If the user enters null input, return second argument DEFAULT-CODING-SYSTEM. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6621 (prompt, default_coding_system) |
|
19758
49a1662b68dd
(Fread_coding_system): New optional arg DEFAULT_CODING_SYSTEM.
Richard M. Stallman <rms@gnu.org>
parents:
19750
diff
changeset
|
6622 Lisp_Object prompt, default_coding_system; |
| 17052 | 6623 { |
|
19747
bed06df9cbc5
(setup_coding_system, Ffind_operation_coding_system)
Richard M. Stallman <rms@gnu.org>
parents:
19743
diff
changeset
|
6624 Lisp_Object val; |
|
19758
49a1662b68dd
(Fread_coding_system): New optional arg DEFAULT_CODING_SYSTEM.
Richard M. Stallman <rms@gnu.org>
parents:
19750
diff
changeset
|
6625 if (SYMBOLP (default_coding_system)) |
|
89519
040a08a2a879
(Fread_coding_system): Fix arg of XSETSTRING.
Dave Love <fx@gnu.org>
parents:
89483
diff
changeset
|
6626 XSETSTRING (default_coding_system, XPNTR (SYMBOL_NAME (default_coding_system))); |
|
20105
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
6627 val = Fcompleting_read (prompt, Vcoding_system_alist, Qnil, |
|
19758
49a1662b68dd
(Fread_coding_system): New optional arg DEFAULT_CODING_SYSTEM.
Richard M. Stallman <rms@gnu.org>
parents:
19750
diff
changeset
|
6628 Qt, Qnil, Qcoding_system_history, |
|
49a1662b68dd
(Fread_coding_system): New optional arg DEFAULT_CODING_SYSTEM.
Richard M. Stallman <rms@gnu.org>
parents:
19750
diff
changeset
|
6629 default_coding_system, Qnil); |
|
46370
40db0673e6f0
Most uses of XSTRING combined with STRING_BYTES or indirection changed to
Ken Raeburn <raeburn@raeburn.org>
parents:
46293
diff
changeset
|
6630 return (SCHARS (val) == 0 ? Qnil : Fintern (val, Qnil)); |
| 17052 | 6631 } |
| 6632 | |
| 6633 DEFUN ("check-coding-system", Fcheck_coding_system, Scheck_coding_system, | |
| 6634 1, 1, 0, | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6635 doc: /* Check validity of CODING-SYSTEM. |
| 89218 | 6636 If valid, return CODING-SYSTEM, else signal a `coding-system-error' error. */) |
| 88365 | 6637 (coding_system) |
| 17052 | 6638 Lisp_Object coding_system; |
| 6639 { | |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
6640 CHECK_SYMBOL (coding_system); |
| 17052 | 6641 if (!NILP (Fcoding_system_p (coding_system))) |
| 6642 return coding_system; | |
| 6643 while (1) | |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
6644 Fsignal (Qcoding_system_error, Fcons (coding_system, Qnil)); |
| 17052 | 6645 } |
| 88365 | 6646 |
|
20680
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
6647 |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6648 /* Detect how the bytes at SRC of length SRC_BYTES are encoded. If |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6649 HIGHEST is nonzero, return the coding system of the highest |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6650 priority among the detected coding systems. Otherwize return a |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6651 list of detected coding systems sorted by their priorities. If |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6652 MULTIBYTEP is nonzero, it is assumed that the bytes are in correct |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6653 multibyte form but contains only ASCII and eight-bit chars. |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6654 Otherwise, the bytes are raw bytes. |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6655 |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6656 CODING-SYSTEM controls the detection as below: |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6657 |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6658 If it is nil, detect both text-format and eol-format. If the |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6659 text-format part of CODING-SYSTEM is already specified |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6660 (e.g. `iso-latin-1'), detect only eol-format. If the eol-format |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6661 part of CODING-SYSTEM is already specified (e.g. `undecided-unix'), |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6662 detect only text-format. */ |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6663 |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6664 Lisp_Object |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6665 detect_coding_system (src, src_chars, src_bytes, highest, multibytep, |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6666 coding_system) |
|
46548
cb1914307488
(encode_eol, detect_coding, detect_eol):
Ken Raeburn <raeburn@raeburn.org>
parents:
46462
diff
changeset
|
6667 const unsigned char *src; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6668 int src_chars, src_bytes, highest; |
|
34531
37f85e931855
(ONE_MORE_BYTE_CHECK_MULTIBYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34197
diff
changeset
|
6669 int multibytep; |
| 88365 | 6670 Lisp_Object coding_system; |
| 17052 | 6671 { |
| 89483 | 6672 const unsigned char *src_end = src + src_bytes; |
| 88365 | 6673 Lisp_Object attrs, eol_type; |
| 6674 Lisp_Object val; | |
| 6675 struct coding_system coding; | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6676 int id; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6677 struct coding_detection_info detect_info; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6678 enum coding_category base_category; |
| 88365 | 6679 |
| 6680 if (NILP (coding_system)) | |
| 6681 coding_system = Qundecided; | |
| 6682 setup_coding_system (coding_system, &coding); | |
| 6683 attrs = CODING_ID_ATTRS (coding.id); | |
| 6684 eol_type = CODING_ID_EOL_TYPE (coding.id); | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6685 coding_system = CODING_ATTR_BASE_NAME (attrs); |
| 88365 | 6686 |
| 6687 coding.source = src; | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6688 coding.src_chars = src_chars; |
| 88365 | 6689 coding.src_bytes = src_bytes; |
| 6690 coding.src_multibyte = multibytep; | |
| 6691 coding.consumed = 0; | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6692 coding.mode |= CODING_MODE_LAST_BLOCK; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6693 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6694 detect_info.checked = detect_info.found = detect_info.rejected = 0; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6695 |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6696 /* At first, detect text-format if necessary. */ |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6697 base_category = XINT (CODING_ATTR_CATEGORY (attrs)); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6698 if (base_category == coding_category_undecided) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6699 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6700 enum coding_category category; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6701 struct coding_system *this; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6702 int c, i; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6703 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6704 /* Skip all ASCII bytes except for a few ISO2022 controls. */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6705 for (i = 0; src < src_end; i++, src++) |
| 17052 | 6706 { |
| 88365 | 6707 c = *src; |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6708 if (c & 0x80 || (c < 0x20 && (c == 0 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6709 || c == ISO_CODE_ESC |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6710 || c == ISO_CODE_SI |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6711 || c == ISO_CODE_SO))) |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6712 break; |
| 17052 | 6713 } |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6714 /* Skipped bytes must be even for utf-16 detecor. */ |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6715 if (i % 2) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6716 src--; |
| 88365 | 6717 coding.head_ascii = src - coding.source; |
| 6718 | |
| 6719 if (src < src_end) | |
| 6720 for (i = 0; i < coding_category_raw_text; i++) | |
| 6721 { | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6722 category = coding_priorities[i]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6723 this = coding_categories + category; |
| 88365 | 6724 |
| 6725 if (this->id < 0) | |
| 6726 { | |
| 6727 /* No coding system of this category is defined. */ | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6728 detect_info.rejected |= (1 << category); |
| 88365 | 6729 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6730 else if (category >= coding_category_raw_text) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6731 continue; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6732 else if (detect_info.checked & (1 << category)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6733 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6734 if (highest |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6735 && (detect_info.found & (1 << category))) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6736 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6737 } |
| 88365 | 6738 else |
| 6739 { | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6740 if ((*(this->detector)) (&coding, &detect_info) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6741 && highest |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6742 && (detect_info.found & (1 << category))) |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6743 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6744 if (category == coding_category_utf_16_auto) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6745 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6746 if (detect_info.found & CATEGORY_MASK_UTF_16_LE) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6747 category = coding_category_utf_16_le; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6748 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6749 category = coding_category_utf_16_be; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6750 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6751 break; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6752 } |
| 88365 | 6753 } |
| 6754 } | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6755 |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6756 if (detect_info.rejected == CATEGORY_MASK_ANY) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6757 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6758 detect_info.found = CATEGORY_MASK_RAW_TEXT; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6759 id = coding_categories[coding_category_raw_text].id; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6760 val = Fcons (make_number (id), Qnil); |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6761 } |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6762 else if (! detect_info.rejected && ! detect_info.found) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6763 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6764 detect_info.found = CATEGORY_MASK_ANY; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6765 id = coding_categories[coding_category_undecided].id; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6766 val = Fcons (make_number (id), Qnil); |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6767 } |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6768 else if (highest) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6769 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6770 if (detect_info.found) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6771 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6772 detect_info.found = 1 << category; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6773 val = Fcons (make_number (this->id), Qnil); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6774 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6775 else |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6776 for (i = 0; i < coding_category_raw_text; i++) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6777 if (! (detect_info.rejected & (1 << coding_priorities[i]))) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6778 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6779 detect_info.found = 1 << coding_priorities[i]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6780 id = coding_categories[coding_priorities[i]].id; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6781 val = Fcons (make_number (id), Qnil); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6782 break; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6783 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6784 } |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6785 else |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6786 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6787 int mask = detect_info.rejected | detect_info.found; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6788 int found = 0; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6789 val = Qnil; |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6790 |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6791 for (i = coding_category_raw_text - 1; i >= 0; i--) |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6792 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6793 category = coding_priorities[i]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6794 if (! (mask & (1 << category))) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6795 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6796 found |= 1 << category; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6797 id = coding_categories[category].id; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6798 val = Fcons (make_number (id), val); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6799 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6800 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6801 for (i = coding_category_raw_text - 1; i >= 0; i--) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6802 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6803 category = coding_priorities[i]; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6804 if (detect_info.found & (1 << category)) |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6805 { |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6806 id = coding_categories[category].id; |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6807 val = Fcons (make_number (id), val); |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6808 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6809 } |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6810 detect_info.found |= found; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6811 } |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6812 } |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6813 else if (base_category == coding_category_utf_16_auto) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6814 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6815 if (detect_coding_utf_16 (&coding, &detect_info)) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6816 { |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6817 enum coding_category category; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6818 struct coding_system *this; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6819 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6820 if (detect_info.found & CATEGORY_MASK_UTF_16_LE) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6821 this = coding_categories + coding_category_utf_16_le; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6822 else if (detect_info.found & CATEGORY_MASK_UTF_16_BE) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6823 this = coding_categories + coding_category_utf_16_be; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6824 else if (detect_info.rejected & CATEGORY_MASK_UTF_16_LE_NOSIG) |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6825 this = coding_categories + coding_category_utf_16_be_nosig; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6826 else |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6827 this = coding_categories + coding_category_utf_16_le_nosig; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6828 val = Fcons (make_number (this->id), Qnil); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6829 } |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6830 } |
| 88365 | 6831 else |
| 6832 { | |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6833 detect_info.found = 1 << XINT (CODING_ATTR_CATEGORY (attrs)); |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6834 val = Fcons (make_number (coding.id), Qnil); |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6835 } |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6836 |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6837 /* Then, detect eol-format if necessary. */ |
| 88365 | 6838 { |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6839 int normal_eol = -1, utf_16_be_eol = -1, utf_16_le_eol; |
| 88365 | 6840 Lisp_Object tail; |
| 6841 | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6842 if (VECTORP (eol_type)) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6843 { |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6844 if (detect_info.found & ~CATEGORY_MASK_UTF_16) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6845 normal_eol = detect_eol (coding.source, src_bytes, |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6846 coding_category_raw_text); |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6847 if (detect_info.found & (CATEGORY_MASK_UTF_16_BE |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6848 | CATEGORY_MASK_UTF_16_BE_NOSIG)) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6849 utf_16_be_eol = detect_eol (coding.source, src_bytes, |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6850 coding_category_utf_16_be); |
|
89331
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6851 if (detect_info.found & (CATEGORY_MASK_UTF_16_LE |
|
1892a75ffcac
(CATEGORY_MASK_RAW_TEXT): New macro.
Kenichi Handa <handa@m17n.org>
parents:
89279
diff
changeset
|
6852 | CATEGORY_MASK_UTF_16_LE_NOSIG)) |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6853 utf_16_le_eol = detect_eol (coding.source, src_bytes, |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6854 coding_category_utf_16_le); |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6855 } |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6856 else |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6857 { |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6858 if (EQ (eol_type, Qunix)) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6859 normal_eol = utf_16_be_eol = utf_16_le_eol = EOL_SEEN_LF; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6860 else if (EQ (eol_type, Qdos)) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6861 normal_eol = utf_16_be_eol = utf_16_le_eol = EOL_SEEN_CRLF; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6862 else |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6863 normal_eol = utf_16_be_eol = utf_16_le_eol = EOL_SEEN_CR; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6864 } |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6865 |
| 88365 | 6866 for (tail = val; CONSP (tail); tail = XCDR (tail)) |
| 6867 { | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6868 enum coding_category category; |
| 88365 | 6869 int this_eol; |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6870 |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6871 id = XINT (XCAR (tail)); |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6872 attrs = CODING_ID_ATTRS (id); |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6873 category = XINT (CODING_ATTR_CATEGORY (attrs)); |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6874 eol_type = CODING_ID_EOL_TYPE (id); |
| 88365 | 6875 if (VECTORP (eol_type)) |
| 6876 { | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6877 if (category == coding_category_utf_16_be |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6878 || category == coding_category_utf_16_be_nosig) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6879 this_eol = utf_16_be_eol; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6880 else if (category == coding_category_utf_16_le |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6881 || category == coding_category_utf_16_le_nosig) |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6882 this_eol = utf_16_le_eol; |
| 88365 | 6883 else |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6884 this_eol = normal_eol; |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6885 |
| 88365 | 6886 if (this_eol == EOL_SEEN_LF) |
| 6887 XSETCAR (tail, AREF (eol_type, 0)); | |
| 6888 else if (this_eol == EOL_SEEN_CRLF) | |
| 6889 XSETCAR (tail, AREF (eol_type, 1)); | |
| 6890 else if (this_eol == EOL_SEEN_CR) | |
| 6891 XSETCAR (tail, AREF (eol_type, 2)); | |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6892 else |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6893 XSETCAR (tail, CODING_ID_NAME (id)); |
| 88365 | 6894 } |
|
89193
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6895 else |
|
311d061195ef
(detect_coding_utf_8): Check incomplete byte sequence.
Kenichi Handa <handa@m17n.org>
parents:
89184
diff
changeset
|
6896 XSETCAR (tail, CODING_ID_NAME (id)); |
| 88365 | 6897 } |
| 6898 } | |
| 6899 | |
|
25662
0a7261c1d487
Use XCAR, XCDR, and XFLOAT_DATA instead of explicit member access.
Ken Raeburn <raeburn@raeburn.org>
parents:
25370
diff
changeset
|
6900 return (highest ? XCAR (val) : val); |
|
42104
d69c2368e549
(DECODE_COMPOSITION_END): Fixed a typo in the last
Sam Steingold <sds@gnu.org>
parents:
42103
diff
changeset
|
6901 } |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6902 |
| 88365 | 6903 |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6904 DEFUN ("detect-coding-region", Fdetect_coding_region, Sdetect_coding_region, |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6905 2, 3, 0, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6906 doc: /* Detect coding system of the text in the region between START and END. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6907 Return a list of possible coding systems ordered by priority. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6908 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6909 If only ASCII characters are found, it returns a list of single element |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6910 `undecided' or its subsidiary coding system according to a detected |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6911 end-of-line format. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6912 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6913 If optional argument HIGHEST is non-nil, return the coding system of |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6914 highest priority. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6915 (start, end, highest) |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6916 Lisp_Object start, end, highest; |
| 17052 | 6917 { |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6918 int from, to; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6919 int from_byte, to_byte; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6920 |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
6921 CHECK_NUMBER_COERCE_MARKER (start); |
|
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
6922 CHECK_NUMBER_COERCE_MARKER (end); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6923 |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6924 validate_region (&start, &end); |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6925 from = XINT (start), to = XINT (end); |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6926 from_byte = CHAR_TO_BYTE (from); |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6927 to_byte = CHAR_TO_BYTE (to); |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6928 |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6929 if (from < GPT && to >= GPT) |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6930 move_gap_both (to, to_byte); |
| 88365 | 6931 |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6932 return detect_coding_system (BYTE_POS_ADDR (from_byte), |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6933 to - from, to_byte - from_byte, |
|
34531
37f85e931855
(ONE_MORE_BYTE_CHECK_MULTIBYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34197
diff
changeset
|
6934 !NILP (highest), |
|
37f85e931855
(ONE_MORE_BYTE_CHECK_MULTIBYTE): New macro.
Kenichi Handa <handa@m17n.org>
parents:
34197
diff
changeset
|
6935 !NILP (current_buffer |
| 88365 | 6936 ->enable_multibyte_characters), |
| 6937 Qnil); | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6938 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6939 |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6940 DEFUN ("detect-coding-string", Fdetect_coding_string, Sdetect_coding_string, |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6941 1, 2, 0, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6942 doc: /* Detect coding system of the text in STRING. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6943 Return a list of possible coding systems ordered by priority. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6944 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6945 If only ASCII characters are found, it returns a list of single element |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6946 `undecided' or its subsidiary coding system according to a detected |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6947 end-of-line format. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6948 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6949 If optional argument HIGHEST is non-nil, return the coding system of |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6950 highest priority. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6951 (string, highest) |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6952 Lisp_Object string, highest; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6953 { |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
6954 CHECK_STRING (string); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
6955 |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6956 return detect_coding_system (SDATA (string), |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
6957 SCHARS (string), SBYTES (string), |
| 89483 | 6958 !NILP (highest), STRING_MULTIBYTE (string), |
| 88365 | 6959 Qnil); |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6960 } |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6961 |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6962 |
| 88365 | 6963 static INLINE int |
| 6964 char_encodable_p (c, attrs) | |
| 6965 int c; | |
| 6966 Lisp_Object attrs; | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6967 { |
| 88365 | 6968 Lisp_Object tail; |
| 6969 struct charset *charset; | |
| 6970 | |
| 6971 for (tail = CODING_ATTR_CHARSET_LIST (attrs); | |
| 6972 CONSP (tail); tail = XCDR (tail)) | |
| 6973 { | |
| 6974 charset = CHARSET_FROM_ID (XINT (XCAR (tail))); | |
| 6975 if (CHAR_CHARSET_P (c, charset)) | |
| 6976 break; | |
| 6977 } | |
| 6978 return (! NILP (tail)); | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6979 } |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6980 |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6981 |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6982 /* Return a list of coding systems that safely encode the text between |
| 88365 | 6983 START and END. If EXCLUDE is non-nil, it is a list of coding |
| 6984 systems not to check. The returned list doesn't contain any such | |
|
88889
4548f224c603
(Ffind_coding_systems_region_internal): Detect an
Kenichi Handa <handa@m17n.org>
parents:
88876
diff
changeset
|
6985 coding systems. In any case, if the text contains only ASCII or is |
| 88365 | 6986 unibyte, return t. */ |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6987 |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6988 DEFUN ("find-coding-systems-region-internal", |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6989 Ffind_coding_systems_region_internal, |
| 88365 | 6990 Sfind_coding_systems_region_internal, 2, 3, 0, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
6991 doc: /* Internal use only. */) |
| 88365 | 6992 (start, end, exclude) |
| 6993 Lisp_Object start, end, exclude; | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
6994 { |
| 88365 | 6995 Lisp_Object coding_attrs_list, safe_codings; |
| 6996 EMACS_INT start_byte, end_byte; | |
|
88876
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
6997 const unsigned char *p, *pbeg, *pend; |
| 88365 | 6998 int c; |
| 6999 Lisp_Object tail, elt; | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7000 |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7001 if (STRINGP (start)) |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7002 { |
| 88365 | 7003 if (!STRING_MULTIBYTE (start) |
| 89483 | 7004 || SCHARS (start) == SBYTES (start)) |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7005 return Qt; |
| 88365 | 7006 start_byte = 0; |
| 89483 | 7007 end_byte = SBYTES (start); |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7008 } |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7009 else |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7010 { |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
7011 CHECK_NUMBER_COERCE_MARKER (start); |
|
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
7012 CHECK_NUMBER_COERCE_MARKER (end); |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7013 if (XINT (start) < BEG || XINT (end) > Z || XINT (start) > XINT (end)) |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7014 args_out_of_range (start, end); |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7015 if (NILP (current_buffer->enable_multibyte_characters)) |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7016 return Qt; |
| 88365 | 7017 start_byte = CHAR_TO_BYTE (XINT (start)); |
| 7018 end_byte = CHAR_TO_BYTE (XINT (end)); | |
| 7019 if (XINT (end) - XINT (start) == end_byte - start_byte) | |
| 7020 return Qt; | |
| 7021 | |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7022 if (XINT (start) < GPT && XINT (end) > GPT) |
| 88365 | 7023 { |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7024 if ((GPT - XINT (start)) < (XINT (end) - GPT)) |
|
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7025 move_gap_both (XINT (start), start_byte); |
| 88365 | 7026 else |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7027 move_gap_both (XINT (end), end_byte); |
| 88365 | 7028 } |
| 7029 } | |
| 7030 | |
| 7031 coding_attrs_list = Qnil; | |
| 7032 for (tail = Vcoding_system_list; CONSP (tail); tail = XCDR (tail)) | |
| 7033 if (NILP (exclude) | |
| 7034 || NILP (Fmemq (XCAR (tail), exclude))) | |
| 7035 { | |
| 7036 Lisp_Object attrs; | |
| 7037 | |
| 7038 attrs = AREF (CODING_SYSTEM_SPEC (XCAR (tail)), 0); | |
| 7039 if (EQ (XCAR (tail), CODING_ATTR_BASE_NAME (attrs)) | |
| 7040 && ! EQ (CODING_ATTR_TYPE (attrs), Qundecided)) | |
| 7041 coding_attrs_list = Fcons (attrs, coding_attrs_list); | |
| 7042 } | |
| 7043 | |
| 7044 if (STRINGP (start)) | |
| 89483 | 7045 p = pbeg = SDATA (start); |
| 88365 | 7046 else |
| 7047 p = pbeg = BYTE_POS_ADDR (start_byte); | |
| 7048 pend = p + (end_byte - start_byte); | |
| 7049 | |
| 7050 while (p < pend && ASCII_BYTE_P (*p)) p++; | |
| 7051 while (p < pend && ASCII_BYTE_P (*(pend - 1))) pend--; | |
| 7052 | |
| 7053 while (p < pend) | |
| 7054 { | |
| 7055 if (ASCII_BYTE_P (*p)) | |
| 7056 p++; | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7057 else |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7058 { |
| 88365 | 7059 c = STRING_CHAR_ADVANCE (p); |
| 7060 | |
| 7061 charset_map_loaded = 0; | |
| 7062 for (tail = coding_attrs_list; CONSP (tail);) | |
| 7063 { | |
| 7064 elt = XCAR (tail); | |
| 7065 if (NILP (elt)) | |
| 7066 tail = XCDR (tail); | |
| 7067 else if (char_encodable_p (c, elt)) | |
| 7068 tail = XCDR (tail); | |
| 7069 else if (CONSP (XCDR (tail))) | |
| 7070 { | |
| 7071 XSETCAR (tail, XCAR (XCDR (tail))); | |
| 7072 XSETCDR (tail, XCDR (XCDR (tail))); | |
| 7073 } | |
| 7074 else | |
| 7075 { | |
| 7076 XSETCAR (tail, Qnil); | |
| 7077 tail = XCDR (tail); | |
| 7078 } | |
| 7079 } | |
| 7080 if (charset_map_loaded) | |
| 7081 { | |
| 7082 EMACS_INT p_offset = p - pbeg, pend_offset = pend - pbeg; | |
| 7083 | |
| 7084 if (STRINGP (start)) | |
| 89483 | 7085 pbeg = SDATA (start); |
| 88365 | 7086 else |
| 7087 pbeg = BYTE_POS_ADDR (start_byte); | |
| 7088 p = pbeg + p_offset; | |
| 7089 pend = pbeg + pend_offset; | |
| 7090 } | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7091 } |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7092 } |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7093 |
| 88365 | 7094 safe_codings = Qnil; |
| 7095 for (tail = coding_attrs_list; CONSP (tail); tail = XCDR (tail)) | |
| 7096 if (! NILP (XCAR (tail))) | |
| 7097 safe_codings = Fcons (CODING_ATTR_BASE_NAME (XCAR (tail)), safe_codings); | |
| 7098 | |
| 7099 return safe_codings; | |
| 7100 } | |
| 7101 | |
| 7102 | |
|
46859
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7103 DEFUN ("unencodable-char-position", Funencodable_char_position, |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7104 Sunencodable_char_position, 3, 5, 0, |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7105 doc: /* |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7106 Return position of first un-encodable character in a region. |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7107 START and END specfiy the region and CODING-SYSTEM specifies the |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7108 encoding to check. Return nil if CODING-SYSTEM does encode the region. |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7109 |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7110 If optional 4th argument COUNT is non-nil, it specifies at most how |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7111 many un-encodable characters to search. In this case, the value is a |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7112 list of positions. |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7113 |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7114 If optional 5th argument STRING is non-nil, it is a string to search |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7115 for un-encodable characters. In that case, START and END are indexes |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7116 to the string. */) |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7117 (start, end, coding_system, count, string) |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7118 Lisp_Object start, end, coding_system, count, string; |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7119 { |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7120 int n; |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7121 struct coding_system coding; |
| 89483 | 7122 Lisp_Object attrs, charset_list; |
|
46859
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7123 Lisp_Object positions; |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7124 int from, to; |
| 89483 | 7125 const unsigned char *p, *stop, *pend; |
| 7126 int ascii_compatible; | |
| 7127 | |
| 7128 setup_coding_system (Fcheck_coding_system (coding_system), &coding); | |
| 7129 attrs = CODING_ID_ATTRS (coding.id); | |
| 7130 if (EQ (CODING_ATTR_TYPE (attrs), Qraw_text)) | |
| 7131 return Qnil; | |
| 7132 ascii_compatible = ! NILP (CODING_ATTR_ASCII_COMPAT (attrs)); | |
| 7133 charset_list = CODING_ATTR_CHARSET_LIST (attrs); | |
|
46859
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7134 |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7135 if (NILP (string)) |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7136 { |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7137 validate_region (&start, &end); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7138 from = XINT (start); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7139 to = XINT (end); |
| 89483 | 7140 if (NILP (current_buffer->enable_multibyte_characters) |
| 7141 || (ascii_compatible | |
| 7142 && (to - from) == (CHAR_TO_BYTE (to) - (CHAR_TO_BYTE (from))))) | |
|
46859
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7143 return Qnil; |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7144 p = CHAR_POS_ADDR (from); |
| 89483 | 7145 pend = CHAR_POS_ADDR (to); |
| 7146 if (from < GPT && to >= GPT) | |
| 7147 stop = GPT_ADDR; | |
|
48829
f6c59ca557c7
(Funencodable_char_position): Set pend correctly.
Kenichi Handa <handa@m17n.org>
parents:
48230
diff
changeset
|
7148 else |
| 89483 | 7149 stop = pend; |
|
46859
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7150 } |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7151 else |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7152 { |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7153 CHECK_STRING (string); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7154 CHECK_NATNUM (start); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7155 CHECK_NATNUM (end); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7156 from = XINT (start); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7157 to = XINT (end); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7158 if (from > to |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7159 || to > SCHARS (string)) |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7160 args_out_of_range_3 (string, start, end); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7161 if (! STRING_MULTIBYTE (string)) |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7162 return Qnil; |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7163 p = SDATA (string) + string_char_to_byte (string, from); |
| 89483 | 7164 stop = pend = SDATA (string) + string_char_to_byte (string, to); |
| 7165 if (ascii_compatible && (to - from) == (pend - p)) | |
| 7166 return Qnil; | |
| 7167 } | |
|
46859
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7168 |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7169 if (NILP (count)) |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7170 n = 1; |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7171 else |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7172 { |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7173 CHECK_NATNUM (count); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7174 n = XINT (count); |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7175 } |
|
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
7176 |
| 89483 | 7177 positions = Qnil; |
| 7178 while (1) | |
| 7179 { | |
| 7180 int c; | |
| 7181 | |
| 7182 if (ascii_compatible) | |
| 7183 while (p < stop && ASCII_BYTE_P (*p)) | |
| 7184 p++, from++; | |
| 7185 if (p >= stop) | |
| 7186 { | |
| 7187 if (p >= pend) | |
| 7188 break; | |
| 7189 stop = pend; | |
| 7190 p = GAP_END_ADDR; | |
| 7191 } | |
| 7192 | |
| 7193 c = STRING_CHAR_ADVANCE (p); | |
| 7194 if (! (ASCII_CHAR_P (c) && ascii_compatible) | |
| 7195 && ! char_charset (c, charset_list, NULL)) | |
| 7196 { | |
| 7197 positions = Fcons (make_number (from), positions); | |
| 7198 n--; | |
| 7199 if (n == 0) | |
| 7200 break; | |
| 7201 } | |
| 7202 | |
| 7203 from++; | |
| 7204 } | |
| 7205 | |
| 7206 return (NILP (count) ? Fcar (positions) : Fnreverse (positions)); | |
| 7207 } | |
| 7208 | |
| 7209 | |
| 88365 | 7210 DEFUN ("check-coding-systems-region", Fcheck_coding_systems_region, |
| 7211 Scheck_coding_systems_region, 3, 3, 0, | |
| 7212 doc: /* Check if the region is encodable by coding systems. | |
| 7213 | |
| 7214 START and END are buffer positions specifying the region. | |
| 7215 CODING-SYSTEM-LIST is a list of coding systems to check. | |
| 7216 | |
| 7217 The value is an alist ((CODING-SYSTEM POS0 POS1 ...) ...), where | |
| 7218 CODING-SYSTEM is a member of CODING-SYSTEM-LIst and can't encode the | |
| 7219 whole region, POS0, POS1, ... are buffer positions where non-encodable | |
| 7220 characters are found. | |
| 7221 | |
| 7222 If all coding systems in CODING-SYSTEM-LIST can encode the region, the | |
| 7223 value is nil. | |
| 7224 | |
| 7225 START may be a string. In that case, check if the string is | |
| 7226 encodable, and the value contains indices to the string instead of | |
| 7227 buffer positions. END is ignored. */) | |
| 7228 (start, end, coding_system_list) | |
| 7229 Lisp_Object start, end, coding_system_list; | |
| 7230 { | |
| 7231 Lisp_Object list; | |
| 7232 EMACS_INT start_byte, end_byte; | |
| 7233 int pos; | |
|
88876
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
7234 const unsigned char *p, *pbeg, *pend; |
| 88365 | 7235 int c; |
| 7236 Lisp_Object tail, elt; | |
| 7237 | |
| 7238 if (STRINGP (start)) | |
| 7239 { | |
| 7240 if (!STRING_MULTIBYTE (start) | |
| 89483 | 7241 && SCHARS (start) != SBYTES (start)) |
| 88365 | 7242 return Qnil; |
| 7243 start_byte = 0; | |
| 89483 | 7244 end_byte = SBYTES (start); |
| 88365 | 7245 pos = 0; |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7246 } |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7247 else |
| 88365 | 7248 { |
| 7249 CHECK_NUMBER_COERCE_MARKER (start); | |
| 7250 CHECK_NUMBER_COERCE_MARKER (end); | |
| 7251 if (XINT (start) < BEG || XINT (end) > Z || XINT (start) > XINT (end)) | |
| 7252 args_out_of_range (start, end); | |
| 7253 if (NILP (current_buffer->enable_multibyte_characters)) | |
| 7254 return Qnil; | |
| 7255 start_byte = CHAR_TO_BYTE (XINT (start)); | |
| 7256 end_byte = CHAR_TO_BYTE (XINT (end)); | |
| 7257 if (XINT (end) - XINT (start) == end_byte - start_byte) | |
| 7258 return Qt; | |
| 7259 | |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7260 if (XINT (start) < GPT && XINT (end) > GPT) |
| 88365 | 7261 { |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7262 if ((GPT - XINT (start)) < (XINT (end) - GPT)) |
|
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7263 move_gap_both (XINT (start), start_byte); |
| 88365 | 7264 else |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7265 move_gap_both (XINT (end), end_byte); |
| 88365 | 7266 } |
|
89394
6ffca50f25b9
(Fcheck_coding_systems_region): Fix type errors.
Dave Love <fx@gnu.org>
parents:
89373
diff
changeset
|
7267 pos = XINT (start); |
| 88365 | 7268 } |
| 7269 | |
| 7270 list = Qnil; | |
| 7271 for (tail = coding_system_list; CONSP (tail); tail = XCDR (tail)) | |
| 7272 { | |
| 7273 elt = XCAR (tail); | |
| 7274 list = Fcons (Fcons (elt, Fcons (AREF (CODING_SYSTEM_SPEC (elt), 0), | |
| 7275 Qnil)), | |
| 7276 list); | |
| 7277 } | |
| 7278 | |
| 7279 if (STRINGP (start)) | |
| 89483 | 7280 p = pbeg = SDATA (start); |
| 88365 | 7281 else |
| 7282 p = pbeg = BYTE_POS_ADDR (start_byte); | |
| 7283 pend = p + (end_byte - start_byte); | |
| 7284 | |
| 7285 while (p < pend && ASCII_BYTE_P (*p)) p++, pos++; | |
| 7286 while (p < pend && ASCII_BYTE_P (*(pend - 1))) pend--; | |
| 7287 | |
| 7288 while (p < pend) | |
| 7289 { | |
| 7290 if (ASCII_BYTE_P (*p)) | |
| 7291 p++; | |
| 7292 else | |
| 7293 { | |
| 7294 c = STRING_CHAR_ADVANCE (p); | |
| 7295 | |
| 7296 charset_map_loaded = 0; | |
| 7297 for (tail = list; CONSP (tail); tail = XCDR (tail)) | |
| 7298 { | |
| 7299 elt = XCDR (XCAR (tail)); | |
| 7300 if (! char_encodable_p (c, XCAR (elt))) | |
| 7301 XSETCDR (elt, Fcons (make_number (pos), XCDR (elt))); | |
| 7302 } | |
| 7303 if (charset_map_loaded) | |
| 7304 { | |
| 7305 EMACS_INT p_offset = p - pbeg, pend_offset = pend - pbeg; | |
| 7306 | |
| 7307 if (STRINGP (start)) | |
| 89483 | 7308 pbeg = SDATA (start); |
| 88365 | 7309 else |
| 7310 pbeg = BYTE_POS_ADDR (start_byte); | |
| 7311 p = pbeg + p_offset; | |
| 7312 pend = pbeg + pend_offset; | |
| 7313 } | |
| 7314 } | |
| 7315 pos++; | |
| 7316 } | |
| 7317 | |
| 7318 tail = list; | |
| 7319 list = Qnil; | |
| 7320 for (; CONSP (tail); tail = XCDR (tail)) | |
| 7321 { | |
| 7322 elt = XCAR (tail); | |
| 7323 if (CONSP (XCDR (XCDR (elt)))) | |
| 7324 list = Fcons (Fcons (XCAR (elt), Fnreverse (XCDR (XCDR (elt)))), | |
| 7325 list); | |
| 7326 } | |
| 7327 | |
| 7328 return list; | |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7329 } |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7330 |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
7331 |
| 88365 | 7332 |
|
20803
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7333 Lisp_Object |
| 88365 | 7334 code_convert_region (start, end, coding_system, dst_object, encodep, norecord) |
| 7335 Lisp_Object start, end, coding_system, dst_object; | |
| 7336 int encodep, norecord; | |
|
20680
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7337 { |
|
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7338 struct coding_system coding; |
| 88365 | 7339 EMACS_INT from, from_byte, to, to_byte; |
| 7340 Lisp_Object src_object; | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
7341 |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
7342 CHECK_NUMBER_COERCE_MARKER (start); |
|
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
7343 CHECK_NUMBER_COERCE_MARKER (end); |
| 88365 | 7344 if (NILP (coding_system)) |
| 7345 coding_system = Qno_conversion; | |
| 7346 else | |
| 7347 CHECK_CODING_SYSTEM (coding_system); | |
| 7348 src_object = Fcurrent_buffer (); | |
| 7349 if (NILP (dst_object)) | |
| 7350 dst_object = src_object; | |
| 7351 else if (! EQ (dst_object, Qt)) | |
| 7352 CHECK_BUFFER (dst_object); | |
|
20680
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7353 |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
7354 validate_region (&start, &end); |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
7355 from = XFASTINT (start); |
| 88365 | 7356 from_byte = CHAR_TO_BYTE (from); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
7357 to = XFASTINT (end); |
| 88365 | 7358 to_byte = CHAR_TO_BYTE (to); |
| 7359 | |
| 7360 setup_coding_system (coding_system, &coding); | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
7361 coding.mode |= CODING_MODE_LAST_BLOCK; |
| 88365 | 7362 |
| 7363 if (encodep) | |
| 7364 encode_coding_object (&coding, src_object, from, from_byte, to, to_byte, | |
| 7365 dst_object); | |
| 7366 else | |
| 7367 decode_coding_object (&coding, src_object, from, from_byte, to, to_byte, | |
| 7368 dst_object); | |
| 7369 if (! norecord) | |
| 7370 Vlast_coding_system_used = CODING_ID_NAME (coding.id); | |
| 7371 | |
| 7372 if (coding.result != CODING_RESULT_SUCCESS) | |
| 7373 error ("Code conversion error: %d", coding.result); | |
| 7374 | |
| 7375 return (BUFFERP (dst_object) | |
| 7376 ? make_number (coding.produced_char) | |
| 7377 : coding.dst_object); | |
|
20803
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7378 } |
|
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7379 |
| 88365 | 7380 |
|
20803
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7381 DEFUN ("decode-coding-region", Fdecode_coding_region, Sdecode_coding_region, |
| 88365 | 7382 3, 4, "r\nzCoding system: ", |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7383 doc: /* Decode the current region from the specified coding system. |
| 88365 | 7384 When called from a program, takes four arguments: |
| 7385 START, END, CODING-SYSTEM, and DESTINATION. | |
| 7386 START and END are buffer positions. | |
| 7387 | |
| 7388 Optional 4th arguments DESTINATION specifies where the decoded text goes. | |
| 7389 If nil, the region between START and END is replace by the decoded text. | |
| 7390 If buffer, the decoded text is inserted in the buffer. | |
| 7391 If t, the decoded text is returned. | |
| 7392 | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7393 This function sets `last-coding-system-used' to the precise coding system |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7394 used (which may be different from CODING-SYSTEM if CODING-SYSTEM is |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7395 not fully specified.) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7396 It returns the length of the decoded text. */) |
| 88365 | 7397 (start, end, coding_system, destination) |
| 7398 Lisp_Object start, end, coding_system, destination; | |
|
20803
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7399 { |
| 88365 | 7400 return code_convert_region (start, end, coding_system, destination, 0, 0); |
|
20680
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7401 } |
|
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7402 |
|
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7403 DEFUN ("encode-coding-region", Fencode_coding_region, Sencode_coding_region, |
| 88365 | 7404 3, 4, "r\nzCoding system: ", |
| 7405 doc: /* Encode the current region by specified coding system. | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7406 When called from a program, takes three arguments: |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7407 START, END, and CODING-SYSTEM. START and END are buffer positions. |
| 88365 | 7408 |
| 7409 Optional 4th arguments DESTINATION specifies where the encoded text goes. | |
| 7410 If nil, the region between START and END is replace by the encoded text. | |
| 7411 If buffer, the encoded text is inserted in the buffer. | |
| 7412 If t, the encoded text is returned. | |
| 7413 | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7414 This function sets `last-coding-system-used' to the precise coding system |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7415 used (which may be different from CODING-SYSTEM if CODING-SYSTEM is |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7416 not fully specified.) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7417 It returns the length of the encoded text. */) |
| 88365 | 7418 (start, end, coding_system, destination) |
| 7419 Lisp_Object start, end, coding_system, destination; | |
|
20680
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7420 { |
| 88365 | 7421 return code_convert_region (start, end, coding_system, destination, 1, 0); |
| 17052 | 7422 } |
| 7423 | |
|
20803
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7424 Lisp_Object |
| 88365 | 7425 code_convert_string (string, coding_system, dst_object, |
| 7426 encodep, nocopy, norecord) | |
| 7427 Lisp_Object string, coding_system, dst_object; | |
| 7428 int encodep, nocopy, norecord; | |
| 17052 | 7429 { |
| 7430 struct coding_system coding; | |
| 88365 | 7431 EMACS_INT chars, bytes; |
| 17052 | 7432 |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
7433 CHECK_STRING (string); |
| 88365 | 7434 if (NILP (coding_system)) |
| 7435 { | |
| 7436 if (! norecord) | |
| 7437 Vlast_coding_system_used = Qno_conversion; | |
| 7438 if (NILP (dst_object)) | |
| 7439 return (nocopy ? Fcopy_sequence (string) : string); | |
| 7440 } | |
| 17052 | 7441 |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
7442 if (NILP (coding_system)) |
| 88365 | 7443 coding_system = Qno_conversion; |
| 7444 else | |
| 7445 CHECK_CODING_SYSTEM (coding_system); | |
| 7446 if (NILP (dst_object)) | |
| 7447 dst_object = Qt; | |
| 7448 else if (! EQ (dst_object, Qt)) | |
| 7449 CHECK_BUFFER (dst_object); | |
| 7450 | |
| 7451 setup_coding_system (coding_system, &coding); | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
7452 coding.mode |= CODING_MODE_LAST_BLOCK; |
| 89483 | 7453 chars = SCHARS (string); |
| 7454 bytes = SBYTES (string); | |
| 88365 | 7455 if (encodep) |
| 7456 encode_coding_object (&coding, string, 0, 0, chars, bytes, dst_object); | |
| 7457 else | |
| 7458 decode_coding_object (&coding, string, 0, 0, chars, bytes, dst_object); | |
| 7459 if (! norecord) | |
| 7460 Vlast_coding_system_used = CODING_ID_NAME (coding.id); | |
| 7461 | |
| 7462 if (coding.result != CODING_RESULT_SUCCESS) | |
| 7463 error ("Code conversion error: %d", coding.result); | |
| 7464 | |
| 7465 return (BUFFERP (dst_object) | |
| 7466 ? make_number (coding.produced_char) | |
| 7467 : coding.dst_object); | |
|
20803
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7468 } |
|
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7469 |
|
0fa2183c587d
(ENCODE_ISO_CHARACTER): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
20794
diff
changeset
|
7470 |
|
22341
572ba933a4bf
(code_convert_string_norecord): New function.
Karl Heuer <kwzh@gnu.org>
parents:
22329
diff
changeset
|
7471 /* Encode or decode STRING according to CODING_SYSTEM. |
| 26847 | 7472 Do not set Vlast_coding_system_used. |
| 7473 | |
| 7474 This function is called only from macros DECODE_FILE and | |
| 7475 ENCODE_FILE, thus we ignore character composition. */ | |
|
22341
572ba933a4bf
(code_convert_string_norecord): New function.
Karl Heuer <kwzh@gnu.org>
parents:
22329
diff
changeset
|
7476 |
|
572ba933a4bf
(code_convert_string_norecord): New function.
Karl Heuer <kwzh@gnu.org>
parents:
22329
diff
changeset
|
7477 Lisp_Object |
|
572ba933a4bf
(code_convert_string_norecord): New function.
Karl Heuer <kwzh@gnu.org>
parents:
22329
diff
changeset
|
7478 code_convert_string_norecord (string, coding_system, encodep) |
|
572ba933a4bf
(code_convert_string_norecord): New function.
Karl Heuer <kwzh@gnu.org>
parents:
22329
diff
changeset
|
7479 Lisp_Object string, coding_system; |
|
572ba933a4bf
(code_convert_string_norecord): New function.
Karl Heuer <kwzh@gnu.org>
parents:
22329
diff
changeset
|
7480 int encodep; |
|
572ba933a4bf
(code_convert_string_norecord): New function.
Karl Heuer <kwzh@gnu.org>
parents:
22329
diff
changeset
|
7481 { |
|
88430
6418a272b97e
* coding.c: Delete unused variables.
Kenichi Handa <handa@m17n.org>
parents:
88365
diff
changeset
|
7482 return code_convert_string (string, coding_system, Qt, encodep, 0, 1); |
|
22341
572ba933a4bf
(code_convert_string_norecord): New function.
Karl Heuer <kwzh@gnu.org>
parents:
22329
diff
changeset
|
7483 } |
| 88365 | 7484 |
| 7485 | |
| 7486 DEFUN ("decode-coding-string", Fdecode_coding_string, Sdecode_coding_string, | |
| 7487 2, 4, 0, | |
| 7488 doc: /* Decode STRING which is encoded in CODING-SYSTEM, and return the result. | |
| 7489 | |
| 7490 Optional third arg NOCOPY non-nil means it is OK to return STRING itself | |
| 7491 if the decoding operation is trivial. | |
| 7492 | |
| 7493 Optional fourth arg BUFFER non-nil meant that the decoded text is | |
|
88845
64b8f6168269
(Fset_coding_system_priority): Allow null arg list.
Dave Love <fx@gnu.org>
parents:
88771
diff
changeset
|
7494 inserted in BUFFER instead of returned as a string. In this case, |
| 88365 | 7495 the return value is BUFFER. |
| 7496 | |
| 7497 This function sets `last-coding-system-used' to the precise coding system | |
| 7498 used (which may be different from CODING-SYSTEM if CODING-SYSTEM is | |
| 7499 not fully specified. */) | |
| 7500 (string, coding_system, nocopy, buffer) | |
| 7501 Lisp_Object string, coding_system, nocopy, buffer; | |
| 7502 { | |
| 7503 return code_convert_string (string, coding_system, buffer, | |
| 7504 0, ! NILP (nocopy), 0); | |
| 7505 } | |
| 7506 | |
| 7507 DEFUN ("encode-coding-string", Fencode_coding_string, Sencode_coding_string, | |
| 7508 2, 4, 0, | |
| 7509 doc: /* Encode STRING to CODING-SYSTEM, and return the result. | |
| 7510 | |
| 7511 Optional third arg NOCOPY non-nil means it is OK to return STRING | |
| 7512 itself if the encoding operation is trivial. | |
| 7513 | |
| 7514 Optional fourth arg BUFFER non-nil meant that the encoded text is | |
|
88845
64b8f6168269
(Fset_coding_system_priority): Allow null arg list.
Dave Love <fx@gnu.org>
parents:
88771
diff
changeset
|
7515 inserted in BUFFER instead of returned as a string. In this case, |
| 88365 | 7516 the return value is BUFFER. |
| 7517 | |
| 7518 This function sets `last-coding-system-used' to the precise coding system | |
| 7519 used (which may be different from CODING-SYSTEM if CODING-SYSTEM is | |
| 7520 not fully specified.) */) | |
| 7521 (string, coding_system, nocopy, buffer) | |
| 7522 Lisp_Object string, coding_system, nocopy, buffer; | |
| 7523 { | |
| 7524 return code_convert_string (string, coding_system, buffer, | |
| 88856 | 7525 1, ! NILP (nocopy), 1); |
| 88365 | 7526 } |
| 7527 | |
|
20680
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7528 |
| 17052 | 7529 DEFUN ("decode-sjis-char", Fdecode_sjis_char, Sdecode_sjis_char, 1, 1, 0, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7530 doc: /* Decode a Japanese character which has CODE in shift_jis encoding. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7531 Return the corresponding character. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7532 (code) |
| 17052 | 7533 Lisp_Object code; |
| 7534 { | |
| 88365 | 7535 Lisp_Object spec, attrs, val; |
| 7536 struct charset *charset_roman, *charset_kanji, *charset_kana, *charset; | |
| 7537 int c; | |
| 7538 | |
| 7539 CHECK_NATNUM (code); | |
| 7540 c = XFASTINT (code); | |
| 7541 CHECK_CODING_SYSTEM_GET_SPEC (Vsjis_coding_system, spec); | |
| 7542 attrs = AREF (spec, 0); | |
| 7543 | |
| 7544 if (ASCII_BYTE_P (c) | |
| 7545 && ! NILP (CODING_ATTR_ASCII_COMPAT (attrs))) | |
| 7546 return code; | |
| 7547 | |
| 7548 val = CODING_ATTR_CHARSET_LIST (attrs); | |
| 7549 charset_roman = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); | |
|
88497
d2b9e0d4c2f6
(Fdecode_sjis_char): Fix typo (0x7F->0xFF). Fix the
Kenichi Handa <handa@m17n.org>
parents:
88485
diff
changeset
|
7550 charset_kana = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); |
|
d2b9e0d4c2f6
(Fdecode_sjis_char): Fix typo (0x7F->0xFF). Fix the
Kenichi Handa <handa@m17n.org>
parents:
88485
diff
changeset
|
7551 charset_kanji = CHARSET_FROM_ID (XINT (XCAR (val))); |
| 88365 | 7552 |
| 7553 if (c <= 0x7F) | |
| 7554 charset = charset_roman; | |
| 7555 else if (c >= 0xA0 && c < 0xDF) | |
| 7556 { | |
| 7557 charset = charset_kana; | |
| 7558 c -= 0x80; | |
|
24065
7e291dea6141
(Fdecode_sjis_char): Decode Japanese Katakana character
Kenichi Handa <handa@m17n.org>
parents:
24056
diff
changeset
|
7559 } |
|
7e291dea6141
(Fdecode_sjis_char): Decode Japanese Katakana character
Kenichi Handa <handa@m17n.org>
parents:
24056
diff
changeset
|
7560 else |
|
7e291dea6141
(Fdecode_sjis_char): Decode Japanese Katakana character
Kenichi Handa <handa@m17n.org>
parents:
24056
diff
changeset
|
7561 { |
|
88497
d2b9e0d4c2f6
(Fdecode_sjis_char): Fix typo (0x7F->0xFF). Fix the
Kenichi Handa <handa@m17n.org>
parents:
88485
diff
changeset
|
7562 int s1 = c >> 8, s2 = c & 0xFF; |
| 88365 | 7563 |
| 7564 if (s1 < 0x81 || (s1 > 0x9F && s1 < 0xE0) || s1 > 0xEF | |
| 7565 || s2 < 0x40 || s2 == 0x7F || s2 > 0xFC) | |
| 7566 error ("Invalid code: %d", code); | |
| 7567 SJIS_TO_JIS (c); | |
| 7568 charset = charset_kanji; | |
| 7569 } | |
| 7570 c = DECODE_CHAR (charset, c); | |
| 7571 if (c < 0) | |
| 7572 error ("Invalid code: %d", code); | |
| 7573 return make_number (c); | |
| 17052 | 7574 } |
| 7575 | |
| 88365 | 7576 |
| 17052 | 7577 DEFUN ("encode-sjis-char", Fencode_sjis_char, Sencode_sjis_char, 1, 1, 0, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7578 doc: /* Encode a Japanese character CHAR to shift_jis encoding. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7579 Return the corresponding code in SJIS. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7580 (ch) |
| 88365 | 7581 Lisp_Object ch; |
| 17052 | 7582 { |
| 88365 | 7583 Lisp_Object spec, attrs, charset_list; |
| 7584 int c; | |
| 7585 struct charset *charset; | |
| 7586 unsigned code; | |
| 7587 | |
| 7588 CHECK_CHARACTER (ch); | |
| 7589 c = XFASTINT (ch); | |
| 7590 CHECK_CODING_SYSTEM_GET_SPEC (Vsjis_coding_system, spec); | |
| 7591 attrs = AREF (spec, 0); | |
| 7592 | |
| 7593 if (ASCII_CHAR_P (c) | |
| 7594 && ! NILP (CODING_ATTR_ASCII_COMPAT (attrs))) | |
| 7595 return ch; | |
| 7596 | |
| 7597 charset_list = CODING_ATTR_CHARSET_LIST (attrs); | |
| 7598 charset = char_charset (c, charset_list, &code); | |
| 7599 if (code == CHARSET_INVALID_CODE (charset)) | |
| 7600 error ("Can't encode by shift_jis encoding: %d", c); | |
| 7601 JIS_TO_SJIS (code); | |
| 7602 | |
| 7603 return make_number (code); | |
| 17052 | 7604 } |
| 7605 | |
| 7606 DEFUN ("decode-big5-char", Fdecode_big5_char, Sdecode_big5_char, 1, 1, 0, | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7607 doc: /* Decode a Big5 character which has CODE in BIG5 coding system. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7608 Return the corresponding character. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7609 (code) |
| 17052 | 7610 Lisp_Object code; |
| 7611 { | |
| 88365 | 7612 Lisp_Object spec, attrs, val; |
| 7613 struct charset *charset_roman, *charset_big5, *charset; | |
| 7614 int c; | |
| 7615 | |
| 7616 CHECK_NATNUM (code); | |
| 7617 c = XFASTINT (code); | |
| 7618 CHECK_CODING_SYSTEM_GET_SPEC (Vbig5_coding_system, spec); | |
| 7619 attrs = AREF (spec, 0); | |
| 7620 | |
| 7621 if (ASCII_BYTE_P (c) | |
| 7622 && ! NILP (CODING_ATTR_ASCII_COMPAT (attrs))) | |
| 7623 return code; | |
| 7624 | |
| 7625 val = CODING_ATTR_CHARSET_LIST (attrs); | |
| 7626 charset_roman = CHARSET_FROM_ID (XINT (XCAR (val))), val = XCDR (val); | |
| 7627 charset_big5 = CHARSET_FROM_ID (XINT (XCAR (val))); | |
| 7628 | |
| 7629 if (c <= 0x7F) | |
| 7630 charset = charset_roman; | |
|
24324
2eec590faf26
(Fdecode_sjis_char, Fencode_sjis_char): Hanlde
Kenichi Handa <handa@m17n.org>
parents:
24316
diff
changeset
|
7631 else |
|
2eec590faf26
(Fdecode_sjis_char, Fencode_sjis_char): Hanlde
Kenichi Handa <handa@m17n.org>
parents:
24316
diff
changeset
|
7632 { |
| 88365 | 7633 int b1 = c >> 8, b2 = c & 0x7F; |
| 7634 if (b1 < 0xA1 || b1 > 0xFE | |
| 7635 || b2 < 0x40 || (b2 > 0x7E && b2 < 0xA1) || b2 > 0xFE) | |
| 7636 error ("Invalid code: %d", code); | |
| 7637 charset = charset_big5; | |
| 7638 } | |
| 7639 c = DECODE_CHAR (charset, (unsigned )c); | |
| 7640 if (c < 0) | |
| 7641 error ("Invalid code: %d", code); | |
| 7642 return make_number (c); | |
| 17052 | 7643 } |
| 7644 | |
| 7645 DEFUN ("encode-big5-char", Fencode_big5_char, Sencode_big5_char, 1, 1, 0, | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7646 doc: /* Encode the Big5 character CHAR to BIG5 coding system. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7647 Return the corresponding character code in Big5. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7648 (ch) |
| 17052 | 7649 Lisp_Object ch; |
| 7650 { | |
| 88365 | 7651 Lisp_Object spec, attrs, charset_list; |
| 7652 struct charset *charset; | |
| 7653 int c; | |
| 7654 unsigned code; | |
| 7655 | |
| 7656 CHECK_CHARACTER (ch); | |
| 7657 c = XFASTINT (ch); | |
| 7658 CHECK_CODING_SYSTEM_GET_SPEC (Vbig5_coding_system, spec); | |
| 7659 attrs = AREF (spec, 0); | |
| 7660 if (ASCII_CHAR_P (c) | |
| 7661 && ! NILP (CODING_ATTR_ASCII_COMPAT (attrs))) | |
| 7662 return ch; | |
| 7663 | |
| 7664 charset_list = CODING_ATTR_CHARSET_LIST (attrs); | |
| 7665 charset = char_charset (c, charset_list, &code); | |
| 7666 if (code == CHARSET_INVALID_CODE (charset)) | |
| 7667 error ("Can't encode by Big5 encoding: %d", c); | |
| 7668 | |
| 7669 return make_number (code); | |
| 17052 | 7670 } |
| 88365 | 7671 |
|
20680
dd46027e8412
(code_convert_region): Always count chars inserted
Richard M. Stallman <rms@gnu.org>
parents:
20668
diff
changeset
|
7672 |
|
18002
a14261786239
(encode_invocation_designation): Use macro
Kenichi Handa <handa@m17n.org>
parents:
17835
diff
changeset
|
7673 DEFUN ("set-terminal-coding-system-internal", |
|
a14261786239
(encode_invocation_designation): Use macro
Kenichi Handa <handa@m17n.org>
parents:
17835
diff
changeset
|
7674 Fset_terminal_coding_system_internal, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7675 Sset_terminal_coding_system_internal, 1, 1, 0, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7676 doc: /* Internal use only. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7677 (coding_system) |
| 88473 | 7678 Lisp_Object coding_system; |
| 17052 | 7679 { |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
7680 CHECK_SYMBOL (coding_system); |
| 88365 | 7681 setup_coding_system (Fcheck_coding_system (coding_system), |
| 7682 &terminal_coding); | |
| 89483 | 7683 |
|
20150
402b6e5f4b58
(encode_designation_at_bol): Fix bug of finding graphic
Kenichi Handa <handa@m17n.org>
parents:
20105
diff
changeset
|
7684 /* We had better not send unsafe characters to terminal. */ |
| 88365 | 7685 terminal_coding.mode |= CODING_MODE_SAFE_ENCODING; |
| 7686 /* Characer composition should be disabled. */ | |
| 7687 terminal_coding.common_flags &= ~CODING_ANNOTATE_COMPOSITION_MASK; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
7688 terminal_coding.src_multibyte = 1; |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
7689 terminal_coding.dst_multibyte = 0; |
| 17052 | 7690 return Qnil; |
| 7691 } | |
| 7692 | |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
7693 DEFUN ("set-safe-terminal-coding-system-internal", |
|
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
7694 Fset_safe_terminal_coding_system_internal, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7695 Sset_safe_terminal_coding_system_internal, 1, 1, 0, |
| 41006 | 7696 doc: /* Internal use only. */) |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7697 (coding_system) |
| 88473 | 7698 Lisp_Object coding_system; |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
7699 { |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
7700 CHECK_SYMBOL (coding_system); |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
7701 setup_coding_system (Fcheck_coding_system (coding_system), |
|
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
7702 &safe_terminal_coding); |
| 88365 | 7703 /* Characer composition should be disabled. */ |
| 7704 safe_terminal_coding.common_flags &= ~CODING_ANNOTATE_COMPOSITION_MASK; | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
7705 safe_terminal_coding.src_multibyte = 1; |
|
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
7706 safe_terminal_coding.dst_multibyte = 0; |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
7707 return Qnil; |
|
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
7708 } |
|
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
7709 |
| 17052 | 7710 DEFUN ("terminal-coding-system", |
| 7711 Fterminal_coding_system, Sterminal_coding_system, 0, 0, 0, | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7712 doc: /* Return coding system specified for terminal output. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7713 () |
| 17052 | 7714 { |
| 88365 | 7715 return CODING_ID_NAME (terminal_coding.id); |
| 17052 | 7716 } |
| 7717 | |
|
18002
a14261786239
(encode_invocation_designation): Use macro
Kenichi Handa <handa@m17n.org>
parents:
17835
diff
changeset
|
7718 DEFUN ("set-keyboard-coding-system-internal", |
|
a14261786239
(encode_invocation_designation): Use macro
Kenichi Handa <handa@m17n.org>
parents:
17835
diff
changeset
|
7719 Fset_keyboard_coding_system_internal, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7720 Sset_keyboard_coding_system_internal, 1, 1, 0, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7721 doc: /* Internal use only. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7722 (coding_system) |
| 17052 | 7723 Lisp_Object coding_system; |
| 7724 { | |
|
40656
cdfd4d09b79a
Update usage of CHECK_ macros (remove unused second argument).
Pavel Jan?k <Pavel@Janik.cz>
parents:
40461
diff
changeset
|
7725 CHECK_SYMBOL (coding_system); |
| 88365 | 7726 setup_coding_system (Fcheck_coding_system (coding_system), |
| 7727 &keyboard_coding); | |
| 7728 /* Characer composition should be disabled. */ | |
| 7729 keyboard_coding.common_flags &= ~CODING_ANNOTATE_COMPOSITION_MASK; | |
| 17052 | 7730 return Qnil; |
| 7731 } | |
| 7732 | |
| 7733 DEFUN ("keyboard-coding-system", | |
| 7734 Fkeyboard_coding_system, Skeyboard_coding_system, 0, 0, 0, | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7735 doc: /* Return coding system specified for decoding keyboard input. */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7736 () |
| 17052 | 7737 { |
| 88365 | 7738 return CODING_ID_NAME (keyboard_coding.id); |
| 17052 | 7739 } |
| 7740 | |
| 7741 | |
|
18536
69c0e220b626
(Vstandard_character_unification_table_for_decode):
Kenichi Handa <handa@m17n.org>
parents:
18523
diff
changeset
|
7742 DEFUN ("find-operation-coding-system", Ffind_operation_coding_system, |
|
69c0e220b626
(Vstandard_character_unification_table_for_decode):
Kenichi Handa <handa@m17n.org>
parents:
18523
diff
changeset
|
7743 Sfind_operation_coding_system, 1, MANY, 0, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7744 doc: /* Choose a coding system for an operation based on the target name. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7745 The value names a pair of coding systems: (DECODING-SYSTEM . ENCODING-SYSTEM). |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7746 DECODING-SYSTEM is the coding system to use for decoding |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7747 \(in case OPERATION does decoding), and ENCODING-SYSTEM is the coding system |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7748 for encoding (in case OPERATION does encoding). |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7749 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7750 The first argument OPERATION specifies an I/O primitive: |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7751 For file I/O, `insert-file-contents' or `write-region'. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7752 For process I/O, `call-process', `call-process-region', or `start-process'. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7753 For network I/O, `open-network-stream'. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7754 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7755 The remaining arguments should be the same arguments that were passed |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7756 to the primitive. Depending on which primitive, one of those arguments |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7757 is selected as the TARGET. For example, if OPERATION does file I/O, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7758 whichever argument specifies the file name is TARGET. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7759 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7760 TARGET has a meaning which depends on OPERATION: |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7761 For file I/O, TARGET is a file name. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7762 For process I/O, TARGET is a process name. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7763 For network I/O, TARGET is a service name or a port number |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7764 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7765 This function looks up what specified for TARGET in, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7766 `file-coding-system-alist', `process-coding-system-alist', |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7767 or `network-coding-system-alist' depending on OPERATION. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7768 They may specify a coding system, a cons of coding systems, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7769 or a function symbol to call. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7770 In the last case, we call the function with one argument, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7771 which is a list of all the arguments given to this function. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7772 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7773 usage: (find-operation-coding-system OPERATION ARGUMENTS ...) */) |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
7774 (nargs, args) |
| 17052 | 7775 int nargs; |
| 7776 Lisp_Object *args; | |
| 7777 { | |
| 7778 Lisp_Object operation, target_idx, target, val; | |
| 7779 register Lisp_Object chain; | |
| 7780 | |
| 7781 if (nargs < 2) | |
| 7782 error ("Too few arguments"); | |
| 7783 operation = args[0]; | |
| 7784 if (!SYMBOLP (operation) | |
| 7785 || !INTEGERP (target_idx = Fget (operation, Qtarget_idx))) | |
| 88365 | 7786 error ("Invalid first arguement"); |
| 17052 | 7787 if (nargs < 1 + XINT (target_idx)) |
| 7788 error ("Too few arguments for operation: %s", | |
|
46370
40db0673e6f0
Most uses of XSTRING combined with STRING_BYTES or indirection changed to
Ken Raeburn <raeburn@raeburn.org>
parents:
46293
diff
changeset
|
7789 SDATA (SYMBOL_NAME (operation))); |
| 17052 | 7790 target = args[XINT (target_idx) + 1]; |
| 7791 if (!(STRINGP (target) | |
| 7792 || (EQ (operation, Qopen_network_stream) && INTEGERP (target)))) | |
| 88365 | 7793 error ("Invalid %dth argument", XINT (target_idx) + 1); |
| 17052 | 7794 |
|
18613
614b916ff5bf
Fix bugs with inappropriate mixing of Lisp_Object with int.
Richard M. Stallman <rms@gnu.org>
parents:
18536
diff
changeset
|
7795 chain = ((EQ (operation, Qinsert_file_contents) |
|
614b916ff5bf
Fix bugs with inappropriate mixing of Lisp_Object with int.
Richard M. Stallman <rms@gnu.org>
parents:
18536
diff
changeset
|
7796 || EQ (operation, Qwrite_region)) |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7797 ? Vfile_coding_system_alist |
|
18613
614b916ff5bf
Fix bugs with inappropriate mixing of Lisp_Object with int.
Richard M. Stallman <rms@gnu.org>
parents:
18536
diff
changeset
|
7798 : (EQ (operation, Qopen_network_stream) |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7799 ? Vnetwork_coding_system_alist |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7800 : Vprocess_coding_system_alist)); |
| 17052 | 7801 if (NILP (chain)) |
| 7802 return Qnil; | |
| 7803 | |
|
25662
0a7261c1d487
Use XCAR, XCDR, and XFLOAT_DATA instead of explicit member access.
Ken Raeburn <raeburn@raeburn.org>
parents:
25370
diff
changeset
|
7804 for (; CONSP (chain); chain = XCDR (chain)) |
| 17052 | 7805 { |
|
19747
bed06df9cbc5
(setup_coding_system, Ffind_operation_coding_system)
Richard M. Stallman <rms@gnu.org>
parents:
19743
diff
changeset
|
7806 Lisp_Object elt; |
| 88365 | 7807 |
|
25662
0a7261c1d487
Use XCAR, XCDR, and XFLOAT_DATA instead of explicit member access.
Ken Raeburn <raeburn@raeburn.org>
parents:
25370
diff
changeset
|
7808 elt = XCAR (chain); |
| 17052 | 7809 if (CONSP (elt) |
| 7810 && ((STRINGP (target) | |
|
25662
0a7261c1d487
Use XCAR, XCDR, and XFLOAT_DATA instead of explicit member access.
Ken Raeburn <raeburn@raeburn.org>
parents:
25370
diff
changeset
|
7811 && STRINGP (XCAR (elt)) |
|
0a7261c1d487
Use XCAR, XCDR, and XFLOAT_DATA instead of explicit member access.
Ken Raeburn <raeburn@raeburn.org>
parents:
25370
diff
changeset
|
7812 && fast_string_match (XCAR (elt), target) >= 0) |
|
0a7261c1d487
Use XCAR, XCDR, and XFLOAT_DATA instead of explicit member access.
Ken Raeburn <raeburn@raeburn.org>
parents:
25370
diff
changeset
|
7813 || (INTEGERP (target) && EQ (target, XCAR (elt))))) |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7814 { |
|
25662
0a7261c1d487
Use XCAR, XCDR, and XFLOAT_DATA instead of explicit member access.
Ken Raeburn <raeburn@raeburn.org>
parents:
25370
diff
changeset
|
7815 val = XCDR (elt); |
|
19763
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7816 /* Here, if VAL is both a valid coding system and a valid |
|
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7817 function symbol, we return VAL as a coding system. */ |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7818 if (CONSP (val)) |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7819 return val; |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7820 if (! SYMBOLP (val)) |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7821 return Qnil; |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7822 if (! NILP (Fcoding_system_p (val))) |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7823 return Fcons (val, val); |
|
19763
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7824 if (! NILP (Ffboundp (val))) |
|
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7825 { |
|
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7826 val = call1 (val, Flist (nargs, args)); |
|
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7827 if (CONSP (val)) |
|
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7828 return val; |
|
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7829 if (SYMBOLP (val) && ! NILP (Fcoding_system_p (val))) |
|
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7830 return Fcons (val, val); |
|
ab2fd2c85986
(Ffind_operation_coding_system): If a function in
Kenichi Handa <handa@m17n.org>
parents:
19758
diff
changeset
|
7831 } |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7832 return Qnil; |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
7833 } |
| 17052 | 7834 } |
| 7835 return Qnil; | |
| 7836 } | |
| 7837 | |
| 88365 | 7838 DEFUN ("set-coding-system-priority", Fset_coding_system_priority, |
|
88845
64b8f6168269
(Fset_coding_system_priority): Allow null arg list.
Dave Love <fx@gnu.org>
parents:
88771
diff
changeset
|
7839 Sset_coding_system_priority, 0, MANY, 0, |
| 88645 | 7840 doc: /* Assign higher priority to the coding systems given as arguments. |
|
89467
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7841 If multiple coding systems belongs to the same category, |
|
89519
040a08a2a879
(Fread_coding_system): Fix arg of XSETSTRING.
Dave Love <fx@gnu.org>
parents:
89483
diff
changeset
|
7842 all but the first one are ignored. |
|
040a08a2a879
(Fread_coding_system): Fix arg of XSETSTRING.
Dave Love <fx@gnu.org>
parents:
89483
diff
changeset
|
7843 |
|
040a08a2a879
(Fread_coding_system): Fix arg of XSETSTRING.
Dave Love <fx@gnu.org>
parents:
89483
diff
changeset
|
7844 usage: (set-coding-system-priority ...) */) |
| 88365 | 7845 (nargs, args) |
| 7846 int nargs; | |
| 7847 Lisp_Object *args; | |
| 7848 { | |
| 7849 int i, j; | |
| 7850 int changed[coding_category_max]; | |
| 7851 enum coding_category priorities[coding_category_max]; | |
| 7852 | |
| 7853 bzero (changed, sizeof changed); | |
| 7854 | |
| 7855 for (i = j = 0; i < nargs; i++) | |
| 7856 { | |
| 7857 enum coding_category category; | |
| 7858 Lisp_Object spec, attrs; | |
| 7859 | |
| 7860 CHECK_CODING_SYSTEM_GET_SPEC (args[i], spec); | |
| 7861 attrs = AREF (spec, 0); | |
| 7862 category = XINT (CODING_ATTR_CATEGORY (attrs)); | |
| 7863 if (changed[category]) | |
| 7864 /* Ignore this coding system because a coding system of the | |
| 7865 same category already had a higher priority. */ | |
| 7866 continue; | |
| 7867 changed[category] = 1; | |
| 7868 priorities[j++] = category; | |
| 7869 if (coding_categories[category].id >= 0 | |
| 7870 && ! EQ (args[i], CODING_ID_NAME (coding_categories[category].id))) | |
| 7871 setup_coding_system (args[i], &coding_categories[category]); | |
|
89467
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7872 Fset (AREF (Vcoding_category_table, category), args[i]); |
| 88365 | 7873 } |
| 7874 | |
| 7875 /* Now we have decided top J priorities. Reflect the order of the | |
| 7876 original priorities to the remaining priorities. */ | |
| 7877 | |
| 7878 for (i = j, j = 0; i < coding_category_max; i++, j++) | |
| 7879 { | |
| 7880 while (j < coding_category_max | |
| 7881 && changed[coding_priorities[j]]) | |
| 7882 j++; | |
| 7883 if (j == coding_category_max) | |
| 7884 abort (); | |
| 7885 priorities[i] = coding_priorities[j]; | |
| 7886 } | |
| 7887 | |
| 7888 bcopy (priorities, coding_priorities, sizeof priorities); | |
|
89467
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7889 |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7890 /* Update `coding-category-list'. */ |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7891 Vcoding_category_list = Qnil; |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7892 for (i = coding_category_max - 1; i >= 0; i--) |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7893 Vcoding_category_list |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7894 = Fcons (AREF (Vcoding_category_table, priorities[i]), |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7895 Vcoding_category_list); |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
7896 |
| 88365 | 7897 return Qnil; |
| 7898 } | |
| 7899 | |
| 7900 DEFUN ("coding-system-priority-list", Fcoding_system_priority_list, | |
| 7901 Scoding_system_priority_list, 0, 1, 0, | |
| 88645 | 7902 doc: /* Return a list of coding systems ordered by their priorities. |
| 7903 HIGHESTP non-nil means just return the highest priority one. */) | |
| 88365 | 7904 (highestp) |
| 7905 Lisp_Object highestp; | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
7906 { |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
7907 int i; |
| 88365 | 7908 Lisp_Object val; |
| 7909 | |
| 7910 for (i = 0, val = Qnil; i < coding_category_max; i++) | |
| 7911 { | |
| 7912 enum coding_category category = coding_priorities[i]; | |
| 7913 int id = coding_categories[category].id; | |
| 7914 Lisp_Object attrs; | |
| 7915 | |
| 7916 if (id < 0) | |
| 7917 continue; | |
| 7918 attrs = CODING_ID_ATTRS (id); | |
| 7919 if (! NILP (highestp)) | |
| 7920 return CODING_ATTR_BASE_NAME (attrs); | |
| 7921 val = Fcons (CODING_ATTR_BASE_NAME (attrs), val); | |
| 7922 } | |
| 7923 return Fnreverse (val); | |
| 7924 } | |
| 7925 | |
|
88631
780b91d4a7e5
(setup_iso_safe_charsets): Fix arg decl for K&R.
Dave Love <fx@gnu.org>
parents:
88607
diff
changeset
|
7926 static char *suffixes[] = { "-unix", "-dos", "-mac" }; |
|
780b91d4a7e5
(setup_iso_safe_charsets): Fix arg decl for K&R.
Dave Love <fx@gnu.org>
parents:
88607
diff
changeset
|
7927 |
| 88365 | 7928 static Lisp_Object |
| 7929 make_subsidiaries (base) | |
| 7930 Lisp_Object base; | |
| 7931 { | |
| 7932 Lisp_Object subsidiaries; | |
| 89483 | 7933 int base_name_len = SBYTES (SYMBOL_NAME (base)); |
| 88365 | 7934 char *buf = (char *) alloca (base_name_len + 6); |
| 7935 int i; | |
| 89483 | 7936 |
| 7937 bcopy (SDATA (SYMBOL_NAME (base)), buf, base_name_len); | |
| 88365 | 7938 subsidiaries = Fmake_vector (make_number (3), Qnil); |
| 7939 for (i = 0; i < 3; i++) | |
| 7940 { | |
| 7941 bcopy (suffixes[i], buf + base_name_len, strlen (suffixes[i]) + 1); | |
| 7942 ASET (subsidiaries, i, intern (buf)); | |
| 7943 } | |
| 7944 return subsidiaries; | |
| 7945 } | |
| 7946 | |
| 7947 | |
| 7948 DEFUN ("define-coding-system-internal", Fdefine_coding_system_internal, | |
| 7949 Sdefine_coding_system_internal, coding_arg_max, MANY, 0, | |
|
88544
f464d728344c
(Vchar_coding_system_table, Qchar_coding_system):
Dave Love <fx@gnu.org>
parents:
88510
diff
changeset
|
7950 doc: /* For internal use only. |
|
f464d728344c
(Vchar_coding_system_table, Qchar_coding_system):
Dave Love <fx@gnu.org>
parents:
88510
diff
changeset
|
7951 usage: (define-coding-system-internal ...) */) |
| 88365 | 7952 (nargs, args) |
| 7953 int nargs; | |
| 7954 Lisp_Object *args; | |
| 7955 { | |
| 7956 Lisp_Object name; | |
| 7957 Lisp_Object spec_vec; /* [ ATTRS ALIASE EOL_TYPE ] */ | |
| 7958 Lisp_Object attrs; /* Vector of attributes. */ | |
| 7959 Lisp_Object eol_type; | |
| 7960 Lisp_Object aliases; | |
| 7961 Lisp_Object coding_type, charset_list, safe_charsets; | |
| 7962 enum coding_category category; | |
| 7963 Lisp_Object tail, val; | |
| 7964 int max_charset_id = 0; | |
| 7965 int i; | |
| 7966 | |
| 7967 if (nargs < coding_arg_max) | |
| 7968 goto short_args; | |
| 7969 | |
| 7970 attrs = Fmake_vector (make_number (coding_attr_last_index), Qnil); | |
| 7971 | |
| 7972 name = args[coding_arg_name]; | |
| 7973 CHECK_SYMBOL (name); | |
| 7974 CODING_ATTR_BASE_NAME (attrs) = name; | |
| 7975 | |
| 7976 val = args[coding_arg_mnemonic]; | |
| 7977 if (! STRINGP (val)) | |
| 7978 CHECK_CHARACTER (val); | |
| 7979 CODING_ATTR_MNEMONIC (attrs) = val; | |
| 7980 | |
| 7981 coding_type = args[coding_arg_coding_type]; | |
| 7982 CHECK_SYMBOL (coding_type); | |
| 7983 CODING_ATTR_TYPE (attrs) = coding_type; | |
| 7984 | |
| 7985 charset_list = args[coding_arg_charset_list]; | |
| 7986 if (SYMBOLP (charset_list)) | |
| 7987 { | |
| 7988 if (EQ (charset_list, Qiso_2022)) | |
| 7989 { | |
| 7990 if (! EQ (coding_type, Qiso_2022)) | |
| 7991 error ("Invalid charset-list"); | |
| 7992 charset_list = Viso_2022_charset_list; | |
| 7993 } | |
| 7994 else if (EQ (charset_list, Qemacs_mule)) | |
| 7995 { | |
| 7996 if (! EQ (coding_type, Qemacs_mule)) | |
| 7997 error ("Invalid charset-list"); | |
| 7998 charset_list = Vemacs_mule_charset_list; | |
| 7999 } | |
| 8000 for (tail = charset_list; CONSP (tail); tail = XCDR (tail)) | |
| 8001 if (max_charset_id < XFASTINT (XCAR (tail))) | |
| 8002 max_charset_id = XFASTINT (XCAR (tail)); | |
| 8003 } | |
| 8004 else | |
| 8005 { | |
| 8006 charset_list = Fcopy_sequence (charset_list); | |
| 8007 for (tail = charset_list; !NILP (tail); tail = Fcdr (tail)) | |
| 8008 { | |
| 8009 struct charset *charset; | |
| 8010 | |
| 8011 val = Fcar (tail); | |
| 8012 CHECK_CHARSET_GET_CHARSET (val, charset); | |
| 8013 if (EQ (coding_type, Qiso_2022) | |
| 8014 ? CHARSET_ISO_FINAL (charset) < 0 | |
| 8015 : EQ (coding_type, Qemacs_mule) | |
| 8016 ? CHARSET_EMACS_MULE_ID (charset) < 0 | |
| 8017 : 0) | |
| 8018 error ("Can't handle charset `%s'", | |
| 89483 | 8019 SDATA (SYMBOL_NAME (CHARSET_NAME (charset)))); |
| 8020 | |
| 8021 XSETCAR (tail, make_number (charset->id)); | |
| 88365 | 8022 if (max_charset_id < charset->id) |
| 8023 max_charset_id = charset->id; | |
| 8024 } | |
| 8025 } | |
| 8026 CODING_ATTR_CHARSET_LIST (attrs) = charset_list; | |
| 8027 | |
| 8028 safe_charsets = Fmake_string (make_number (max_charset_id + 1), | |
| 8029 make_number (255)); | |
| 8030 for (tail = charset_list; CONSP (tail); tail = XCDR (tail)) | |
| 89483 | 8031 SSET (safe_charsets, XFASTINT (XCAR (tail)), 0); |
| 88365 | 8032 CODING_ATTR_SAFE_CHARSETS (attrs) = safe_charsets; |
| 8033 | |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8034 CODING_ATTR_ASCII_COMPAT (attrs) = args[coding_arg_ascii_compatible_p]; |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8035 |
| 88365 | 8036 val = args[coding_arg_decode_translation_table]; |
| 8037 if (! NILP (val)) | |
| 8038 CHECK_CHAR_TABLE (val); | |
| 8039 CODING_ATTR_DECODE_TBL (attrs) = val; | |
| 8040 | |
| 8041 val = args[coding_arg_encode_translation_table]; | |
| 8042 if (! NILP (val)) | |
| 8043 CHECK_CHAR_TABLE (val); | |
| 8044 CODING_ATTR_ENCODE_TBL (attrs) = val; | |
| 8045 | |
| 8046 val = args[coding_arg_post_read_conversion]; | |
| 8047 CHECK_SYMBOL (val); | |
| 8048 CODING_ATTR_POST_READ (attrs) = val; | |
| 8049 | |
| 8050 val = args[coding_arg_pre_write_conversion]; | |
| 8051 CHECK_SYMBOL (val); | |
| 8052 CODING_ATTR_PRE_WRITE (attrs) = val; | |
| 8053 | |
| 8054 val = args[coding_arg_default_char]; | |
| 8055 if (NILP (val)) | |
| 8056 CODING_ATTR_DEFAULT_CHAR (attrs) = make_number (' '); | |
| 8057 else | |
| 8058 { | |
| 89483 | 8059 CHECK_CHARACTER (val); |
| 88365 | 8060 CODING_ATTR_DEFAULT_CHAR (attrs) = val; |
| 8061 } | |
| 8062 | |
| 89483 | 8063 val = args[coding_arg_for_unibyte]; |
| 8064 CODING_ATTR_FOR_UNIBYTE (attrs) = NILP (val) ? Qnil : Qt; | |
| 8065 | |
| 88365 | 8066 val = args[coding_arg_plist]; |
| 8067 CHECK_LIST (val); | |
| 8068 CODING_ATTR_PLIST (attrs) = val; | |
| 8069 | |
| 8070 if (EQ (coding_type, Qcharset)) | |
| 8071 { | |
|
88597
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8072 /* Generate a lisp vector of 256 elements. Each element is nil, |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8073 integer, or a list of charset IDs. |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8074 |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8075 If Nth element is nil, the byte code N is invalid in this |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8076 coding system. |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8077 |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8078 If Nth element is a number NUM, N is the first byte of a |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8079 charset whose ID is NUM. |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8080 |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8081 If Nth element is a list of charset IDs, N is the first byte |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8082 of one of them. The list is sorted by dimensions of the |
|
89648
d5641a606e08
(Fdefine_coding_system_internal): Fix checking of ascii compatibility.
Kenichi Handa <handa@m17n.org>
parents:
89644
diff
changeset
|
8083 charsets. A charset of smaller dimension comes firtst. */ |
| 88365 | 8084 val = Fmake_vector (make_number (256), Qnil); |
| 8085 | |
|
89653
cbaa9fd1aa5c
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
89648
diff
changeset
|
8086 for (tail = charset_list; CONSP (tail); tail = XCDR (tail)) |
| 88365 | 8087 { |
|
88597
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8088 struct charset *charset = CHARSET_FROM_ID (XFASTINT (XCAR (tail))); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8089 int dim = CHARSET_DIMENSION (charset); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8090 int idx = (dim - 1) * 4; |
| 89483 | 8091 |
|
89653
cbaa9fd1aa5c
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
89648
diff
changeset
|
8092 if (CHARSET_ASCII_COMPATIBLE_P (charset)) |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8093 CODING_ATTR_ASCII_COMPAT (attrs) = Qt; |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8094 |
|
88477
5f974cbba7b3
(coding_set_source): Delete the local variable beg_byte.
Kenichi Handa <handa@m17n.org>
parents:
88473
diff
changeset
|
8095 for (i = charset->code_space[idx]; |
|
5f974cbba7b3
(coding_set_source): Delete the local variable beg_byte.
Kenichi Handa <handa@m17n.org>
parents:
88473
diff
changeset
|
8096 i <= charset->code_space[idx + 1]; i++) |
|
5f974cbba7b3
(coding_set_source): Delete the local variable beg_byte.
Kenichi Handa <handa@m17n.org>
parents:
88473
diff
changeset
|
8097 { |
|
88597
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8098 Lisp_Object tmp, tmp2; |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8099 int dim2; |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8100 |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8101 tmp = AREF (val, i); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8102 if (NILP (tmp)) |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8103 tmp = XCAR (tail); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8104 else if (NUMBERP (tmp)) |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8105 { |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8106 dim2 = CHARSET_DIMENSION (CHARSET_FROM_ID (XFASTINT (tmp))); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8107 if (dim < dim2) |
|
88607
18436bf3d6dd
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
88598
diff
changeset
|
8108 tmp = Fcons (XCAR (tail), Fcons (tmp, Qnil)); |
|
88597
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8109 else |
|
88607
18436bf3d6dd
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
88598
diff
changeset
|
8110 tmp = Fcons (tmp, Fcons (XCAR (tail), Qnil)); |
|
88597
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8111 } |
|
88477
5f974cbba7b3
(coding_set_source): Delete the local variable beg_byte.
Kenichi Handa <handa@m17n.org>
parents:
88473
diff
changeset
|
8112 else |
|
88597
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8113 { |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8114 for (tmp2 = tmp; CONSP (tmp2); tmp2 = XCDR (tmp2)) |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8115 { |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8116 dim2 = CHARSET_DIMENSION (CHARSET_FROM_ID (XFASTINT (XCAR (tmp2)))); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8117 if (dim < dim2) |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8118 break; |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8119 } |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8120 if (NILP (tmp2)) |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8121 tmp = nconc2 (tmp, Fcons (XCAR (tail), Qnil)); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8122 else |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8123 { |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8124 XSETCDR (tmp2, Fcons (XCAR (tmp2), XCDR (tmp2))); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8125 XSETCAR (tmp2, XCAR (tail)); |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8126 } |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8127 } |
|
74b74f59bc09
(decode_coding_charset): Adjusted for the change of
Kenichi Handa <handa@m17n.org>
parents:
88587
diff
changeset
|
8128 ASET (val, i, tmp); |
|
88477
5f974cbba7b3
(coding_set_source): Delete the local variable beg_byte.
Kenichi Handa <handa@m17n.org>
parents:
88473
diff
changeset
|
8129 } |
| 88365 | 8130 } |
| 8131 ASET (attrs, coding_attr_charset_valids, val); | |
| 8132 category = coding_category_charset; | |
| 8133 } | |
| 8134 else if (EQ (coding_type, Qccl)) | |
| 8135 { | |
| 8136 Lisp_Object valids; | |
| 89483 | 8137 |
| 88365 | 8138 if (nargs < coding_arg_ccl_max) |
| 8139 goto short_args; | |
| 8140 | |
| 8141 val = args[coding_arg_ccl_decoder]; | |
| 8142 CHECK_CCL_PROGRAM (val); | |
| 8143 if (VECTORP (val)) | |
| 8144 val = Fcopy_sequence (val); | |
| 8145 ASET (attrs, coding_attr_ccl_decoder, val); | |
| 8146 | |
| 8147 val = args[coding_arg_ccl_encoder]; | |
| 8148 CHECK_CCL_PROGRAM (val); | |
| 8149 if (VECTORP (val)) | |
| 8150 val = Fcopy_sequence (val); | |
| 8151 ASET (attrs, coding_attr_ccl_encoder, val); | |
| 8152 | |
| 8153 val = args[coding_arg_ccl_valids]; | |
| 8154 valids = Fmake_string (make_number (256), make_number (0)); | |
| 8155 for (tail = val; !NILP (tail); tail = Fcdr (tail)) | |
| 8156 { | |
|
89373
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8157 int from, to; |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8158 |
| 88365 | 8159 val = Fcar (tail); |
| 8160 if (INTEGERP (val)) | |
|
89373
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8161 { |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8162 from = to = XINT (val); |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8163 if (from < 0 || from > 255) |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8164 args_out_of_range_3 (val, make_number (0), make_number (255)); |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8165 } |
| 88365 | 8166 else |
| 8167 { | |
| 8168 CHECK_CONS (val); | |
| 89483 | 8169 CHECK_NATNUM_CAR (val); |
| 8170 CHECK_NATNUM_CDR (val); | |
| 88365 | 8171 from = XINT (XCAR (val)); |
| 89483 | 8172 if (from > 255) |
|
89373
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8173 args_out_of_range_3 (XCAR (val), |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8174 make_number (0), make_number (255)); |
| 88365 | 8175 to = XINT (XCDR (val)); |
|
89373
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8176 if (to < from || to > 255) |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8177 args_out_of_range_3 (XCDR (val), |
|
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8178 XCAR (val), make_number (255)); |
| 88365 | 8179 } |
|
89373
4cc9e57fcabc
(decode_coding_ccl, encode_coding_ccl): Call ccl_driver
Kenichi Handa <handa@m17n.org>
parents:
89331
diff
changeset
|
8180 for (i = from; i <= to; i++) |
| 89483 | 8181 SSET (valids, i, 1); |
| 88365 | 8182 } |
| 8183 ASET (attrs, coding_attr_ccl_valids, valids); | |
| 89483 | 8184 |
| 88365 | 8185 category = coding_category_ccl; |
| 8186 } | |
| 8187 else if (EQ (coding_type, Qutf_16)) | |
| 8188 { | |
| 8189 Lisp_Object bom, endian; | |
| 8190 | |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8191 CODING_ATTR_ASCII_COMPAT (attrs) = Qnil; |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8192 |
| 88365 | 8193 if (nargs < coding_arg_utf16_max) |
| 8194 goto short_args; | |
| 8195 | |
| 8196 bom = args[coding_arg_utf16_bom]; | |
| 8197 if (! NILP (bom) && ! EQ (bom, Qt)) | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
8198 { |
| 88365 | 8199 CHECK_CONS (bom); |
| 89483 | 8200 val = XCAR (bom); |
| 8201 CHECK_CODING_SYSTEM (val); | |
| 8202 val = XCDR (bom); | |
| 8203 CHECK_CODING_SYSTEM (val); | |
| 88365 | 8204 } |
| 8205 ASET (attrs, coding_attr_utf_16_bom, bom); | |
| 8206 | |
| 8207 endian = args[coding_arg_utf16_endian]; | |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
8208 CHECK_SYMBOL (endian); |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
8209 if (NILP (endian)) |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
8210 endian = Qbig; |
|
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
8211 else if (! EQ (endian, Qbig) && ! EQ (endian, Qlittle)) |
| 89483 | 8212 error ("Invalid endian: %s", SDATA (SYMBOL_NAME (endian))); |
| 88365 | 8213 ASET (attrs, coding_attr_utf_16_endian, endian); |
| 8214 | |
| 8215 category = (CONSP (bom) | |
| 8216 ? coding_category_utf_16_auto | |
| 8217 : NILP (bom) | |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
8218 ? (EQ (endian, Qbig) |
| 88365 | 8219 ? coding_category_utf_16_be_nosig |
| 8220 : coding_category_utf_16_le_nosig) | |
|
89420
c3e67ce6ee0f
(Qsignature, Qendian): Delete these variables.
Kenichi Handa <handa@m17n.org>
parents:
89418
diff
changeset
|
8221 : (EQ (endian, Qbig) |
| 88365 | 8222 ? coding_category_utf_16_be |
| 8223 : coding_category_utf_16_le)); | |
| 8224 } | |
| 8225 else if (EQ (coding_type, Qiso_2022)) | |
| 8226 { | |
| 8227 Lisp_Object initial, reg_usage, request, flags; | |
|
89442
7349f4473e7f
(detected_mask): Delete unused variable.
Kenichi Handa <handa@m17n.org>
parents:
89429
diff
changeset
|
8228 int i; |
| 88365 | 8229 |
| 8230 if (nargs < coding_arg_iso2022_max) | |
| 8231 goto short_args; | |
| 8232 | |
| 8233 initial = Fcopy_sequence (args[coding_arg_iso2022_initial]); | |
| 8234 CHECK_VECTOR (initial); | |
| 8235 for (i = 0; i < 4; i++) | |
| 8236 { | |
| 8237 val = Faref (initial, make_number (i)); | |
| 8238 if (! NILP (val)) | |
| 8239 { | |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8240 struct charset *charset; |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8241 |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8242 CHECK_CHARSET_GET_CHARSET (val, charset); |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8243 ASET (initial, i, make_number (CHARSET_ID (charset))); |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8244 if (i == 0 && CHARSET_ASCII_COMPATIBLE_P (charset)) |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8245 CODING_ATTR_ASCII_COMPAT (attrs) = Qt; |
| 88365 | 8246 } |
| 8247 else | |
| 8248 ASET (initial, i, make_number (-1)); | |
| 8249 } | |
| 8250 | |
| 8251 reg_usage = args[coding_arg_iso2022_reg_usage]; | |
| 8252 CHECK_CONS (reg_usage); | |
| 89483 | 8253 CHECK_NUMBER_CAR (reg_usage); |
| 8254 CHECK_NUMBER_CDR (reg_usage); | |
| 88365 | 8255 |
| 8256 request = Fcopy_sequence (args[coding_arg_iso2022_request]); | |
| 8257 for (tail = request; ! NILP (tail); tail = Fcdr (tail)) | |
| 8258 { | |
| 8259 int id; | |
| 89483 | 8260 Lisp_Object tmp; |
| 88365 | 8261 |
| 8262 val = Fcar (tail); | |
| 8263 CHECK_CONS (val); | |
| 89483 | 8264 tmp = XCAR (val); |
| 8265 CHECK_CHARSET_GET_ID (tmp, id); | |
| 8266 CHECK_NATNUM_CDR (val); | |
| 88365 | 8267 if (XINT (XCDR (val)) >= 4) |
| 8268 error ("Invalid graphic register number: %d", XINT (XCDR (val))); | |
| 89483 | 8269 XSETCAR (val, make_number (id)); |
| 88365 | 8270 } |
| 8271 | |
| 8272 flags = args[coding_arg_iso2022_flags]; | |
| 8273 CHECK_NATNUM (flags); | |
| 8274 i = XINT (flags); | |
| 8275 if (EQ (args[coding_arg_charset_list], Qiso_2022)) | |
| 8276 flags = make_number (i | CODING_ISO_FLAG_FULL_SUPPORT); | |
| 8277 | |
| 8278 ASET (attrs, coding_attr_iso_initial, initial); | |
| 8279 ASET (attrs, coding_attr_iso_usage, reg_usage); | |
| 8280 ASET (attrs, coding_attr_iso_request, request); | |
| 8281 ASET (attrs, coding_attr_iso_flags, flags); | |
| 8282 setup_iso_safe_charsets (attrs); | |
| 8283 | |
| 8284 if (i & CODING_ISO_FLAG_SEVEN_BITS) | |
| 8285 category = ((i & (CODING_ISO_FLAG_LOCKING_SHIFT | |
| 8286 | CODING_ISO_FLAG_SINGLE_SHIFT)) | |
| 8287 ? coding_category_iso_7_else | |
| 8288 : EQ (args[coding_arg_charset_list], Qiso_2022) | |
| 8289 ? coding_category_iso_7 | |
| 8290 : coding_category_iso_7_tight); | |
| 8291 else | |
| 8292 { | |
| 8293 int id = XINT (AREF (initial, 1)); | |
| 8294 | |
|
88977
3b05c02eebf2
(Fdefine_coding_system_internal): Fix category setting
Kenichi Handa <handa@m17n.org>
parents:
88950
diff
changeset
|
8295 category = (((i & CODING_ISO_FLAG_LOCKING_SHIFT) |
| 88365 | 8296 || EQ (args[coding_arg_charset_list], Qiso_2022) |
| 8297 || id < 0) | |
| 8298 ? coding_category_iso_8_else | |
| 8299 : (CHARSET_DIMENSION (CHARSET_FROM_ID (id)) == 1) | |
| 8300 ? coding_category_iso_8_1 | |
| 8301 : coding_category_iso_8_2); | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
8302 } |
|
89227
101ee928c088
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
89225
diff
changeset
|
8303 if (category != coding_category_iso_8_1 |
|
101ee928c088
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
89225
diff
changeset
|
8304 && category != coding_category_iso_8_2) |
|
101ee928c088
(Fdefine_coding_system_internal): Fix previous change.
Kenichi Handa <handa@m17n.org>
parents:
89225
diff
changeset
|
8305 CODING_ATTR_ASCII_COMPAT (attrs) = Qnil; |
| 88365 | 8306 } |
| 8307 else if (EQ (coding_type, Qemacs_mule)) | |
| 8308 { | |
| 8309 if (EQ (args[coding_arg_charset_list], Qemacs_mule)) | |
| 8310 ASET (attrs, coding_attr_emacs_mule_full, Qt); | |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8311 CODING_ATTR_ASCII_COMPAT (attrs) = Qt; |
| 88365 | 8312 category = coding_category_emacs_mule; |
| 8313 } | |
| 8314 else if (EQ (coding_type, Qshift_jis)) | |
| 8315 { | |
| 8316 | |
| 8317 struct charset *charset; | |
| 8318 | |
| 8319 if (XINT (Flength (charset_list)) != 3) | |
| 8320 error ("There should be just three charsets"); | |
| 8321 | |
| 8322 charset = CHARSET_FROM_ID (XINT (XCAR (charset_list))); | |
| 8323 if (CHARSET_DIMENSION (charset) != 1) | |
| 8324 error ("Dimension of charset %s is not one", | |
| 89483 | 8325 SDATA (SYMBOL_NAME (CHARSET_NAME (charset)))); |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8326 if (CHARSET_ASCII_COMPATIBLE_P (charset)) |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8327 CODING_ATTR_ASCII_COMPAT (attrs) = Qt; |
| 88365 | 8328 |
| 8329 charset_list = XCDR (charset_list); | |
| 8330 charset = CHARSET_FROM_ID (XINT (XCAR (charset_list))); | |
| 8331 if (CHARSET_DIMENSION (charset) != 1) | |
| 8332 error ("Dimension of charset %s is not one", | |
| 89483 | 8333 SDATA (SYMBOL_NAME (CHARSET_NAME (charset)))); |
| 88365 | 8334 |
| 8335 charset_list = XCDR (charset_list); | |
| 8336 charset = CHARSET_FROM_ID (XINT (XCAR (charset_list))); | |
| 8337 if (CHARSET_DIMENSION (charset) != 2) | |
| 8338 error ("Dimension of charset %s is not two", | |
| 89483 | 8339 SDATA (SYMBOL_NAME (CHARSET_NAME (charset)))); |
| 88365 | 8340 |
| 8341 category = coding_category_sjis; | |
| 8342 Vsjis_coding_system = name; | |
| 8343 } | |
| 8344 else if (EQ (coding_type, Qbig5)) | |
| 8345 { | |
| 8346 struct charset *charset; | |
| 8347 | |
| 8348 if (XINT (Flength (charset_list)) != 2) | |
| 8349 error ("There should be just two charsets"); | |
| 8350 | |
| 8351 charset = CHARSET_FROM_ID (XINT (XCAR (charset_list))); | |
| 8352 if (CHARSET_DIMENSION (charset) != 1) | |
| 8353 error ("Dimension of charset %s is not one", | |
| 89483 | 8354 SDATA (SYMBOL_NAME (CHARSET_NAME (charset)))); |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8355 if (CHARSET_ASCII_COMPATIBLE_P (charset)) |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8356 CODING_ATTR_ASCII_COMPAT (attrs) = Qt; |
| 88365 | 8357 |
| 8358 charset_list = XCDR (charset_list); | |
| 8359 charset = CHARSET_FROM_ID (XINT (XCAR (charset_list))); | |
| 8360 if (CHARSET_DIMENSION (charset) != 2) | |
| 8361 error ("Dimension of charset %s is not two", | |
| 89483 | 8362 SDATA (SYMBOL_NAME (CHARSET_NAME (charset)))); |
| 88365 | 8363 |
| 8364 category = coding_category_big5; | |
| 8365 Vbig5_coding_system = name; | |
| 8366 } | |
| 8367 else if (EQ (coding_type, Qraw_text)) | |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8368 { |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8369 category = coding_category_raw_text; |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8370 CODING_ATTR_ASCII_COMPAT (attrs) = Qt; |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8371 } |
| 88365 | 8372 else if (EQ (coding_type, Qutf_8)) |
|
89225
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8373 { |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8374 category = coding_category_utf_8; |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8375 CODING_ATTR_ASCII_COMPAT (attrs) = Qt; |
|
32058afc72e2
(detect_coding_charset): If only ASCII bytes are found, return 0.
Kenichi Handa <handa@m17n.org>
parents:
89221
diff
changeset
|
8376 } |
| 88365 | 8377 else if (EQ (coding_type, Qundecided)) |
| 8378 category = coding_category_undecided; | |
| 8379 else | |
| 8380 error ("Invalid coding system type: %s", | |
| 89483 | 8381 SDATA (SYMBOL_NAME (coding_type))); |
| 88365 | 8382 |
| 8383 CODING_ATTR_CATEGORY (attrs) = make_number (category); | |
|
89468
7dbbe692f70c
* coding.c (QCcategory): New variable.
Kenichi Handa <handa@m17n.org>
parents:
89467
diff
changeset
|
8384 CODING_ATTR_PLIST (attrs) |
|
7dbbe692f70c
* coding.c (QCcategory): New variable.
Kenichi Handa <handa@m17n.org>
parents:
89467
diff
changeset
|
8385 = Fcons (QCcategory, Fcons (AREF (Vcoding_category_table, category), |
|
7dbbe692f70c
* coding.c (QCcategory): New variable.
Kenichi Handa <handa@m17n.org>
parents:
89467
diff
changeset
|
8386 CODING_ATTR_PLIST (attrs))); |
| 88365 | 8387 |
| 8388 eol_type = args[coding_arg_eol_type]; | |
| 8389 if (! NILP (eol_type) | |
| 8390 && ! EQ (eol_type, Qunix) | |
| 8391 && ! EQ (eol_type, Qdos) | |
| 8392 && ! EQ (eol_type, Qmac)) | |
| 8393 error ("Invalid eol-type"); | |
| 8394 | |
| 8395 aliases = Fcons (name, Qnil); | |
| 8396 | |
| 8397 if (NILP (eol_type)) | |
| 8398 { | |
| 8399 eol_type = make_subsidiaries (name); | |
| 8400 for (i = 0; i < 3; i++) | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
8401 { |
| 88365 | 8402 Lisp_Object this_spec, this_name, this_aliases, this_eol_type; |
| 8403 | |
| 8404 this_name = AREF (eol_type, i); | |
| 8405 this_aliases = Fcons (this_name, Qnil); | |
| 8406 this_eol_type = (i == 0 ? Qunix : i == 1 ? Qdos : Qmac); | |
| 8407 this_spec = Fmake_vector (make_number (3), attrs); | |
| 8408 ASET (this_spec, 1, this_aliases); | |
| 8409 ASET (this_spec, 2, this_eol_type); | |
| 8410 Fputhash (this_name, this_spec, Vcoding_system_hash_table); | |
| 8411 Vcoding_system_list = Fcons (this_name, Vcoding_system_list); | |
| 8412 Vcoding_system_alist = Fcons (Fcons (Fsymbol_name (this_name), Qnil), | |
| 8413 Vcoding_system_alist); | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
8414 } |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8415 } |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
8416 |
| 88365 | 8417 spec_vec = Fmake_vector (make_number (3), attrs); |
| 8418 ASET (spec_vec, 1, aliases); | |
| 8419 ASET (spec_vec, 2, eol_type); | |
| 8420 | |
| 8421 Fputhash (name, spec_vec, Vcoding_system_hash_table); | |
| 8422 Vcoding_system_list = Fcons (name, Vcoding_system_list); | |
| 8423 Vcoding_system_alist = Fcons (Fcons (Fsymbol_name (name), Qnil), | |
| 8424 Vcoding_system_alist); | |
| 8425 | |
| 8426 { | |
| 8427 int id = coding_categories[category].id; | |
| 8428 | |
| 8429 if (id < 0 || EQ (name, CODING_ID_NAME (id))) | |
| 8430 setup_coding_system (name, &coding_categories[category]); | |
| 8431 } | |
| 8432 | |
| 8433 return Qnil; | |
| 8434 | |
| 8435 short_args: | |
| 8436 return Fsignal (Qwrong_number_of_arguments, | |
| 8437 Fcons (intern ("define-coding-system-internal"), | |
| 8438 make_number (nargs))); | |
| 8439 } | |
| 8440 | |
|
89571
242f2cc0134b
(Fdefine_coding_system_alias): Update Vcoding_system_list.
Kenichi Handa <handa@m17n.org>
parents:
89562
diff
changeset
|
8441 |
| 88365 | 8442 DEFUN ("define-coding-system-alias", Fdefine_coding_system_alias, |
| 8443 Sdefine_coding_system_alias, 2, 2, 0, | |
| 8444 doc: /* Define ALIAS as an alias for CODING-SYSTEM. */) | |
| 8445 (alias, coding_system) | |
| 8446 Lisp_Object alias, coding_system; | |
| 8447 { | |
| 8448 Lisp_Object spec, aliases, eol_type; | |
| 8449 | |
| 8450 CHECK_SYMBOL (alias); | |
| 8451 CHECK_CODING_SYSTEM_GET_SPEC (coding_system, spec); | |
| 8452 aliases = AREF (spec, 1); | |
|
89571
242f2cc0134b
(Fdefine_coding_system_alias): Update Vcoding_system_list.
Kenichi Handa <handa@m17n.org>
parents:
89562
diff
changeset
|
8453 /* ALISES should be a list of length more than zero, and the first |
|
242f2cc0134b
(Fdefine_coding_system_alias): Update Vcoding_system_list.
Kenichi Handa <handa@m17n.org>
parents:
89562
diff
changeset
|
8454 element is a base coding system. Append ALIAS at the tail of the |
|
242f2cc0134b
(Fdefine_coding_system_alias): Update Vcoding_system_list.
Kenichi Handa <handa@m17n.org>
parents:
89562
diff
changeset
|
8455 list. */ |
| 88365 | 8456 while (!NILP (XCDR (aliases))) |
| 8457 aliases = XCDR (aliases); | |
| 89483 | 8458 XSETCDR (aliases, Fcons (alias, Qnil)); |
| 88365 | 8459 |
| 8460 eol_type = AREF (spec, 2); | |
| 8461 if (VECTORP (eol_type)) | |
| 8462 { | |
| 8463 Lisp_Object subsidiaries; | |
| 8464 int i; | |
| 8465 | |
| 8466 subsidiaries = make_subsidiaries (alias); | |
| 8467 for (i = 0; i < 3; i++) | |
| 8468 Fdefine_coding_system_alias (AREF (subsidiaries, i), | |
| 8469 AREF (eol_type, i)); | |
| 8470 } | |
| 8471 | |
| 8472 Fputhash (alias, spec, Vcoding_system_hash_table); | |
|
89571
242f2cc0134b
(Fdefine_coding_system_alias): Update Vcoding_system_list.
Kenichi Handa <handa@m17n.org>
parents:
89562
diff
changeset
|
8473 Vcoding_system_list = Fcons (alias, Vcoding_system_list); |
| 88485 | 8474 Vcoding_system_alist = Fcons (Fcons (Fsymbol_name (alias), Qnil), |
| 8475 Vcoding_system_alist); | |
| 88365 | 8476 |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8477 return Qnil; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8478 } |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8479 |
| 88365 | 8480 DEFUN ("coding-system-base", Fcoding_system_base, Scoding_system_base, |
| 8481 1, 1, 0, | |
| 8482 doc: /* Return the base of CODING-SYSTEM. | |
| 88645 | 8483 Any alias or subsidiary coding system is not a base coding system. */) |
| 88365 | 8484 (coding_system) |
| 8485 Lisp_Object coding_system; | |
| 8486 { | |
| 8487 Lisp_Object spec, attrs; | |
| 8488 | |
| 8489 if (NILP (coding_system)) | |
| 8490 return (Qno_conversion); | |
| 8491 CHECK_CODING_SYSTEM_GET_SPEC (coding_system, spec); | |
| 8492 attrs = AREF (spec, 0); | |
| 8493 return CODING_ATTR_BASE_NAME (attrs); | |
| 8494 } | |
| 8495 | |
| 8496 DEFUN ("coding-system-plist", Fcoding_system_plist, Scoding_system_plist, | |
| 8497 1, 1, 0, | |
| 8498 doc: "Return the property list of CODING-SYSTEM.") | |
| 8499 (coding_system) | |
| 8500 Lisp_Object coding_system; | |
|
22226
557fac086b1b
(ascii_skip_code): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22186
diff
changeset
|
8501 { |
| 88365 | 8502 Lisp_Object spec, attrs; |
| 8503 | |
| 8504 if (NILP (coding_system)) | |
| 8505 coding_system = Qno_conversion; | |
| 8506 CHECK_CODING_SYSTEM_GET_SPEC (coding_system, spec); | |
| 8507 attrs = AREF (spec, 0); | |
| 8508 return CODING_ATTR_PLIST (attrs); | |
| 8509 } | |
| 8510 | |
| 8511 | |
| 8512 DEFUN ("coding-system-aliases", Fcoding_system_aliases, Scoding_system_aliases, | |
| 8513 1, 1, 0, | |
| 88645 | 8514 doc: /* Return the list of aliases of CODING-SYSTEM. */) |
| 88365 | 8515 (coding_system) |
| 8516 Lisp_Object coding_system; | |
| 8517 { | |
| 8518 Lisp_Object spec; | |
| 8519 | |
| 8520 if (NILP (coding_system)) | |
| 8521 coding_system = Qno_conversion; | |
| 8522 CHECK_CODING_SYSTEM_GET_SPEC (coding_system, spec); | |
| 88645 | 8523 return AREF (spec, 1); |
| 88365 | 8524 } |
| 8525 | |
| 8526 DEFUN ("coding-system-eol-type", Fcoding_system_eol_type, | |
| 8527 Scoding_system_eol_type, 1, 1, 0, | |
| 8528 doc: /* Return eol-type of CODING-SYSTEM. | |
| 8529 An eol-type is integer 0, 1, 2, or a vector of coding systems. | |
| 8530 | |
| 8531 Integer values 0, 1, and 2 indicate a format of end-of-line; LF, CRLF, | |
| 8532 and CR respectively. | |
| 8533 | |
| 8534 A vector value indicates that a format of end-of-line should be | |
| 8535 detected automatically. Nth element of the vector is the subsidiary | |
| 8536 coding system whose eol-type is N. */) | |
| 8537 (coding_system) | |
| 8538 Lisp_Object coding_system; | |
| 8539 { | |
| 8540 Lisp_Object spec, eol_type; | |
| 8541 int n; | |
| 8542 | |
| 8543 if (NILP (coding_system)) | |
| 8544 coding_system = Qno_conversion; | |
| 8545 if (! CODING_SYSTEM_P (coding_system)) | |
| 8546 return Qnil; | |
| 8547 spec = CODING_SYSTEM_SPEC (coding_system); | |
| 8548 eol_type = AREF (spec, 2); | |
| 8549 if (VECTORP (eol_type)) | |
| 8550 return Fcopy_sequence (eol_type); | |
| 8551 n = EQ (eol_type, Qunix) ? 0 : EQ (eol_type, Qdos) ? 1 : 2; | |
| 8552 return make_number (n); | |
|
22226
557fac086b1b
(ascii_skip_code): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22186
diff
changeset
|
8553 } |
|
557fac086b1b
(ascii_skip_code): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22186
diff
changeset
|
8554 |
| 17052 | 8555 #endif /* emacs */ |
| 8556 | |
| 8557 | |
|
22874
b133f07a76db
(Qvalid_codes): New variable.
Kenichi Handa <handa@m17n.org>
parents:
22812
diff
changeset
|
8558 /*** 9. Post-amble ***/ |
| 17052 | 8559 |
| 21514 | 8560 void |
| 17052 | 8561 init_coding_once () |
| 8562 { | |
| 8563 int i; | |
| 8564 | |
| 88365 | 8565 for (i = 0; i < coding_category_max; i++) |
| 8566 { | |
| 8567 coding_categories[i].id = -1; | |
| 8568 coding_priorities[i] = i; | |
| 8569 } | |
| 17052 | 8570 |
| 8571 /* ISO2022 specific initialize routine. */ | |
| 8572 for (i = 0; i < 0x20; i++) | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
8573 iso_code_class[i] = ISO_control_0; |
| 17052 | 8574 for (i = 0x21; i < 0x7F; i++) |
| 8575 iso_code_class[i] = ISO_graphic_plane_0; | |
| 8576 for (i = 0x80; i < 0xA0; i++) | |
|
29005
b396df3a5181
(ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
Kenichi Handa <handa@m17n.org>
parents:
28512
diff
changeset
|
8577 iso_code_class[i] = ISO_control_1; |
| 17052 | 8578 for (i = 0xA1; i < 0xFF; i++) |
| 8579 iso_code_class[i] = ISO_graphic_plane_1; | |
| 8580 iso_code_class[0x20] = iso_code_class[0x7F] = ISO_0x20_or_0x7F; | |
| 8581 iso_code_class[0xA0] = iso_code_class[0xFF] = ISO_0xA0_or_0xFF; | |
| 8582 iso_code_class[ISO_CODE_SO] = ISO_shift_out; | |
| 8583 iso_code_class[ISO_CODE_SI] = ISO_shift_in; | |
| 8584 iso_code_class[ISO_CODE_SS2_7] = ISO_single_shift_2_7; | |
| 8585 iso_code_class[ISO_CODE_ESC] = ISO_escape; | |
| 8586 iso_code_class[ISO_CODE_SS2] = ISO_single_shift_2; | |
| 8587 iso_code_class[ISO_CODE_SS3] = ISO_single_shift_3; | |
| 8588 iso_code_class[ISO_CODE_CSI] = ISO_control_sequence_introducer; | |
| 8589 | |
| 88365 | 8590 for (i = 0; i < 256; i++) |
| 8591 { | |
| 8592 emacs_mule_bytes[i] = 1; | |
| 8593 } | |
|
88876
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
8594 emacs_mule_bytes[EMACS_MULE_LEADING_CODE_PRIVATE_11] = 3; |
|
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
8595 emacs_mule_bytes[EMACS_MULE_LEADING_CODE_PRIVATE_12] = 3; |
|
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
8596 emacs_mule_bytes[EMACS_MULE_LEADING_CODE_PRIVATE_21] = 4; |
|
af9012fdad56
(LEADING_CODE_PRIVATE_11, LEADING_CODE_PRIVATE_12,
Kenichi Handa <handa@m17n.org>
parents:
88862
diff
changeset
|
8597 emacs_mule_bytes[EMACS_MULE_LEADING_CODE_PRIVATE_22] = 4; |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8598 } |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8599 |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8600 #ifdef emacs |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8601 |
| 21514 | 8602 void |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8603 syms_of_coding () |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8604 { |
| 88365 | 8605 staticpro (&Vcoding_system_hash_table); |
| 89483 | 8606 { |
| 8607 Lisp_Object args[2]; | |
| 8608 args[0] = QCtest; | |
| 8609 args[1] = Qeq; | |
| 8610 Vcoding_system_hash_table = Fmake_hash_table (2, args); | |
| 8611 } | |
| 88365 | 8612 |
| 8613 staticpro (&Vsjis_coding_system); | |
| 8614 Vsjis_coding_system = Qnil; | |
| 8615 | |
| 8616 staticpro (&Vbig5_coding_system); | |
| 8617 Vbig5_coding_system = Qnil; | |
| 8618 | |
|
89665
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
8619 staticpro (&Vcode_conversion_reused_workbuf); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
8620 Vcode_conversion_reused_workbuf = Qnil; |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
8621 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
8622 staticpro (&Vcode_conversion_workbuf_name); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
8623 Vcode_conversion_workbuf_name = build_string (" *code-conversion-work*"); |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
8624 |
|
9010cefe8d29
(enum iso_code_class_type): Delete ISO_carriage_return.
Kenichi Handa <handa@m17n.org>
parents:
89653
diff
changeset
|
8625 reused_workbuf_in_use = 0; |
| 88365 | 8626 |
| 8627 DEFSYM (Qcharset, "charset"); | |
| 8628 DEFSYM (Qtarget_idx, "target-idx"); | |
| 8629 DEFSYM (Qcoding_system_history, "coding-system-history"); | |
|
19750
95e4e1cba6ac
(Qcoding_system_history): New variable.
Richard M. Stallman <rms@gnu.org>
parents:
19747
diff
changeset
|
8630 Fset (Qcoding_system_history, Qnil); |
|
95e4e1cba6ac
(Qcoding_system_history): New variable.
Richard M. Stallman <rms@gnu.org>
parents:
19747
diff
changeset
|
8631 |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8632 /* Target FILENAME is the first argument. */ |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8633 Fput (Qinsert_file_contents, Qtarget_idx, make_number (0)); |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8634 /* Target FILENAME is the third argument. */ |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8635 Fput (Qwrite_region, Qtarget_idx, make_number (2)); |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8636 |
| 88365 | 8637 DEFSYM (Qcall_process, "call-process"); |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8638 /* Target PROGRAM is the first argument. */ |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8639 Fput (Qcall_process, Qtarget_idx, make_number (0)); |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8640 |
| 88365 | 8641 DEFSYM (Qcall_process_region, "call-process-region"); |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8642 /* Target PROGRAM is the third argument. */ |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8643 Fput (Qcall_process_region, Qtarget_idx, make_number (2)); |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8644 |
| 88365 | 8645 DEFSYM (Qstart_process, "start-process"); |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8646 /* Target PROGRAM is the third argument. */ |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8647 Fput (Qstart_process, Qtarget_idx, make_number (2)); |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8648 |
| 88365 | 8649 DEFSYM (Qopen_network_stream, "open-network-stream"); |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8650 /* Target SERVICE is the fourth argument. */ |
|
17119
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8651 Fput (Qopen_network_stream, Qtarget_idx, make_number (3)); |
|
2cfb31c15ced
(create_process, Fopen_network_stream): Typo in indexes
Kenichi Handa <handa@m17n.org>
parents:
17071
diff
changeset
|
8652 |
| 88365 | 8653 DEFSYM (Qcoding_system, "coding-system"); |
| 8654 DEFSYM (Qcoding_aliases, "coding-aliases"); | |
| 8655 | |
| 8656 DEFSYM (Qeol_type, "eol-type"); | |
| 8657 DEFSYM (Qunix, "unix"); | |
| 8658 DEFSYM (Qdos, "dos"); | |
| 8659 | |
| 8660 DEFSYM (Qbuffer_file_coding_system, "buffer-file-coding-system"); | |
| 8661 DEFSYM (Qpost_read_conversion, "post-read-conversion"); | |
| 8662 DEFSYM (Qpre_write_conversion, "pre-write-conversion"); | |
| 8663 DEFSYM (Qdefault_char, "default-char"); | |
| 8664 DEFSYM (Qundecided, "undecided"); | |
| 8665 DEFSYM (Qno_conversion, "no-conversion"); | |
| 8666 DEFSYM (Qraw_text, "raw-text"); | |
| 8667 | |
| 8668 DEFSYM (Qiso_2022, "iso-2022"); | |
| 8669 | |
| 8670 DEFSYM (Qutf_8, "utf-8"); | |
| 89483 | 8671 DEFSYM (Qutf_8_emacs, "utf-8-emacs"); |
| 88365 | 8672 |
| 8673 DEFSYM (Qutf_16, "utf-16"); | |
| 8674 DEFSYM (Qbig, "big"); | |
| 8675 DEFSYM (Qlittle, "little"); | |
| 8676 | |
| 8677 DEFSYM (Qshift_jis, "shift-jis"); | |
| 8678 DEFSYM (Qbig5, "big5"); | |
| 8679 | |
| 8680 DEFSYM (Qcoding_system_p, "coding-system-p"); | |
| 8681 | |
| 8682 DEFSYM (Qcoding_system_error, "coding-system-error"); | |
| 17052 | 8683 Fput (Qcoding_system_error, Qerror_conditions, |
| 8684 Fcons (Qcoding_system_error, Fcons (Qerror, Qnil))); | |
| 8685 Fput (Qcoding_system_error, Qerror_message, | |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8686 build_string ("Invalid coding system")); |
| 17052 | 8687 |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
8688 /* Intern this now in case it isn't already done. |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
8689 Setting this variable twice is harmless. |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
8690 But don't staticpro it here--that is done in alloc.c. */ |
|
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
8691 Qchar_table_extra_slots = intern ("char-table-extra-slots"); |
| 88365 | 8692 |
| 8693 DEFSYM (Qtranslation_table, "translation-table"); | |
| 8694 Fput (Qtranslation_table, Qchar_table_extra_slots, make_number (1)); | |
| 8695 DEFSYM (Qtranslation_table_id, "translation-table-id"); | |
| 8696 DEFSYM (Qtranslation_table_for_decode, "translation-table-for-decode"); | |
| 8697 DEFSYM (Qtranslation_table_for_encode, "translation-table-for-encode"); | |
| 8698 | |
| 8699 DEFSYM (Qvalid_codes, "valid-codes"); | |
| 8700 | |
| 8701 DEFSYM (Qemacs_mule, "emacs-mule"); | |
| 8702 | |
|
89468
7dbbe692f70c
* coding.c (QCcategory): New variable.
Kenichi Handa <handa@m17n.org>
parents:
89467
diff
changeset
|
8703 DEFSYM (QCcategory, ":category"); |
|
7dbbe692f70c
* coding.c (QCcategory): New variable.
Kenichi Handa <handa@m17n.org>
parents:
89467
diff
changeset
|
8704 |
| 88365 | 8705 Vcoding_category_table |
| 8706 = Fmake_vector (make_number (coding_category_max), Qnil); | |
| 8707 staticpro (&Vcoding_category_table); | |
| 8708 /* Followings are target of code detection. */ | |
| 8709 ASET (Vcoding_category_table, coding_category_iso_7, | |
| 8710 intern ("coding-category-iso-7")); | |
| 8711 ASET (Vcoding_category_table, coding_category_iso_7_tight, | |
| 8712 intern ("coding-category-iso-7-tight")); | |
| 8713 ASET (Vcoding_category_table, coding_category_iso_8_1, | |
| 8714 intern ("coding-category-iso-8-1")); | |
| 8715 ASET (Vcoding_category_table, coding_category_iso_8_2, | |
| 8716 intern ("coding-category-iso-8-2")); | |
| 8717 ASET (Vcoding_category_table, coding_category_iso_7_else, | |
| 8718 intern ("coding-category-iso-7-else")); | |
| 8719 ASET (Vcoding_category_table, coding_category_iso_8_else, | |
| 8720 intern ("coding-category-iso-8-else")); | |
| 8721 ASET (Vcoding_category_table, coding_category_utf_8, | |
| 8722 intern ("coding-category-utf-8")); | |
| 8723 ASET (Vcoding_category_table, coding_category_utf_16_be, | |
| 8724 intern ("coding-category-utf-16-be")); | |
|
89467
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
8725 ASET (Vcoding_category_table, coding_category_utf_16_auto, |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
8726 intern ("coding-category-utf-16-auto")); |
| 88365 | 8727 ASET (Vcoding_category_table, coding_category_utf_16_le, |
| 8728 intern ("coding-category-utf-16-le")); | |
| 8729 ASET (Vcoding_category_table, coding_category_utf_16_be_nosig, | |
| 8730 intern ("coding-category-utf-16-be-nosig")); | |
| 8731 ASET (Vcoding_category_table, coding_category_utf_16_le_nosig, | |
| 8732 intern ("coding-category-utf-16-le-nosig")); | |
| 8733 ASET (Vcoding_category_table, coding_category_charset, | |
| 8734 intern ("coding-category-charset")); | |
| 8735 ASET (Vcoding_category_table, coding_category_sjis, | |
| 8736 intern ("coding-category-sjis")); | |
| 8737 ASET (Vcoding_category_table, coding_category_big5, | |
| 8738 intern ("coding-category-big5")); | |
| 8739 ASET (Vcoding_category_table, coding_category_ccl, | |
| 8740 intern ("coding-category-ccl")); | |
| 8741 ASET (Vcoding_category_table, coding_category_emacs_mule, | |
| 8742 intern ("coding-category-emacs-mule")); | |
| 8743 /* Followings are NOT target of code detection. */ | |
| 8744 ASET (Vcoding_category_table, coding_category_raw_text, | |
| 8745 intern ("coding-category-raw-text")); | |
| 8746 ASET (Vcoding_category_table, coding_category_undecided, | |
| 8747 intern ("coding-category-undecided")); | |
| 8748 | |
| 17052 | 8749 defsubr (&Scoding_system_p); |
| 8750 defsubr (&Sread_coding_system); | |
| 8751 defsubr (&Sread_non_nil_coding_system); | |
| 8752 defsubr (&Scheck_coding_system); | |
| 8753 defsubr (&Sdetect_coding_region); | |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8754 defsubr (&Sdetect_coding_string); |
|
30487
6165da9c89c6
(Qsafe_charsets): This variable deleted.
Kenichi Handa <handa@m17n.org>
parents:
30384
diff
changeset
|
8755 defsubr (&Sfind_coding_systems_region_internal); |
|
46859
a26dd8891732
(unencodable_char_position): New function.
Kenichi Handa <handa@m17n.org>
parents:
46839
diff
changeset
|
8756 defsubr (&Sunencodable_char_position); |
| 88365 | 8757 defsubr (&Scheck_coding_systems_region); |
| 17052 | 8758 defsubr (&Sdecode_coding_region); |
| 8759 defsubr (&Sencode_coding_region); | |
| 8760 defsubr (&Sdecode_coding_string); | |
| 8761 defsubr (&Sencode_coding_string); | |
| 8762 defsubr (&Sdecode_sjis_char); | |
| 8763 defsubr (&Sencode_sjis_char); | |
| 8764 defsubr (&Sdecode_big5_char); | |
| 8765 defsubr (&Sencode_big5_char); | |
|
18002
a14261786239
(encode_invocation_designation): Use macro
Kenichi Handa <handa@m17n.org>
parents:
17835
diff
changeset
|
8766 defsubr (&Sset_terminal_coding_system_internal); |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
8767 defsubr (&Sset_safe_terminal_coding_system_internal); |
| 17052 | 8768 defsubr (&Sterminal_coding_system); |
|
18002
a14261786239
(encode_invocation_designation): Use macro
Kenichi Handa <handa@m17n.org>
parents:
17835
diff
changeset
|
8769 defsubr (&Sset_keyboard_coding_system_internal); |
| 17052 | 8770 defsubr (&Skeyboard_coding_system); |
|
18536
69c0e220b626
(Vstandard_character_unification_table_for_decode):
Kenichi Handa <handa@m17n.org>
parents:
18523
diff
changeset
|
8771 defsubr (&Sfind_operation_coding_system); |
| 88365 | 8772 defsubr (&Sset_coding_system_priority); |
| 8773 defsubr (&Sdefine_coding_system_internal); | |
| 8774 defsubr (&Sdefine_coding_system_alias); | |
| 8775 defsubr (&Scoding_system_base); | |
| 8776 defsubr (&Scoding_system_plist); | |
| 8777 defsubr (&Scoding_system_aliases); | |
| 8778 defsubr (&Scoding_system_eol_type); | |
| 8779 defsubr (&Scoding_system_priority_list); | |
| 17052 | 8780 |
|
20105
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
8781 DEFVAR_LISP ("coding-system-list", &Vcoding_system_list, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8782 doc: /* List of coding systems. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8783 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8784 Do not alter the value of this variable manually. This variable should be |
| 88365 | 8785 updated by the functions `define-coding-system' and |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8786 `define-coding-system-alias'. */); |
|
20105
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
8787 Vcoding_system_list = Qnil; |
|
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
8788 |
|
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
8789 DEFVAR_LISP ("coding-system-alist", &Vcoding_system_alist, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8790 doc: /* Alist of coding system names. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8791 Each element is one element list of coding system name. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8792 This variable is given to `completing-read' as TABLE argument. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8793 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8794 Do not alter the value of this variable manually. This variable should be |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8795 updated by the functions `make-coding-system' and |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8796 `define-coding-system-alias'. */); |
|
20105
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
8797 Vcoding_system_alist = Qnil; |
|
c017642863c2
(Qcoding_system_spec): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
19824
diff
changeset
|
8798 |
| 17052 | 8799 DEFVAR_LISP ("coding-category-list", &Vcoding_category_list, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8800 doc: /* List of coding-categories (symbols) ordered by priority. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8801 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8802 On detecting a coding system, Emacs tries code detection algorithms |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8803 associated with each coding-category one by one in this order. When |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8804 one algorithm agrees with a byte sequence of source text, the coding |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8805 system bound to the corresponding coding-category is selected. */); |
| 17052 | 8806 { |
| 8807 int i; | |
| 8808 | |
| 8809 Vcoding_category_list = Qnil; | |
| 88365 | 8810 for (i = coding_category_max - 1; i >= 0; i--) |
| 17052 | 8811 Vcoding_category_list |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8812 = Fcons (XVECTOR (Vcoding_category_table)->contents[i], |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8813 Vcoding_category_list); |
| 17052 | 8814 } |
| 8815 | |
| 8816 DEFVAR_LISP ("coding-system-for-read", &Vcoding_system_for_read, | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8817 doc: /* Specify the coding system for read operations. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8818 It is useful to bind this variable with `let', but do not set it globally. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8819 If the value is a coding system, it is used for decoding on read operation. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8820 If not, an appropriate element is used from one of the coding system alists: |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8821 There are three such tables, `file-coding-system-alist', |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8822 `process-coding-system-alist', and `network-coding-system-alist'. */); |
| 17052 | 8823 Vcoding_system_for_read = Qnil; |
| 8824 | |
| 8825 DEFVAR_LISP ("coding-system-for-write", &Vcoding_system_for_write, | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8826 doc: /* Specify the coding system for write operations. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8827 Programs bind this variable with `let', but you should not set it globally. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8828 If the value is a coding system, it is used for encoding of output, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8829 when writing it to a file and when sending it to a file or subprocess. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8830 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8831 If this does not specify a coding system, an appropriate element |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8832 is used from one of the coding system alists: |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8833 There are three such tables, `file-coding-system-alist', |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8834 `process-coding-system-alist', and `network-coding-system-alist'. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8835 For output to files, if the above procedure does not specify a coding system, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8836 the value of `buffer-file-coding-system' is used. */); |
| 17052 | 8837 Vcoding_system_for_write = Qnil; |
| 8838 | |
| 8839 DEFVAR_LISP ("last-coding-system-used", &Vlast_coding_system_used, | |
| 88365 | 8840 doc: /* |
| 8841 Coding system used in the latest file or process I/O. */); | |
| 17052 | 8842 Vlast_coding_system_used = Qnil; |
| 8843 | |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8844 DEFVAR_BOOL ("inhibit-eol-conversion", &inhibit_eol_conversion, |
| 88365 | 8845 doc: /* |
| 8846 *Non-nil means always inhibit code conversion of end-of-line format. | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8847 See info node `Coding Systems' and info node `Text and Binary' concerning |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8848 such conversion. */); |
|
18650
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8849 inhibit_eol_conversion = 0; |
|
aa3f2820e2ac
(Qemacs_mule, inhibit_eol_conversion): New variables.
Kenichi Handa <handa@m17n.org>
parents:
18613
diff
changeset
|
8850 |
|
21574
30394e3ae7f8
(syms_of_coding): Declare and define inherit-process-coding-system.
Eli Zaretskii <eliz@gnu.org>
parents:
21520
diff
changeset
|
8851 DEFVAR_BOOL ("inherit-process-coding-system", &inherit_process_coding_system, |
| 88365 | 8852 doc: /* |
| 8853 Non-nil means process buffer inherits coding system of process output. | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8854 Bind it to t if the process output is to be treated as if it were a file |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8855 read from some filesystem. */); |
|
21574
30394e3ae7f8
(syms_of_coding): Declare and define inherit-process-coding-system.
Eli Zaretskii <eliz@gnu.org>
parents:
21520
diff
changeset
|
8856 inherit_process_coding_system = 0; |
|
30394e3ae7f8
(syms_of_coding): Declare and define inherit-process-coding-system.
Eli Zaretskii <eliz@gnu.org>
parents:
21520
diff
changeset
|
8857 |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8858 DEFVAR_LISP ("file-coding-system-alist", &Vfile_coding_system_alist, |
| 88365 | 8859 doc: /* |
| 8860 Alist to decide a coding system to use for a file I/O operation. | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8861 The format is ((PATTERN . VAL) ...), |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8862 where PATTERN is a regular expression matching a file name, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8863 VAL is a coding system, a cons of coding systems, or a function symbol. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8864 If VAL is a coding system, it is used for both decoding and encoding |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8865 the file contents. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8866 If VAL is a cons of coding systems, the car part is used for decoding, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8867 and the cdr part is used for encoding. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8868 If VAL is a function symbol, the function must return a coding system |
|
41678
5aa97e545399
(syms_of_coding) <Qchar_coding_system>: Give it an
Dave Love <fx@gnu.org>
parents:
41624
diff
changeset
|
8869 or a cons of coding systems which are used as above. The function gets |
|
5aa97e545399
(syms_of_coding) <Qchar_coding_system>: Give it an
Dave Love <fx@gnu.org>
parents:
41624
diff
changeset
|
8870 the arguments with which `find-operation-coding-systems' was called. |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8871 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8872 See also the function `find-operation-coding-system' |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8873 and the variable `auto-coding-alist'. */); |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8874 Vfile_coding_system_alist = Qnil; |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8875 |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8876 DEFVAR_LISP ("process-coding-system-alist", &Vprocess_coding_system_alist, |
| 88365 | 8877 doc: /* |
| 8878 Alist to decide a coding system to use for a process I/O operation. | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8879 The format is ((PATTERN . VAL) ...), |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8880 where PATTERN is a regular expression matching a program name, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8881 VAL is a coding system, a cons of coding systems, or a function symbol. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8882 If VAL is a coding system, it is used for both decoding what received |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8883 from the program and encoding what sent to the program. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8884 If VAL is a cons of coding systems, the car part is used for decoding, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8885 and the cdr part is used for encoding. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8886 If VAL is a function symbol, the function must return a coding system |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8887 or a cons of coding systems which are used as above. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8888 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8889 See also the function `find-operation-coding-system'. */); |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8890 Vprocess_coding_system_alist = Qnil; |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8891 |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8892 DEFVAR_LISP ("network-coding-system-alist", &Vnetwork_coding_system_alist, |
| 88365 | 8893 doc: /* |
| 8894 Alist to decide a coding system to use for a network I/O operation. | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8895 The format is ((PATTERN . VAL) ...), |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8896 where PATTERN is a regular expression matching a network service name |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8897 or is a port number to connect to, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8898 VAL is a coding system, a cons of coding systems, or a function symbol. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8899 If VAL is a coding system, it is used for both decoding what received |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8900 from the network stream and encoding what sent to the network stream. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8901 If VAL is a cons of coding systems, the car part is used for decoding, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8902 and the cdr part is used for encoding. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8903 If VAL is a function symbol, the function must return a coding system |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8904 or a cons of coding systems which are used as above. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8905 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8906 See also the function `find-operation-coding-system'. */); |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8907 Vnetwork_coding_system_alist = Qnil; |
| 17052 | 8908 |
|
26088
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
8909 DEFVAR_LISP ("locale-coding-system", &Vlocale_coding_system, |
|
41026
6f20449b7e12
(syms_of_coding): Doc fix.
Richard M. Stallman <rms@gnu.org>
parents:
41006
diff
changeset
|
8910 doc: /* Coding system to use with system messages. |
|
6f20449b7e12
(syms_of_coding): Doc fix.
Richard M. Stallman <rms@gnu.org>
parents:
41006
diff
changeset
|
8911 Also used for decoding keyboard input on X Window system. */); |
|
26088
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
8912 Vlocale_coding_system = Qnil; |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
8913 |
|
29182
1d1c27067af4
(encode_eol): Add null statement after label.
Dave Love <fx@gnu.org>
parents:
29172
diff
changeset
|
8914 /* The eol mnemonics are reset in startup.el system-dependently. */ |
|
24200
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8915 DEFVAR_LISP ("eol-mnemonic-unix", &eol_mnemonic_unix, |
| 88365 | 8916 doc: /* |
| 8917 *String displayed in mode line for UNIX-like (LF) end-of-line format. */); | |
|
24200
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8918 eol_mnemonic_unix = build_string (":"); |
|
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8919 |
|
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8920 DEFVAR_LISP ("eol-mnemonic-dos", &eol_mnemonic_dos, |
| 88365 | 8921 doc: /* |
| 8922 *String displayed in mode line for DOS-like (CRLF) end-of-line format. */); | |
|
24200
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8923 eol_mnemonic_dos = build_string ("\\"); |
|
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8924 |
|
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8925 DEFVAR_LISP ("eol-mnemonic-mac", &eol_mnemonic_mac, |
| 88365 | 8926 doc: /* |
| 8927 *String displayed in mode line for MAC-like (CR) end-of-line format. */); | |
|
24200
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8928 eol_mnemonic_mac = build_string ("/"); |
|
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8929 |
|
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8930 DEFVAR_LISP ("eol-mnemonic-undecided", &eol_mnemonic_undecided, |
| 88365 | 8931 doc: /* |
| 8932 *String displayed in mode line when end-of-line format is not yet determined. */); | |
|
24200
b9d9fccad516
(syms_of_coding): eol-mnemonic-* variables are now
Eli Zaretskii <eliz@gnu.org>
parents:
24178
diff
changeset
|
8933 eol_mnemonic_undecided = build_string (":"); |
| 17052 | 8934 |
|
22119
592bb8b9bcfd
Change terms unify/unification to
Kenichi Handa <handa@m17n.org>
parents:
22020
diff
changeset
|
8935 DEFVAR_LISP ("enable-character-translation", &Venable_character_translation, |
| 88365 | 8936 doc: /* |
| 8937 *Non-nil enables character translation while encoding and decoding. */); | |
|
22119
592bb8b9bcfd
Change terms unify/unification to
Kenichi Handa <handa@m17n.org>
parents:
22020
diff
changeset
|
8938 Venable_character_translation = Qt; |
|
592bb8b9bcfd
Change terms unify/unification to
Kenichi Handa <handa@m17n.org>
parents:
22020
diff
changeset
|
8939 |
|
22186
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
8940 DEFVAR_LISP ("standard-translation-table-for-decode", |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8941 &Vstandard_translation_table_for_decode, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8942 doc: /* Table for translating characters while decoding. */); |
|
22186
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
8943 Vstandard_translation_table_for_decode = Qnil; |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
8944 |
|
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
8945 DEFVAR_LISP ("standard-translation-table-for-encode", |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8946 &Vstandard_translation_table_for_encode, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8947 doc: /* Table for translating characters while encoding. */); |
|
22186
fc4aaf1b1772
Change term "character translation table" to "translation table".
Kenichi Handa <handa@m17n.org>
parents:
22166
diff
changeset
|
8948 Vstandard_translation_table_for_encode = Qnil; |
| 17052 | 8949 |
| 88365 | 8950 DEFVAR_LISP ("charset-revision-table", &Vcharset_revision_table, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8951 doc: /* Alist of charsets vs revision numbers. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8952 While encoding, if a charset (car part of an element) is found, |
| 88365 | 8953 designate it with the escape sequence identifying revision (cdr part |
| 8954 of the element). */); | |
| 8955 Vcharset_revision_table = Qnil; | |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8956 |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8957 DEFVAR_LISP ("default-process-coding-system", |
|
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8958 &Vdefault_process_coding_system, |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8959 doc: /* Cons of coding systems used for process I/O by default. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8960 The car part is used for decoding a process output, |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8961 the cdr part is used for encoding a text to be sent to a process. */); |
|
18180
5f4c4da24e75
(Vcoding_system_alist): Deleted.
Kenichi Handa <handa@m17n.org>
parents:
18002
diff
changeset
|
8962 Vdefault_process_coding_system = Qnil; |
|
19280
e755044718ee
(ENCODE_ISO_CHARACTER_DIMENSION1): Pay attention to
Kenichi Handa <handa@m17n.org>
parents:
19193
diff
changeset
|
8963 |
|
19365
d9374f5ebd3a
(CODING_FLAG_ISO_LATIN_EXTRA): New macro.
Kenichi Handa <handa@m17n.org>
parents:
19285
diff
changeset
|
8964 DEFVAR_LISP ("latin-extra-code-table", &Vlatin_extra_code_table, |
| 88365 | 8965 doc: /* |
| 8966 Table of extra Latin codes in the range 128..159 (inclusive). | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8967 This is a vector of length 256. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8968 If Nth element is non-nil, the existence of code N in a file |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8969 \(or output of subprocess) doesn't prevent it to be detected as |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8970 a coding system of ISO 2022 variant which has a flag |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8971 `accept-latin-extra-code' t (e.g. iso-latin-1) on reading a file |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8972 or reading output of a subprocess. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8973 Only 128th through 159th elements has a meaning. */); |
|
19365
d9374f5ebd3a
(CODING_FLAG_ISO_LATIN_EXTRA): New macro.
Kenichi Handa <handa@m17n.org>
parents:
19285
diff
changeset
|
8974 Vlatin_extra_code_table = Fmake_vector (make_number (256), Qnil); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8975 |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8976 DEFVAR_LISP ("select-safe-coding-system-function", |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8977 &Vselect_safe_coding_system_function, |
| 88365 | 8978 doc: /* |
| 8979 Function to call to select safe coding system for encoding a text. | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8980 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8981 If set, this function is called to force a user to select a proper |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8982 coding system which can encode the text in the case that a default |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8983 coding system used in each operation can't encode the text. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8984 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
8985 The default value is `select-safe-coding-system' (which see). */); |
|
20718
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8986 Vselect_safe_coding_system_function = Qnil; |
|
c600dea3b06b
Vselect_safe_coding_system_function): New variable.
Kenichi Handa <handa@m17n.org>
parents:
20708
diff
changeset
|
8987 |
|
48874
3002a87cc629
(coding_system_require_warning): New variable.
Kenichi Handa <handa@m17n.org>
parents:
48829
diff
changeset
|
8988 DEFVAR_BOOL ("coding-system-require-warning", |
|
3002a87cc629
(coding_system_require_warning): New variable.
Kenichi Handa <handa@m17n.org>
parents:
48829
diff
changeset
|
8989 &coding_system_require_warning, |
|
3002a87cc629
(coding_system_require_warning): New variable.
Kenichi Handa <handa@m17n.org>
parents:
48829
diff
changeset
|
8990 doc: /* Internal use only. |
|
49539
1ad5bfbb831a
(syms_of_coding): Add `...' for symbols in the docstring of
Kenichi Handa <handa@m17n.org>
parents:
48874
diff
changeset
|
8991 If non-nil, on writing a file, `select-safe-coding-system-function' is |
|
1ad5bfbb831a
(syms_of_coding): Add `...' for symbols in the docstring of
Kenichi Handa <handa@m17n.org>
parents:
48874
diff
changeset
|
8992 called even if `coding-system-for-write' is non-nil. The command |
|
1ad5bfbb831a
(syms_of_coding): Add `...' for symbols in the docstring of
Kenichi Handa <handa@m17n.org>
parents:
48874
diff
changeset
|
8993 `universal-coding-system-argument' binds this variable to t temporarily. */); |
|
48874
3002a87cc629
(coding_system_require_warning): New variable.
Kenichi Handa <handa@m17n.org>
parents:
48829
diff
changeset
|
8994 coding_system_require_warning = 0; |
|
3002a87cc629
(coding_system_require_warning): New variable.
Kenichi Handa <handa@m17n.org>
parents:
48829
diff
changeset
|
8995 |
|
3002a87cc629
(coding_system_require_warning): New variable.
Kenichi Handa <handa@m17n.org>
parents:
48829
diff
changeset
|
8996 |
|
30292
14a9937df1f5
(syms_of_coding): Fix typo in spelling of variable
Gerd Moellmann <gerd@gnu.org>
parents:
30263
diff
changeset
|
8997 DEFVAR_BOOL ("inhibit-iso-escape-detection", |
|
30204
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
8998 &inhibit_iso_escape_detection, |
| 88365 | 8999 doc: /* |
| 9000 If non-nil, Emacs ignores ISO2022's escape sequence on code detection. | |
|
40713
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9001 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9002 By default, on reading a file, Emacs tries to detect how the text is |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9003 encoded. This code detection is sensitive to escape sequences. If |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9004 the sequence is valid as ISO2022, the code is determined as one of |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9005 the ISO2022 encodings, and the file is decoded by the corresponding |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9006 coding system (e.g. `iso-2022-7bit'). |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9007 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9008 However, there may be a case that you want to read escape sequences in |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9009 a file as is. In such a case, you can set this variable to non-nil. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9010 Then, as the code detection ignores any escape sequences, no file is |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9011 detected as encoded in some ISO2022 encoding. The result is that all |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9012 escape sequences become visible in a buffer. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9013 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9014 The default value is nil, and it is strongly recommended not to change |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9015 it. That is because many Emacs Lisp source files that contain |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9016 non-ASCII characters are encoded by the coding system `iso-2022-7bit' |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9017 in Emacs's distribution, and they won't be decoded correctly on |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9018 reading if you suppress escape sequence detection. |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9019 |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9020 The other way to read escape sequences in a file without decoding is |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9021 to explicitly specify some coding system that doesn't use ISO2022's |
|
42351475da08
Change doc-string comments to `new style' [w/`doc:' keyword].
Pavel Jan?k <Pavel@Janik.cz>
parents:
40656
diff
changeset
|
9022 escape sequence (e.g `latin-1') on reading by \\[universal-coding-system-argument]. */); |
|
30204
35aec8514228
(inhibit_iso_escape_detection): New variable.
Kenichi Handa <handa@m17n.org>
parents:
29985
diff
changeset
|
9023 inhibit_iso_escape_detection = 0; |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9024 |
|
48182
9474e269efd1
Reformat some DEFUNs so that etags works.
Dave Love <fx@gnu.org>
parents:
48125
diff
changeset
|
9025 DEFVAR_LISP ("translation-table-for-input", &Vtranslation_table_for_input, |
|
48230
c2ce8280fb97
(Vtranslation_table_for_input): New.
Dave Love <fx@gnu.org>
parents:
48182
diff
changeset
|
9026 doc: /* Char table for translating self-inserting characters. |
|
c2ce8280fb97
(Vtranslation_table_for_input): New.
Dave Love <fx@gnu.org>
parents:
48182
diff
changeset
|
9027 This is applied to the result of input methods, not their input. See also |
|
c2ce8280fb97
(Vtranslation_table_for_input): New.
Dave Love <fx@gnu.org>
parents:
48182
diff
changeset
|
9028 `keyboard-translate-table'. */); |
|
48182
9474e269efd1
Reformat some DEFUNs so that etags works.
Dave Love <fx@gnu.org>
parents:
48125
diff
changeset
|
9029 Vtranslation_table_for_input = Qnil; |
| 89483 | 9030 |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9031 { |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9032 Lisp_Object args[coding_arg_max]; |
| 89483 | 9033 Lisp_Object plist[16]; |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9034 int i; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9035 |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9036 for (i = 0; i < coding_arg_max; i++) |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9037 args[i] = Qnil; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9038 |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9039 plist[0] = intern (":name"); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9040 plist[1] = args[coding_arg_name] = Qno_conversion; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9041 plist[2] = intern (":mnemonic"); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9042 plist[3] = args[coding_arg_mnemonic] = make_number ('='); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9043 plist[4] = intern (":coding-type"); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9044 plist[5] = args[coding_arg_coding_type] = Qraw_text; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9045 plist[6] = intern (":ascii-compatible-p"); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9046 plist[7] = args[coding_arg_ascii_compatible_p] = Qt; |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9047 plist[8] = intern (":default-char"); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9048 plist[9] = args[coding_arg_default_char] = make_number (0); |
| 89483 | 9049 plist[10] = intern (":for-unibyte"); |
| 9050 plist[11] = args[coding_arg_for_unibyte] = Qt; | |
| 9051 plist[12] = intern (":docstring"); | |
| 9052 plist[13] = build_string ("Do no conversion.\n\ | |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9053 \n\ |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9054 When you visit a file with this coding, the file is read into a\n\ |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9055 unibyte buffer as is, thus each byte of a file is treated as a\n\ |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9056 character."); |
| 89483 | 9057 plist[14] = intern (":eol-type"); |
| 9058 plist[15] = args[coding_arg_eol_type] = Qunix; | |
| 9059 args[coding_arg_plist] = Flist (16, plist); | |
|
88456
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9060 Fdefine_coding_system_internal (coding_arg_max, args); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9061 } |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9062 |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9063 setup_coding_system (Qno_conversion, &keyboard_coding); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9064 setup_coding_system (Qno_conversion, &terminal_coding); |
|
a7b309f72920
(coding_alloc_by_making_gap): Check the case that the
Kenichi Handa <handa@m17n.org>
parents:
88443
diff
changeset
|
9065 setup_coding_system (Qno_conversion, &safe_terminal_coding); |
|
89467
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
9066 |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
9067 { |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
9068 int i; |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
9069 |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
9070 for (i = 0; i < coding_category_max; i++) |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
9071 Fset (AREF (Vcoding_category_table, i), Qno_conversion); |
|
e911ca706166
(Fset_coding_system_priority): Doc fix. Update values
Kenichi Handa <handa@m17n.org>
parents:
89462
diff
changeset
|
9072 } |
| 17052 | 9073 } |
| 9074 | |
|
26088
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9075 char * |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9076 emacs_strerror (error_number) |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9077 int error_number; |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9078 { |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9079 char *str; |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9080 |
|
26526
b7438760079b
* callproc.c (strerror): Remove decl.
Paul Eggert <eggert@twinsun.com>
parents:
26240
diff
changeset
|
9081 synchronize_system_messages_locale (); |
|
26088
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9082 str = strerror (error_number); |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9083 |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9084 if (! NILP (Vlocale_coding_system)) |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9085 { |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9086 Lisp_Object dec = code_convert_string_norecord (build_string (str), |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9087 Vlocale_coding_system, |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9088 0); |
|
46370
40db0673e6f0
Most uses of XSTRING combined with STRING_BYTES or indirection changed to
Ken Raeburn <raeburn@raeburn.org>
parents:
46293
diff
changeset
|
9089 str = (char *) SDATA (dec); |
|
26088
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9090 } |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9091 |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9092 return str; |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9093 } |
|
b7aa6ac26872
Add support for large files, 64-bit Solaris, system locale codings.
Paul Eggert <eggert@twinsun.com>
parents:
26067
diff
changeset
|
9094 |
| 17052 | 9095 #endif /* emacs */ |
