Mercurial > emacs

diff lispref/objects.texi @ 21007:66d807bdc5b4
*** empty log message ***
author: Richard M. Stallman <rms@gnu.org>
date: Sat, 28 Feb 1998 01:53:53 +0000
parents: 981e116b4ac6
children: 90da2489c498
--- a/lispref/objects.texi	Sat Feb 28 01:49:58 1998 +0000
+++ b/lispref/objects.texi	Sat Feb 28 01:53:53 1998 +0000
@@ -1,6 +1,6 @@
 @c -*-texinfo-*-
 @c This is part of the GNU Emacs Lisp Reference Manual.
-@c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc. 
+@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998 Free Software Foundation, Inc. 
 @c See the file elisp.texi for copying conditions.
 @setfilename ../info/objects
 @node Lisp Data Types, Numbers, Introduction, Top
@@ -66,8 +66,10 @@
 output generated by the Lisp printer (the function @code{prin1}) for
 that object.  The @dfn{read syntax} of an object is the format of the
 input accepted by the Lisp reader (the function @code{read}) for that
-object.  Most objects have more than one possible read syntax.  Some
-types of object have no read syntax; except for these cases, the printed
+object.  @xref{Read and Print}.
+
+  Most objects have more than one possible read syntax.  Some types of
+object have no read syntax; except for these cases, the printed
 representation of an object is also a read syntax for it.
 
   In other languages, an expression is text; it has no other form.  In
@@ -143,6 +145,8 @@
 * Array Type::          Arrays include strings and vectors.
 * String Type::         An (efficient) array of characters.
 * Vector Type::         One-dimensional arrays.
+* Char-Table Type::     One-dimensional sparse arrays indexed by characters.
+* Bool-Vector Type::    One-dimensional arrays of @code{t} or @code{nil}.
 * Function Type::       A piece of executable code you can call from elsewhere.
 * Macro Type::          A method of expanding an expression into another
                           expression, more fundamental but less pretty.
@@ -196,9 +200,9 @@
 @node Floating Point Type
 @subsection Floating Point Type
 
-  Emacs version 19 supports floating point numbers (though there is a
-compilation option to disable them).  The precise range of floating
-point numbers is machine-specific.
+  Emacs supports floating point numbers (though there is a compilation
+option to disable them).  The precise range of floating point numbers is
+machine-specific.
 
   The printed representation for floating point numbers requires either
 a decimal point (with at least one digit following), an exponent, or
@@ -221,9 +225,10 @@
 characters.  @xref{String Type}.
 
   Characters in strings, buffers, and files are currently limited to the
-range of 0 to 255---eight bits.  If you store a larger integer into a
-string, buffer or file, it is truncated to that range.  Characters that
-represent keyboard input have a much wider range.
+range of 0 to 524287---nineteen bits.  But not all values in that range
+are valid character codes.  Characters that represent keyboard input
+have a much wider range, so they can modifier keys such as Control, Meta
+and Shift.
 
 @cindex read syntax for characters
 @cindex printed representation for characters
@@ -272,8 +277,7 @@
   You can express the characters Control-g, backspace, tab, newline,
 vertical tab, formfeed, return, and escape as @samp{?\a}, @samp{?\b},
 @samp{?\t}, @samp{?\n}, @samp{?\v}, @samp{?\f}, @samp{?\r}, @samp{?\e},
-respectively.  Those values are 7, 8, 9, 10, 11, 12, 13, and 27 in
-decimal.  Thus,
+respectively.  Thus,
 
 @example
 ?\a @result{} 7                 ; @r{@kbd{C-g}}
@@ -306,10 +310,10 @@
 ?\^I @result{} 9     ?\C-I @result{} 9
 @end example
 
-  For use in strings and buffers, you are limited to the control
-characters that exist in @sc{ASCII}, but for keyboard input purposes,
-you can turn any character into a control character with @samp{C-}.  The
-character codes for these non-@sc{ASCII} control characters include the
+  In strings and buffers, the only control characters allowed are those
+that exist in @sc{ASCII}; but for keyboard input purposes, you can turn
+any character into a control character with @samp{C-}.  The character
+codes for these non-@sc{ASCII} control characters include the
 @iftex
 $2^{26}$
 @end iftex
@@ -359,11 +363,11 @@
 @ifinfo
 2**7
 @end ifinfo
-bit indicates a meta character, so the meta
-characters that can fit in a string have codes in the range from 128 to
-255, and are the meta versions of the ordinary @sc{ASCII} characters.
-(In Emacs versions 18 and older, this convention was used for characters
-outside of strings as well.)
+bit attached to an ASCII character indicates a meta character; thus, the
+meta characters that can fit in a string have codes in the range from
+128 to 255, and are the meta versions of the ordinary @sc{ASCII}
+characters.  (In Emacs versions 18 and older, this convention was used
+for characters outside of strings as well.)
 
   The read syntax for meta characters uses @samp{\M-}.  For example,
 @samp{?\M-A} stands for @kbd{M-A}.  You can use @samp{\M-} together with
@@ -372,9 +376,10 @@
 or as @samp{?\M-\101}.  Likewise, you can write @kbd{C-M-b} as
 @samp{?\M-\C-b}, @samp{?\C-\M-b}, or @samp{?\M-\002}.
 
-  The case of an ordinary letter is indicated by its character code as
-part of @sc{ASCII}, but @sc{ASCII} has no way to represent whether a
-control character is upper case or lower case.  Emacs uses the
+  The case of a graphic character is indicated by its character code;
+for example, @sc{ASCII} distinguishes between the characters @samp{a}
+and @samp{A}.  But @sc{ASCII} has no way to represent whether a control
+character is upper case or lower case.  Emacs uses the
 @iftex
 $2^{25}$
 @end iftex
@@ -407,8 +412,9 @@
 @cindex @samp{\} in character constant
 @cindex backslash in character constant
 @cindex octal character code
-  Finally, the most general read syntax consists of a question mark
-followed by a backslash and the character code in octal (up to three
+  Finally, the most general read syntax for a character represents the
+character code in either octal or hex.  To use octal, write a question
+mark followed by a backslash and the octal character code (up to three
 octal digits); thus, @samp{?\101} for the character @kbd{A},
 @samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the
 character @kbd{C-b}.  Although this syntax can represent any @sc{ASCII}
@@ -422,6 +428,18 @@
 @end group
 @end example
 
+  To use hex, write a question mark followed by a backslash, @samp{x},
+and the hexadecimal character code.  You can use any number of hex
+digits, so you can represent any character code in this way.
+Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the
+character @kbd{C-a}, and @code{?\x8c0} for the character
+@iftex
+@`a.
+@end iftex
+@ifinfo
+@samp{a} with grave accent.
+@end ifinfo
+
   A backslash is allowed, and harmless, preceding any character without
 a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}.
 There is no reason to add a backslash before most characters.  However,
@@ -788,49 +806,36 @@
 text extracted from buffers.  Strings in Lisp are constants: evaluation
 of a string returns the same string.
 
+  @xref{Strings and Characters}, for functions that operate on strings.
+
+@menu
+* Syntax for Strings::
+* Non-ASCII in Strings::
+* Nonprinting Characters::
+* Text Props and Strings::
+@end menu
+
+@node Syntax for Strings
+@subsubsection Syntax for Strings
+
 @cindex @samp{"} in strings
 @cindex double-quote in strings
 @cindex @samp{\} in strings
 @cindex backslash in strings
   The read syntax for strings is a double-quote, an arbitrary number of
-characters, and another double-quote, @code{"like this"}.  The Lisp
-reader accepts the same formats for reading the characters of a string
-as it does for reading single characters (without the question mark that
-begins a character literal).  You can enter a nonprinting character such
-as tab, @kbd{C-a} or @kbd{M-C-A} using the convenient escape sequences,
-like this: @code{"\t, \C-a, \M-\C-a"}.  You can include a double-quote
-in a string by preceding it with a backslash; thus, @code{"\""} is a
-string containing just a single double-quote character.
-(@xref{Character Type}, for a description of the read syntax for
-characters.)
+characters, and another double-quote, @code{"like this"}.  To include a
+double-quote in a string, precede it with a backslash; thus, @code{"\""}
+is a string containing just a single double-quote character.  Likewise,
+you can include a backslash by preceding it with another backslash, like
+this: @code{"this \\ is a single embedded backslash"}.
 
-  If you use the @samp{\M-} syntax to indicate a meta character in a
-string constant, this sets the
-@iftex
-$2^{7}$
-@end iftex
-@ifinfo
-2**7
-@end ifinfo
-bit of the character in the string.
-This is not the same representation that the meta modifier has in a
-character on its own (not inside a string).  @xref{Character Type}.
-
-  Strings cannot hold characters that have the hyper, super, or alt
-modifiers; they can hold @sc{ASCII} control characters, but no others.
-They do not distinguish case in @sc{ASCII} control characters.
-
-  The printed representation of a string consists of a double-quote, the
-characters it contains, and another double-quote.  However, you must
-escape any backslash or double-quote characters in the string with a
-backslash, like this: @code{"this \" is an embedded quote"}.
-
+@cindex newline in strings
   The newline character is not special in the read syntax for strings;
 if you write a new line between the double-quotes, it becomes a
 character in the string.  But an escaped newline---one that is preceded
 by @samp{\}---does not become part of the string; i.e., the Lisp reader
-ignores an escaped newline while reading a string.
-@cindex newline in strings
+ignores an escaped newline while reading a string.  An escaped space
+@w{@samp{\ }} is likewise ignored.
 
 @example
 "It is useful to include newlines
@@ -842,11 +847,73 @@
 but the newline is ignored if escaped."
 @end example
 
-  A string can hold properties of the text it contains, in addition to
-the characters themselves.  This enables programs that copy text between
-strings and buffers to preserve the properties with no special effort.
-@xref{Text Properties}.  Strings with text properties have a special
-read and print syntax:
+@node Non-ASCII in Strings
+@subsubsection Non-ASCII Characters in Strings
+
+  You can include a non-@sc{ASCII} international character in a string
+constant by writing it literally.  There are two text representations
+for non-@sc{ASCII} characters in Emacs strings (and in buffers): unibyte
+and multibyte.  If the string constant is read from a multibyte source,
+then the character is read as a multibyte character, and that makes the
+string multibyte.  If the string constant is read from a unibyte source,
+then the character is read as unibyte and that makes the string unibyte.
+
+  You can also represent a multibyte non-@sc{ASCII} character with its
+character code, using a hex escape, @samp{\x@var{nnnnnnn}}, with as many
+digits as necessary.  (Multibyte non-@sc{ASCII} character codes are all
+greater than 256.)  Any character which is not a valid hex digit
+terminates this construct.  If the character that would follow is a hex
+digit, write @samp{\ } to terminate the hex escape---for example,
+@samp{\x8c0\ } represents one character, @samp{a} with grave accent.
+@samp{\ } in a string constant is just like backslash-newline; it does
+not contribute any character to the string, but it does terminate the
+preceding hex escape.
+
+  Using a multibyte hex escape forces the string to multibyte.  You can
+represent a unibyte non-@sc{ASCII} character with its character code,
+which must be in the range from 128 (0200 octal) to 255 (0377 octal).
+This forces a unibyte string.
+  
+  @xref{Text Representations}, for more information about the two
+text representations.
+
+@node Nonprinting Characters
+@subsubsection Nonprinting Characters in Strings
+
+  Strings cannot hold characters that have the hyper, super, or alt
+modifiers; the only control or meta characters they can hold are the
+@sc{ASCII} control characters.  Strings do not distinguish case in
+@sc{ASCII} control characters.
+
+  You can use the same backslash escape-sequences in a string constant
+as in character literals (but do not use the question mark that begins a
+character constant).  For example, you can write a string containing the
+nonprinting characters tab, @kbd{C-a} and @kbd{M-C-a}, with commas and
+spaces between them, like this: @code{"\t, \C-a, \M-\C-a"}.
+@xref{Character Type}, for a description of the read syntax for
+characters.
+
+  If you use the @samp{\M-} syntax to indicate a meta character in a
+string constant, this sets the
+@iftex
+$2^{7}$
+@end iftex
+@ifinfo
+2**7
+@end ifinfo
+bit of the character in the string.  This construct works only with
+ASCII characters.  Note that the same meta characters have a different
+representation when not in a string.  @xref{Character Type}.
+
+@node Text Props and Strings
+@subsubsection Text Properties in Strings
+
+  A string can hold properties for the characters it contains, in
+addition to the characters themselves.  This enables programs that copy
+text between strings and buffers to copy the text's properties with no
+special effort.  @xref{Text Properties}, for an explanation of what text
+properties mean.  Strings with text properties use a special read and
+print syntax:
 
 @example
 #("@var{characters}" @var{property-data}...)
@@ -863,9 +930,18 @@
 @noindent
 The elements @var{beg} and @var{end} are integers, and together specify
 a range of indices in the string; @var{plist} is the property list for
-that range.
+that range.  For example,
+
+@example
+#("foo bar" 0 3 (face bold) 3 4 nil 4 7 (face italic))
+@end example
 
-  @xref{Strings and Characters}, for functions that work on strings.
+@noindent
+represents a string whose textual contents are @samp{foo bar}, in which
+the first three characters have a @code{face} property with value
+@code{bold}, and the last three have a @code{face} property with value
+@code{italic}.  (The fourth character has no text properties so its
+property list is @code{nil}.)
 
 @node Vector Type
 @subsection Vector Type
@@ -887,6 +963,44 @@
 
   @xref{Vectors}, for functions that work with vectors.
 
+@node Char-Table Type
+@subsection Char-Table Type
+
+  A @dfn{char-table} is a one-dimensional array of elements of any type,
+indexed by character codes.  Char-tables have certain extra features to
+make them more useful for many jobs that involve assigning information
+to character codes---for example, a char-table can have a parent to
+inherit from, a default value, and a small number of extra slots to use for
+special purposes.  A char-table can also specify a single value for
+a whole character set.
+
+  The printed representation of a char-table is like a vector
+except that there is an extra @samp{#} at the beginning.
+
+  @xref{Char-Tables}, for special functions to operate on char-tables.
+
+@node Bool-Vector Type
+@subsection Bool-Vector Type
+
+  A @dfn{bool-vector} is a one-dimensional array of elements that
+must be @code{t} or @code{nil}.
+
+  The printed representation of a Bool-vector is like a string, except
+that it begins with @samp{#&} followed by the length.  The string
+constant that follows actually specifies the contents of the bool-vector
+as a bitmap---each ``character'' in the string contains 8 bits, which
+specify the next 8 elements of the bool-vector (1 stands for @code{t},
+and 0 for @code{nil}).  If the length is not a multiple of 8, the
+printed representation describes extra elements, but these really
+make no difference.
+
+@example
+(make-bool-vector 3 t)
+     @result{} #&3"\377"
+(make-bool-vector 3 nil)
+     @result{} #&3"\0""
+@end example
+
 @node Function Type
 @subsection Function Type
 
@@ -922,6 +1036,10 @@
 a macro as far as Emacs is concerned.  @xref{Macros}, for an explanation
 of how to write a macro.
 
+  @strong{Warning}: Lisp macros and keyboard macros (@pxref{Keyboard
+Macros}) are entirely different things.  When we use the word ``macro''
+without qualification, we mean a Lisp macro, not a keyboard macro.
+
 @node Primitive Function Type
 @subsection Primitive Function Type
 @cindex special forms
@@ -939,7 +1057,8 @@
 function written in Lisp for a primitive of the same name.  The reason
 is that the primitive function may be called directly from C code.
 Calls to the redefined function from Lisp will use the new definition,
-but calls from C code may still use the built-in definition.
+but calls from C code may still use the built-in definition.  Therefore,
+@strong{we discourage redefinition of primitive functions}.
 
   The term @dfn{function} refers to all Emacs functions, whether written
 in Lisp or C.  @xref{Function Type}, for information about the
@@ -1227,18 +1346,22 @@
 @node Syntax Table Type
 @subsection Syntax Table Type
 
-  A @dfn{syntax table} is a vector of 256 integers.  Each element of the
-vector defines how one character is interpreted when it appears in a
+  A @dfn{syntax table} is a char-table which specifies the syntax of
+each character, for word and list parsing.  Each element of the syntax
+table defines how one character is interpreted when it appears in a
 buffer.  For example, in C mode (@pxref{Major Modes}), the @samp{+}
 character is punctuation, but in Lisp mode it is a valid character in a
 symbol.  These modes specify different interpretations by changing the
 syntax table entry for @samp{+}, at index 43 in the syntax table.
 
-  Syntax tables are used only for scanning text in buffers, not for
-reading Lisp expressions.  The table the Lisp interpreter uses to read
-expressions is built into the Emacs source code and cannot be changed;
-thus, to change the list delimiters to be @samp{@{} and @samp{@}}
-instead of @samp{(} and @samp{)} would be impossible.
+  Syntax tables are used only to control primitives that scan text in
+buffers, not for reading Lisp expressions.  The syntax that the Lisp
+interpreter uses to read expressions is built into the Emacs source code
+and cannot be changed; thus, to change the list delimiters to be
+@samp{@{} and @samp{@}} instead of @samp{(} and @samp{)} would be
+impossible.  (Some Lisp systems provide ways to redefine the read
+syntax, but we decided to leave this feature out of Emacs Lisp for
+simplicity.)
 
   @xref{Syntax Tables}, for details about syntax classes and how to make
 and modify syntax tables.
@@ -1248,18 +1371,18 @@
 
   A @dfn{display table} specifies how to display each character code.
 Each buffer and each window can have its own display table.  A display
-table is actually a vector of length 262.  @xref{Display Tables}.
+table is actually a char-table.  @xref{Display Tables}.
 
 @node Overlay Type
 @subsection Overlay Type
 
-  An @dfn{overlay} specifies temporary alteration of the display
-appearance of a part of a buffer.  It contains markers delimiting a
-range of the buffer, plus a property list (a list whose elements are
-alternating property names and values).  Overlays are used to present
-parts of the buffer temporarily in a different display style.  They have
-no read syntax, and print in hash notation, giving the buffer name and
-range of positions.
+  An @dfn{overlay} specifies properties that apply to a part of a
+buffer.  Each overlay applies to a specified range of the buffer, and
+contains a property list (a list whose elements are alternating property
+names and values).  Overlay properties are used to present parts of the
+buffer temporarily in a different display style.  Overlays have no read
+syntax, and print in hash notation, giving the buffer name and range of
+positions.
 
   @xref{Overlays}, for how to create and use overlays.
 
@@ -1284,7 +1407,7 @@
 @example
 @group
 (+ 2 'a)
-     @error{} Wrong type argument: integer-or-marker-p, a
+     @error{} Wrong type argument: number-or-marker-p, a
 @end group
 @end example
 
@@ -1355,6 +1478,9 @@
 @item framep
 @xref{Frames, framep}.
 
+@item functionp
+@xref{Functions, functionp}.
+
 @item integer-or-marker-p
 @xref{Predicates on Markers, integer-or-marker-p}.
 
@@ -1572,10 +1698,10 @@
 @end group
 @end example
 
-Comparison of strings is case-sensitive and takes account of text
-properties as well as the characters in the strings.  To compare
-two strings' characters without comparing their text properties,
-use @code{string=} (@pxref{Text Comparison}).
+Comparison of strings is case-sensitive, but does not take account of
+text properties---it compares only the characters in the strings.
+A unibyte string never equals a multibyte string unless the
+contents are entirely @sc{ASCII} (@pxref{Text Representations}).
 
 @example
 @group
author	Richard M. Stallman <rms@gnu.org>
date	Sat, 28 Feb 1998 01:53:53 +0000
parents	981e116b4ac6
children	90da2489c498