Next: Strings, Previous: Symbols, Up: Standard procedures [Index]
Characters are objects that represent printed characters such as letters and digits. All Scheme implementations must support at least the ASCII character repertoire: that is, Unicode characters U+0000 through U+007F. Implementations may support any other Unicode characters they see fit, and may also support non-Unicode characters as well. Except as otherwise specified, the result of applying any of the following procedures to a non-Unicode character is implementation-dependent.
Characters are written using the notation #\
⟨character⟩
or #\
⟨character name⟩ or #\x
⟨hex scalar value⟩.
The following character names must be supported by all implementations
with the given values. Implementations may add other names provided they
cannot be interpreted as hex scalar values preceded by x
.
#\alarm ; U+0007 #\backspace ; U+0008 #\delete ; U+007F #\escape ; U+001B #\newline ; the linefeed character, U+000A #\null ; the null character, U+0000 #\return ; the return character, U+000D #\space ; the preferred way to write a space #\tab ; the tab character, U+0009
Here are some additional examples:
#\a ; lower case letter #\A ; upper case letter #\( ; left parenthesis #\ ; the space character #\x03BB ; λ (if character is supported) #\iota ; ι (if character and name are supported)
Case is significant in #\
⟨character⟩, and in
#\
⟨character name⟩, but not in #\x
⟨hex scalar
value⟩. If ⟨character⟩ in #\
⟨character⟩ is alphabetic,
then any character immediately following ⟨character⟩ cannot be
one that can appear in an identifier. This rule resolves the ambiguous
case where, for example, the sequence of characters ‘#\space’
could be taken to be either a representation of the space character or a
representation of the character #\s
followed by a representation
of the symbol pace
.
Characters written in the #\
notation are self-evaluating. That
is, they do not have to be quoted in programs.
Some of the procedures that operate on characters ignore the difference between upper case and lower case. The procedures that ignore case have ‘-ci’ (for “case insensitive”) embedded in their names.
Returns #t
if obj is a character, otherwise returns
#f
.
These procedures return #t
if the results of passing their
arguments to char->integer
are respectively equal, monotonically
increasing, monotonically decreasing, monotonically non-decreasing, or
monotonically non-increasing.
These predicates are required to be transitive.
These procedures are similar to char=?
et cetera, but they treat
upper case and lower case letters as the same. For example,
(char-ci=? #\A #\a)
returns #t
.
Specifically, these procedures behave as if char-foldcase
were
applied to their arguments before they were compared.
These procedures return #t
if their arguments are alphabetic,
numeric, whitespace, upper case, or lower case characters, respectively,
otherwise they return #f
.
Specifically, they must return #t
when applied to characters with
the Unicode properties Alphabetic, Numeric_Type=Decimal, White_Space,
Uppercase, and Lowercase respectively, and #f
when applied to
any other Unicode characters. Note that many Unicode characters are
alphabetic but neither upper nor lower case.
This procedure returns the numeric value (0 to 9) of its argument if it
is a numeric digit (that is, if char-numeric?
returns
#t
), or #f
on any other character.
(digit-value #\3) ⇒ 3 (digit-value #\x0664) ⇒ 4 (digit-value #\x0AE6) ⇒ 0 (digit-value #\x0EA6) ⇒ #f
Given a Unicode character, char->integer
returns an exact
integer between 0 and #xD7FF
or between #xE000
and
#x10FFFF
which is equal to the Unicode scalar value of that
character. Given a non-Unicode character, it returns an exact integer
greater than #x10FFFF
. This is true independent of whether the
implementation uses the Unicode representation internally.
Given an exact integer that is the value returned by a character when
char->integer
is applied to it, integer->char
returns
that character.
The char-upcase
procedure, given an argument that is the lowercase
part of a Unicode casing pair, returns the uppercase member of the pair,
provided that both characters are supported by the Scheme implementation.
Note that language-sensitive casing pairs are not used. If the argument
is not the lowercase member of such a pair, it is returned.
The char-downcase
procedure, given an argument that is
the uppercase part of a Unicode casing pair, returns the lowercase
member of the pair, provided that both characters are supported by the
Scheme implementation. Note that language-sensitive casing pairs are
not used. If the argument is not the uppercase member of such a pair,
it is returned.
The char-foldcase
procedure applies the Unicode simple
case-folding algorithm to its argument and returns the result. Note that
language-sensitive folding is not used. If the character that results
from folding is not supported by the implementation, the argument is
returned.
See UAX #29 [uax29] (part of the Unicode Standard)
for details.
Note that many Unicode lowercase characters do not have uppercase equivalents.
Next: Strings, Previous: Symbols, Up: Standard procedures [Index]