Next: Vectors, Previous: Characters, Up: Standard procedures [Index]
Strings are sequences of characters. Strings are written as sequences
of characters enclosed within quotation marks ("
). Within a
string literal, various escape sequences represent characters other
than themselves. Escape sequences always start with a backslash
(\
):
\a
: alarm, U+0007
\b
: backspace, U+0008
\t
: character tabulation, U+0009
\n
: linefeed, U+000A
\r
: return, U+000D
\"
: double quote, U+0022
\\
: backslash, U+005C
\|
: vertical line, U+007C
\
⟨intraline whitespace⟩*⟨line ending⟩
⟨intraline whitespace⟩* : nothing
\x
⟨hex scalar value⟩;
: specified character
(note the terminating semi-colon).
The result is unspecified if any other character in a string occurs after a backslash.
Except for a line ending, any character outside of an escape sequence
stands for itself in the string literal. A line ending which is
preceded by \
⟨intraline whitespace⟩ expands to nothing
(along with any trailing intraline whitespace), and can be used to
indent strings for improved legibility. Any other line ending has the
same effect as inserting a \n
character into the string.
Examples:
"The word \"recursion\" has many meanings." "Another example:\ntwo lines of text" "Here's text \ containing just one line" "\x03B1; is named GREEK SMALL LETTER ALPHA."
The length of a string is the number of characters that it contains. This number is an exact, non-negative integer that is fixed when the string is created. The valid indexes of a string are the exact non-negative integers less than the length of the string. The first character of a string has index 0, the second has index 1, and so on.
Some of the procedures that operate on strings ignore the difference between upper and lower case. The names of the versions that ignore case end with ‘-ci’ (for “case insensitive”).
Implementations may forbid certain characters from appearing in
strings. However, with the exception of #\null
, ASCII characters
must not be forbidden. For example, an implementation might support the
entire Unicode repertoire, but only allow characters U+0001 to U+00FF
(the Latin-1 repertoire without #\null
) in strings.
It is an error to pass such a forbidden character to
make-string
, string
, string-set!
, or
string-fill!
, as part of the list passed to list->string
,
or as part of the vector passed to vector->string
(see
Vectors), or in UTF-8 encoded form within a bytevector passed to
utf8->string
(see Bytevectors). It is also an error for a
procedure passed to string-map
(see Control features) to
return a forbidden character, or for read-string
(see
Input) to attempt to read one.
Returns #t
if obj is a string, otherwise returns #f
.
The make-string
procedure returns a newly allocated string of
length k. If char is given, then all the characters of the
string are initialized to char, otherwise the contents of the
string are unspecified.
Returns a newly allocated string composed of the arguments. It is
analogous to list
.
Returns the number of characters in the given string.
It is an error if k is not a valid index of string.
The string-ref
procedure returns character k of
string using zero-origin indexing.
There is no requirement for this procedure to execute in constant time.
It is an error if k is not a valid index of string.
The string-set!
procedure stores char in element k
of string. There is no requirement for this procedure to execute
in constant time.
(define (f) (make-string 3 #\*)) (define (g) "***") (string-set! (f) 0 #\?) ⇒ unspecified (string-set! (g) 0 #\?) ⇒ error (string-set! (symbol->string 'immutable) 0 #\?) ⇒ error
Returns #t
if all the strings are the same length and
contain exactly the same characters in the same positions, otherwise
returns #f
.
Returns #t
if, after case-folding, all the strings are the
same length and contain the same characters in the same positions,
otherwise returns #f
. Specifically, these procedures behave as
if string-foldcase
were applied to their arguments before
comparing them.
These procedures return #t
if their arguments are (respectively):
monotonically increasing, monotonically decreasing, monotonically
non-decreasing, or monotonically non-increasing.
These predicates are required to be transitive.
These procedures compare strings in an implementation-defined way.
One approach is to make them the lexicographic extensions to strings of
the corresponding orderings on characters. In that case, string<?
would be the lexicographic ordering on strings induced by the ordering
char<?
on characters, and if the two strings differ in length but
are the same up to the length of the shorter string, the shorter string
would be considered to be lexicographically less than the longer string.
However, it is also permitted to use the natural ordering imposed by the
implementation’s internal representation of strings, or a more complex
locale-specific ordering.
In all cases, a pair of strings must satisfy exactly one of
string<?
, string=?
, and string>?
, and must satisfy
string<=?
if and only if they do not satisfy string>?
and string>=?
if and only if they do not satisfy string<?
.
The ‘-ci’ procedures behave as if they applied string-foldcase
to their arguments before invoking the corresponding procedures without
‘-ci’.
These procedures apply the Unicode full string uppercasing, lowercasing,
and case-folding algorithms to their arguments and return the result.
In certain cases, the result differs in length from the argument.
If the result is equal to the argument in the sense of string=?
,
the argument may be returned. Note that language-sensitive mappings
and foldings are not used.
The Unicode Standard prescribes special treatment of the Greek letter
Σ, whose normal lower-case form is σ
but which becomes ς at the end of a word. See UAX #44
[uax44]
(part of the Unicode Standard) for details. However, implementations of
string-downcase
are not required to provide this behavior, and may
choose to change Σ to σ in all cases.
The substring
procedure returns a newly allocated string formed
from the characters of string beginning with index start
and ending with index end. This is equivalent to calling
string-copy
with the same arguments, but is provided for
backward compatibility and stylistic flexibility.
Returns a newly allocated string whose characters are the concatenation of the characters in the given strings.
It is an error if any element of list is not a character.
The string->list
procedure returns a newly allocated list of
the characters of string between start and end.
list->string
returns a newly allocated string formed from the
elements in the list list. In both procedures, order is preserved.
string->list
and list->string
are inverses so far as
equal?
is concerned.
Returns a newly allocated copy of the part of the given string between start and end.
It is an error if at is less than zero or greater than the length of
to. It is also an error if (- (string-length
to)
at)
is less than (-
end start)
.
Copies the characters of string from between start and end to string to, starting at at. The order in which characters are copied is unspecified, except that if the source and destination overlap, copying takes place as if the source is first copied into a temporary string and then into the destination. This can be achieved without allocating storage by making sure to copy in the correct direction in such circumstances.
(define a "12345") (define b (string-copy "abcde")) (string-copy! b 1 a 0 2) b ⇒ "a12de"
Next: Vectors, Previous: Characters, Up: Standard procedures [Index]