Next: , Previous: , Up: Standard procedures   [Index]


6.7 Strings

Strings are sequences of characters. Strings are written as sequences of characters enclosed within quotation marks ("). Within a string literal, various escape sequences represent characters other than themselves. Escape sequences always start with a backslash (\):

The result is unspecified if any other character in a string occurs after a backslash.

Except for a line ending, any character outside of an escape sequence stands for itself in the string literal. A line ending which is preceded by \⟨intraline whitespace⟩ expands to nothing (along with any trailing intraline whitespace), and can be used to indent strings for improved legibility. Any other line ending has the same effect as inserting a \n character into the string.

Examples:

"The word \"recursion\" has many meanings."
"Another example:\ntwo lines of text"
"Here's text \
   containing just one line"
"\x03B1; is named GREEK SMALL LETTER ALPHA."

The length of a string is the number of characters that it contains. This number is an exact, non-negative integer that is fixed when the string is created. The valid indexes of a string are the exact non-negative integers less than the length of the string. The first character of a string has index 0, the second has index 1, and so on.

Some of the procedures that operate on strings ignore the difference between upper and lower case. The names of the versions that ignore case end with ‘-ci’ (for “case insensitive”).

Implementations may forbid certain characters from appearing in strings. However, with the exception of #\null, ASCII characters must not be forbidden. For example, an implementation might support the entire Unicode repertoire, but only allow characters U+0001 to U+00FF (the Latin-1 repertoire without #\null) in strings.

It is an error to pass such a forbidden character to make-string, string, string-set!, or string-fill!, as part of the list passed to list->string, or as part of the vector passed to vector->string (see Vectors), or in UTF-8 encoded form within a bytevector passed to utf8->string (see Bytevectors). It is also an error for a procedure passed to string-map (see Control features) to return a forbidden character, or for read-string (see Input) to attempt to read one.

procedure: string? obj

Returns #t if obj is a string, otherwise returns #f.

procedure: make-string k
procedure: make-string k char

The make-string procedure returns a newly allocated string of length k. If char is given, then all the characters of the string are initialized to char, otherwise the contents of the string are unspecified.

procedure: string char…

Returns a newly allocated string composed of the arguments. It is analogous to list.

procedure: string-length string

Returns the number of characters in the given string.

procedure: string-ref string k

It is an error if k is not a valid index of string.

The string-ref procedure returns character k of string using zero-origin indexing.

There is no requirement for this procedure to execute in constant time.

procedure: string-set! string k char

It is an error if k is not a valid index of string.

The string-set! procedure stores char in element k of string. There is no requirement for this procedure to execute in constant time.

(define (f) (make-string 3 #\*))
(define (g) "***")
(string-set! (f) 0 #\?) ⇒ unspecified
(string-set! (g) 0 #\?) ⇒ error
(string-set! (symbol->string 'immutable)
             0
             #\?) ⇒ error
procedure: string=? string1 string2 string3

Returns #t if all the strings are the same length and contain exactly the same characters in the same positions, otherwise returns #f.

char library procedure: string-ci=? string1 string2 string3

Returns #t if, after case-folding, all the strings are the same length and contain the same characters in the same positions, otherwise returns #f. Specifically, these procedures behave as if string-foldcase were applied to their arguments before comparing them.

procedure: string<? string1 string2 string3
char library procedure: string-ci<? string1 string2 string3
procedure: string>? string1 string2 string3
char library procedure: string-ci>? string1 string2 string3
procedure: string<=? string1 string2 string3
char library procedure: string-ci<=? string1 string2 string3
procedure: string>=? string1 string2 string3
char library procedure: string-ci>=? string1 string2 string3

These procedures return #t if their arguments are (respectively): monotonically increasing, monotonically decreasing, monotonically non-decreasing, or monotonically non-increasing.

These predicates are required to be transitive.

These procedures compare strings in an implementation-defined way. One approach is to make them the lexicographic extensions to strings of the corresponding orderings on characters. In that case, string<? would be the lexicographic ordering on strings induced by the ordering char<? on characters, and if the two strings differ in length but are the same up to the length of the shorter string, the shorter string would be considered to be lexicographically less than the longer string. However, it is also permitted to use the natural ordering imposed by the implementation’s internal representation of strings, or a more complex locale-specific ordering.

In all cases, a pair of strings must satisfy exactly one of string<?, string=?, and string>?, and must satisfy string<=? if and only if they do not satisfy string>? and string>=? if and only if they do not satisfy string<?.

The ‘-ci’ procedures behave as if they applied string-foldcase to their arguments before invoking the corresponding procedures without ‘-ci’.

char library procedure: string-upcase string
char library procedure: string-downcase string
char library procedure: string-foldcase string

These procedures apply the Unicode full string uppercasing, lowercasing, and case-folding algorithms to their arguments and return the result. In certain cases, the result differs in length from the argument. If the result is equal to the argument in the sense of string=?, the argument may be returned. Note that language-sensitive mappings and foldings are not used.

The Unicode Standard prescribes special treatment of the Greek letter Σ, whose normal lower-case form is σ but which becomes ς at the end of a word. See UAX #44 [uax44] (part of the Unicode Standard) for details. However, implementations of string-downcase are not required to provide this behavior, and may choose to change Σ to σ in all cases.

procedure: substring string start end

The substring procedure returns a newly allocated string formed from the characters of string beginning with index start and ending with index end. This is equivalent to calling string-copy with the same arguments, but is provided for backward compatibility and stylistic flexibility.

procedure: string-append string…

Returns a newly allocated string whose characters are the concatenation of the characters in the given strings.

procedure: string->list string
procedure: string->list string start
procedure: string->list string start end
procedure: list->string list

It is an error if any element of list is not a character.

The string->list procedure returns a newly allocated list of the characters of string between start and end. list->string returns a newly allocated string formed from the elements in the list list. In both procedures, order is preserved. string->list and list->string are inverses so far as equal? is concerned.

procedure: string-copy string
procedure: string-copy string start
procedure: string-copy string start end

Returns a newly allocated copy of the part of the given string between start and end.

procedure: string-copy! to at from
procedure: string-copy! to at from start
procedure: string-copy! to at from start end

It is an error if at is less than zero or greater than the length of to. It is also an error if (- (string-length to) at) is less than (- end start).

Copies the characters of string from between start and end to string to, starting at at. The order in which characters are copied is unspecified, except that if the source and destination overlap, copying takes place as if the source is first copied into a temporary string and then into the destination. This can be achieved without allocating storage by making sure to copy in the correct direction in such circumstances.

(define a "12345")
(define b (string-copy "abcde"))
(string-copy! b 1 a 0 2)
b ⇒ "a12de"
procedure: string-fill! string fill
procedure: string-fill! string fill start
procedure: string-fill! string fill start end

It is an error if fill is not a character.

The string-fill! procedure stores fill in the elements of string between start and end.


Next: Vectors, Previous: Characters, Up: Standard procedures   [Index]