String
Standard library for string operations.
Functions in this module operate on type String
, which is a collection of
ordered, utf-8 characters. Literal strings can be created with double quotes
(a regular string) or backticks (a raw string), such as in "hello"
or
`hey there`
.
Within a regular string, escape sequences can be used to include special characters. Within a raw string, no escape sequences are allowed, meaning backslash is a literal character. Raw strings are often used to define regular expressions, which frequently contain literal backslashes.
To import all names from this module, use:
import String (*)
With no import, you can still access anything with the prefix String.
, like String.slice
.
Index
Name | Type | Description |
---|---|---|
String -> Int | Returns the length of | |
String -> Bool | Returns whether | |
(String, String) -> String | Concatenates | |
[String] -> String | Concatenates all elements in the list | |
(String, Int, Int) -> String | Returns a substring of | |
(String, Int, Int) -> String | Returns a substring of | |
String -> String | Capitalizes the first letter in | |
String -> String | Converts every character in | |
String -> String | Converts every character in | |
String -> String | Removes leading whitespace in | |
String -> String | Removes trailing whitespace in | |
String -> String | Removes leading and trailing whitespace in | |
String -> [String] | Returns a list of lines in | |
String -> String | Reverses the string | |
(String, String) -> Bool | Returns true if | |
(String, String) -> Bool | Returns true if | |
(String, String) -> Bool | Returns true if | |
(String, [RegexOpt]) -> Regex | Creates a regular expression based on the pattern given by string | |
(String, Regex) -> Bool | Returns true if the regex | |
(String, Regex) -> [String] | Runs the regex | |
(String, Regex) -> [[String]] | Runs the regex | |
(String, String) -> Option<(String, String)> | Looks for the first occurrence of | |
(String, String) -> Option<(String, String)> | Looks for the last occurrence of | |
(String, Regex) -> Option<(String, [String], String)> | Runs the regex | |
T ~ Pattern | A | |
(String, T ~ Pattern, Int) -> [String] | Splits a string into at most | |
(String, T ~ Pattern, String) -> String | Replaces all occurrences of | |
(String, T ~ Pattern, String) -> String | Replaces the first occurrence of | |
(String, T ~ Pattern) -> [String] | This works exactly like | |
String -> [Char] | Converts a string | |
[Char] -> String | Converts a list of characters, | |
String -> Int | Converts | |
[Char] -> Int | Converts | |
String -> Float | Converts | |
[Char] -> Float | Converts | |
String -> Atom | Converts | |
[Char] -> Atom | Converts | |
Regex | Represents a compiled regular expression. | |
RegexOpt | Options for creating a regular expression, passed to |
Functions
slice : (String, Int, Int) -> String slice(s, i, len)
Returns a substring of s
starting at index i
with length len
. i
can
be negative, where -1
is the last index, -2
is the second-to-last index,
etc. If len <= 0
, returns ""
. If slice
tries to access an element at an
invalid index, it raises BadStringIndex
.
assert slice("bar", 0, 1) == "b" assert slice("bar", -1, 1) == "r" assert slice("bar", 0, 3) == "bar" assert slice("bar", -3, 2) == "ba"
range : (String, Int, Int) -> String range(s, start, end)
Returns a substring of s
starting at index start
and ending at index
end
, inclusive. start
and end
can be negative, where -1
is the last
index, -2
is the second-to-last index, etc. If start > end
(after
resolving negative indexes), returns ""
. If range
tries to access an
element at an invalid index, it raises BadStringIndex
.
assert range("bar", 0, 1) == "ba" assert range("bar", -1, 2) == "r" assert range("bar", 2, -2) == ""
capitalize : String -> String capitalize(s)
Capitalizes the first letter in s
, returning a new string. If the
first character isn't a letter, simply returns s
.
assert capitalize("foo BAR") == "Foo BAR" assert capitalize("Foo BAR") == "Foo BAR" assert capitalize("158") == "158" assert capitalize("") == ""
to_lower : String -> String to_lower(s)
Converts every character in s
to lowercase, returning a new string.
assert to_lower("foo BAR") == "foo bar" assert to_lower("158") == "158" assert to_lower("") == ""
to_upper : String -> String to_upper(s)
Converts every character in s
to uppercase, returning a new string.
assert to_upper("foo BAR") == "FOO BAR" assert to_upper("158") == "158" assert to_upper("") == ""
lstrip : String -> String lstrip(s)
Removes leading whitespace in s
, returning a new string.
assert lstrip(" \n\tfoo bar") == "foo bar" assert lstrip(" \r\n\tfoo bar \t\r\n") == "foo bar \t\r\n" assert lstrip("") == ""
rstrip : String -> String rstrip(s)
Removes trailing whitespace in s
, returning a new string.
assert rstrip("foo bar \t\n") == "foo bar" assert rstrip(" \r\n\tfoo bar \t\r\n") == " \r\n\tfoo bar" assert rstrip("") == ""
strip : String -> String strip(s)
Removes leading and trailing whitespace in s
, returning a new string.
assert strip(" \n\tfoo bar") == "foo bar" assert strip("foo bar \t\n") == "foo bar" assert strip(" \r\n\tfoo bar \t\r\n") == "foo bar"
lines : String -> [String] lines(s)
Returns a list of lines in s
, where lines are separated either by "\n"
or
"\r\n"
. Each line in the result list is stripped of the trailing "\n"
or
"\r\n"
. All trailing blank lines at the end of s
are removed from the
resultant list.
assert lines("foo bar") == ["foo bar"] assert lines("foo\nbar") == ["foo", "bar"] assert lines("foo\r\nbar\r\n") == ["foo", "bar"] assert lines("\r\nfoo bar") == ["", "foo bar"] assert lines("\r\n\r\n") == []
reverse : String -> String reverse(s)
Reverses the string s
, returning a new string.
assert reverse("foo bar") == "rab oof" assert reverse("") == ""
starts_with? : (String, String) -> Bool starts_with?(s, prefix)
Returns true if s
starts with the given prefix
.
assert starts_with?("foo bar", "foo ") assert starts_with?("foo bar", "") assert !starts_with?("foo bar", "asdf") assert !starts_with?("foo bar", "F")
ends_with? : (String, String) -> Bool ends_with?(s, suffix)
Returns true if s
ends with the given suffix
.
assert ends_with?("foo bar", "r") assert ends_with?("foo bar", "bar") assert ends_with?("foo bar", "") assert !ends_with?("foo bar", "asdf")
substr? : (String, String) -> Bool substr?(s, sub)
Returns true if s
contains the substring sub
. If s == sub
, returns
true.
assert substr?("foo bar", "r") assert substr?("foo bar", "bar") assert !substr?("foo bar", "flo") assert !substr?("foo bar", "R")
re : (String, [RegexOpt]) -> Regex re(s, opts)
Creates a regular expression based on the pattern given by string s
. See
Erlang's re
documentation for
a full description of the supported regex syntax. If s
isn't a valid
pattern, raises BadRegex
.
Many regular expressions contain literal backslashes. For instance, the
regular expression \d+
matches one or more digits. Because Par uses
backslashes as the escape character in normal strings, you'd have to write
this as String.re("\\d+", [])
(the \\
represents a single \
in
a string). To avoid this, it's recommended to use raw strings, which are
delimited by backticks. In raw strings, backslashes don't need to be escaped,
so you can simply write String.re(`\d+`, [])
. See
primitives for details about raw strings.
opts
is a list of regex options. See RegexOpt
for details
about each option.
RegExr is a useful resource to test regular expressions and understand how they match text.
// matches one or more digits, optionally followed by a decimal // point and more digits re(`\d+(.\d+)?`, []) // matches the literal text "foo bar", case-insensitive re(`foo bar`, [Caseless]) // matches any text within braces that spans an entire line re(`^\{.*?\}$`, [Multiline])
matches? : (String, Regex) -> Bool matches?(s, r)
Returns true if the regex r
successfully matches some part of s
.
assert matches?("foo bar", re(`ba`, [])) assert matches?("foo bar", re(`\w+`, [])) assert matches?("foo bar", re(`FO{2}\s`, [Caseless])) assert !matches?("foo bar", re(`Bar`, [])) assert !matches?("foo bar", re(`o{3}`, []))
search : (String, Regex) -> [String] search(s, r)
Runs the regex r
on s
once. If there is a match, returns a list of
strings describing the match. The first element in the resultant list is the
full matched text. The second element is the text matched by the first
capture group in r
. The third element is the text matched by the second
capture group in r
, and so on for all capture groups. If no match is found,
returns []
.
Capture groups are defined by parentheses. In the regex foo ((\w+) bar)
,
the first capture group is ((\w+) bar)
, which matches one or more word
characters followed by the literal text bar
. The second capture group,
which happens to be a subset of the first capture group, is (\w+)
, which
matches one or more word characters.
See re()
for how to create a Regex
and details about syntax.
assert search("foo bar", re(`\w+`, [])) == ["foo"] assert search("foo bar", re(`foo (\w{2})`, [])) == ["foo ba", "ba"] assert search("foo\nbar", re(`^bar$`, [Multiline])) == ["bar"] assert search("foo bar", re(`f o o`, [Extended])) == ["foo"] assert search("foo bar", re(`Bar`, [])) == []
search_all : (String, Regex) -> [[String]] search_all(s, r)
Runs the regex r
on s
iteratively until there are no more matces. For
each match, creates a list of strings describing the match, where the first
element is the full matched text, and each subsequent element corresponds to
a capture group in r
(see search()
for details). Returns
a list of list of strings describing all matches.
assert search_all("foo bar", re(`\w+`, [])) == [["foo"], ["bar"]] assert search_all("foo bar", re(`\w(\w)`, [])) == [["fo", "o"], ["ba", "a"]] assert search_all("foo bar", re(`O`, [Caseless])) == [["o"], ["o"]] assert search_all("foo bar", re(`Bar`, [])) == []
span : (String, String) -> Option<(String, String)> span(s, sub)
Looks for the first occurrence of sub
in s
. If sub
is found, returns
Some((a, b))
where a
is the text before sub
, and b
is
the text after sub
. If sub
isn't found, returns None
.
assert span("foo bar", "bar") == Some(("foo ", "")) assert span("foo bar", "o") == Some(("f", "o bar")) assert span("foo bar", "flo") == None
rspan : (String, String) -> Option<(String, String)> rspan(s, sub)
Looks for the last occurrence of sub
in s
. If sub
is found, returns
Some((a, b))
where a
is the text before sub
, and b
is
the text after sub
. If sub
isn't found, returns None
.
assert rspan("foo bar", "bar") == Some(("foo ", "")) assert rspan("foo bar", "o") == Some(("fo", " bar")) assert rspan("foo bar", "flo") == None
search_span : (String, Regex) -> Option<(String, [String], String)> search_span(s, r)
Runs the regex r
on s
once. If there is a match, returns Some((a, l,
b))
, where a
is the text before the match, l
is a list
describing the match, and b
is the text after the match. Each element in
l
corresponds to a capture group in r
(see search()
for
details). Note that, unlike search()
, the first element in l
is not
the fully matched text (though you can add a capture group around the entire
regex to get this behavior). If there is no match, returns
None
.
assert search_span("foo bar", re(`ba`, [])) == Some(("foo ", [], "r")) assert search_span("foo bar", re(`(\w+)`, [])) == Some(("", ["foo"], " bar")) assert search_span("foo bar", re(`\w(\w)`, [])) == Some(("", ["o"], "o bar")) assert search_span("foo bar", re(`Bar`, [])) == None
split : (String, T ~ Pattern) -> [String] split(s, r)
This works exactly like splitn()
when n
is set to -1, meaning
there's no limit on the number of splits.
assert split("foo bar", " ") == ["foo", "bar"] assert split("foo bar", "o") == ["f", "", " bar"] assert split("foo bar", re(`(\w+)`, [])) == ["", "foo", " ", "bar", ""] assert split("foo bar", re(``, [])) == ["f", "o", "o", " ", "b", "a", "r"]
to_chars : String -> [Char] to_chars(s)
Converts a string s
into a list of characters.
assert to_chars("foo BAR") == ['f', 'o', 'o', ' ', 'B', 'A', 'R'] assert to_chars("åäö") == ['å', 'ä', 'ö'] assert to_chars("") == []
from_chars : [Char] -> String from_chars(chars)
Converts a list of characters, chars
into a string.
assert from_chars(['f', 'o', 'o', ' ', 'B', 'A', 'R']) == "foo BAR" assert from_chars(['å', 'ä', 'ö']) == "åäö" assert from_chars([]) == ""
Interfaces
interface Pattern
A Pattern
is used to split a string into parts or to replace portions of
a string. A pattern identifies where to split or what to replace. The
types String
and Regex
implement the Pattern
interface.
interface Pattern { splitn : (String, T, Int) -> [String] replace : (String, T, String) -> String replace_one : (String, T, String) -> String }
splitn : (String, T ~ Pattern, Int) -> [String] splitn(s, pat, n)
Splits a string into at most n
parts, where each part is separated by
the given pattern pat
. Returns a list with these constituent parts. If
n <= 0
, there is no limit to the number of parts. If n > 0
and there
are more than n - 1
occurrences of pat
, only the first n - 1
occurrences will be used to determine the parts.
If pat
is a string, this function splits on substrings that exactly match
pat
.
If pat
is a regex, this function splits on any substring that matches
pat
. If pat
contains capture groups, the matched text for each capture
group is included in the result list, located in-between the two parts
that were split by the substring matched by pat
.
If s
is the empty string ""
, returns []
.
assert splitn("foo bar", "o", 2) == ["f", "o bar"] assert splitn("foo bar", "bar", 10) == ["foo ", ""] assert splitn("foo bar", re(`(\w+)`, []), 2) == ["", "foo", " bar"] assert splitn("foo\nbar", re(`^bar$`, [Multiline]), 1) == ["foo\nbar"] assert splitn("foo bar", re(``, []), -1) == ["f", "o", "o", " ", "b", "a", "r"]
replace : (String, T ~ Pattern, String) -> String replace(s, pat, replacement)
Replaces all occurrences of pat
in s
with replacement
.
If pat
is a string, this function looks for substrings that exactly match
pat
, and replaces them with the literal text in replacement
.
If pat
is a regex, this function looks for substrings that match pat
.
The special sequences \1
, \2
, ..., \n
etc. in replacement
can
be used to include the text captured by the n
th capture group in pat
.
If you're including \1
, \2
, etc. in replacement
, it's generally best
to use raw strings so you don't need to escape the backslashes, such as in
replace(s, pat, `\1`)
. Otherwise, you'll need to write replace(s, pat,
"\\1")
. See re()
for details about raw strings.
assert replace("foo bar", "r", "cat") == "foo bacat" assert replace("foo bar", "o", "i") == "fii bar" assert replace("foo bar", "flo", "abc") == "foo bar" assert replace("foo bar", re(`\w+`, []), "baz") == "baz baz" assert replace("foo bar", re(`foo (\w{2})`, []), `\1zaa`) == "bazaar"
replace_one : (String, T ~ Pattern, String) -> String replace_one(s, pat, replacement)
Replaces the first occurrence of pat
in s
with replacement
. This
works exactly like replace
, but only replaces one occurrence.
assert replace_one("foo bar", "r", "cat") == "foo bacat" assert replace_one("foo bar", "o", "i") == "fio bar" assert replace_one("foo bar", "flo", "abc") == "foo bar" assert replace_one("foo bar", re(`\w+`, []), "baz") == "baz bar" assert replace_one("foo bar", re(`foo (\w{2})`, []), `\1zaa`) == "bazaar"
Implementations
impl Sized for String
The following functions are from the Sized
interface.
length : String -> Int length(sized)
Returns the length of sized
. See the full description in the Sized
interface.
empty? : String -> Bool empty?(sized)
Returns whether sized
is empty. See the full description in the Sized
interface.
impl Concat for String
The following functions are from the Concat
interface.
concat : (String, String) -> String concat(a, b)
Concatenates a
and b
together, equivalent to a ++ b
. See the full description in the Concat
interface.
concat_all : [String] -> String concat_all(l)
Concatenates all elements in the list l
together. See the full description in the Concat
interface.
impl Pattern for String
The following functions are from the Pattern
interface.
splitn : (String, String, Int) -> [String] splitn(s, pat, n)
Splits a string into at most n
parts, where each part is separated by
the given pattern pat
. See the full description in the Pattern
interface.
replace : (String, String, String) -> String replace(s, pat, replacement)
Replaces all occurrences of pat
in s
with replacement
. See the full description in the Pattern
interface.
replace_one : (String, String, String) -> String replace_one(s, pat, replacement)
Replaces the first occurrence of pat
in s
with replacement
. See the full description in the Pattern
interface.
impl Pattern for Regex
The following functions are from the Pattern
interface.
splitn : (String, Regex, Int) -> [String] splitn(s, pat, n)
Splits a string into at most n
parts, where each part is separated by
the given pattern pat
. See the full description in the Pattern
interface.
replace : (String, Regex, String) -> String replace(s, pat, replacement)
Replaces all occurrences of pat
in s
with replacement
. See the full description in the Pattern
interface.
replace_one : (String, Regex, String) -> String replace_one(s, pat, replacement)
Replaces the first occurrence of pat
in s
with replacement
. See the full description in the Pattern
interface.
impl ToInt for String
The following functions are from the ToInt
interface.
to_int : String -> Int to_int(a)
Converts a
to an integer. See the full description in the ToInt
interface.
impl ToInt for [Char]
The following functions are from the ToInt
interface.
to_int : [Char] -> Int to_int(a)
Converts a
to an integer. See the full description in the ToInt
interface.
impl ToFloat for String
The following functions are from the ToFloat
interface.
to_float : String -> Float to_float(a)
Converts a
to a float. See the full description in the ToFloat
interface.
impl ToFloat for [Char]
The following functions are from the ToFloat
interface.
to_float : [Char] -> Float to_float(a)
Converts a
to a float. See the full description in the ToFloat
interface.
impl ToAtom for String
The following functions are from the ToAtom
interface.
to_atom : String -> Atom to_atom(a)
Converts a
to an atom. See the full description in the ToAtom
interface.
impl ToAtom for [Char]
The following functions are from the ToAtom
interface.
to_atom : [Char] -> Atom to_atom(a)
Converts a
to an atom. See the full description in the ToAtom
interface.
Types
struct Regex
Represents a compiled regular expression. To create a regular expression,
use re()
.
enum RegexOpt
Options for creating a regular expression, passed to re()
.
Caseless
— Case-insensitive, soA
matches both A and a.Multiline
—^
matches at the start of a line and$
matches at the end of a line. Without this option,^
only matches at the start of the entire string, and$
only matches at the end of the entire string (or before a final\n
character).DotAll
—.
matches any character, including\n
. Without this option,.
matches anything except\n
.Extended
— Whitespace characters are ignored, unless they're escaped by\
or in a character class. An unescaped#
starts a comment that lasts until the next\n
.
enum RegexOpt { Caseless Multiline DotAll Extended }
Exceptions
exception BadStringIndex(Int)
exception BadRegex({ reason : String, index : Int })
Raised by re()
when the regular expression pattern is invalid.
reason
is a description of what's wrong, and index
is the index at which
the error occurs.