String

Standard library for string operations.

Functions in this module operate on type String, which is a collection of ordered, utf-8 characters. Literal strings can be created with double quotes (a regular string) or backticks (a raw string), such as in "hello" or `hey there`.

Within a regular string, escape sequences can be used to include special characters. Within a raw string, no escape sequences are allowed, meaning backslash is a literal character. Raw strings are often used to define regular expressions, which frequently contain literal backslashes.

To import all names from this module, use:

import String (*)

With no import, you can still access anything with the prefix String., like String.slice.

Index

NameTypeDescription

length(sized)

String -> Int

Returns the length of sized.

empty?(sized)

String -> Bool

Returns whether sized is empty.

concat(a, b)

(String, String) -> String

Concatenates a and b together, equivalent to a ++ b.

concat_all(l)

[String] -> String

Concatenates all elements in the list l together.

slice(s, i, len)

(String, Int, Int) -> String

Returns a substring of s starting at index i with length len.

range(s, start, end)

(String, Int, Int) -> String

Returns a substring of s starting at index start and ending at index end, inclusive.

capitalize(s)

String -> String

Capitalizes the first letter in s, returning a new string.

to_lower(s)

String -> String

Converts every character in s to lowercase, returning a new string.

to_upper(s)

String -> String

Converts every character in s to uppercase, returning a new string.

lstrip(s)

String -> String

Removes leading whitespace in s, returning a new string.

rstrip(s)

String -> String

Removes trailing whitespace in s, returning a new string.

strip(s)

String -> String

Removes leading and trailing whitespace in s, returning a new string.

lines(s)

String -> [String]

Returns a list of lines in s, where lines are separated either by "\n" or "\r\n".

reverse(s)

String -> String

Reverses the string s, returning a new string.

starts_with?(s, prefix)

(String, String) -> Bool

Returns true if s starts with the given prefix.

ends_with?(s, suffix)

(String, String) -> Bool

Returns true if s ends with the given suffix.

substr?(s, sub)

(String, String) -> Bool

Returns true if s contains the substring sub.

re(s, opts)

(String, [RegexOpt]) -> Regex

Creates a regular expression based on the pattern given by string s.

matches?(s, r)

(String, Regex) -> Bool

Returns true if the regex r successfully matches some part of s.

search(s, r)

(String, Regex) -> [String]

Runs the regex r on s once.

search_all(s, r)

(String, Regex) -> [[String]]

Runs the regex r on s iteratively until there are no more matces.

span(s, sub)

(String, String) -> Option<(String, String)>

Looks for the first occurrence of sub in s.

rspan(s, sub)

(String, String) -> Option<(String, String)>

Looks for the last occurrence of sub in s.

search_span(s, r)

(String, Regex) -> Option<(String, [String], String)>

Runs the regex r on s once.

interface Pattern

T ~ Pattern

A Pattern is used to split a string into parts or to replace portions of a string.

splitn(s, pat, n)

(String, T ~ Pattern, Int) -> [String]

Splits a string into at most n parts, where each part is separated by the given pattern pat.

replace(s, pat, replacement)

(String, T ~ Pattern, String) -> String

Replaces all occurrences of pat in s with replacement.

replace_one(s, pat, replacement)

(String, T ~ Pattern, String) -> String

Replaces the first occurrence of pat in s with replacement.

split(s, r)

(String, T ~ Pattern) -> [String]

This works exactly like splitn() when n is set to -1, meaning there's no limit on the number of splits.

to_chars(s)

String -> [Char]

Converts a string s into a list of characters.

from_chars(chars)

[Char] -> String

Converts a list of characters, chars into a string.

to_int(a)

String -> Int

Converts a to an integer.

to_int(a)

[Char] -> Int

Converts a to an integer.

to_float(a)

String -> Float

Converts a to a float.

to_float(a)

[Char] -> Float

Converts a to a float.

to_atom(a)

String -> Atom

Converts a to an atom.

to_atom(a)

[Char] -> Atom

Converts a to an atom.

struct Regex

Regex

Represents a compiled regular expression.

enum RegexOpt

RegexOpt

Options for creating a regular expression, passed to re().

Functions

slice : (String, Int, Int) -> String
slice(s, i, len)

Returns a substring of s starting at index i with length len. i can be negative, where -1 is the last index, -2 is the second-to-last index, etc. If len <= 0, returns "". If slice tries to access an element at an invalid index, it raises BadStringIndex.

assert slice("bar", 0, 1) == "b"
assert slice("bar", -1, 1) == "r"
assert slice("bar", 0, 3) == "bar"
assert slice("bar", -3, 2) == "ba"
range : (String, Int, Int) -> String
range(s, start, end)

Returns a substring of s starting at index start and ending at index end, inclusive. start and end can be negative, where -1 is the last index, -2 is the second-to-last index, etc. If start > end (after resolving negative indexes), returns "". If range tries to access an element at an invalid index, it raises BadStringIndex.

assert range("bar", 0, 1) == "ba"
assert range("bar", -1, 2) == "r"
assert range("bar", 2, -2) == ""
capitalize : String -> String
capitalize(s)

Capitalizes the first letter in s, returning a new string. If the first character isn't a letter, simply returns s.

assert capitalize("foo BAR") == "Foo BAR"
assert capitalize("Foo BAR") == "Foo BAR"
assert capitalize("158") == "158"
assert capitalize("") == ""
to_lower : String -> String
to_lower(s)

Converts every character in s to lowercase, returning a new string.

assert to_lower("foo BAR") == "foo bar"
assert to_lower("158") == "158"
assert to_lower("") == ""
to_upper : String -> String
to_upper(s)

Converts every character in s to uppercase, returning a new string.

assert to_upper("foo BAR") == "FOO BAR"
assert to_upper("158") == "158"
assert to_upper("") == ""
lstrip : String -> String
lstrip(s)

Removes leading whitespace in s, returning a new string.

assert lstrip(" \n\tfoo bar") == "foo bar"
assert lstrip(" \r\n\tfoo bar \t\r\n") == "foo bar \t\r\n"
assert lstrip("") == ""
rstrip : String -> String
rstrip(s)

Removes trailing whitespace in s, returning a new string.

assert rstrip("foo bar \t\n") == "foo bar"
assert rstrip(" \r\n\tfoo bar \t\r\n") == " \r\n\tfoo bar"
assert rstrip("") == ""
strip : String -> String
strip(s)

Removes leading and trailing whitespace in s, returning a new string.

assert strip(" \n\tfoo bar") == "foo bar"
assert strip("foo bar \t\n") == "foo bar"
assert strip(" \r\n\tfoo bar \t\r\n") == "foo bar"
lines : String -> [String]
lines(s)

Returns a list of lines in s, where lines are separated either by "\n" or "\r\n". Each line in the result list is stripped of the trailing "\n" or "\r\n". All trailing blank lines at the end of s are removed from the resultant list.

assert lines("foo bar") == ["foo bar"]
assert lines("foo\nbar") == ["foo", "bar"]
assert lines("foo\r\nbar\r\n") == ["foo", "bar"]
assert lines("\r\nfoo bar") == ["", "foo bar"]
assert lines("\r\n\r\n") == []
reverse : String -> String
reverse(s)

Reverses the string s, returning a new string.

assert reverse("foo bar") == "rab oof"
assert reverse("") == ""
starts_with? : (String, String) -> Bool
starts_with?(s, prefix)

Returns true if s starts with the given prefix.

assert starts_with?("foo bar", "foo ")
assert starts_with?("foo bar", "")
assert !starts_with?("foo bar", "asdf")
assert !starts_with?("foo bar", "F")
ends_with? : (String, String) -> Bool
ends_with?(s, suffix)

Returns true if s ends with the given suffix.

assert ends_with?("foo bar", "r")
assert ends_with?("foo bar", "bar")
assert ends_with?("foo bar", "")
assert !ends_with?("foo bar", "asdf")
substr? : (String, String) -> Bool
substr?(s, sub)

Returns true if s contains the substring sub. If s == sub, returns true.

assert substr?("foo bar", "r")
assert substr?("foo bar", "bar")
assert !substr?("foo bar", "flo")
assert !substr?("foo bar", "R")
re : (String, [RegexOpt]) -> Regex
re(s, opts)

Creates a regular expression based on the pattern given by string s. See Erlang's re documentation for a full description of the supported regex syntax. If s isn't a valid pattern, raises BadRegex.

Many regular expressions contain literal backslashes. For instance, the regular expression \d+ matches one or more digits. Because Par uses backslashes as the escape character in normal strings, you'd have to write this as String.re("\\d+", []) (the \\ represents a single \ in a string). To avoid this, it's recommended to use raw strings, which are delimited by backticks. In raw strings, backslashes don't need to be escaped, so you can simply write String.re(`\d+`, []). See primitives for details about raw strings.

opts is a list of regex options. See RegexOpt for details about each option.

RegExr is a useful resource to test regular expressions and understand how they match text.

// matches one or more digits, optionally followed by a decimal
// point and more digits
re(`\d+(.\d+)?`, [])

// matches the literal text "foo bar", case-insensitive
re(`foo bar`, [Caseless])

// matches any text within braces that spans an entire line
re(`^\{.*?\}$`, [Multiline])
matches? : (String, Regex) -> Bool
matches?(s, r)

Returns true if the regex r successfully matches some part of s.

assert matches?("foo bar", re(`ba`, []))
assert matches?("foo bar", re(`\w+`, []))
assert matches?("foo bar", re(`FO{2}\s`, [Caseless]))
assert !matches?("foo bar", re(`Bar`, []))
assert !matches?("foo bar", re(`o{3}`, []))

Runs the regex r on s once. If there is a match, returns a list of strings describing the match. The first element in the resultant list is the full matched text. The second element is the text matched by the first capture group in r. The third element is the text matched by the second capture group in r, and so on for all capture groups. If no match is found, returns [].

Capture groups are defined by parentheses. In the regex foo ((\w+) bar), the first capture group is ((\w+) bar), which matches one or more word characters followed by the literal text bar. The second capture group, which happens to be a subset of the first capture group, is (\w+), which matches one or more word characters.

See re() for how to create a Regex and details about syntax.

assert search("foo bar", re(`\w+`, [])) == ["foo"]
assert search("foo bar", re(`foo (\w{2})`, [])) == ["foo ba", "ba"]
assert search("foo\nbar", re(`^bar$`, [Multiline])) == ["bar"]
assert search("foo bar", re(`f  o   o`, [Extended])) == ["foo"]
assert search("foo bar", re(`Bar`, [])) == []
search_all : (String, Regex) -> [[String]]
search_all(s, r)

Runs the regex r on s iteratively until there are no more matces. For each match, creates a list of strings describing the match, where the first element is the full matched text, and each subsequent element corresponds to a capture group in r (see search() for details). Returns a list of list of strings describing all matches.

assert search_all("foo bar", re(`\w+`, [])) == [["foo"], ["bar"]]
assert search_all("foo bar", re(`\w(\w)`, [])) ==
  [["fo", "o"], ["ba", "a"]]
assert search_all("foo bar", re(`O`, [Caseless])) == [["o"], ["o"]]
assert search_all("foo bar", re(`Bar`, [])) == []
span : (String, String) -> Option<(String, String)>
span(s, sub)

Looks for the first occurrence of sub in s. If sub is found, returns Some((a, b)) where a is the text before sub, and b is the text after sub. If sub isn't found, returns None.

assert span("foo bar", "bar") == Some(("foo ", ""))
assert span("foo bar", "o") == Some(("f", "o bar"))
assert span("foo bar", "flo") == None
rspan : (String, String) -> Option<(String, String)>
rspan(s, sub)

Looks for the last occurrence of sub in s. If sub is found, returns Some((a, b)) where a is the text before sub, and b is the text after sub. If sub isn't found, returns None.

assert rspan("foo bar", "bar") == Some(("foo ", ""))
assert rspan("foo bar", "o") == Some(("fo", " bar"))
assert rspan("foo bar", "flo") == None
search_span : (String, Regex) -> Option<(String, [String], String)>
search_span(s, r)

Runs the regex r on s once. If there is a match, returns Some((a, l, b)), where a is the text before the match, l is a list describing the match, and b is the text after the match. Each element in l corresponds to a capture group in r (see search() for details). Note that, unlike search(), the first element in l is not the fully matched text (though you can add a capture group around the entire regex to get this behavior). If there is no match, returns None.

assert search_span("foo bar", re(`ba`, [])) ==
  Some(("foo ", [], "r"))
assert search_span("foo bar", re(`(\w+)`, [])) ==
  Some(("", ["foo"], " bar"))
assert search_span("foo bar", re(`\w(\w)`, [])) ==
  Some(("", ["o"], "o bar"))
assert search_span("foo bar", re(`Bar`, [])) == None
split : (String, T ~ Pattern) -> [String]
split(s, r)

This works exactly like splitn() when n is set to -1, meaning there's no limit on the number of splits.

assert split("foo bar", " ") == ["foo", "bar"]
assert split("foo bar", "o") == ["f", "", " bar"]
assert split("foo bar", re(`(\w+)`, [])) == ["", "foo", " ", "bar", ""]
assert split("foo bar", re(``, [])) ==
  ["f", "o", "o", " ", "b", "a", "r"]
to_chars : String -> [Char]
to_chars(s)

Converts a string s into a list of characters.

assert to_chars("foo BAR") == ['f', 'o', 'o', ' ', 'B', 'A', 'R']
assert to_chars("åäö") == ['å', 'ä', 'ö']
assert to_chars("") == []
from_chars : [Char] -> String
from_chars(chars)

Converts a list of characters, chars into a string.

assert from_chars(['f', 'o', 'o', ' ', 'B', 'A', 'R']) == "foo BAR"
assert from_chars(['å', 'ä', 'ö']) == "åäö"
assert from_chars([]) == ""

Interfaces

interface Pattern

A Pattern is used to split a string into parts or to replace portions of a string. A pattern identifies where to split or what to replace. The types String and Regex implement the Pattern interface.

interface Pattern {
  splitn : (String, T, Int) -> [String]
  replace : (String, T, String) -> String
  replace_one : (String, T, String) -> String
}
splitn : (String, T ~ Pattern, Int) -> [String]
splitn(s, pat, n)

Splits a string into at most n parts, where each part is separated by the given pattern pat. Returns a list with these constituent parts. If n <= 0, there is no limit to the number of parts. If n > 0 and there are more than n - 1 occurrences of pat, only the first n - 1 occurrences will be used to determine the parts.

If pat is a string, this function splits on substrings that exactly match pat.

If pat is a regex, this function splits on any substring that matches pat. If pat contains capture groups, the matched text for each capture group is included in the result list, located in-between the two parts that were split by the substring matched by pat.

If s is the empty string "", returns [].

assert splitn("foo bar", "o", 2) == ["f", "o bar"]
assert splitn("foo bar", "bar", 10) == ["foo ", ""]
assert splitn("foo bar", re(`(\w+)`, []), 2) == ["", "foo", " bar"]
assert splitn("foo\nbar", re(`^bar$`, [Multiline]), 1) == ["foo\nbar"]
assert splitn("foo bar", re(``, []), -1) ==
  ["f", "o", "o", " ", "b", "a", "r"]
replace : (String, T ~ Pattern, String) -> String
replace(s, pat, replacement)

Replaces all occurrences of pat in s with replacement.

If pat is a string, this function looks for substrings that exactly match pat, and replaces them with the literal text in replacement.

If pat is a regex, this function looks for substrings that match pat. The special sequences \1, \2, ..., \n etc. in replacement can be used to include the text captured by the nth capture group in pat.

If you're including \1, \2, etc. in replacement, it's generally best to use raw strings so you don't need to escape the backslashes, such as in replace(s, pat, `\1`). Otherwise, you'll need to write replace(s, pat, "\\1"). See re() for details about raw strings.

assert replace("foo bar", "r", "cat") == "foo bacat"
assert replace("foo bar", "o", "i") == "fii bar"
assert replace("foo bar", "flo", "abc") == "foo bar"
assert replace("foo bar", re(`\w+`, []), "baz") == "baz baz"
assert replace("foo bar", re(`foo (\w{2})`, []), `\1zaa`) == "bazaar"
replace_one : (String, T ~ Pattern, String) -> String
replace_one(s, pat, replacement)

Replaces the first occurrence of pat in s with replacement. This works exactly like replace, but only replaces one occurrence.

assert replace_one("foo bar", "r", "cat") == "foo bacat"
assert replace_one("foo bar", "o", "i") == "fio bar"
assert replace_one("foo bar", "flo", "abc") == "foo bar"
assert replace_one("foo bar", re(`\w+`, []), "baz") == "baz bar"
assert replace_one("foo bar", re(`foo (\w{2})`, []), `\1zaa`) == "bazaar"

Implementations

impl Sized for String

The following functions are from the Sized interface.

length : String -> Int
length(sized)

Returns the length of sized. See the full description in the Sized interface.

empty? : String -> Bool
empty?(sized)

Returns whether sized is empty. See the full description in the Sized interface.

impl Concat for String

The following functions are from the Concat interface.

concat : (String, String) -> String
concat(a, b)

Concatenates a and b together, equivalent to a ++ b. See the full description in the Concat interface.

concat_all : [String] -> String
concat_all(l)

Concatenates all elements in the list l together. See the full description in the Concat interface.

impl Pattern for String

The following functions are from the Pattern interface.

splitn : (String, String, Int) -> [String]
splitn(s, pat, n)

Splits a string into at most n parts, where each part is separated by the given pattern pat. See the full description in the Pattern interface.

replace : (String, String, String) -> String
replace(s, pat, replacement)

Replaces all occurrences of pat in s with replacement. See the full description in the Pattern interface.

replace_one : (String, String, String) -> String
replace_one(s, pat, replacement)

Replaces the first occurrence of pat in s with replacement. See the full description in the Pattern interface.

impl Pattern for Regex

The following functions are from the Pattern interface.

splitn : (String, Regex, Int) -> [String]
splitn(s, pat, n)

Splits a string into at most n parts, where each part is separated by the given pattern pat. See the full description in the Pattern interface.

replace : (String, Regex, String) -> String
replace(s, pat, replacement)

Replaces all occurrences of pat in s with replacement. See the full description in the Pattern interface.

replace_one : (String, Regex, String) -> String
replace_one(s, pat, replacement)

Replaces the first occurrence of pat in s with replacement. See the full description in the Pattern interface.

impl ToInt for String

The following functions are from the ToInt interface.

to_int : String -> Int
to_int(a)

Converts a to an integer. See the full description in the ToInt interface.

impl ToInt for [Char]

The following functions are from the ToInt interface.

to_int : [Char] -> Int
to_int(a)

Converts a to an integer. See the full description in the ToInt interface.

impl ToFloat for String

The following functions are from the ToFloat interface.

to_float : String -> Float
to_float(a)

Converts a to a float. See the full description in the ToFloat interface.

impl ToFloat for [Char]

The following functions are from the ToFloat interface.

to_float : [Char] -> Float
to_float(a)

Converts a to a float. See the full description in the ToFloat interface.

impl ToAtom for String

The following functions are from the ToAtom interface.

to_atom : String -> Atom
to_atom(a)

Converts a to an atom. See the full description in the ToAtom interface.

impl ToAtom for [Char]

The following functions are from the ToAtom interface.

to_atom : [Char] -> Atom
to_atom(a)

Converts a to an atom. See the full description in the ToAtom interface.

Types

struct Regex

Represents a compiled regular expression. To create a regular expression, use re().

enum RegexOpt

Options for creating a regular expression, passed to re().

  • Caseless — Case-insensitive, so A matches both A and a.
  • Multiline^ matches at the start of a line and $ matches at the end of a line. Without this option, ^ only matches at the start of the entire string, and $ only matches at the end of the entire string (or before a final \n character).
  • DotAll. matches any character, including \n. Without this option, . matches anything except \n.
  • Extended — Whitespace characters are ignored, unless they're escaped by \ or in a character class. An unescaped # starts a comment that lasts until the next \n.
enum RegexOpt {
  Caseless
  Multiline
  DotAll
  Extended
}

Exceptions

exception BadStringIndex(Int)

Raised by slice() and range() when they try to access an index out of bounds.

exception BadRegex({ reason : String, index : Int })

Raised by re() when the regular expression pattern is invalid. reason is a description of what's wrong, and index is the index at which the error occurs.