Module Str

Contents

Instructions: Use this module in your project

In the IDE (CLion, Visual Studio Code, Xcode, etc.) you use for your DkSDK project:

  1. Add the following to your project's dependencies/CMakeLists.txt:

    Copy
    DkSDKProject_DeclareAvailable(ocaml
        CONSTRAINT "= 4.14.0"
        FINDLIBS str unix runtime_events threads dynlink)
    DkSDKProject_MakeAvailable(ocaml)
  2. Add the Findlib::ocaml library to any desired targets in src/*/CMakeLists.txt:

    Copy
    target_link_libraries(YourPackage_YourLibraryName
         # ... existing libraries, if any ...
         Findlib::ocaml)
  3. Click your IDE's Build button

Not using DkSDK?

FIRST, do one or all of the following:

  1. Run:

    Copy
    opam install ocaml.4.14.0
  2. Edit your dune-project and add:

    Copy
    (package
      (name YourExistingPackage)
      (depends
      ; ... existing dependenices ...
      (ocaml (>= 4.14.0))))

    Then run:

    Copy
    dune build *.opam # if this fails, run: dune build
  3. Edit your <package>.opam file and add:

    Copy
    depends: [
      # ... existing dependencies ...
      "ocaml" {>= "4.14.0"}
    ]

    Then run:

    Copy
    opam install . --deps-only

FINALLY, add the str library to any desired (library)and/or (executable) targets in your **/dune files:

Copy
(library
  (name YourLibrary)
  ; ... existing library options ...
  (libraries
    ; ... existing libraries ...
    str))

(executable
  (name YourExecutable)
  ; ... existing executable options ...
  (libraries
    ; ... existing libraries ...
    str))

Regular expressions

type regexp

The type of compiled regular expressions.

valregexp : ``string-> regexp

Compile a regular expression. The following constructs are recognized:

  • . Matches any character except newline.
  • * (postfix) Matches the preceding expression zero, one or several times
  • + (postfix) Matches the preceding expression one or several times
  • ? (postfix) Matches the preceding expression once or not at all
  • [..] Character set. Ranges are denoted with -, as in [a-z]. An initial ^, as in [^0-9], complements the set. To include a ] character in a set, make it the first character of the set. To include a - character in a set, make it the first or the last character of the set.
  • ^ Matches at beginning of line: either at the beginning of the matched string, or just after a '\n' character.
  • $ Matches at end of line: either at the end of the matched string, or just before a '\n' character.
  • \| (infix) Alternative between two expressions.
  • \(..\) Grouping and naming of the enclosed expression.
  • \1 The text matched by the first \(...\) expression (\2 for the second expression, and so on up to \9).
  • \b Matches word boundaries.
  • \ Quotes special characters. The special characters are $^\.*+?[].

In regular expressions you will often use backslash characters; it's easier to use a quoted string literal {|...|} to avoid having to escape backslashes.

For example, the following expression:

Copy
let r = Str.regexp {|hello \([A-Za-z]+\)|} in
Str.replace_first r {|\1|} "hello world" 

returns the string "world".

If you want a regular expression that matches a literal backslash character, you need to double it: Str.regexp {|\\|}.

You can use regular string literals "..." too, however you will have to escape backslashes. The example above can be rewritten with a regular string literal as:

Copy
let r = Str.regexp "hello \\([A-Za-z]+\\)" in
Str.replace_first r "\\1" "hello world" 

And the regular expression for matching a backslash becomes a quadruple backslash: Str.regexp "\\\\".

valregexp_case_fold : ``string-> regexp

Same as regexp, but the compiled expression will match text in a case-insensitive way: uppercase and lowercase letters will be considered equivalent.

valquote : ``string-> string

Str.quote s returns a regexp string that matches exactly s and nothing else.

valregexp_string : ``string-> regexp

Str.regexp_string s returns a regular expression that matches exactly s and nothing else.

valregexp_string_case_fold : ``string-> regexp

Str.regexp_string_case_fold is similar to Str.regexp_string, but the regexp matches in a case-insensitive way.

String matching and searching

valstring_match :regexp ->``string->``int-> bool

string_match r s start tests whether a substring of s that starts at position start matches the regular expression r. The first character of a string has position 0, as usual.

valsearch_forward :regexp ->``string->``int-> int

search_forward r s start searches the string s for a substring matching the regular expression r. The search starts at position start and proceeds towards the end of the string. Return the position of the first character of the matched substring.

  • raises Not_found

    if no substring matches.

valsearch_backward :regexp ->``string->``int-> int

search_backward r s last searches the string s for a substring matching the regular expression r. The search first considers substrings that start at position last and proceeds towards the beginning of string. Return the position of the first character of the matched substring.

  • raises Not_found

    if no substring matches.

valstring_partial_match :regexp ->``string->``int-> bool

Similar to Str.string_match, but also returns true if the argument string is a prefix of a string that matches. This includes the case of a true complete match.

valmatched_string : ``string-> string

matched_string s returns the substring of s that was matched by the last call to one of the following matching or searching functions:

provided that none of the following functions was called in between:

Note: in the case of global_substitute and substitute_first, a call to matched_string is only valid within the subst argument, not after global_substitute or substitute_first returns.

The user must make sure that the parameter s is the same string that was passed to the matching or searching function.

valmatch_beginning : ``unit-> int

match_beginning() returns the position of the first character of the substring that was matched by the last call to a matching or searching function (see Str.matched_string for details).

valmatch_end : ``unit-> int

match_end() returns the position of the character following the last character of the substring that was matched by the last call to a matching or searching function (see Str.matched_string for details).

valmatched_group : ``int->``string-> string

matched_group n s returns the substring of s that was matched by the nth group \(...\) of the regular expression that was matched by the last call to a matching or searching function (see Str.matched_string for details). When n is 0, it returns the substring matched by the whole regular expression. The user must make sure that the parameter s is the same string that was passed to the matching or searching function.

  • raises Not_found

    if the nth group of the regular expression was not matched. This can happen with groups inside alternatives \|, options ? or repetitions *. For instance, the empty string will match \(a\)*, but matched_group 1 "" will raise Not_found because the first group itself was not matched.

valgroup_beginning : ``int-> int

group_beginning n returns the position of the first character of the substring that was matched by the nth group of the regular expression that was matched by the last call to a matching or searching function (see Str.matched_string for details).

  • raises Not_found

    if the nth group of the regular expression was not matched.

  • raises Invalid_argument

    if there are fewer than n groups in the regular expression.

valgroup_end : ``int-> int

group_end n returns the position of the character following the last character of substring that was matched by the nth group of the regular expression that was matched by the last call to a matching or searching function (see Str.matched_string for details).

  • raises Not_found

    if the nth group of the regular expression was not matched.

  • raises Invalid_argument

    if there are fewer than n groups in the regular expression.

Replacement

valglobal_replace :regexp ->``string->``string-> string

global_replace regexp templ s returns a string identical to s, except that all substrings of s that match regexp have been replaced by templ. The replacement template templ can contain \1, \2, etc; these sequences will be replaced by the text matched by the corresponding group in the regular expression. \0 stands for the text matched by the whole regular expression.

valreplace_first :regexp ->``string->``string-> string

Same as Str.global_replace, except that only the first substring matching the regular expression is replaced.

valglobal_substitute :regexp ->``(``string->string)``->``string-> string

global_substitute regexp subst s returns a string identical to s, except that all substrings of s that match regexp have been replaced by the result of function subst. The function subst is called once for each matching substring, and receives s (the whole text) as argument.

valsubstitute_first :regexp ->``(``string->string)``->``string-> string

Same as Str.global_substitute, except that only the first substring matching the regular expression is replaced.

valreplace_matched : ``string->``string-> string

replace_matched repl s returns the replacement text repl in which \1, \2, etc. have been replaced by the text matched by the corresponding groups in the regular expression that was matched by the last call to a matching or searching function (see Str.matched_string for details). s must be the same string that was passed to the matching or searching function.

Splitting

valsplit :regexp ->``string-> ``string list

split r s splits s into substrings, taking as delimiters the substrings that match r, and returns the list of substrings. For instance, split (regexp "[ \t]+") s splits s into blank-separated words. An occurrence of the delimiter at the beginning or at the end of the string is ignored.

valbounded_split :regexp ->``string->``int-> ``string list

Same as Str.split, but splits into at most n substrings, where n is the extra integer parameter.

valsplit_delim :regexp ->``string-> ``string list

Same as Str.split but occurrences of the delimiter at the beginning and at the end of the string are recognized and returned as empty strings in the result. For instance, split_delim (regexp " ") " abc " returns [""; "abc"; ""], while split with the same arguments returns ["abc"].

valbounded_split_delim :regexp ->``string->``int-> ``string list

Same as Str.bounded_split, but occurrences of the delimiter at the beginning and at the end of the string are recognized and returned as empty strings in the result.

typesplit_result`` =
| Text of string
| Delim of string
valfull_split :regexp ->``string-> split_result list

Same as Str.split_delim, but returns the delimiters as well as the substrings contained between delimiters. The former are tagged Delim in the result list; the latter are tagged Text. For instance, full_split (regexp "[{}]") "{ab}" returns [Delim "{"; Text "ab"; Delim "}"].

valbounded_full_split :regexp ->``string->``int-> split_result list

Same as Str.bounded_split_delim, but returns the delimiters as well as the substrings contained between delimiters. The former are tagged Delim in the result list; the latter are tagged Text.

Extracting substrings

valstring_before : ``string->``int-> string

string_before s n returns the substring of all characters of s that precede position n (excluding the character at position n).

valstring_after : ``string->``int-> string

string_after s n returns the substring of all characters of s that follow position n (including the character at position n).

valfirst_chars : ``string->``int-> string

first_chars s n returns the first n characters of s. This is the same function as Str.string_before.

vallast_chars : ``string->``int-> string

last_chars s n returns the last n characters of s.