Module Stdlib.Lexing

Contents

Instructions: Use this module in your project

In the IDE (CLion, Visual Studio Code, Xcode, etc.) you use for your DkSDK project:

  1. Add the following to your project's dependencies/CMakeLists.txt:

    Copy
    DkSDKProject_DeclareAvailable(ocaml
        CONSTRAINT "= 4.14.0"
        FINDLIBS str unix runtime_events threads dynlink)
    DkSDKProject_MakeAvailable(ocaml)
  2. Add the Findlib::ocaml library to any desired targets in src/*/CMakeLists.txt:

    Copy
    target_link_libraries(YourPackage_YourLibraryName
         # ... existing libraries, if any ...
         Findlib::ocaml)
  3. Click your IDE's Build button

Not using DkSDK?

FIRST, do one or all of the following:

  1. Run:

    Copy
    opam install ocaml.4.14.0
  2. Edit your dune-project and add:

    Copy
    (package
      (name YourExistingPackage)
      (depends
      ; ... existing dependenices ...
      (ocaml (>= 4.14.0))))

    Then run:

    Copy
    dune build *.opam # if this fails, run: dune build
  3. Edit your <package>.opam file and add:

    Copy
    depends: [
      # ... existing dependencies ...
      "ocaml" {>= "4.14.0"}
    ]

    Then run:

    Copy
    opam install . --deps-only

FINALLY, add the library to any desired (library)and/or (executable) targets in your **/dune files:

Copy
(library
  (name YourLibrary)
  ; ... existing library options ...
  (libraries
    ; ... existing libraries ...
    ))

(executable
  (name YourExecutable)
  ; ... existing executable options ...
  (libraries
    ; ... existing libraries ...
    ))

Positions

type position`` = ``{
pos_fname : string;
pos_lnum : int;
pos_bol : int;
pos_cnum : int;}

A value of type position describes a point in a source file. pos_fname is the file name; pos_lnum is the line number; pos_bol is the offset of the beginning of the line (number of characters between the beginning of the lexbuf and the beginning of the line); pos_cnum is the offset of the position (number of characters between the beginning of the lexbuf and the position). The difference between pos_cnum and pos_bol is the character offset within the line (i.e. the column number, assuming each character is one column wide).

See the documentation of type lexbuf for information about how the lexing engine will manage positions.

valdummy_pos :position

A value of type position, guaranteed to be different from any valid position.

Lexer buffers

type lexbuf`` = ``{
refill_buff : lexbuf -> unit;
mutable lex_buffer : bytes;
mutable lex_buffer_len : int;
mutable lex_abs_pos : int;
mutable lex_start_pos : int;
mutable lex_curr_pos : int;
mutable lex_last_pos : int;
mutable lex_last_action : int;
mutable lex_eof_reached : bool;
mutable lex_mem : ``int array``;
mutablelex_start_p :position;
mutablelex_curr_p :position;}

The type of lexer buffers. A lexer buffer is the argument passed to the scanning functions defined by the generated scanners. The lexer buffer holds the current state of the scanner, plus a function to refill the buffer from the input.

Lexers can optionally maintain the lex_curr_p and lex_start_p position fields. This "position tracking" mode is the default, and it corresponds to passing ~with_position:true to functions that create lexer buffers. In this mode, the lexing engine and lexer actions are co-responsible for properly updating the position fields, as described in the next paragraph. When the mode is explicitly disabled (with ~with_position:false), the lexing engine will not touch the position fields and the lexer actions should be careful not to do it either; the lex_curr_p and lex_start_p field will then always hold the dummy_pos invalid position. Not tracking positions avoids allocations and memory writes and can significantly improve the performance of the lexer in contexts where lex_start_p and lex_curr_p are not needed.

Position tracking mode works as follows. At each token, the lexing engine will copy lex_curr_p to lex_start_p, then change the pos_cnum field of lex_curr_p by updating it with the number of characters read since the start of the lexbuf. The other fields are left unchanged by the lexing engine. In order to keep them accurate, they must be initialised before the first use of the lexbuf, and updated by the relevant lexer actions (i.e. at each end of line -- see also new_line).

valfrom_channel : ``?with_positions:bool-> in_channel -> lexbuf

Create a lexer buffer on the given input channel. Lexing.from_channel inchan returns a lexer buffer which reads from the input channel inchan, at the current reading position.

valfrom_string : ``?with_positions:bool->``string-> lexbuf

Create a lexer buffer which reads from the given string. Reading starts from the first character in the string. An end-of-input condition is generated when the end of the string is reached.

valfrom_function : ``?with_positions:bool->``(``bytes->``int->int)``-> lexbuf

Create a lexer buffer with the given function as its reading method. When the scanner needs more characters, it will call the given function, giving it a byte sequence s and a byte count n. The function should put n bytes or fewer in s, starting at index 0, and return the number of bytes provided. A return value of 0 means end of input.

valset_position :lexbuf -> position -> unit

Set the initial tracked input position for lexbuf to a custom value. Ignores pos_fname. See set_filename for changing this field.

  • since 4.11
valset_filename :lexbuf ->``string-> unit

Set filename in the initial tracked position to file in lexbuf.

  • since 4.11
valwith_positions :lexbuf -> bool

Tell whether the lexer buffer keeps track of position fields lex_curr_p / lex_start_p, as determined by the corresponding optional argument for functions that create lexer buffers (whose default value is true).

When with_positions is false, lexer actions should not modify position fields. Doing it nevertheless could re-enable the with_position mode and degrade performances.

Functions for lexer semantic actions

The following functions can be called from the semantic actions of lexer definitions (the ML code enclosed in braces that computes the value returned by lexing functions). They give access to the character string matched by the regular expression associated with the semantic action. These functions must be applied to the argument lexbuf, which, in the code generated by ocamllex, is bound to the lexer buffer passed to the parsing function.

vallexeme :lexbuf -> string

Lexing.lexeme lexbuf returns the string matched by the regular expression.

vallexeme_char :lexbuf ->``int-> char

Lexing.lexeme_char lexbuf i returns character number i in the matched string.

vallexeme_start :lexbuf -> int

Lexing.lexeme_start lexbuf returns the offset in the input stream of the first character of the matched string. The first character of the stream has offset 0.

vallexeme_end :lexbuf -> int

Lexing.lexeme_end lexbuf returns the offset in the input stream of the character following the last character of the matched string. The first character of the stream has offset 0.

vallexeme_start_p :lexbuf -> position

Like lexeme_start, but return a complete position instead of an offset. When position tracking is disabled, the function returns dummy_pos.

vallexeme_end_p :lexbuf -> position

Like lexeme_end, but return a complete position instead of an offset. When position tracking is disabled, the function returns dummy_pos.

valnew_line :lexbuf -> unit

Update the lex_curr_p field of the lexbuf to reflect the start of a new line. You can call this function in the semantic action of the rule that matches the end-of-line character. The function does nothing when position tracking is disabled.

  • since 3.11.0

Miscellaneous functions

valflush_input :lexbuf -> unit

Discard the contents of the buffer and reset the current position to 0. The next use of the lexbuf will trigger a refill.

More from the DkSDK Book

    1. DkSDK
      1. Package capnp
        1. Module Capnp
            1. Module type MessageSig.MESSAGE
            1. Module type MessageSig.S
              1. Module S.ListStorage
              1. Module S.Message
              1. Module S.Object
              1. Module S.Segment
              1. Module S.Slice
              1. Module S.StructStorage
            1. Module type MessageSig.SEGMENT
            1. Module type MessageSig.SLICE
        1. Module Capnp_unix
      1. Package cmdliner
        1. Module Cmdliner
        1. Module Cmdliner_arg
        1. Module Cmdliner_base
        1. Module Cmdliner_cline
        1. Module Cmdliner_cmd
        1. Module Cmdliner_docgen
        1. Module Cmdliner_eval
        1. Module Cmdliner_info
        1. Module Cmdliner_manpage
        1. Module Cmdliner_msg
        1. Module Cmdliner_term
        1. Module Cmdliner_term_deprecated
        1. Module Cmdliner_trie
      1. Package fmt
        1. Module Fmt
        1. Module Fmt_cli
        1. Module Fmt_tty
      1. Package logs
        1. Module Logs
          1. Module type Logs.LOG
          1. ...
        1. Module Logs_cli
        1. Module Logs_fmt
        1. Module Logs_lwt
          1. Module type Logs_lwt.LOG
        1. Module Logs_threaded
      1. Package lwt
        1. Module Lwt
        1. Module Lwt_bytes
        1. Module Lwt_condition
        1. Module Lwt_config
        1. Module Lwt_engine
        1. Module Lwt_features
        1. Module Lwt_fmt
        1. Module Lwt_gc
        1. Module Lwt_io
            1. Module type Lwt_io.NumberIO
        1. Module Lwt_list
        1. Module Lwt_main
            1. Module type Lwt_main.Hooks
        1. Module Lwt_mutex
        1. Module Lwt_mvar
        1. Module Lwt_pool
        1. Module Lwt_pqueue
            1. Module type Lwt_pqueue.OrderedType
            1. Module type Lwt_pqueue.S
        1. Module Lwt_preemptive
        1. Module Lwt_process
        1. Module Lwt_result
        1. Module Lwt_seq
        1. Module Lwt_sequence
        1. Module Lwt_stream
        1. Module Lwt_switch
        1. Module Lwt_sys
        1. Module Lwt_throttle
            1. Module type Lwt_throttle.S
        1. Module Lwt_timeout
        1. Module Lwt_unix
      1. Package mtime
        1. Module Mtime
        1. Module Mtime_clock
      1. Package ocaml
        1. Module Bigarray
        1. Module Condition
        1. Module Dynlink
        1. Module Event
        1. Module Mutex
        1. Module Profiling
        1. Module Semaphore
        1. Module Stdlib
          1. ...
          1. Module Stdlib.Lexing
        1. Module Str
        1. Module Thread
        1. Module ThreadUnix
        1. Module Topdirs
        1. Module Unix
        1. Module UnixLabels
      1. Package
      1. Package result
        1. Module Result
      1. Package stdint
        1. Module Stdint
            1. Module type Stdint.Int