Decider of representations of types and values in OCaml that work at scale. For example, Stripe's OpenAPI spec was 5.9 MB as of mid-2024 and its schemas were highly mutually recursive.
In OCaml that led to:
- Backport linear closures bugfix
- ocamlopt.opt 5.2.0 fails to compile very large file but ocamlc compiles fine
Since the behavior can vary based on the OCaml compiler version (and the presence of Flambda, etc.), the choice is hidden behind create
.
type repr_sumtype =
| RegularVariants
(*The representation using the regular variant (ex.
type pet = | Cat | Dog
). Regular variants are limited to 246 variants.Mitigation. None. Do not use if there are more than 246 variants.
*)| ExtensibleVariants
(*The representation using extensible variants (ex.
type pet = .. type pet += | Cat | Dog
). Extensible variants are mostly unlimited except when the definition and the access are within the same module file. Confer: ocamlopt.opt 5.2.0 fails to compile very large file but ocamlc compiles fine.Confer: https://stackoverflow.com/questions/54730373/when-should-extensible-variant-types-be-used-in-ocaml for the memory representation.
Mitigation. Avoid the scaling limits by placing the definitions of the extensible variants in a separate module from modules that access those modules.
*)| PolymorphicVariants
(*The representation using polymorphic variants (ex.
type pet = [`Cat | `Dog]
). Polymorphic variants do not have to be pre-declared. They are mostly unlimited except when the hashes of the variant names collide. An example is squigglier and premiums. The collision probability is the classic birthday problem. With 2^31 possible hash variants in caml_hash_variant, 54,563 variants would have a 50% probability of collision (assuming a strong hash, whichcaml_hash_variant
is not).Mitigation. Change the variant names if there is a collision.
*)
The representation of variants in a sum type in generated source code.
variantize t ~ocaml_id
creates a name of a variant specific to the OCaml identifier ocaml_id
. Polymorphic variants (PolymorphicVariants
) will be `Id
and the other variants will be Id
, assuming that ocaml_id = "Id"
.
It is the responsibility of the caller to ensure that ocaml_id
is a valid OCaml identifier. However, the ocaml_id
does not need to be a valid (ie. capitalized) variant name.
require_separated_modules t
is true if creating a single global module would create problems with the OCaml compiler.
For example, extensible variants behave differently (and more scalably) when their definition is in a seperate module from their accessing modules.
Confer: ocamlopt.opt 5.2.0 fails to compile very large file but ocamlc compiles fine.
include DkCoder_Std.SCRIPT
__init context
is the entry point for running a script module. The DkCoder compiler will inject this function at the top and bottom of the script module. The top __init
does nothing, while the bottom __init
calls the prior __init
.
That means:
- calling the
__init
function guarantees that the script module is initialized; that is, all of the script module's side-effects (ex.let () = Format.printf "Hello world@."
) are executed before the__init
returns to the caller. - you can override the
__init
function by simply defining the__init
idempotently. That will shadow the top__init
and when the bottom__init
is executed your__init
will be called instead of the do-nothing top__init
.
Future versions of DkCoder will call __init
in dependency order for all `You
script modules. Your __init
function may be called several times.
__repl context
is the entry point for debugging a script module in a REPL. The DkCoder compiler will inject this function at the top and bottom of the script module. The top __repl
does nothing, while the bottom __repl
calls the prior __repl
.
That means:
- you can override the
__repl
function by simply defining the__repl
idempotently. That will shadow the top__repl
and when the bottom__repl
is executed your__repl
will be called instead of the do-nothing top__repl
.
The run-time module information for the script module.