In this part I will be describing the core semantics of MlFront
. Much of the
core semantics is captured in the
MlFront_Core
API documentation,
but this post will tie it all together.
Parties
My initial designs had modules being able to compile other modules (the logical
equivalent of an eval
statement in other languages). This was essential
because I wanted DkCoder and now MlFront
to have the ability to “bootstrap”
(compile itself).
At the same time, I did not want to give that eval
ability to end-user code,
which would have made it difficult to analyse end-user code.
So I introduced a trust model in MlFront
that distinguished between modules
that were written by me (ie. any MlFront
-based build system) and those written
by end-users. In that trust model my trusted modules could have access to eval
but the less-trusted modules of end-users would not have access.
And then from there I extended the trust model slightly and formalized it.
When MlFront
analyses a project, each module in MlFront
is categorized as
belonging to one of three parties:
- The “You” party are you and your team, who write the primary modules in a
project. By convention these “You” modules are organized under the
src/
folder of a project. - The “Us” party are the developers of the
MlFront
-based build system that is analysing the current project. These developers write trusted modules which have a higher-level of privilege than “You” modules. For example, in the DkCoder build system, the “Us” modules are given environment variables that describe how to compile source code (the equivalent of giving theeval
permission). By convention these trusted “Us” modules are located in read-only installation folders of theMlFront
-based build system. - The “Them” party includes every other developer. These “Them” developers
write modules which have a lower-level of privilege than “You” modules, and
may even be completely untrusted. The convention is that these Them
modules are downloaded by the
MlFront
-based build system.
Any MlFront
-based build system is free to ignore the party conventions.
Libraries
One of the first technical challenges I had to overcome was how to distinguish relative module references from absolute module references.
For example, let’s pretend you are writing code in the following You
module:
(* file: src/A/B/C.ml *)
let () = D.E.F.say_hello ()
Does D.E.F
refer to what I’m calling a “relative reference”:
(* file: src/A/B/C/D/E/F.ml *)
let say_hello = Tr1Stdlib_V414CRuntime.Printf.printf "hello"
or an “absolute reference”:
(* file: src/D/E/F.ml *)
let say_hello = Tr1Stdlib_V414CRuntime.Printf.printf "hello"
I resolved that ambiguity by making the anchor (the “D” in D.E.F
) have a
different lexical structure than the non-anchors (the E.F
in D.E.F
).
I’ve called the anchor the library. We’ve already encountered the
Tr1Stdlib_V414CRuntime
library; that library is a sub-partition of the OCaml
Standard Library that has all the functions that need a C99 runtime. It looks
different compared to a regular module name.
A library name always has:
- A capital starting letter, followed by
- One lowercase letter, followed by
- Zero or more digits or lowercase letters, followed by
- Another capital letter, followed by
- One or more digits or lowercase letters, followed by
- An underscore (
_
), followed by - A capital letter, followed by
- Zero or more letters, digits and underscores.
The rules are strict but using the mixed casing below with the underscore will always lead to a valid library name:
VendorProject_Unit
│ │ ││
│ │ │└ UPPERCASE
│ │ └ UNDERSCORE
│ └ UPPERCASE
└ UPPERCASE
Not only is an MlFront
library the anchor of the naming convention, but the
library is the entity which owns all the modules underneath it.
The module Tr1Stdlib_V414CRuntime.Printf
is owned by the library
Tr1Stdlib_V414CRuntime
and the module
AcmeWidgets_Std.Activities.Manufacturing
is owned by the library
AcmeWidgets_Std
.
What is in a library name?
Let’s go back to the mixed casing form of a library name:
VendorProject_Unit
│ │ ││
│ │ │└ UPPERCASE
│ │ └ UNDERSCORE
│ └ UPPERCASE
└ UPPERCASE
The first part, the “vendor”, is the organziation or person who owns the
library. We heard in the Origin Story of the original post that cohttp
(and
other packages like it) were able to simplify their package naming by prefixing
each package with their name (cohttp
). The vendor plays the same role in
MlFront.
There are some reserved vendors:
Ml
, which is reserved by Diskuv (my company) on behalf of the OCaml compiler, runtime and dependency analyzers.Dk
, which is reserved by Diskuv for DkCoder.Tr1
,Tr2
, …Tr<Number>
, which are reserved by Diskuv for “Technical Report” proposals that split up and extend the OCaml standard library. We saw theTr1
vendor in the first post’sTr1Stdlib_V414CRuntime
library.
In the first post we also encountered a library MmotlSqlite3_Std
. That had a
personal vendor Mmotl
which corresponded to that user’s GitHub username. Using
a GitHub username is the convention for personal vendors because the GitHub
username is a globally unique identifier for developers. Of course, not every
developer has a GitHub username, so just pick a vendor name for yourself that
won’t be chosen by any other developer.
VendorProject_Unit
│ │ ││
│ │ │└ UPPERCASE
│ │ └ UNDERSCORE
│ └ UPPERCASE
└ UPPERCASE
The second part of a library name is the “project”. By convention, when you
create one or more MlFront
libraries in a source repository, those libraries
will share the “project” name. More simply, all the libraries in a source
repository belong to the same project.
So if we did a clone of the MlFront
source repository we would see the
following abbreviated listing of files:
$ git clone https://gitlab.com/dkml/build-tools/MlFront.git
Cloning ...
$ tree MlFront -P '*.ml' -I ci/ -I .ci/ -I _build/ -I msys64/
MlFront
├── src
│ ├── MlFront_Cli
│ │ ├── CmiUtils.ml
│ │ ├── ColorDetect.ml
│ │ ├── GeneratedLoads.ml
│ │ ├── Optslog.ml
│ │ └── TerminalLogSetup.ml
│ ├── MlFront_Codept
│ │ ├── CodeptFiles.ml
│ │ ├── CodeptLog.ml
│ │ ├── CodeptOrd.ml
│ │ ├── DepGraph.ml
│ │ ├── Errors.ml
│ │ ├── ModuleUnit.ml
│ │ ├── ModuleUniverse.ml
│ │ ├── NamespacedId.ml
│ │ ├── Trace.ml
│ │ └── UnitPp.ml
│ └── MlFront_Core
│ ├── LibraryId.ml
│ ├── MlFront_Core.ml
│ ├── ModuleAssumptions.ml
│ ├── ModuleId.ml
│ ├── SpecialModuleId.ml
│ ├── Squish.ml
│ ├── StandardModuleId.ml
│ └── UnitId.ml
└── tests
└── MlFront_Core
Notice how all the libraries MlFront_Cli
, MlFront_Codept
and MlFront_Core
share the project name Front
.
The .git
stem of the source code repository URL is conventionally vendor
(Ml
) and the project (Front
). So MlFront.git
is a conventionally named
stem for the repository URL https://gitlab.com/dkml/build-tools/MlFront.git
.
This convention is important because a git clone
uses the stem of the
repository as the name of the new directory created during a clone (aka. a
checkout).
Many package managers (ex. opam
, npm
) have the concept of “overriding” a
package for local development. By following the VendorProject
naming
convention for stems, you can have a set of projects checked out in one
directory:
AcmeWidget/
src/
AcmeWidget_Std/
AcmeRobot/
src/
AcmeRobot_Std/
and the MlFront
-based tooling may assume that the projects are all local
overrides for each other.
VendorProject_Unit
│ │ ││
│ │ │└ UPPERCASE
│ │ └ UNDERSCORE
│ └ UPPERCASE
└ UPPERCASE
The final part of the library name is the library “unit”. The unit is what distinguishes one library from the next, and should be short and somewhat descriptive of the contents of the library.
By convention, the main unit in a project is named Std
.
Arranging a library in a file system
Here we have a library DkSubscribeWebhook_Std
located in the src/
folder of
a project. Remember from our earlier discussion about parties that src/
is the
customary location for the You
party.
src
└── DkSubscribeWebhook_Std
├── Aws
│ ├── Endpoints.ml
│ └── Signing.ml
├── CliEmail.ml
├── CliTemplate.ml
├── CurlIo.ml
├── Errors.ml
├── Expiry.ml
├── PingHandler.ml
├── Prov1Password.ml
├── ProvAwsSes.ml
├── ProvGitLab.ml
├── ProvStripe.ml
├── Providers.ml
├── Subscriptions.ml
├── TemplateInvoicePaid.ml
├── TemplateSubscriptionCancelled.ml
├── TemplateSubscriptionDeleted.ml
├── TemplateSubscriptionPaused.ml
├── TemplateSubscriptionResumed.ml
└── WebhookHandler.ml
What you see above are standard modules, where the hierarchy you see in the filesystem reflects how they are named in your source code:
DkSubscribeWebhook_Std
DkSubscribeWebhook_Std.CliEmail
DkSubscribeWebhook_Std.Expiry
DkSubscribeWebhook_Std.Aws
DkSubscribeWebhook_Std.Aws.Endpoints
The “open” module
There is one more category of modules that can be saved in the file system. These are called special modules. Unlike standard modules, special modules cannot be referenced in your source code.
Today there is only type of special module1: the open module. It
appears on the file system as the file open__.ml
:
src
└── DkSubscribeWebhook_Std
├── open__.ml <------- The "open" module
├── Aws
│ └── Signing.ml
├── TemplateSubscriptionResumed.ml
└── WebhookHandler.ml
1Actually, there are two types of special module but the second type is deprecated.
The open module logically belongs to the library, and it can only be placed
directly in the library directory rather in a subdirectory. In the example
above, the open__.ml
could not be in the Aws/
subdirectory.
We’ve seen that open module being used in the first post to introduce an alias used in all the modules of the library:
(* file: src/AcmeWidgets_Db/open__.ml *)
module Sqlite3 = MmotlSqlite3_Std.Sqlite3
Wrapping up the filesystem
The standard module and the special module are instances of a module
unit. Any module file that ends with .ml
or .mli
is a module unit.
Referencing modules in source code
All the items below contain valid module references except the one line that is commented out:
module A = DkSubscribeWebhook_Std
module C = DkSubscribeWebhook_Std.Expiry
module D = DkSubscribeWebhook_Std.Aws
module E = DkSubscribeWebhook_Std.Aws.Endpoints
(* let () = DkSubscribeWebhook_Std.cannot_do_this () *)
let () = DkSubscribeWebhook_Std.Expiry.print_tomorrow ()
let () = DkSubscribeWebhook_Std.Aws.print_services ()
let () = DkSubscribeWebhook_Std.Aws.Endpoints.print_region ()
The DkSubscribeWebhook_Std.cannot_do_this ()
is not allowed because the
library module only contain submodules.
This makes sense because you, as an end-user, never created a
DkSubscribeWebhook_Std.ml
file. It was automatically generated by the
MlFront
-based build system as we say in the first post.
But for now, let’s recap what we have seen today:
- Modules you write as files are called module units. And there are two types of units: the standard module and the special module. What makes the special module “special” is that you can’t reference it in your source code.
All the modules you can reference in your source code are called packages. We’ve already encountered two types of packages:
- The standard module can be referenced in source code. Examples:
DkSubscribeWebhook_Std.Expiry
,DkSubscribeWebhook_Std.Aws
,DkSubscribeWebhook_Std.Aws
andDkSubscribeWebhook_Std.Aws.Endpoints
. - The library module can be referenced in source code, although the only
values it contains are other (standard) modules. Example:
DkSubscribeWebhook_Std
.
There are more types of packages which we’ll encounter in the next post.
Summary
Here is a Venn diagram for how the different types of modules are identified by
MlFront
:
┌──────────────| UNIT ID |───────────────┐
│ │
| │
| ┌──────| MODULE ID |───────────┐ |
| │ │ |
| │ DkEx_Std/open__.ml │ |
| │ library_id: DkEx_Std │ |
| │ state: Special │ |
| │ │ |
| │ ┌───────────────────────────┼──┐ |
| │ │ │ │ |
| │ │ DkEx_Std/One.ml │ │ |
| | │ library_id: DkEx_Std | │ |
| │ | state: Standard │ | |
| │ │ │ │ |
| │ │ DkEx_Std/Sub/Two.ml │ │ |
| | │ library_id: DkEx_Std | │ |
| │ | state: Standard │ | |
| │ │ │ │ |
| └──┼───────────────────────────┘ │ |
| │ │ |
| │ DkEx_Std.ml │ |
| │ library_id: DkEx_Std │ |
| | state: Library | |
| │ │ |
| └─────| PACKAGE ID |───────────┘ |
| |
└────────────────────────────────────────┘
The inner-most box (DkEx_Std/One.ml
and DkEx_Std/Sub/Two.ml
) are standard
modules; those are the modules you will write most often.
There are modules you write (aka. “units”) that can’t be referenced in code: the
“special” modules like DkEx_Std/open__.ml
.
And there are modules you can reference (aka. “packages”) that you can’t write
at all. They are autogenerated: the “library” modules like DkEx_Std.ml
and
some more you’ll encounter next post.
Community Links: