Intro to DkCoder Scripting

DkCoder: Scripting at Scale

Hello Builder! Scripting is a small, free and important piece of DkSDK.

A few clicks from your web browser and four (4) minutes later you and your Windows and macOS users can start scripting with DkCoder. And all users, including glibc-based Linux desktop users, can use their Unix shells or Windows PowerShell. Nothing needs to be pre-installed on Windows and macOS. Just copy and paste two lines (you’ll see examples soon) and your script is running and your project is editable with an LSP-capable IDE like Visual Studio Code.

Unlike most scripting frameworks, DkCoder solves the problem of scale: you start with small scripts that do immediately useful things for you and your team, and when inevitably you need to expand, distribute or embed those scripts to make full-featured applications, you don’t need to throw out what you have already written. DkCoder is a re-imagining of the scripting experience that re-uses the best historical ideas:

  1. You don’t write build files. If that sounds like Unix /bin/sh or the Windows Command Prompt, that is intentional.
  2. Most files you write can be immediately run. If that sounds like how Python scripts are almost indistinguishable from Python modules, or like JavaScript modules, that is intentional.
  3. Most files you write can be referenced with a fully-qualified name. If that sounds like Java packages and how that has been proven to scale to large code bases, that is intentional.
  4. Your scripts play well together and don’t bit rot. It is conventional to add static typing (Typescript, mypy) when scripting projects get large. DkCoder has type-safety from Day One that is safer and easier to use.

You’ll start with the mouse-click install and the basic one-liner:

let () = Tr1Stdlib_V414Io.StdIo.print_endline "Hello Builder!"

which you run with a single command:

./dk DkRun_V2_2.Run -- DkHelloScript_Std.AndHello
Hello Builder!

You’ll do a quick tour of the prior art where we will acknowledge situations when DkCoder is not the right tool for you.

You’ll see three examples of scripts:

  1. the integration test script that produces this documentation page
  2. a 2D game to show non-traditional uses of scripts and the re-use of existing code
  3. the security-conscious production service managing subscriptions for DkSDK Pricing

Along the way you’ll encounter a small language made by a community who can write some very good libraries. And a software kit that gets out the way and makes good software accessible.

Feedback?
If you want to "like" DkCoder with GitHub stars, or if you want help with DkCoder, please visit https://github.com/diskuv/dkcoder.

Let’s begin!

You, your co-developers and your users can start scripting in a couple clicks.

FIRST, if you don't have these installed then install Git from https://git-scm.com/downloads and Visual Studio Code from https://code.visualstudio.com/download.

Then click this link: Clone and Open DkHelloScript in Visual Studio Code.

Click to Trust the Authors and Install the Recommended Extensions!
Want to Use The Command Line Instead?
git clone https://gitlab.com/diskuv/samples/dkcoder/DkHelloScript.git
code DkHelloScript

SECOND, open src/DkHelloScript_Std/AndHello.ml in your IDE or open src/DkHelloScript_Std/AndHello.ml in your browser.

You should see:

open Tr1Stdlib_V414Io

let () = StdIo.print_endline "Hello Builder!"
Wait a minute or two for the one-time background installation, and then you'll notice the text colors have changed. Hover over the print_endline to see its API. After DkCoder gets out of Alpha, there will be a progress bar to provide visual feedback during the background install.

THIRD, open a Terminal > New Terminal and run:

./dk DkRun_V2_2.Run -- DkHelloScript_Std.AndHello
Hello Builder!

✔️   DONE! Ok, one-liners are not very interesting.

But ... that should have only taken a few minutes.

And ... the same command works on Windows, macOS and GNU/Linux.

You will be prompted to accept the DkCoder licenses. The game demo has a separate license: GPL 3.0. If you want to adapt the game demo for commercial purposes you must contact the demo author at https://github.com/sanette.
We'll go over licensing in general later in this walkthrough.

Reproducibility or quick typing? Pick one

We specified a version number and a double dash separator (DkRun_V2_2.Run --) in our last example:

./dk DkRun_V2_2.Run -- DkHelloScript_Std.AndHello

You can relax your fingers by instead typing:

./dk DkHelloScript_Std.AndHello

In your everyday scripting you won’t want to type DkRun_V2_2.Run --. Leave it out.

However, when you are publishing documentation (like this article) you should always include the version number. You’ll find it easy for your users to copy-and-paste and more importantly, the behavior of your scripts won’t change in some future version of ./dk.

Focus on what you are running

Open src/DkHelloScript_Std/AndHelloAgain.ml in your IDE.

It has the same content as DkHelloScript_Std/AndHello.ml. However, it has red squiggly lines.

The red is an indication that DkCoder has not compiled the script. That is a good thing because generally you don't want to waste time compiling every script every time you make an edit in a large project.

Now run the DkHelloScript_Std.AndHelloAgain script with:

./dk DkRun_V2_2.Run -- DkHelloScript_Std.AndHelloAgain

Visual Studio Code should remove the red in a minute.

Faster?
You can speed up how fast Visual Studio Code recognizes the newly focused script by using the View > Command Palette menu, and then using the OCaml: Restart Language Server action.

DkCoder optimizes for rapid iterative development by:

  1. Only compiling the script that you last ran (ex. DkHelloScript_Std/AndHelloAgain.ml). If your script requires other scripts to run, those are also compiled.
  2. Compiling all the scripts in a project when you first open Visual Studio Code. That means you can browse your project when you first start your day without seeing any red. Then, when you have found a script you want to edit or a new script to add, edit and run that script repeatedly throughout the day.
    Upcoming Changes
    Since large projects may have many scripts, a future version of DkCoder will allow you to select which modules are automatically compiled at startup.

Runtime requirements

  • Windows 10 (version 1903 aka 19H1, or later), or Windows 11, either 32-bit or 64-bit.

  • macOS AMD/Intel or Apple Silicon (Big Sur 11.0 or higher).

  • Linux AMD/Intel 64-bit with glibc 2.27 or higher.

    • You will need the following system tools installed:

      • tar, wget or curl, git, unzip
      • libssl-dev (Debian, Ubunutu) or openssl-devel (RedHat)
      • ninja (this requirement will be removed)
      • If any are missing you will be asked if DkCoder can install them the first time you run ./dk from the command line.
    • For graphics you will need one of:

      • X11 with a direct OpenGL driver.
        • A good prereq test is running glxgears (ex. yum install glx-utils; glxgears). If that works, DkCoder graphics should work.
        • Don’t expect the LIBGL_ALWAYS_INDIRECT=1 environment variable to work. That is sometimes recommended when forwarding X11 with ssh -X or into a Docker container, but it forces the protocol to be a too-old OpenGL version 1.4.
      • Wayland. Set the environment variable SDL_VIDEODRIVER=wayland to force it.
      • Linux Direct Rendering Manager. If supported, you will have a /dev/dri/card0. Set the environment variable SDL_VIDEODRIVER=kmsdrm to force it.
    • That’s it. Linux desktops are complex!

Runtime system

Your scripts will have access to the following libraries:

  • curl
    • All OSes:
      • Brotli and zstd compression
      • Asynchronous DNS resolution
      • HTTP/2
      • Websockets
    • Windows: Schannel TLS backend (Microsoft CryptoAPI) and WinIDN domain name resolution
    • macOS: Secure transport backend (Keychain) and IDN2
    • Linux: Openssl is required from the operating system. DkCoder does not provide it.
  • SDL2

Basic shell scripting

In an earlier section an OCaml programming language environment was transparently downloaded for you. You will be using OCaml in this walkthrough, even though you may find other DkSDK documentation that uses DkCoder with C and Java. You can proceed through this walkthrough without knowing OCaml.

Once you are done the DkCoder walkthrough you may want to go to the OCaml - Learn site. For this walkthrough we’ll stick to ordinary OCaml open-source examples that you can replicate in a conventional OCaml environment.

In this section we’ll be using the shexp library that was created by Jane Street Capital. If you are familiar with traditional shell scripts like bash you’ll find shexp more powerful.

Open src/DkHelloScript_Std/B43Shell/B35Shexp/B43Countdown.ml in your IDE or open src/DkHelloScript_Std/B43Shell/B35Shexp/B43Countdown.ml in your browser.

You should see:

open Tr1Shexp_Std.Shexp_process
open Tr1Shexp_Std.BindsShexp

(* Counts down from n with a one second delay between ticks *)
let rec countdown (n : int) : unit t =
  if n > 0 then begin
    echo (string_of_int n) ;%bind
    sleep 1.0 ;%bind
    countdown (n - 1)
  end
  else echo "Done countdown. Bye Builder!"

let main_t : unit t =
  echo "Hello Builder! ..." ;%bind
  echo "------------------------------------------------------------" ;%bind
  echo "Starting countdown..." ;%bind
  countdown 5

let () = if Tr1EntryName.module_id = __MODULE_ID__ then eval main_t

And running it with:

./dk DkRun_V2_2.Run -- DkHelloScript_Std.B43Shell.B35Shexp.B43Countdown

gives:

Hello Builder! ...
------------------------------------------------------------
Starting countdown...
5
4
3
2
1
Done countdown. Bye Builder!

Prior Art

This is only a quick tour of the most popular alternatives to DkCoder. Most of the advantages for DkCoder boil down to it works on Windows and Unix without pre-installing other packages.

  • Shell scripts like /bin/bash. A POSIX-compatible shell script is pre-installed on almost all Unix distributions. By comparison, DkCoder has a POSIX shell script launcher that transparently installs pre-compiled binaries for macOS and the major Linux desktop distributions. However, DkCoder has expanded reach with its Windows shell script launcher that transparently installs on Windows (and Unix) machines.
  • PowerShell. The scripts are written in a full-featured programming language, and in that respect PowerShell is similar to DkCoder. Furthermore, PowerShell has an command shell that is a complete alternative to both the Command Prompt on Windows and /bin/sh on Unix. DkCoder does not have a command shell. However, DkCoder scripts work on both Windows and Unix without pre-installing the PowerShell distribution, so DkCoder may be easier to adopt.
  • Perl. CPAN gives Perl a huge standardized package registry with standardized package tools. By comparison, DkCoder has no packages (today!) in its package registry. However, DkCoder will soon have an alternative to package registries that is expected to be the main form of sharing until DkCoder is out of its infancy stage. Also, as with shell scripts, Perl is pre-installed on almost all Unix distributions. However, DkCoder has expanded reach with its Windows and Unix shell script launcher.
  • Python. Most of the advantages and disadvantages for Perl apply to Python. Python is less ubiquitous but more popular than Perl. DkCoder scripting should have a similar learning curve to Python, as evidenced by its early use by high-schoolers.
  • Fortran. The DkCoder author (me!) has never written a line of Fortran, but its “dynamic dependency” system of Fortran files containing declarations of which Fortran modules are used is similar to DkCoder. DkCoder goes a step further and drops the need to declare dependencies that are available in the DkRegistry.
  • etc.

There are a few reasons not to use DkCoder:

  • OCaml has memory inefficiences due to garbage collection, boxing, cache non-locality and wide native types. This would affect you if your application is memory-constrained rather than CPU or IO-constrained. The main mitigation is that using OCaml FFI to interface with memory efficient languages like C and Rust is not overly complex.
  • The OCaml ecosystem is small and biased towards 64-bit Linux applications. This would affect you if you develop applications on Windows or for mobile devices. The main mitigation is to use cross-platform distributions like esy, DkML and DkSDK.
  • The company behind DkCoder is very small (at times only one FTE). In particular, some features like fast native compilation are going to take substantial time to deliver.
    The main mitigation is that DkSDK customers get source code rights for life, and can prioritize their wishlist with sponsored feature development.
  • DkCoder is currently in Alpha. Major features like auto-downloading of other scripts are almost feature complete but paused while we collect Alpha feedback. The main mitigation is to wait if you need post-Alpha features.
  • The licensing of OCaml (LGPL) and the broader OCaml ecosystem (many GPL) may be too restrictive for some companies to adopt, especially in comparison to many modern (Rust, Go) or dominant (C, Python, JavaScript) ecosystems. The main mitigation is to strategically choose either static and shared linking (tools in DkSDK CMake and DkSDK FFI C make this easier).

Integration Testing

Let’s run a script that is simultaneously both an integration test and source of documentation:

./dk DkRun_V2_2.Run DkHelloScript_Std.Y33Article --serve

While that is running, open your web browser to http://localhost:8080

You will see a preview of the documentation you are reading right now! The documentation you are reading is a side-effect of running integration tests. Open src/DkHelloScript_Std/Y33Article.ml in your IDE or open src/DkHelloScript_Std/Y33Article.ml in your browser.

You should see:

open Tr1Tezt_C.Tezt
open Tr1Htmlit_Std.Htmlit

(** {1 Register Tests incl. Documentation and Server}
    
    If you are familiar with React, you can "push down" functions and values
    so that your deeper content can render itself. We do the same thing
    here, and that is why you see all the [~make_capture ~run_dk] and other
    parameters.
    
    We could have also used a `src/DkHelloScript_Std/open__.ml` file
    to provide values to all the scripts of the [DkHelloScript_Std] library.
    If you are familiar with React, that "global" style of distributing
    values would be called a "context".

    Even simpler would be to create functions inside [Doc] since
    [Y33ArticleX/Doc] is visible to the [Y33ArticleX/Section*] scripts. 
    
    If possible find a module like [Doc] that is visible to all the scripts
    that need it, and place your functions in the module. We don't do it
    in this project except for one function {!ucodeblock} for demonstration
    purposes.

    If that doesn't work for the way your have organized your project code,
    prefer the "push down" method over the "global context". It is simpler
    to test, your functions will be re-usable outside of the current
    library (ex. outside of [DkHelloScript_Std]), and you won't pull in
    unnecessary dependencies for scripts that don't need it.
    
    You can avoid "unused parameters" errors with the "push down" method
    by attaching the following extension on your "push down" functions:
    
    [[
      let register_before_tests (* ... *) =
         (* ... *)
         unit
         [@@warning "-unused-var-strict"]
    ]] *)

let title_section () =
  Y33ArticleX.Doc.(
    append
      (usection
         [ ucontainer
             El.
               [p ~at:[At.class' "subtitle"] [txt "DkCoder: Scripting at Scale"]]
         ] ) )

let () =
  if Tr1EntryName.module_id = __MODULE_ID__ then begin
    (* SECTIONS *)
    let andhello_file = AndHello.__FILE__ in
    let andhello_module = AndHello.__MODULE_ID__ in
    title_section () ;
    Y33ArticleX.S004Intro.register_before_tests ~andhello_module () ;
    Y33ArticleX.S008StartScript.register_before_tests ~andhello_module
      ~andhello_file () ;
    Y33ArticleX.S012Reproducib.register_before_tests ~andhello_module () ;
    Y33ArticleX.S016Focus.register_before_tests ~andhello_file
      ~andhelloagain_module:AndHelloAgain.__MODULE_ID__
      ~andhelloagain_file:AndHelloAgain.__FILE__ () ;
    Y33ArticleX.S020RuntimeReqs.register_before_tests () ;
    Y33ArticleX.S022RuntimeSys.register_before_tests () ;
    Y33ArticleX.S024BasShell.register_before_tests
      ~shexpcountdown_module:B43Shell.Index.B35Shexp.B43Countdown.__MODULE_ID__
      ~shexpcountdown_file:B43Shell.Index.B35Shexp.B43Countdown.__FILE__ () ;
    Y33ArticleX.S028PriorArt.register_before_tests () ;
    Y33ArticleX.S032Testing.register_before_tests ~article_module:__MODULE_ID__
      ~article_file:__FILE__ () ;
    Y33ArticleX.S036FlairGraph.register_before_tests
      ~boguetiny_module:B57Graphics.Index.B43Bogue.B43Tiny.__MODULE_ID__
      ~boguetiny_file:B57Graphics.Index.B43Bogue.B43Tiny.__FILE__ () ;
    Y33ArticleX.S038SnokeGame.register_before_tests () ;
    (* Y33ArticleX.S040ReusingYouScripts.register_before_tests () ; *)
    (* Y33ArticleX.S044ListingThirdParties.register_before_tests () ; *)
    (* Y33ArticleX.S048UsingListedThirdParties.register_before_tests () ; *)
    (* Y33ArticleX.S052FunBreak.register_before_tests () ; *)
    (* Y33ArticleX.S056UsingRegThirdParties.register_before_tests () ; *)
    (* Y33ArticleX.S060DkRegistry.register_before_tests () ; *)
    Y33ArticleX.S064MakeProject.register_before_tests () ;
    Y33ArticleX.S066Production.register_before_tests () ;
    Y33ArticleX.S068Parties.register_before_tests () ;
    Y33ArticleX.S072Stdlib.register_before_tests () ;
    Y33ArticleX.S076RuntimeLibs.register_before_tests () ;
    Y33ArticleX.S080InducLimits.register_before_tests () ;
    Y33ArticleX.S084EarlyLimits.register_before_tests () ;
    Y33ArticleX.S088SecDesign.register_before_tests () ;
    Y33ArticleX.S092InTouch.register_before_tests () ;
    (* HTTP SERVER *)
    Y33ArticleX.Httpd.register_before_tests () ;
    (* DOCUMENTATION PRINTING *)
    Y33ArticleX.Doc.register_before_tests () ;
    (* RUN TESTS *)
    Test.run ()
  end

The test infrastructure is provided by the module Tr1Tezt_C.Tezt. I recommend you read the Announcing Tezt by Nomadic Labs article to see what it can do and why it was developed.

DkCoder has Tezt pre-installed, so you won’t have to create ‘dune’ files or do any ‘dune build’ steps that regular OCaml developers would have to do. To use Tezt, follow the examples that are given to you in this documentation.

I wrote the module DkHelloScript_Std.Y33ArticleX.Doc in a very “scripty” style: it incrementally builds documentation in memory. It renders to either HTML or Markdown, and makes use of the Bulma CSS Framework for styling and layout. You can copy and customize the module in your own scripts. For those of you familiar with JavaScript and the DOM, the incremental approach is similar to how your web browser renders a web page. For everybody else, just think of a file being created in-memory the first time DkHelloScript_Std.Y33ArticleX.Doc is accessed, and through the use of several helper functions that write (“append”) fragments at the end of the file, the DkHelloScript_Std.Y33ArticleX.Doc.register_before_tests () is able to print a complete documentation file to the console or a real file.

The title_section () function is the first documentation fragment that is built. You should recognize it on the top of this documentation page.

Imperative programming
You may have heard that OCaml is a functional language. However, OCaml also supports imperative programming with global variables. The Tezt test framework uses imperative programming to allow you to register tests whenever you want. I personally find imperative programming the most natural programming model when writing scripts.

Each section of the documentation has its own test script which registers its own integration test.

Once all of the tests are registered, they are all run with the OCaml function call:

Test.run ()

Let’s drill down into this section of the documentation. Open src/DkHelloScript_Std/Y33ArticleX/S032Testing.ml in your IDE or open src/DkHelloScript_Std/Y33ArticleX/S032Testing.ml in your browser.

open Tr1Tezt_C.Tezt
open Tr1Tezt_C.Tezt.Base
open Tr1Htmlit_Std.Htmlit
module Printf = Tr1Stdlib_V414CRuntime.Printf

let register_before_tests ~article_module ~article_file () =
  Test.register ~__FILE__ ~title:"testing" ~tags:[]
  @@ fun () ->
  let self_FILE = __FILE__ in
  let open Doc in
  let open El in
  append
    (usection
       [ ucontainer
           [ ucard ~title:"Integration Testing" []
           ; upar
               [ txt
                   "Let's run a script that is simultaneously both an \
                    integration test and source of documentation:" ]
           ; ucodeblock `Shell
               (Printf.sprintf "./dk %s %s --serve" Tr1Version.run_module
                  article_module )
           ; upar
               [ txt "While that is running, open your web browser to "
               ; ulink ~url:"http://localhost:8080" "http://localhost:8080" ]
           ; upar
               [ txt
                   "You will see a preview of the documentation you are \
                    reading right now! "
               ; unsafe_raw
                   "<strong>The documentation you are reading is a side-effect \
                    of running integration tests.</strong>" ]
           ; ucodeaction ~ide:(txt "Open") ~browser:(Some "open") article_file
           ; txt "You should see:"
           ; ucodefile article_file
           ; upar
               [ txt "The test infrastructure is provided by the module "
               ; ucode "Tr1Tezt_C.Tezt"
               ; txt ". I recommend you read the "
               ; ulink
                   ~url:
                     "https://research-development.nomadic-labs.com/announcing-tezt.html"
                   "Announcing Tezt by Nomadic Labs"
               ; txt " article to see what it can do and why it was developed. "
               ]
           ; uinfo
               [ txt
                   "DkCoder has Tezt pre-installed, so you won't have to \
                    create 'dune' files or do any 'dune build' steps that \
                    regular OCaml developers would have to do. To use Tezt, \
                    follow the examples that are given to you in this \
                    documentation." ]
           ; upar
               [ txt "I wrote the module "
               ; ucode Doc.__MODULE_ID__
               ; txt
                   " in a very \"scripty\" style: it incrementally builds \
                    documentation in memory. "
               ; txt
                   "It renders to either HTML or Markdown, and makes use of \
                    the "
               ; ulink ~url:"https://bulma.io/" "Bulma CSS Framework"
               ; txt " for styling and layout. "
               ; txt
                   "You can copy and customize the module in your own scripts. "
               ; txt "For those of you familiar with "
               ; ulink
                   ~url:
                     "https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Introduction"
                   "JavaScript and the DOM"
               ; txt
                   ", the incremental approach is similar to how your web \
                    browser renders a web page. For everybody else, just think \
                    of a file being created in-memory the first time "
               ; ucode Doc.__MODULE_ID__
               ; txt
                   " is accessed, and through the use of several helper \
                    functions that write (\"append\") fragments at the end of \
                    the file, the "
               ; ucode
                   (Printf.sprintf "%s.register_before_tests ()"
                      Doc.__MODULE_ID__ )
               ; txt
                   " is able to print a complete documentation file to the \
                    console or a real file." ]
           ; upar
               [ txt "The "
               ; ucode "title_section ()"
               ; txt
                   " function is the first documentation fragment that is \
                    built. "
               ; txt
                   "You should recognize it on the top of this documentation \
                    page." ]
           ; uinfo
               ~header:[txt "Imperative programming"]
               [ txt "You may have heard that OCaml is a functional language. "
               ; txt
                   "However, OCaml also supports imperative programming with \
                    global variables. "
               ; txt
                   "The Tezt test framework uses imperative programming to \
                    allow you to register tests whenever you want. "
               ; txt
                   "I personally find imperative programming the most natural \
                    programming model when writing scripts. " ]
           ; upar
               [ txt
                   "Each section of the documentation has its own test script \
                    which registers its own integration test. " ]
           ; upar
               [ txt
                   "Once all of the tests are registered, they are all run \
                    with the OCaml function call:"
               ; ucodeblock `OCaml "Test.run ()" ]
           ; upar
               [txt "Let's drill down into this section of the documentation. "]
           ; ucodeaction ~ide:(txt "Open") ~browser:(Some "open") self_FILE
           ; ucodefile self_FILE
           ; upar
               [ txt "You will see some brief boilerplate to register the test ("
               ; ucode "Test.register ~__FILE__ ..."
               ; txt
                   "). The test adds a \"section\" fragment to the in-memory \
                    documentation. In fact, the side-effect of creating the \
                    section fragment is all that the test does." ]
           ; upar [txt "Let's see something more complex."]
           ; ucodeaction
               ~ide:(txt "Open the \"Basic shell scripting\" test")
               ~browser:(Some "open") S024BasShell.__FILE__
           ; ucodefile S024BasShell.__FILE__
           ; upar
               [ txt "In this test we run a script from the command line ("
               ; ucode "RunDk.make_capture ()"
               ; txt " and "
               ; ucode "RunDk.run_dk ~hooks [shexpcountdown_module]"
               ; txt "). "
               ; txt
                   "The source code, the script command and its output become \
                    part of the documentation:"
               ; ucodeblock `OCaml
                   {|
; ucodefile shexpcountdown_file
; ucodeblock `Shell !ref_cmd
; ucodeblock `Output dk_output
|}
               ] ] ] ) ;
  unit
[@@warning "-unused-var-strict"]

You will see some brief boilerplate to register the test (Test.register ~__FILE__ ...). The test adds a “section” fragment to the in-memory documentation. In fact, the side-effect of creating the section fragment is all that the test does.

Let’s see something more complex. Open the “Basic shell scripting” test src/DkHelloScript_Std/Y33ArticleX/S024BasShell.ml in your IDE or open src/DkHelloScript_Std/Y33ArticleX/S024BasShell.ml in your browser.

open Tr1Tezt_C.Tezt
open Tr1Tezt_C.Tezt.Base
open Tr1Htmlit_Std.Htmlit

let register_before_tests ~shexpcountdown_module ~shexpcountdown_file () =
  Test.register ~__FILE__ ~title:"basic shell scripting" ~tags:[]
  @@ fun () ->
  let ref_cmd, hooks = RunDk.make_capture () in
  let* dk_output = RunDk.run_dk ~hooks [shexpcountdown_module] in
  let open Doc in
  let open El in
  append
    (usection
       [ ucontainer
           [ ucard ~title:"Basic shell scripting" []
           ; upar
               [ unsafe_raw
                   "In an earlier section an <strong>OCaml</strong> \
                    programming language environment was transparently \
                    downloaded for you. "
               ; txt
                   "You will be using OCaml in this walkthrough, even though \
                    you may find other DkSDK documentation that uses DkCoder \
                    with C and Java. "
               ; txt
                   "You can proceed through this walkthrough without knowing \
                    OCaml." ]
           ; uinfo
               [ unsafe_raw
                   {|Once you are done the DkCoder walkthrough you may want to go to the <a href="https://ocaml.org/docs">OCaml - Learn</a> site. |}
               ; unsafe_raw
                   {|For this walkthrough <strong>we'll stick to ordinary OCaml open-source examples</strong> that you can replicate in a conventional OCaml environment.|}
               ]
           ; upar
               [ txt "In this section we'll be using the "
               ; ulink ~url:"https://github.com/janestreet/shexp" "shexp"
               ; txt " library that was created by "
               ; ulink
                   ~url:"https://www.janestreet.com/join-jane-street/overview/"
                   "Jane Street Capital"
               ; txt
                   ". If you are familiar with traditional shell scripts like "
               ; ucode "bash"
               ; txt " you'll find "
               ; ucode "shexp"
               ; txt " more powerful." ]
           ; upar
               [ ucodeaction ~ide:(txt "Open") ~browser:(Some "open")
                   shexpcountdown_file
               ; txt "You should see:" ]
           ; ucodefile shexpcountdown_file
           ; upar [txt "And running it with:"]
           ; ucodeblock `Shell !ref_cmd
           ; txt "gives:"
           ; ucodeblock `Output dk_output ] ] ) ;
  unit
[@@warning "-unused-var-strict"]

In this test we run a script from the command line (RunDk.make_capture () and RunDk.run_dk ~hooks [shexpcountdown_module]). The source code, the script command and its output become part of the documentation:

; ucodefile shexpcountdown_file
; ucodeblock `Shell !ref_cmd
; ucodeblock `Output dk_output

Adding some flair with graphics

Open src/DkHelloScript_Std/B57Graphics/B43Bogue/B43Tiny.ml in your IDE or open src/DkHelloScript_Std/B57Graphics/B43Bogue/B43Tiny.ml in your browser.

You should see:

open Tr1Bogue_Std.Bogue

let () =
  if Tr1EntryName.module_id = __MODULE_ID__ then
    Widget.label "Hello world"
    |> Layout.resident
    |> Bogue.of_layout
    |> Bogue.run

Run it with:

./dk DkRun_V2_2.Run -- DkHelloScript_Std.B57Graphics.B43Bogue.B43Tiny

to see:Bogue's Hello World

The Snoke game

So far you have seen a very simple use of the Bogue graphics library.

The author of Bogue has created a demonstration game which was “ported” to DkCoder.

The port did not change a single line of the original code. The directory structure was re-arranged (recall that there is a Java-like package mechanism underneath DkCoder) and an extra .ml file was added.

Run it outside your project (perhaps your home directory) with:

git clone --branch V2_1 https://gitlab.com/diskuv/samples/dkcoder/SanetteBogue.git

./SanetteBogue/dk DkRun_V2_1.Run -- SanetteBogue_Snoke.Snoke

You should see a really fun game!sanette's Snoke game

You can explore its GPL-3.0 licensed source code at https://gitlab.com/diskuv/samples/dkcoder/SanetteBogue.git

Making Your Own Script Project

Create an empty folder for your project. In the Terminal run:

git clone https://github.com/diskuv/dkcoder.git
dkcoder/dk user.dkcoder.project.init

Files Created
.
├── .git/
├── .gitattributes
├── .gitignore
├── .merlin
├── .ocamlformat
├── .vscode
│   ├── extensions.json
│   └── settings.json
├── __dk.cmake
├── dk
├── dk.cmd
└── src
    └── MyScripts_Std
        └── StartHere.ml

✔️   DONE! You can edit the StartHere.ml script, or add new scripts now.

Using scripts in Production

We’ve gone through the mechanics of creating your own projects and seeing what scripts can do on a desktop. However, scripts are not limited to running the desktop!

Let’s run a real production script:

git clone --branch V0_3 https://gitlab.com/diskuv/samples/devops/DkSubscribeWebhook.git

./DkSubscribeWebhook/dk DkRun_V0_3.Run -- DkSubscribeWebhook_Std.Subscriptions --help

You’ll be seeing the numerous help and options available to run the production webhook that manages the DkSDK subscriptions at DkSDK Pricing.

A webhook is a production microservice that responds to Internet requests from third parties. You could run the webhook yourself to manage your own customer subscriptions after you have configured some cloud SaaS services (more on this soon). The basic flow is:

  1. Stripe is the payments provider for DkSDK. Stripe sends an invoice.paid event to the webhook after establishing a subscription from the <https://diskuv.com/pricing> website.
  2. GitLab is the source control provider for DkSDK. A GitLab group token is created for the subscriber that expires after the subscription (plus a grace period).
  3. AWS SES is one of the email gateways used by DkSDK. An AWS SES email is sent to the subscriber containing the group token.

You could run the webhook script just like you did above. However, it is common to package up your scripts into a Docker container for deployment to production. Then in production when you execute:

docker-compose up --build

the 100MB webhook container image starts up with all the command line options and all the credentials to your cloud SaaS services. The Docker Compose example we provide also has a Let’s Encrypt web interface so that you can manage SSL certificates for the webhook.

I recommend you read the README at https://gitlab.com/diskuv/samples/devops/DkSubscribeWebhook.git if you would like to see the production scripts in detail.

What may not be obvious from looking at the whole DkSubscribeWebhook project is how it was developed. I encourage you to look at the git commit history to see how each SaaS provider was tested and developed separately as runnable scripts. Those same “provider” scripts can be used for manually administering subscriptions. They don’t bitrot and they compose well into the larger webhook service.

Parties Who Own Scripts

There are three parties of people who own DkCoder scripts:

  1. You (the first party)
  2. Us (the second party)
  3. Them (the third parties)
    <p class="block">Here is a handy reference table that summarizes the distinctions. You won't understand the table right now and that is OK. We'll cover "Us" and "Them" in later sections in a future edition of DkCoder, but it is nice to have all the information in one place. All you need to remember is if you have a question about parties, this is the table to check.</p>
Party Reference Table
Party Code Generators Terms Compiled Initially
Party Code Generators Terms Compiled Initially
You DuneIde (default), Dune Checked Yes
Us Dune No
Them DuneIde (default), Dune Checked No

In this section we'll talk about where your You scripts go.

Your scripts must be placed in specific paths. Here is an example which you've seen in this walkthrough, and the general pattern you must follow:

.
└── src/
    ├── DkHelloScript_Std/
    │   └── Example001.ml
    └── <Libraryowner><Project>_<Unit>/
        │── <Modulename1>.ml
        └── <Modulename2>/
            │── <Modulename3>.ml
            │── <Modulename3>.mli
            └── <Modulename4>/
                └── <Modulename5>.ml

In the example above the DkHelloScript_Std is a library. A library is an organization of files and directories. These directories can have subdirectories, and those subdirectories can have their own subdirectories (etc.).

Any directory under src/ that is named in the format:

<Libraryowner><Project>_<Unit>

is also a library.

Breakdown of the DkHelloScript_Std library name
Component Part Why
Component Part Why
Dk Library owner A short identifier for which party owns the source code
HelloScript Project Conventionally this is a name related to your source code repository, but it can be any organizational name you want. You might, for example, want to name it your product name.
Std Unit Conventionally if you want to create a single library in a project, you use the name Std. Once you get too much code in your project, you can use the Unit to separate your code into more manageable, smaller pieces.

We’ll define the exact format of <Libraryowner>, <Project>, and <Unit> below.

You can place your script into the library directory, or into a subdirectory (or subdirectory of a subdirectory, etc.) named according to the <Modulename>/ format defined below, as long as your script is named according to the <Modulename>.ml format. There is also an optional <Modulename>.mli script interface which is not covered in this article.

Rules for the parts of a library name
Part Examples Rules
Part Examples Rules
Library owner Dk, Acme, Blue123 The Library owner must start with an ASCII capital letter and have a lowercase ASCII second letter; the remaining characters (if any) must be lowercase ASCII letters and/or ASCII digits.
Project X, Ab, Widgets, WidgetsPlus The Project must start with an ASCII capital letter. The remaining characters (if any) must be ASCII letters (any case) and/or ASCII digits. Conventionally you use something similar to your source code repository name as the Project.
Unit Std, V1, X The library unit must start with an ASCII capital letter. The remaining characters (if any) must be ASCII letters (any case) and/or ASCII digits and/or underscores (`_`). Conventially the main unit for your project is named `Std`. Once you get too much code in your project, you can use the Unit to separate your code into more manageable, smaller pieces.
Carefully choose the library owner
The Library owner (ex. Acme above) should uniquely identify yourself or your organization. Pick one and use it consistently. If you decide to publish libraries to the DkRegistry (described later) you will avoid conflicts with other libraries.

Rules for a module name
Examples Counter Examples Rule
Acme, Acme_Two, X DkHelloScript_Std

The module name MUST NOT be a valid <Libraryowner><Project>_<Unit> name.

Acme, Acme*Two, X 12345, someThing The module name must start with an ASCII capital letter. The remaining characters (if any) must be ASCII letters (any case) and/or ASCII digits and/or underscores (*).

Standard OCaml Library - Stdlib

The Stdlib standard library provides the basic operations over the built-in types (numbers, booleans, byte sequences, strings, exceptions, references, lists, arrays, input-output channels, …) and standard library modules.

In conventional OCaml programs the Stdlib is automatically opened. That means you can type print_endline "Hi" rather than Stdlib.print_endline "Hi". However, in DkCoder access to the Stdlib is restricted: Stdlib does a bit too much. In particular, input-output channels and threading do not make sense when compiling to JavaScript for use on a web page. Even more critical is that the Stdlib.Obj module is included, which has unsafe functions that make it impossible to prove overall code safety.

You can explicitly get back the unsafe standard library by performing an open at the top of your scripts:

open Tr1Stdlib_V414

However, it is recommended to open the pieces of the standard library you actually need. That way if, for example, your script does not use files it can be compiled to JavaScript.

Splitting the OCaml Standard Library
Package Description Modules
Package Description Modules
Tr1Stdlib_V414CRuntime Modules that need a C99 runtime Arg, Callback, Filename, Format, In_channel, LargeFile, Lexing, Out_channel, Printexc, Printf, Scanf, StdExit1, Sys, Unix, UnixLabels
Tr1Stdlib_V414Io Modules for input and output. Can't be used on Android and iOS. StdIo2
Tr1Stdlib_V414Threads Modules to create and coordinate threads. Brings in Tr1Stdlib_V414CRuntime. Condition, Event, Thread
Tr1Stdlib_V414Gc Modules for garbage collection Ephemeron, Gc, Weak
Tr1Stdlib_V414Random Modules for random numbers Random
Tr1Stdlib_V414Unsafe Modules that can break type-safety Dynlink, Marshal, Obj
Tr1Stdlib_V414Base Modules safe for use on any platform. Also includes safe functions like sqrt. Array, ArrayLabels, Atomic, Bool, Buffer, Bytes, BytesLabels, Char, Complex, Digest, Either, Float, Fun, Hashtbl, Int, Int32, Int64, Lazy, List, ListLabels, Map, MoreLabels, Nativeint, Oo, Option, Parsing, Queue, Result, Seq, Set, Stack, StdBinds3, StdLabels, String, StringLabels, Uchar, Unit
Deprecated Stdlib modules
The ThreadUnix, Stream, Pervasives, and Genlex modules have been deprecated in OCaml and do not appear in any of the Tr1Stdlib_V414* packages
Footnote 1: StdExit
StdExit is not a module provided by Stdlib. It is a new module that contains all the program termination types and functions of Stdlib:
val exit : int -> 'a
(** Terminate the process, returning the given status code
   to the operating system: usually 0 to indicate no errors,
   and a small positive integer to indicate failure.
   ... *)

val at_exit : (unit -> unit) -> unit
(** Register the given function to be called at program termination
   time. ... *)

Full documentation is at Stdlib - Program termination

Footnote 2: StdIo
StdIo is not a module provided by Stdlib. It is a new module that contains all the input/output types and functions of Stdlib:
type in_channel
(** The type of input channel. *)

type out_channel
(** The type of output channel. *)

val stdin : in_channel
(** The standard input for the process. *)

val stdout : out_channel
(** The standard output for the process. *)

val stderr : out_channel
(** The standard error output for the process. *)

val print_char : char -> unit
(** Print a character on standard output. *)

val print_string : string -> unit
(** Print a string on standard output. *)

val print_bytes : bytes -> unit
(** Print a byte sequence on standard output. *)

val print_int : int -> unit
(** Print an integer, in decimal, on standard output. *)

val print_float : float -> unit
(** Print a floating-point number, in decimal, on standard output. *)

val print_endline : string -> unit
(** Print a string, followed by a newline character, on
   standard output and flush standard output. *)

val print_newline : unit -> unit
(** Print a newline character on standard output, and flush
   standard output. *)

val prerr_char : char -> unit
(** Print a character on standard error. *)

val prerr_string : string -> unit
(** Print a string on standard error. *)

val prerr_bytes : bytes -> unit
(** Print a byte sequence on standard error. *)

val prerr_int : int -> unit
(** Print an integer, in decimal, on standard error. *)

val prerr_float : float -> unit
(** Print a floating-point number, in decimal, on standard error. *)

val prerr_endline : string -> unit
(** Print a string, followed by a newline character on standard
   error and flush standard error. *)

val prerr_newline : unit -> unit
(** Print a newline character on standard error, and flush
   standard error. *)

val read_line : unit -> string
(** Flush standard output, then read characters from standard input
   until a newline character is encountered. ... *)

val read_int_opt: unit -> int option
(** Flush standard output, then read one line from standard input
   and convert it to an integer. ... *)

val read_int : unit -> int
(** Same as {!read_int_opt}, but raise [Failure "int_of_string"]
   instead of returning [None]. *)

val read_float_opt: unit -> float option
(** Flush standard output, then read one line from standard input
   and convert it to a floating-point number. ... *)

val read_float : unit -> float
(** Same as {!read_float_opt}, but raise [Failure "float_of_string"]
   instead of returning [None]. *)

type open_flag =
    Open_rdonly      (** open for reading. *)
  | Open_wronly      (** open for writing. *)
  | Open_append      (** open for appending: always write at end of file. *)
  | Open_creat       (** create the file if it does not exist. *)
  | Open_trunc       (** empty the file if it already exists. *)
  | Open_excl        (** fail if Open_creat and the file already exists. *)
  | Open_binary      (** open in binary mode (no conversion). *)
  | Open_text        (** open in text mode (may perform conversions). *)
  | Open_nonblock    (** open in non-blocking mode. *)
(** Opening modes for {!open_out_gen} and {!open_in_gen}. *)

val open_out : string -> out_channel
(** Open the named file for writing, and return a new output channel
   on that file, positioned at the beginning of the file. ... *)

val open_out_bin : string -> out_channel
(** Same as {!open_out}, but the file is opened in binary mode,
   so that no translation takes place during writes. ... *)

val open_out_gen : open_flag list -> int -> string -> out_channel
(** [open_out_gen mode perm filename] opens the named file for writing,
   as described above. ... *)

val flush : out_channel -> unit
(** Flush the buffer associated with the given output channel,
   performing all pending writes on that channel. ... *)

val flush_all : unit -> unit
(** Flush all open output channels; ignore errors. *)

val output_char : out_channel -> char -> unit
(** Write the character on the given output channel. *)

val output_string : out_channel -> string -> unit
(** Write the string on the given output channel. *)

val output_bytes : out_channel -> bytes -> unit
(** Write the byte sequence on the given output channel. *)

val output : out_channel -> bytes -> int -> int -> unit
(** [output oc buf pos len] writes [len] characters from byte sequence [buf],
   starting at offset [pos], to the given output channel [oc]. ... *)

val output_substring : out_channel -> string -> int -> int -> unit
(** Same as [output] but take a string as argument instead of
   a byte sequence. *)

val output_byte : out_channel -> int -> unit
(** Write one 8-bit integer (as the single character with that code)
   on the given output channel. ... *)

val output_binary_int : out_channel -> int -> unit
(** Write one integer in binary format (4 bytes, big-endian)
   on the given output channel. ... *)

val output_value : out_channel -> 'a -> unit
(** Write the representation of a structured value of any type
   to a channel. ... *)

val seek_out : out_channel -> int -> unit
(** [seek_out chan pos] sets the current writing position to [pos]
   for channel [chan]. ... *)

val pos_out : out_channel -> int
(** Return the current writing position for the given channel. ... *)

val out_channel_length : out_channel -> int
(** Return the size (number of characters) of the regular file
   on which the given channel is opened. ... *)

val close_out : out_channel -> unit
(** Close the given channel, flushing all buffered write operations. ... *)

val close_out_noerr : out_channel -> unit
(** Same as [close_out], but ignore all errors. *)

val set_binary_mode_out : out_channel -> bool -> unit
(** [set_binary_mode_out oc true] sets the channel [oc] to binary
   mode: no translations take place during output. ... *)

val open_in : string -> in_channel
(** Open the named file for reading, and return a new input channel
   on that file, positioned at the beginning of the file. *)

val open_in_bin : string -> in_channel
(** Same as {!open_in}, but the file is opened in binary mode,
   so that no translation takes place during reads. ... *)

val open_in_gen : open_flag list -> int -> string -> in_channel
(** [open_in_gen mode perm filename] opens the named file for reading,
   as described above. ... *)

val input_char : in_channel -> char
(** Read one character from the given input channel. ... *)

val input_line : in_channel -> string
(** Read characters from the given input channel, until a
   newline character is encountered. ... *)

val input : in_channel -> bytes -> int -> int -> int
(** [input ic buf pos len] reads up to [len] characters from
   the given channel [ic], storing them in byte sequence [buf], starting at
   character number [pos]. ... *)

val really_input : in_channel -> bytes -> int -> int -> unit
(** [really_input ic buf pos len] reads [len] characters from channel [ic],
   storing them in byte sequence [buf], starting at character number [pos].
   ... *)

val really_input_string : in_channel -> int -> string
(** [really_input_string ic len] reads [len] characters from channel [ic]
   and returns them in a new string. ...*)

val input_byte : in_channel -> int
(** Same as {!input_char}, but return the 8-bit integer representing
   the character. ... *)

val input_binary_int : in_channel -> int
(** Read an integer encoded in binary format (4 bytes, big-endian)
   from the given input channel. ... *)

val input_value : in_channel -> 'a
(** Read the representation of a structured value, as produced
   by {!output_value}, and return the corresponding value. ... *)

val seek_in : in_channel -> int -> unit
(** [seek_in chan pos] sets the current reading position to [pos]
   for channel [chan]. ... *)

val pos_in : in_channel -> int
(** Return the current reading position for the given channel. ... *)

val in_channel_length : in_channel -> int
(** Return the size (number of characters) of the regular file
    on which the given channel is opened. ... *)

val close_in : in_channel -> unit
(** Close the given channel. ... *)

val close_in_noerr : in_channel -> unit
(** Same as [close_in], but ignore all errors. *)

val set_binary_mode_in : in_channel -> bool -> unit
(** [set_binary_mode_in ic true] sets the channel [ic] to binary
   mode: no translations take place during input. ... *)

val __LOC__ : string
(** [__LOC__] returns the location at which this expression appears in
    the file currently being parsed by the compiler, with the standard
    error format of OCaml: "File %S, line %d, characters %d-%d". *)

val __FILE__ : string
(** [__FILE__] returns the name of the file currently being
    parsed by the compiler. *)

val __LOC_OF__ : 'a -> string * 'a
(** [__LOC_OF__ expr] returns a pair [(loc, expr)] where [loc] is the
    location of [expr] in the file currently being parsed by the
    compiler, with the standard error format of OCaml: "File %S, line
    %d, characters %d-%d". *)

Full documentation is at Stdlib - Input/output

Footnote 3: StdBinds
StdBinds is not a module provided by Stdlib. It contains modules that can be opened and used with the ocaml-monadic PPX macros:
(** Open to use [let%bind] with {!Result}. *)
module BindsResult = struct
  exception ResultFailed of string option

  let bind = Result.bind
  let map = Result.map
  let return = Result.ok
  let zero ?msg () = raise (ResultFailed msg)
end

(** Open to use [let%bind] with {!Option}. *)
module BindsOption = struct
  let bind = Stdlib.Option.bind
  let map = Stdlib.Option.map
  let return = Stdlib.Option.some
  let zero () = None
end

The macro documentation is at ocaml-monadic extensions

DkCoder Runtime Libraries

shexp.process

This is a library for creating possibly parallelized pipelines that represent shell scripts.

Docs: shexp README

Warning!
This library only works on 64-bit machines.

ocaml-monadic

This small macro (PPX) library provides the let%bind, if%bind and match%bind forms. These are similar in design to Jane Street’s ppx_let macros but work in 32-bit architectures and will be easier to support backwards-compatibilty on OCaml 4.

This library also offers the ;%bind, let%orzero and [%guard] forms.

Docs: ocaml-monadic README

Limitations imposed by cross-platform support.

DkSDK is a family of tools that supports cross-platform development, even on embedded devices. This cross-platform support places constraints on how you structure your DkCoder projects.

  • The FAT32 filesystem is found on older Windows disks, CD drives and on embedded devices. FAT32 only supports filenames up to 255 characters. DkCoder limits the paths <Libraryowner><Libraryproject>_<Libraryunit>/<Modulename>/.../<Modulename>.mli to 240 characters. Even when the FAT32 filesystem entirely disappears from use this 240 character limit may remain since the limit encourages modular, single-focused libraries.
  • The FAT16 filesystem is the only filesystem in moderate use today that is not supported by DkCoder. FAT16 might still be used on your older USB thumbdrives and some very old embedded devices. Since FAT16 directory names are limited to 8 uppercase characters, and the minimum <Libraryowner><Libraryproject>_<Libraryunit> name is 7 characters, it is not reasonable to support FAT16.

Release Notes - Early Technical Limitations

  1. On first install for Windows running the ./dk DkRun_V2_2.Run -- DkHelloScript_Std.Y33Article --serve example can give:

    [00:00:22.564] [SUCCESS] (3/18) reproducibility or quick typing
    [00:00:22.564] Starting test: focus on what you run
    [00:00:22.566] [.\dk.cmd#3] '.\dk.cmd' DkRun_V2_1.Run '--generator=dune' -- DkHelloScript_Std.AndHelloAgain
    [ERROR][2024-04-29T00:00:43Z] /Run/
           Failed to run
             C:\Users\WDAGUtilityAccount\DkHelloScript\src\DkHelloScript_Std\Y33Article.ml
             (DkHelloScript_Std.Y33Article). Code fa10b83d.
    
           Problem: The DkHelloScript_Std.Y33Article script exited with
             STATUS_ACCESS_VIOLATION (0xC0000005) - The instruction at 0x%08lx
           referenced memory at 0x%08lx. The memory could not be %s.
           Solution: Scroll up to see why.

    If you rerun it, it succeeds.

  2. The ./dk script delegates to the build tool CMake first, and then to OCaml. That design reflects the history that DkCoder was designed first for C packages. The current design means:

    1. ./dk has the performance overhead of spawning CMake on startup.
    2. Ctrl-C does not cleanly kill all of the subprocesses on Windows. You may need to run taskkill /f /im ocamlrunx.exe to kill these hung processes.
    3. ./dk has command line arguments interpreted by CMake before being interpreted by your script. In particular, nested double-quotes like ./dk ... "here is a nested `" double quote" in PowerShell won't work.
  3. The GUI code was very recently ported to Windows. It has not had memory auditing so segfaults may occur. And playing sounds is known to hang. Do not use for GUIs and music for production code until we message that this has been fixed.

Security

Advanced Topic
This section is intended for security engineers.

The primary security goal of DkCoder is to allow the script user to assert, before a script is run, whether the script does not access resources like files, the console, sound, etc. Not all scripts can be asserted, but if DkCoder asserts a script does not have access to files for exmaple, then the script does not have access to files under some light assumptions. These assumptions are detailed in this section.

Access to Source Code
Security engineers, auditors and certifiers are granted free audit access to DkCoder and other DkSDK source code. The requirements are that you have worn the security hat for the majority of the past three months, have a publicly discoverable telephone switchboard, and reside in a country having no United States export controls. Contact information to request an audit account is available at the bottom of this article.
Status
# ETA What is Implemented
# ETA What is Implemented
Alpha Now Split the OCaml Standard Library by resource type
Beta TBD Implementation of the Technical Requirements below
After GA TBD Allow users to write AWS Cedar policies to restrict access to a) resources used by running scripts and b) which scripts can be downloaded and c) data submitted to DkCoder services.

Assumption A1 - A superset of modules used by a script can be obtained

This assumption is based on the OCaml 4.14 environment model Env.t defined in typing/env.mli. These Env.t have values, modules, types, module types, classes and class types produced during compilation. Only add operations like add_value are present, so by the end of compilation we have a superset of statically compiled modules used by the compiled script.

However,

  1. Some of the modules may be functors; that is, new modules can be created at runtime of a specified module type. That leads to a technical requirement:

    If a script uses a functor module, transitively or not, then the script is unassertable (aka. “tainted”).

How it will be implemented: The codept analysis tool parses the source and gathers module information, including whether a module is a functor. Using the --log-level DEBUG shows the modules used by the scripts.

Assumption A2 - Access to resources can be gated through modules

By definition, resources are either values like input channels or are accessed through values like function calls (including external C function calls).

In OCaml, values are present in a toplevel Env.t or in a (possibly nested) module and/or class.

  1. Since there is no tool like codept to analyze OCaml classes, there is a technical requirement:

    If a script uses a class, transitively or not, then the script is unassertable (aka. “tainted”).

  2. Since the standard OCaml toplevel environment is the contents of the Stdlib module (ie. there is an implicit open Stdlib at the top of each module), and Stdlib contains toplevel values like print_endline, there is a technical requirement:

    All Stdlib toplevel values are annotated with OCaml alerts that fail the compilation when the toplevel values are used.

  3. Since alerts can be locally overridden, there is a technical requirement:

    If a script uses a local alert override, transitively or not, then the script is unassertable (aka. “tainted”).

To repeat: In OCaml, values are present in a toplevel Env.t or in a (possibly nested) module and/or class. What remains after implementing the technical requirements above is that resource values necessary for safety assertions are only present in (possibly nested) modules.

How it will be implemented: A PPX can be run that injects a new empty module module Dk__Tainted = struct end whenever a class or local alert override is used. The module analysis by codept will discover the Dk__Tainted, and the analysis phase can fail.

Assumption A3 - Violating the correctness of OCaml type checking can be detected at compile time

The type-safe violations fall into these categories:

  1. Using an external C function declaration
  2. Using the Obj module
  3. Using the Marshal module

This is a weak assumption as it requires expert knowledge of OCaml.

Since the presence of modules can be trivially detected with assumption A1, the technical requirement is:

If a script uses an external C function declaration, transitively or not, then the script is unassertable (aka. “tainted”).

How it will be implemented: Any external declaration can be caught by a PPX which injects a module Dk__Tainted = struct end. The presence of any of Dk__Tainted, Obj or Marshal during the codept analysis will taint the script.

Getting in touch

I'm a recovering Luddite when it comes to social media; I deactivated Facebook and stopped using LinkedIn years ago. That will change. In the meantime, it is best to post OCaml questions on discuss.ocaml.org. And you can follow or DM me on the low-volume @Diskuv on 𝕏.