Skip to main content

Codebase Overview

The Mina Protocol is written in OCaml, a statically typed, functional programming language.

For OCaml beginners, it may help to skim through Real World OCaml for a good introduction to the language, and some deep dives into specific topics if you're interested. Assuming basic familiarity with OCaml, here's some more info on how it is used in the Mina Protocol.

Code Structure

See the Repository Structure page.

Compilation

The OCaml compiler can target bytecode and native compilation. The code statically links with some libraries so it can't compile to bytecode. The code doesn't play well with the REPL. Dune, the build system, has a concept of folders that represent modules and files that a module. If the folder has a file with the same name, it's essentially equivalent to index.js in Node.

Interface files in OCaml with the .mli extension contain type signatures and structures for a module. The corresponding implementation must have the same file name with the .ml extension. Only the things defined in the interface are available from other modules. If an interface file does not exist for a module, everything is exposed by default. The same convention and rules apply to files with the .rei and .re extensions.

For the linking step, dune uses ldd under the hood. You can also use things like -O3 for optimization. For debugging, you can use gdb.

Open-source Library Documentation

There are multiple libraries for OCaml. One challenge with learning OCaml is locating and reading documentation for the various libraries. For example, the Jane Street Core library has the following structure:

              Base
|
Core_kernel -> Async_kernel
| |
Unix <- Core -> Async

In general, the source code of an installed library is not available, so follow these tips to find the documentation.

To review the docs for Core, a standard library overlay. Core is a popular alternative to the OCaml standard library. First, go to the Core codebase on GitHub and then locate the correct documentation. If you don't see the module you're looking for, go to next to Core_kernel. If that fails, then look for Base. To use the find capability in GitHub, expand the sections.

Most documentation is published in HTML. However, you can use the an IDE to find and review code and documentation, like the Merlin editor service that provides modern IDE features for OCaml. Merlin provides type hints and code navigation, like Go To Definition. Note that Merlin works only if your code compiles.

OPAM, the source-based package manager for OCaml, usually ships documentation with libraries that you can access using Merlin.

Extensions

OCaml uses the ppx meta-programming system that generates code at compile time.

For example, consider this ppx extension on a type signature:

    type t =
| A
| B [@ to_yojson f]

[@@ deriving yojson]

The single @ scopes an extension to a single expression. The @@ denotes the extension is expanded in the scope of the calling context.

For an extension on a structure or a value, use the following syntax.

% returns a value/expression.

%% injects a statement

    let x = [% ...]
[%% ...]
let y =
let%... z = ... in
match%... ... with
| ...
| ... in

[%% if x]
let x = y
[%% else]
let x = z
[%% endif]

TL;DR Anytime you see [@ ...] [@@ ...] [% ...] [%% ...] it's an OCaml language extension.

Monads

Functional design patterns that allows you to write computation in a very generic way, that removes the need for boilerplate code. Monads allow us to abstract up to a higher level of computation, while taking care of all the glue for us. In other words, monads are programmable semicolons.

For example consider the following imperative example:

    function example(x) {
if ( x == null ) return null;
x = x + 1;
if ( !isEven(x) ) return null;
return x;
}

This can expressed similarly in functional programming using a monad, using option:

    type a' option =
| None
| Some of 'a

let return x = Some x

(* Bind infix operation, applies f to m *)
let (>>=) m f =
match m with
| Some x -> f x
| None -> None

(**
Map infix operation
Essentially the same as bind, but the inner function unwraps the value.
**)
let (>>|) m f = m >>= (fun x -> return (f x))

Now we an use these primitives to reimplement the imperative example above as follows.

    let add_one = ((+) 1)

let bind_even : int -> int option =
fun x -> if x mod 2 = 0 then Some x else None

let example x = x >>| add_one >>= bind_even;

OCaml has a `ppx` that makes writing monads much easier to follow, using the let syntax.

let%bind x = y in
f x

(* This compiles down to the following *)
y >>= (fun x -> f x)

Essentially, this syntax takes the value from the let statement and places it on the left of the bind infix call, and puts the assignment into a lambda.

Async

Under the hood, async uses Monads. However, ivars are the low-level async primitive. a' Ivar.t is essentially a mutex that can only be filled once. After the value from a computation returns, it then fills the ivar. The rest of the syntactic sugar takes the ivar and passes them through Deferred monads.

A yield function exists, but avoid using it since it has some weird behavior. Instead, operate on the wrapped values that happen in between Deferred bindings.

Custom Helpers

Use these custom helpers:

  • Strict_pipe - wraps pipe and gives certain guarantees around how it can be used.
  • Broadcast_pipe - allows a single pipe to be hooked up to multiple downstream pipes.

Do not use:

  • Async.Pipe operates essentially like a buffer and is unsafe since it has unlimited buffering by default (memory overflow) and some funky behavior around which end of the pipe should do what.
  • Linear_pipe - deprecated in favor of Strict_pipe and Broadcast_pipe.