io7m-jparasol 0.11.3
io7m-jparasol 0.11.3 Documentation
Package Information
Orientation
Overview
The io7m-jparasol package implements a minimalist pure functional shading language.
Installation
Source compilation
The project can be compiled and installed with Maven:
$ mvn -C clean install
Maven
Regular releases are made to the Central Repository, so it's possible to use the io7m-jparasol package in your projects with the following Maven dependency:
<dependency>
  <groupId>com.io7m.jparasol</groupId>
  <artifactId>io7m-jparasol-core</artifactId>
  <version>0.11.3</version>
</dependency>
All io7m.com packages use Semantic Versioning [0], which implies that it is always safe to use version ranges with an exclusive upper bound equal to the next major version - the API of the package will not change in a backwards-incompatible manner before the next major version.
Platform Specific Issues
There are currently no known platform-specific issues.
License
All files distributed with the io7m-jparasol package are placed under the following license:
Copyright © 2014 <code@io7m.com> http://io7m.com

Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
        
Language Tutorial
Prerequisites
The parasol compiler is implemented as a set of APIs with a simple command-line frontend. The frontend is packaged into a convenient executable jar containing all of the compiler's dependencies. The jar file can be executed in the usual manner:
$ java -jar io7m-jparasol-compiler-frontend-0.11.3-jparasol-c.jar
parasol-c: [options] --compile-one output shader file0 [file1 ... fileN]
        or [options] --compile-batch output batch-list source-list
        or [options] --check file0 [file1 ... fileN]
        or [options] --show-glsl-versions
        or [options] --version

  Where: output           is a directory (unless --zip is specified) that will be populated with GLSL shaders
         shader           is the fully-qualified name of a shading program
         batch-list       is a file containing (output , ':' , shader) tuples, separated by newlines
         source-list      is a file containing a set of filenames, separated by newlines
         file[0 .. N]     is a series of filenames containing source code

     --check                            Parse and type-check all source files, but do not produce GLSL source
     --compact                          Enable compaction (eliminates duplicate source files)
     --compile-batch                    Produce multiple GLSL programs from a set of sources
     --compile-one                      Compile a specific shader program to GLSL source
  -h,--help                             Show this help message
     --log-properties <properties>      Configure logging based on the given property file
     --log-stack-traces                 Enable logging of exception stack traces
     --require-glsl <version-set>       Require GLSL source code for the given set of GLSL versions, failing if any of
                                        the versions cannot be satisfied
     --require-glsl-es <version-set>    Require GLSL ES source code for the given set of GLSL ES versions, failing if
                                        any of the versions cannot be satisfied
     --show-glsl-versions               Show the available GLSL versions
     --threads <count>                  Set the number of threads to use during code generation
     --version                          Display compiler version
     --zip                              Write shaders to a zip archive instead of a directory

   Where: version-set     := version-segment ( ',' version-segment )*
          version-segment := version-exact | version-range
          version-exact   := integer
          version-range   := ('(' | '[') integer? ',' integer? (')' | ']')

     Where '[' and ']' denote inclusive bounds, and '(' ')' denote exclusive bounds.

     Example: 130 selects version 130
     Example: [120, 150] selects versions 120 to 150
     Example: (120, 150] selects versions 130 to 150
     Example: 120,[140,330],440 selects versions 120, 140, 150, 330, and 440

  Version: 0.11.3
The name of the io7m-jparasol-compiler-frontend-0.11.3-jparasol-c.jar file is actually selected by Maven. For the sake of brevity, it is assumed that the user has renamed this file to jparasol-c.jar for this tutorial.
The programmer is expected to be familiar enough with GLSL to dislike the language.
Why?
Versioning
GLSL is a versioning and compatibility nightmare: There are, at the time of writing, fourteen different versions of GLSL in use. The versions of OpenGL and GLSL available are tied to hardware and to OpenGL drivers. Different versions of GLSL differ wildly in syntax and in the names of functions defined in the standard library. The same program may have to be rewritten five or six times to deal with trivial syntax differences between versions. The standard recommendation is for programs to explicitly mark the version of GLSL for which they're intended by including a #version directive at the start of the file. Without this directive, many GLSL implementations fall back to using the oldest available version, which then typically means that the compiler rejects the program as invalid. With a #version directive included, the programmer is asserting that he/she wishes to use a specific version which, of course, may not be available. Certain versions of OpenGL (such as OpenGL ES 2), place arbitrary restrictions on programs (such as not being able to declare multiple fragment shader outputs).
Most programmers attempt to work around these problems by layering on hacks and preprocessor macros. This is obviously unacceptable to programmers that actually care about correctness.
Correctness
GLSL is a nightmare from the perspective of correctness: Programs are parsed and type-checked at runtime. The commitee that mismanages OpenGL decided that all OpenGL drivers should implement their own compilers. Apart from the inevitable correctness issues that stem from this requirement, this also means that errors that should be entirely statically detectable are signalled at the latest possible stage (run time). Systems that use thousands of GLSL shaders are required to do obscene amounts of testing to ensure that all of their shaders are valid GLSL.
The language, on one hand, requires explicit type conversions. This is generally beneficial because there are none of magical implicit type conversions that appear in somewhat more weakly-typed languages. The language then works hard to eliminate the beneficial aspects by overloading all functions and operators, so that the programmer really has no idea which particular overloaded variant is being used [1].
The language has no module system and simply dumps its entire standard library into the same namespace as the programmer's code. The programmer must explicitly prefix the names of any defined functions or variables in order to avoid potential collisions. The OpenGL committee are free to introduce new name collisions at any time.
There are also very limited facilities for inspecting programs that have been compiled. For example, the GLSL compiler is free to remove any program inputs or parameters that is has deemed to be unused (and has therefore optimized out). The programmer using the shading program wants to be able to assign values to inputs by name. Unfortunately, without the programmer manually parsing the original program, there is no way to tell the difference between an input or parameter that has been optimized out, and an input or parameter that never existed in the first place. There is no way to distinguish between "The compiler removed the parameter" and "I got the name of the parameter wrong". The GLSL compiler obviously has this information, but it is not made available to the programmer. In order to really guarantee correctness, it's necessary for the programmer to reimplement parts of the GLSL compiler!
Modularity and re-use
A GLSL shader is represented by a single file. Code that is re-used between shaders must either be copied and pasted, or must be inserted automatically via some error-prone macro or preprocessor system. The GLSL language does not support #include directives, so the programmer is forced to build their own system to manage pieces of shaders.
What?
Parasol
The Parasol language is an aggressively minimalist pure-functional shading language. The intention is to provide a language with simple and predictable semantics that can then be compiled to all possible versions of GLSL, eliminating all of the versioning concerns previously expressed. On versions of GLSL that do not provide certain functions, the compiler silently provides emulations [2]. Essentially, the programmer provides a Parasol program and a range of GLSL versions upon which the program is expected to run. The compiler produces GLSL for all of the possible versions and will raise errors if a valid program really cannot be produced for a particular GLSL version [3].
The Parasol language is both pure-functional and total. That is, programs do not have side effects and are guaranteed to terminate. The language statically rejects recursive and mutually recursive terms, and does not provide looping constructs or arrays.
The Parasol language categorically rejects operator and function overloading [4]; the programmer always knows exactly which function is being applied on sight. This eliminates the correctness issues caused by pervasive overloading.
The language provides a simple non-hierarchical module and package system for controlling the namespace. Parasol programs can be developed as sets of re-usable libraries without any fear of name collisions. The language works from the assumption that, although shader programs are likely to be short and linear, there are also likely to be a large number of them, with significant amounts of duplicated code. The Parasol module system allows code to be imported and re-used safely, with full type checking and without requiring a complicated and error-prone preprocessor.
The jparasol compiler is a strictly offline compiler developed in Java. The compiler takes Parasol programs as input, and produces GLSL programs (and additional metadata) as output. The produced GLSL can then be loaded as normal in any OpenGL program and has no dependencies on Parasol, the jparasol compiler, or anything else. The additional metadata produced as output describes various properties of the program. This means that programmers are not required to parse Parasol programs to, for example, determine all of the declared inputs, parameters, and outputs. The metadata is an optional component; programmers are not required to use it in order to make use of compiled GLSL programs.
Limitations
The language considers portability and correctness to be more important than exposing the latest flashy features of GLSL. The language therefore exposes features roughly analogous to version 1.40 of GLSL in an attempt to provide the subset of OpenGL and GLSL that will work on as many implementations as possible. The language does not support geometry shaders or tesselation shaders. The language does not support double-precision floating point. These features will likely be introduced as time moves on and older versions of OpenGL disappear into history.
As described, the language is currently total, meaning that all programs are guaranteed to terminate. This is both a desirable property (because non-terminating shader programs are extremely bad news on some OpenGL implementations), and a severe limitation (because programs that require loops cannot be expressed in Parasol at all). In practice, this has not turned out to be much of a problem [5]. The shading programs found in most game engines tend to be linear in nature and do not require iterating over arrays of values.
How?
Parasol ↔ GLSL
The Parasol language takes a similar view of the world as a typical GLSL program. That is, a program consists of exactly one vertex shader and exactly one fragment shader.
In GLSL, the vertex shader is responsible for assigning values to a set of declared outputs, and additionally assigning a clip-space vertex position to a built-in variable named gl_Position.
In Parasol, a vertex shader declares a set of inputs, outputs, and parameters. One of the outputs is marked as being the vertex output, to which the programmer is expected to assign a clip-space vertex position. There are no built-in names. Input, output, and parameter names in Parasol programs will appear exactly as written in the resulting GLSL and are therefore subject to the same naming restrictions (and this is checked by the compiler).
In early versions of GLSL, the fragment shader was responsible for calculating a single RGBA colour vector and assigning it to a built-in variable called gl_FragColor. In later versions of OpenGL, it became possible to have multiple colour attachments to framebuffers, and therefore it became necessary for the fragment shader to assign multiple outputs. The gl_FragColor variable was essentially replaced with the gl_FragData variable, which is of an array type. In modern OpenGL, the gl_FragColor and gl_FragData variables have been removed, and programmers are required to explicitly declare named fragment shader outputs.
In Parasol, a fragment shader declares a set of inputs, outputs, and parameters. All of the outputs must be named and numbered. This allows the compiler to map numbered outputs to the gl_FragData array in early versions of GLSL, and to correctly declare named outputs in modern GLSL. The compiler checks that output numbers start at 0 and increase monotonically.
As stated previously, in GLSL a vertex or fragment shader is typically represented by a single file and execution begins at a function called main.
The Parasol language implements a simple module system to promote re-use of code, and all functions, types, and shaders must be declared inside of modules. Vertex and fragment shaders are declared inside of modules and then aggregated into programs. When it comes time to produce GLSL, the programmer provides the fully-qualified name of a declared program and the compiler produces a set of GLSL shaders based on the selection.
Hello World GLSL
A simple GLSL vertex shader in GLSL 1.40 that simply transforms the given object-space coordinates to clip-space by multiplying by a given modelview and projection matrix:
#version 140

in vec4 v_position;
uniform mat4 m_modelview;
uniform mat4 m_projection;

void
main ()
{
  gl_Position = m_projection * m_modelview * v_position;
}
A simple fragment shader in GLSL 1.40 that declares a colour output and assigns a constant red colour to it:
#version 140

out vec4 out_rgba;

void
main ()
{
  out_rgba = vec4 (1.0, 0.0, 0.0, 1.0);
}
The programmer then compiles and loads each shader file from the OpenGL API, links them, and then uses them.
Hello World Parasol
As stated, all terms, types, and shaders must be declared in modules. For this tutorial, a HelloWorld module suffices, declared in package com.io7m.examples.
The previous GLSL shaders translated to Parasol become:
package com.io7m.examples;

module HelloWorld is

  import com.io7m.parasol.Matrix4x4f as M4;

  shader vertex v is
    in         v_position   : vector_4f;
    parameter  m_modelview  : matrix_4x4f;
    parameter  m_projection : matrix_4x4f;
    out vertex o            : vector_4f;
  with
    value position_clip =
      M4.multiply_vector (M4.multiply (m_projection, m_modelview), v_position);
  as
    out o = position_clip;
  end;

  shader fragment f is
    out o : vector_4f as 0;
  with
    value red =
      new vector_4f (1.0, 0.0, 0.0, 1.0);
  as
    out o = red;
  end;

  shader program p is
    vertex v;
    fragment f;
  end;

end;
The module declares a vertex shader named v, a fragment shader named p, and combines the two into a program named p. Any number of shaders and programs can be declared in a single module (but their names must obviously differ).
The vertex shader declares an input v_position, two parameters m_modelview and m_projection, and an output named o which is marked as the main vertex output with the vertex keyword. The shader performs the same multiplication as the GLSL program, assigning the result to a local variable named position_clip. Variables in Parasol are immutable and the programmer is not required to state the types of values because the language features local type inference. Finally, the shader writes the calculated value to the output o with an output assignment. The right-hand side of an output assignment is required to be a name [6]. The Parasol compiler requires that all outputs be assigned.
Because the language only provides functions and not operators, the order of operations is completely unambiguous; the arguments to functions are evaluated eagerly from left to right and then substituted into the body of the function. Does the multiplication in the GLSL version mean (p * m) * v or p * (m * v)? In this case, it may not matter, but what about in more complicated expressions with overloading? Programmers get this wrong constantly. The Parasol language keeps things explicit - mistakes become difficult to make.
The fragment shader simply declares a value named red and assigns it to the single declared output named o. Note that o is assigned both a name and a number; this is required by the language.
The program p states that it uses the vertex shader v and the fragment shader f. The compiler will check that the programs are compatible when p is declared. The rules for compatibility are the same as they are for GLSL; the vertex shader must have a corresponding output with the same name and type as each of the fragment shader's inputs.
Assuming that the above is in a file named HelloWorld.txt and that the resulting GLSL shaders should be written to a directory called /tmp/shaders, the program can be compiled as follows:
$ java -jar jparasol-c.jar --compile-one /tmp/shaders com.io7m.examples.HelloWorld.p HelloWorld.txt
The compiler says nothing unless an error occurs.
By default, the compiler will attempt to produce GLSL shaders for all known versions of GLSL. It will silently fail to produce a program for a version that cannot be supported. In the case of the above program, there are no versions of GLSL that cannot support the program. It is usually more desirable to specify a range of required versions and have the compiler raise an error when it cannot produce code for one or more versions. To compile the same program but with the requirement that all versions of GLSL must be supported:
$ java -jar jparasol-c.jar --require-glsl [,] --require-glsl-es [,] --compile-one /tmp/shaders com.io7m.examples.HelloWorld.p HelloWorld.txt
The [,] notation is standard mathematical range notation. This particular expression represents an inclusive upper and lower bound that effectively covers all versions. The notation is explained fully in the documentation for the compiler frontend.
An examination of the output directory will show that the compiler has produced code for all of the GLSL versions:
$ ls -alF /tmp/shaders/
total 0
drwxr-xr-x  5 nobody nobody   100 2014-06-11 18:06 ./
drwxrwxrwt 66 root   root    6020 2014-06-11 18:06 ../
drwxr-xr-x  2 nobody nobody   320 2014-06-11 18:06 com.io7m.examples.HelloWorld.f/
drwxr-xr-x  2 nobody nobody    60 2014-06-11 18:06 com.io7m.examples.HelloWorld.p/
drwxr-xr-x  2 nobody nobody   320 2014-06-11 18:06 com.io7m.examples.HelloWorld.v/

$ cat /tmp/shaders/com.io7m.examples.HelloWorld.v/glsl-110.v
#version 110

attribute vec4 v_position;
uniform mat4 m_modelview;
uniform mat4 m_projection;
varying vec4 o;

void
main (void)
{
  vec4 pl_position_clip = ((m_projection * m_modelview) * v_position);
  gl_Position = pl_position_clip;
  o = pl_position_clip;
}

$ cat /tmp/shaders/com.io7m.examples.HelloWorld.f/glsl-110.f
#version 110

void
main (void)
{
  vec4 pl_red = vec4 (1.0, 0.0, 0.0, 1.0);
  gl_FragColor = pl_red;
}

$ cat /tmp/shaders/com.io7m.examples.HelloWorld.v/glsl-140.v
#version 140

in vec4 v_position;
uniform mat4 m_modelview;
uniform mat4 m_projection;
out vec4 o;

void
main (void)
{
  vec4 pl_position_clip = ((m_projection * m_modelview) * v_position);
  gl_Position = pl_position_clip;
  o = pl_position_clip;
}

$ cat /tmp/shaders/com.io7m.examples.HelloWorld.f/glsl-140.f
#version 140

out vec4 o;

void
main (void)
{
  vec4 pl_red = vec4 (1.0, 0.0, 0.0, 1.0);
  o = pl_red;
}

$ cat /tmp/shaders/com.io7m.examples.HelloWorld.v/glsl-es-100.v
#version 100

precision highp float;
precision highp int;

attribute vec4 v_position;
uniform mat4 m_modelview;
uniform mat4 m_projection;
varying vec4 o;

void
main (void)
{
  vec4 pl_position_clip = ((m_projection * m_modelview) * v_position);
  gl_Position = pl_position_clip;
  o = pl_position_clip;
}

$ cat /tmp/shaders/com.io7m.examples.HelloWorld.f/glsl-es-100.f
#version 100

precision highp float;
precision highp int;

void
main (void)
{
  vec4 pl_red = vec4 (1.0, 0.0, 0.0, 1.0);
  gl_FragColor = pl_red;
}
As can be seen, the resulting GLSL program is almost identical, with the added bonus that the Parasol version provides code for all possible GLSL versions.
Where next?
The specification gives the precise formal definitions for the language, and a complete listing of the standard libraries.
Tools reference
jparasol-c
Overview
parasol-c: [options] --compile-one output shader file0 [file1 ... fileN]
  or [options] --compile-batch output batch-list source-list
  or [options] --check file0 [file1 ... fileN]
  or [options] --show-versions
  or [options] --version
Compile
The --compile-one command accepts an output directory (or archive, see --zip) o, the fully-qualified name of a program p, and a list of one or more Parasol source files. The files are parsed and type-checked and then p, and the constituent shaders of p, are transformed to GLSL and written to o (creating the directory if it does not already exist).
The --require-glsl and --require-glsl-es options specify the GLSL and GLSL ES versions for which the compiler must generate code. The compiler will raise an error if the program cannot be supported on one or more versions. The notation for specifying versions is described below. If these options are not specified, the compiler will attempt to generate code for all versions but will not raise an error if code cannot be generated for one or more versions.
Check
The --check command accepts a list of one or more Parasol source files. The files are parsed and type-checked and any errors are printed on the standard error stream.
Compile Batch
The --compile-batch command accepts a base directory b, a file f containing a list of batches, and a file s containing a list of source files with one file per line.
For each program p in the list of batches, the compiler will produce a GLSL program in b/p.
The --require-glsl and --require-glsl-es options specify the GLSL and GLSL ES versions for which the compiler must generate code. The compiler will raise an error if the program cannot be supported on one or more versions. The notation for specifying versions is described below. If these options are not specified, the compiler will attempt to generate code for all versions but will not raise an error if code cannot be generated for one or more versions.
The command is intended to be used to process thousands of programs. The compiler supports multi-threaded program processing and can, after parsing and type-checking, produce multiple GLSL programs from the resulting typed AST in parallel. The --threads option specifies the number of threads that should be used during batch compilation, with 1 thread being the default.
Zip
When the --zip parameter is specified, the output path is instead assumed to be an archive that will be created and populated with the resulting shading programs.
By default, any file that exists at the named output path will be replaced. When the --zip-append option is specified, the output path is assumed to refer to an existing zip archive, which will be updated with the generated files.
Compaction
Normally, one source file will be emitted per version of GLSL. Often, these different source files will actually be identical in content apart from a #version directive on the first line of the file.
When the --compact parameter is specified, shaders will be written in so-called compacted form. Essentially, source files are stripped of their version directives and then hashed with SHA256. Then, each source file is renamed to its SHA256 hash with the addition of its original suffix, and a set of mappings from GLSL versions to SHA256 hashes are written into the program's meta.xml file. The user is responsible for re-inserting the correct version directive upon passing the program to a GLSL compiler.
The storage space savings of compaction are typically in the range 60-70%. In rendering systems that use thousands of shaders, this can be significant!
Range notation
The notation for ranges is given by the following EBNF:
version-set     := version-segment ( ',' version-segment )*
version-segment := version-exact | version-range
version-exact   := integer
version-range   := ('(' | '[') integer? ',' integer? (')' | ']')
The [ and ] characters indicate inclusive bounds. The ( and ) characters indicate exclusive bounds.
Some examples from the jparasol-c command line help, with x ⊢ y indicating that the notation x denotes the set of versions y.
130               ⊢ {130}
[120, 150]        ⊢ {120, 130, 140, 150}
(120, 150]        ⊢ {130, 140, 150}
120,[140,330],430 ⊢ {120, 140, 150, 330, 430}
Batches
The syntax for batches is given by the following EBNF:
batch := [ name ] , ":" , program_name
Where name is an optional output name, and program_name is the fully qualified name of a program (consisting of a package, module, and program name). An example program_name would be com.io7m.examples.Example.p, where com.io7m.examples is the package, Example is the module, and p is the program.
A batch file contains a list of batches, one per line.
Logging
By default, the compiler remains silent except in the case of errors. It is possible to enable more logging for the purposes of debugging by providing a configuration file with the --log-properties option. The configuration file is a set of Java properties that configure the jlog package used internally by the compiler and so the documentation of that package should be examined for the precise formats of the properties.
To save some time, an example file that enables absolutely all logging:
com.io7m.jparasol.logs.compiler = true
com.io7m.jparasol.level         = LOG_DEBUG
An example file that enables the minimum debug messages for the frontend by disabling logging for all of the main compiler components but leaving other messages enabled:
com.io7m.jparasol.logs.compiler                 = true
com.io7m.jparasol.logs.compiler.pipeline        = false
com.io7m.jparasol.logs.compiler.gpipeline       = false
com.io7m.jparasol.logs.compiler.compactor       = false
com.io7m.jparasol.logs.compiler.serializer-zip  = false
com.io7m.jparasol.logs.compiler.serializer-file = false
com.io7m.jparasol.level                         = LOG_DEBUG
API Reference
Javadoc
API documentation for the package is provided via the included Javadoc.

[1]
This often causes serious performance problems. See Low-level Thinking in High-Level Shading Language Programs.
[2]
Consider the trunc function. This function is not available on certain versions of GLSL, but is equivalent to the expression sign(x) * floor(abs(x)) for some x. The compiler can provide a software emulation in this case, allowing the programmer to safely call Parasol's Float.truncate function anywhere without issue.
[3]
For example, if the programmer writes a fragment shader that writes to multiple outputs, there is no way this program can run on OpenGL ES 2 and there is no way that the compiler can provide any kind of emulation to work around the problem.
[4]
The language actually rejects operators entirely, to keep the language definition as simple as possible.
[5]
It is possible to express all of the shaders required to implement a renderer equal in capability to that of the Source engine (circa 2014) without any loops or arrays.
[6]
The reason for this is because the GLSL standard has various rules about how and when shader outputs must be assigned. Parasol language programs syntactically guarantee that either all outputs are assigned, or none of them are.