FAQ

4.1 Origins #

4.1.1 Why a new array programming language? #

I’ve had a strong liking for array programming languages since quite a few years, but I never managed to like array-based text processing. The approach provides smart and efficient solutions for some kinds of parsing tasks, as BQN’s parser well illustrates, but it does not offer immediate solutions to various fundamental text processing needs that arise in common scripting tasks.

Also, Unicode and UTF-8 don’t play well with the array vision. Text is complicated, there is no straightforward mapping between “character” (a somewhat ill-defined notion that is maybe best approached by grapheme clusters) and bytes or code points. For example, some abstract characters cannot be encoded by a single code point, and some abstract characters have more than one possible encoding.

That’s why I feel a scripting language should consider strings as a whole in most operations, which in an array language carries implicit iteration over collections of strings as an added benefit. Moreover, other than providing all the usual string handling functions, I think there is value in integrating standard text-processing features into the design. Regular expressions are a nice tool for describing character classes and Unicode properties: having dedicated syntax support makes them more convenient and less error-prone. String interpolation and quoting constructs make simple templating more intuitive.

Text processing aside, Goal does have a few other characteristics that I wanted to see in an array language:

Last but not least, I had some previous experience with various aspects of compilation, but I had never written a whole bytecode interpreter from scratch before: it’s a great and fun experience!

4.1.2 Influences #

Language design was greatly inspired by both K (for syntax and many primitives), in particular the ngn/k dialect, and BQN (a few fundamental primitives like group by, classify and shifts). I was thinking of Perl and Raku when adding regexp literals, quoting constructs, and a couple of IO primitives. There is some inspiration from the implementation language, Go: similar syntax for numbers, string literals, and time layouts.

I wrote the bytecode implementation after reading the one for GoAWK, and it still shows. I wrote the scanner after reading ivy’s. Vim syntax highlighting is based on ngn/k’s. Be sure to check out all those great projects if you haven’t yet!

4.1.3 What does the name “Goal” stand for? #

4.2 Design #

4.2.1 Why use so many symbols over words? #

Goal uses both symbols and keywords depending on the nature of the operation and its frequency.

Goal prefers symbols for common pure operations, like most programming languages do for arithmetic operations. As an array language, the range of interesting pure and common operations increases significantly. Also, due to practical considerations, Goal extends that reasoning to common string handling functions too.

The usefulness of concise notation is well-known in mathematics, and array languages have made use of it since early APL versions. Unlike APL or BQN, but like its main inspiration K, Goal prefers highly-polysemic symbols, preferably ASCII.

That choice is based on my personal experience that non-extensible polysemy is intuitive and natural as long as the meanings are both mnemonic and easily resolved by context. Note that using words would not be as well-suited in that respect, because natural language already has its own polysemic meanings.

For I/O operations, as well as less frequent operations, Goal uses keywords. This makes the distinction between math-like code and stateful code with side effects clear.

4.2.2 Why are there a few non-ASCII symbols? #

A few primitives have non-ASCII symbols: «, », ¿, ´. The first three have keyword alternatives. The last does not have a single-keyword alternative (due to lack of adverbial keywords which would be quite verbose anyway). It does though have reasonable idiom alternatives. While not ASCII, all of those symbols are still very common symbols found even in Latin1 and used in many natural languages, so they should be accessible on most systems without configuration.

4.2.3 Does Goal perform tail-call optimization? #

Goal optimizes tail calls made with the special o variable. Other kinds of tail calls are not optimized: in particular, tail-call elimination is not performed for mutually recursive functions.

4.2.4 Why does Goal not have imperative loops? #

Most loops in Goal are performed implicitly by primitive verbs. Other kinds of loops are replaced with functional alternatives using the various adverbial forms. Absence of any kind of imperative backwards flow in the language simplifies both language semantics and static analysis of bytecode in the implementation.

Both verbs and adverbs make the lack of imperative loop syntax mostly a non-issue. Sometimes, due to the lack of closures, you might need to explicitly pass around state from local variables, for example with the “while” adverbial form or when writing a tail-recursive function. You may also simply use a global: abusing globals is harder in a language where explicit loops are rare anyway.

4.2.5 Why does Goal not have closures? #

Most other K dialects do not feature them either, or only some restricted form. The lack of closures is rarely felt, as functional projections can be used to pass variables as extra arguments: typical cases can be written succinctly with tacit composition and field expression syntax or using named partial application syntax in formal argument list (see help). In this context, not supporting closures helps with implementation simplicity and performance with little drawback.

If you feel like needing mutable closures at some point, you might want to check before that what you want to do can’t be done easily with the “while” or “fold while” adverbial forms, or using a tail-recursive function. If not, just try using a global variable instead: it’s the only source of mutable state in Goal, and it’s perfectly reasonable for typical scripts, as it should be at most an occasional need.

4.2.6 Is Goal designed as a code golf language? #

No, and I don’t even use it as such. I don’t have much experience with code golf. Most array languages are concise by their very nature and their larger range of symbol operators, so Goal shares some of this heritage.

4.2.7 Is Goal stable? Is backward compatibility expected? #

Both Goal the language and the interpreter are stable. Since version 1 was released, programs written in Goal or using its Go API are expected to continue to work in future versions without changes, within some limits detailed below.

While compatibility will be maintained for most programs, there are a few limitations to the compatibility promise to keep in mind:

4.3 Syntax #

4.3.1 Is Goal space-insensitive? #

Goal is space-insensitive for the most part, except in a few cases where they’re either needed or forbidden to disambiguate:

Other than those few cases, you are free to use spacing in the way that seems the most readable for you.

4.3.2 What’s the difference between newline and semicolon? #

As stated in the help, newlines are ignored after any of ({[ or before )}] but act as semi-colons otherwise. There’s still one minor difference: duplicate newlines are ignored so that it’s possible to freely use spacing and comments within a multi-line list.

4.3.3 Is : syntax or an operator? #

Colon can represent various things in Goal: early return, a verb returning its right argument, assignment syntax, or either a monadic verb or modified assignment marker when tightly following an operator (like in +:). Despite all those different uses, there’s no confusion in practice.

The assignment or modified assignment meaning is used when it follows an identifier like x, an indexing L-value like x[i], or a list of identifiers like (a;b;c) (only plain assignment). If the colon is followed by indexing brackets, the verb meaning is used. When there is an expression on the right, but no noun on the left, a single colon means “return”.

4.3.4 What are parens used for? #

Goal uses parens for two things: list creation and controlling operation precedence. List creation happens when one or more semicolons ; appear within parens. The semicolon is used as item separator, and items are evaluated left-to-right. Otherwise, the parens are used to control precedence of operations, as is usual in mathematics and most languages. Lists with a single item are created using the “enlist” monadic form ,x.

As a special case, a couple of lone parens () represents an empty generic list. Note that other kinds of empty lists don’t have a special syntax and have to be produced using primitives: an empty list of integers is !0, an empty list of strings is !"", and an empty list of floats is ?0. While not syntax, those forms are currently optimized by a basic constant-folding pass.

4.3.5 How do you write a delimited comment? #

Comments in Goal are either line or multi-line based, there are no C-style delimited block comments allowing to control both comment start and end within a line. However, the discard `expr form that allows to ignore the expression on the right can be used for similar purposes. As a special case, (`expr) is not parsed as an empty generic list, but is completely ignored instead. This can be used to even more finely control the portion of the code that should be ignored, when necessary.

Also, the special x:expr form with literal number or string x can be used as a sort of prefix comment before an expression: it is recognized and optimized, so there’s no runtime overhead.

4.3.6 Which number and string literal syntax is supported? #

Number literals are based on Go’s integer and floating-point literals. Goal first attemps to parse a number literal as a 64-bit integer using ParseInt (with automatic base recognition based on string prefix), then as a 64-bit float using ParseFloat, or as a 64-bit integer duration using ParseDuration. Moreover, Goal introduces special literals for a few specific numeric values: 0i for the smallest 64-bit integer value, 0n for NaN, 0w for positive infinity, and -0w for negative infinity.

Double-quoted string literals are similar to Go’s too, as described in the specification, but they can be multi-line and support variable interpolation using $var or ${var}, meaning that a literal $ needs to be escaped. The qq/STRING/ form accepts the same syntax as double-quoted strings, but supports a custom delimiter, as described in the help. The rq/STRING/ is a raw string literal variant supporting a custom delimiter that can still be inserted, but by doubling it instead of by using a backslash.

4.3.7 Which regexp syntax is supported? #

The syntax for regexps is the same as the one described in the regexp/syntax Go package, using the default Perl-like syntax. As an extension, Goal allows multi-line regexps when using the quoting construct rx/PATTERN/. In multi-line regexp literals, leading and trailing spaces on each line are ignored, and space followed by / (or # if / is already used as the regexp delimiter) starts a comment that spans until the end of the line.

4.3.8 How do variable scoping rules work? #

Variable scoping rules in Goal are simple: variables are either global or local to a lambda function. In other words, Goal has global scope and function scope. There is no concept of block, and nested functions don’t have access to their parent’s scope (see related question about the lack of closures).

Within a lambda, single-colon assignment defines local variables, so globals have to be assigned using double-colon :: there. A global variable can still be accessed, but priority is given to the local variable in case of naming conflict. Note, however, that assignment operations aren’t ambiguous, so :: isn’t needed for them. Also, local variable names cannot have dots in them, so :: isn’t needed either for global names with dots.

Note: a related question is that keywords and variables share the same namespace in source code: keywords are resolved early during scanning and take priority over variable names. They are fixed and cannot be reassigned in any way from Goal code.

4.3.9 How do tacit compositions work? #

Goal’s tacit compositions are similar to other K dialects, but they are just sugar for a lambda or a lambda projection. A composition is formed from any kind of expression that ends in a verb: it simply produces an equivalent function with the implicit arguments added at the end. If the last verb is monadic, the function takes just one argument x. If it is dyadic, the function takes two arguments x and y. Both dyadic built-in operators and derived verbs can be made monadic by appending a : without spacing. Dyadic keyword verbs can be made monadic by adding :: at the end. Note that arity in tacit compositions is a syntactic notion unrelated to the semantic concept of function rank.

Most compositions translate easily into a lambda, but when compositions make use of non-constant expressions, they are represented as a lambda projection. In particular, compositions do not capture global variables: those get automatically passed as extra arguments.

Some array languages support more complex tacit features, like BQN or J. Those features carry some cognitive overhead, at least for me. I find the regular switching between explicit and tacit styles distracting.

4.3.10 How do .. and field expressions work? #

Double dot .. is syntax sugar that can be used for several purposes, but the main motivation is allowing concise evaluation of expressions under a dict by referring to values using string keys as unquoted variable names.

The simplest case is a tight bind between identifiers as in x..a, which is parsed as a single token and expands to x["a"]. There’s a special consideration with respect to bracket indexing as in x..a[y], which expands to x["a";y] instead of x["a"][y]. In the common case where x holds a dict to some nested array, the result is the same, but merging both applications is closer to how one would write such cases by hand without using the .. syntactic sugar, and it also simplifies making extensions with new kinds of values callable in a method-like style (for example an hypothetical heap value type could be made usable as in x..push[y]). You can write (x..a)[y] to prevent this merging behavior when appropriate.

When .. is not tightly surrounded by two identifiers, as in x .. expr where x can be any kind of expression, a more general expansion is performed: if provided, x is passed as a single argument to a lambda (or lambda projection) described by expr using special variable scoping rules, without braces. Any variable named a (without dot-prefix and not among x, y, z, and o) appearing in expr expands to x..a, that is x["a"]. Variables with a p. prefix are passed as extra projection arguments from the parent context (without the prefix), while variables with a q. quoting prefix are inserted as-is (without the prefix). Other kinds of dot-prefixed names always represent globals and don’t get any kind of special treatment. Note that those expansion rules are not only convenient for concise dict manipulation but also sometimes for concise definition of lambda projections without having to explicitly pass local variables from a parent context as extra arguments.

When .. is tightly followed by an opening bracket [, we get unquoted assignment-style syntax for dict amend d..[a:e1;b:e2], that expands to @[x;"a""b";:;x..(e1;e2)], which is further expanded following the field expression expansion rules already described. When d is not provided, the syntax is simply used as dict creation syntax, expanding directly to "a""b"!(e1;e2), as already shown in the help. As a special case, when keys are provided without a corresponding value in dict creation, as in ..[a;b], the value is assumed to come from a variable named as that key, expanding to ..[a:a;b:b], that is "a""b"!(a;b). This is useful to create a dict from a few variables by using their names as keys without naming redundancy.

Practical examples of the various kinds of usages for .. can be found in the Working with tables chapter.

4.4 Semantics #

4.4.1 How does namespacing work? #

Goal makes use of a simple flat naming scheme for globals so that they can be efficiently stored in an array, supporting fast access by index. For convenience, variable names can make use of a dot prefix as in pfx.name. Some primitives, such as eval and import, allow to set a prefix for evaluation. That prefix works as a means to add code into a namespace of choice. The default for import is to use a prefix per file and use the filename as the prefix, but a custom prefix (even empty) can be used. Variables with no prefix can be accessed as main.name from other namespaces.

If you need a way to pass a namespace-like structure as a value, you have to use a dictionary. The var..field syntax is convenient when using dictionaries for such purpose. Also, note that the p. and q. prefixes have a special meaning and are used to refer to projection variables and quoted variables in field expressions.

4.4.2 How does function rank work? #

The rank of a function is a semantic notion and should not be confused with syntactically resolved dyadic or monadic usage of a verbal operator.

Other than in projections created by implicitly fixing some of the firsts arguments, rank is mainly used in adverbial forms, for example with the “fold” adverb /, which depending on the rank of the function it modifies can either mean fold (rank > 1) or while/converge (rank 1). Note that derived verbs also have a rank, which depends both on the adverb and modified value:

4.4.3 Which values are false in conditionals? #

There are several kinds of false values in Goal: 0, 0i, 0n, -0w, "", rx//, the identity function (:), empty arrays and error values.

False values only matter when used in conditionals: the ?[cond;then;else] form, as well as short-circuiting and[x;y;…] and or[x;y;…]. Numerical 0 is the most common form of false value, while 0i being false is useful so that |/ on an empty list of booleans returns a false value. NaNs 0n and floating point negative infinity -0w being false is consistent with 0i being false. Also, empty strings or arrays and error values being false is often convenient for obvious reasons.

4.4.4 How do zero value fills work? #

In primitives that require it, Goal uses zero values to provide fill elements, like for example in outdexing, folds over empty arrays, or in the take/pad i@y verb form. The kind of zero value fills used for an array X depends on its type and/or first element, as described in the following table:

Note that this means empty generic arrays may lose type information: this could be partially solved with a smarter prototype/fill system, but Goal’s behavior is simpler to understand and implement.

Zero value fills are useful in practice: for example, outdexing the result of “group by” will always return an empty array. Also, by definition, zero values are also false values, which can occasionally be convenient.

4.4.5 How do error values work? #

Goal has a dedicated value type "e" for non-fatal errors, as for example returned by some IO primitives. As seen in the tutorial, custom error values of any kind can be produced with error. In addition to the .e form that retrieves the underlying value, and the syntactic sugar 'e that is used to return early on error, error values are special in that they are callable using the e..msg form to produce an error message suitable for user consumption, instead of a program string representation as $e would return.

The e..msg call works in different ways depending on the type of error. If .e is a string, number, or array, then e..msg returns a default string representation suitable as error message. Otherwise, the value (.e)..msg is computed: if it is a function f, a default string representation of f@.e is returned; otherwise, a default string representation of (.e)..msg itself is returned. The e..msg call is implicit in string interpolation, say and print, as those contexts expect an error message suitable for user consumption.

4.5 Primitives #

4.5.1 Which primitives generalize operations to arrays element-wise? #

This is the case of arithmetic primitives, but also of most other primitives when the generalization is useful and does not conflict with other usages. Primitives recursively operating at the scalar or atomic level are often referred to as scalar, pervasive, or atomic functions, but there’s no universal terminology for them. Some non-arithmetic primitives are right-atomic, meaning they handle the right argument recursively but not the left one, like for example mod/div i!n. More rarely, a primitive can be left-atomic, like rotate.

Primitives working on more than two arguments may be pervasive on one or more of the arguments. For example, the time verb is pervasive with respect to the time t argument but not the others. The amend form is pervasive with respect to the index and function arguments.

The case of IO primitives is worth mentioning: none of them is truly pervasive. In particular, they do not operate recursively on generic array inputs. However, when the help says they work on file name string inputs, they may accept a list of strings too: this is the case for import, mkdir, read, remove, and stat. Unlike when calling them using the “each” adverb, they return an error as soon as a call on any of the files produces one. Also, both mkdir and remove return the number of files on success, while stat returns a table-like dict, instead of a list of dicts. The monads abspath and glob also accept a list of strings but with special non-pervasive semantics: abspath first joins the path elements into one, while glob gathers the path results for each pattern in a single common list of strings.

4.5.2 Which primitives support dictionaries? #

In most cases, primitives handle dictionary values by applying to their value arrays but returning matching keys along when sensible. This happens, for example, in monadic arithmetic forms like -d, or dyadic ones when only one argument is a dict like n+d, as well as structural operations like reverse |d that simply reverses both key and value arrays. For example:

The case of dyadic operations on dictionary pairs, for arithmetic operators or merge, doesn’t require both dicts to have a matching set of keys, using zero value fills as needed. For example, ..[a:1;b:2]+..[b:3;c:4] gives ..[a:1;b:5;c:4]. Also note that the merge form d,d uses upsert semantics: when some keys are common to both dicts, it gives priority to values of the right one.

The help only mentions cases that have some special dictionary-specific semantics. Also, some primitives have additional forms for dictionaries with string keys and columnar values: notation t is used to refer to such table-like dicts in the help.

4.5.3 Why don’t weed-out and replicate call the filter for each element? #

By passing the whole argument to the filter function and making it return an array, the performance of f#Y and f^Y is greatly improved: the filter is only called once and performs fast whole-array operations.

It’s also more flexible, as you can use the whole array to compute the filtering condition: for example, {x<(+/x)%#x}#Y keeps all values lesser than the average.

4.5.4 How do format strings s$ work? #

Format strings follow the conventions of Go’s fmt package with some limitations: only %-format “verbs” related to integers, floating-point numbers and strings are supported. Applying a format string to other kinds of atomic values or using other formatting verbs will produce an unspecified string result, but later Goal versions may extend and specify support beyond that.

Format strings can work in two distinct modes in Goal. If there is only one %-format verb in s, the s$y form is right-atomic, applying recursively to each atom in y. Otherwise, y is expected to be an array or dict of same length as the number of %-format verbs appearing in s, matching %-format verbs to values in occurrence order (except if the %[n] indexing syntax is used).

4.5.5 How does regexp application work? #

The mandatory argument x should be a string or a recursively-handled list whose atoms are all strings. If no additional arguments are provided, as in r[x], the type of result depends on whether the regular expression r contains capturing groups. If it does not contain a capturing group, the operation returns a boolean integer telling whether the string contains any match of the regular expression. Otherwise, the result is a list containing the whole regexp leftmost match, followed by any submatches.

The optional argument i specifies the maximum number of matches (negative means any number). A regexp with capturing groups still behaves the same, but returns a list of results for successive matches. A regexp without capturing groups now simply returns a list of successive string matches.

The optional final argument s can be used to ask for a specific kind of result, irrespectively of the presence of any capturing groups in the regexp. The default behavior is convenient most of the time, but you might occasionally want to avoid it. If i is not provided, a value of "i" for s indicates an integer boolean result is wanted; a value of "s" asks instead for a string match result for the whole regexp. If i is also provided, "s" simply indicates that the result should ignore capturing groups and treat them as non-capturing ones (single string result for each successive match).

4.5.6 When does converge stop? #

The f/x and f\x forms both repeatedly apply a monadic function to successive results and stop when the next result matches either the current one or the original input. The f\x form gathers all intermediate results, like other scan-like forms.

4.5.7 Can you return early from fold/scan? #

Goal supports a “fold while” form F/[f;x;y] that combines the “fold” and “while” adverbial forms. It works like a seeded fold, but it takes an additional first argument f that is called on the accumulator x before each iteration. If f x returns a false value, the iteration will stop early and return x. There’s also a “scan while” form F\[f;x;y] that works in a similar way but collecting all prior results. Note that both forms support as many list arguments as the rank of F minus one, like for plain “fold” and “scan”.

Pure code rarely needs those extensions of “fold” and “scan”, as there are often easy and efficient ways to determine the portion of the array you want to iterate on. However, it can sometimes come handy, like when the iterating function F can return an error, for example after some IO processing, making it easy to return early in case of error. The alternatives in such cases, like using a plain “while” or a recursive function, are cumbersome.

4.5.8 How do decode/encode work? #

Decode I/x is simply a polynomial evaluation function, implemented using Horner’s method, mostly equivalent to the adverbial projection 0{z+x*y}/. The argument x in I/x represents the coefficients, and I represents the bases, which can be a single number or a list of numbers.

Encode I\x is the inverse of decode, limited to non-negative integer bases. As a special case, using 0 as base in encode is equivalent to using an integer larger than the maximum value being encoded, so it can be a convenient leftmost base.

4.5.9 How does time handle locations? #

Specifying a location follows the conventions described in the documentation of the LoadLocation function from Go’s time package, using the mapping from names to locations described by the IANA Time Zone database. Note that "UTC" is always the default location, both when parsing and formatting. To use the local time zone, you need to pass "Local" as location.

Location is used when formatting integer time values into a string representation, and when parsing non-integer time values without time zone offset information. In the latter case, formatting will still use "UTC". If you need to convert from a string representation of time to another with a specific location, you first need to convert to an integer time value.

4.5.10 How do time layouts work? #

Time layouts follow the conventions described in Go’s time package, using the reference time "01/02 03:04:05PM '06 -0700", which is easy enough to remember (except for the historic convention of month 01 before day 02). For conveniency, the predefined constant layouts can be used as fmt format argument too, for example "UnixDate" or "DateOnly". Any new additions in the Go package will only be supported after some delay, because we want to support the latest two Go releases at least.

4.5.11 Does time support date calculations? #

The time verb does not provide a special way to add raw durations, subtract time values, or perform comparisons on them. Those can be performed on an integer representation of the time like the one given by "unix". Note hence that, unlike time.Time values in Go, time difference operations on values obtained in successive reads of the current time won’t implicitly use a monotonic clock reading.

Date calculations, like adding months or days, can be performed using the "date" format, as it normalizes values outside their usual range, like the time.Date function from Go’s time package. For example, time["date";2024 14 0;"date"] returns 2025 1 31. As per the help, some format names can be used both as cmd or fmt argument: here, the first "date" argument asks for results in (year;month;day) form, while the "date" used as third argument determines how the time argument is interpreted. Like time.Date, the "date" format also accepts extra optional clock fields for hour, minute, second, and nanoseconds. Moreover, note that to process a list of dates at once, you have to pass a single (I;I;I) list with all years, months and days instead. This is in line with how other primitives like odometer or encode/decode work in Goal, the rationale being that a short list of long nested ones is more efficient and convenient than a long one of short nested ones.

4.5.12 How does json handle booleans and nulls? #

Parsing and encoding use -0w for false and 0w for true. Similarly, 0n is used for nulls. One has to be careful of not encoding infinity by mistake. The choice of using special floating values is somewhat arbitrary but simple to implement and understand, and it takes advantage of -0w being a false value, and of 0n being easy to handle thanks to the nan verb. Note that some kind of arbitrary choice had to be made anyway, because Goal lacks both a dedicated boolean type and a general kind of null value.

4.5.13 What kinds of errors do primitives return? #

IO primitives, as well as parsing primitives and forms like json, time or "v"$, return an error value when appropriate. The underlying values of such errors can be plain strings, but they are usually dicts. In the latter case, the only mandatory key is "msg", which provides an error-message suitable for user consumption. See the question about how error values work for the basics of error handling in Goal.

Depending on the kind of error, various kinds of fields with extra information can be provided, as described in the following table:

The lib/os.goal and lib/fs.goal user libraries define a few globals with portable error strings that can be compared with the "err" field when present: ErrInvalid, ErrPermission, ErrExist, ErrNotExist, and ErrClosed. Each of those abstracts one or more concrete kinds of errors in a portable way.

4.5.14 What is the purpose of dirfs? #

Goal’s verbs glob, import, open, read, and stat accept a read-only file system value as left argument, with similar semantics as the monadic cases but using the provided file system value instead of the host’s file system rooted at the current directory. There is also a subfs dyad that is used to derive a new file system rooted at a subtree. All those operations accept slash-separated paths, working portably on all systems.

In the case of import, the semantics follow the same rules as the empty extension GOALLIB case but using the provided file system value instead, and accepting a file name with extension as well.

The dirfs monad returns a read-only file system value as provided by the host operating system, rooted at the given directory. Extensions may provide other kinds of file system values that can then be used from Goal using the same builtins.

4.5.15 Are stdin/stdout/stderr/… configurable in run and open? #

Both run and open accept a left dict argument to handle more advanced usage cases. By default, commands inherit from parent stdin, stderr, and stdout, as well as the environment and working directory. When using a dict, the open dyad uses the special "mode" key for specifying the mode, defaulting to "r". The accepted configuration keys are described in the following table:

When used in pipe configuration, file handles using modes "r" or "w" with buffering disabled will be directly connected to the standard input/output/error of the process.

4.5.16 How do "sr" and "sw" open modes work? #

These modes produce handles for reading and building in-memory strings. The string-writer "sw" mode can be useful when redirecting standard output or error in a command pipe, or when progressively building a long string. The string-reader "sr" mode is less useful, but can be used to pass a string as a handle to a function expecting a handle argument.

The "sw" open mode accepts either an initial string s argument, or an initial buffer size i argument.

4.6 Caveats #

4.6.1 Why do -2+3 and - 2+3 give different results? #

In the first case, -2 is parsed as a single token. In the second case, the - represents the verb “negate”.

4.6.2 How do operations behave with 0n NaN inputs? #

Goal follows the usual floating-point conventions for NaN values in most operations. Any atomic comparison primitive (among =, <, and >) where either operand is 0n will return 0. Most arithmetic operations (like +, -, *, %, or abs) propagate NaN values, returning 0n when either operand is NaN. In a few of special cases (min &, max |, and sign), behavior with NaN inputs is implementation-dependent (in particular, amd64 builds don’t propagate NaN values in a standard way for min/max unless the -tags nosse4 build option was used).

When appropriate, to avoid any issues, use the nan verb’s monadic and dyadic forms to search for or replace NaN values.

Note that non-atomic comparison primitives are not concerned by the floating-point standard rules. In particular, the match ~ dyad supports NaNs and 0n~0n holds. As a result, all (self-)search primitives like “classify”, “find” or “distinct” support inputs containing 0n too using the same matching convention. Also, as a special case, sorting primitives sort NaNs before other numeric values.

4.6.3 Implicit numeric type conversions and overflow #

Goal has both 64-bit integer and floating point numbers, whose types are "i" and "n" as returned by @ respectively. Primitives convert from one to another whenever possible, so most applications do not have to care about this distinction.

Conversion from integer to float means that big integers might be approximated. From float to integer, if the float is too big to be represented or is NaN, it will not be considered an integer by primitives that want an integer. Also, operations on integer operands can overflow, as defined by two’s complement integer overflow.

One thing worth noting is that while integers and floats are two different types, Goal does not allow flat generic arrays with mixed floats and integers: it will convert all elements to floats in such cases, because that’s what’s most convenient and efficient in the common case. If mixed numeric types without coercion are needed, you will have to enlist the values separately or append a dummy generic value.

4.6.4 Fold and Each on empty lists #

When applying a function f using fold on an empty list x, as in f/x, the result is the default zero value fill for that type of list. There is a special exception, though: specially recognized adverbial forms (see section on special combinations) return a neutral element for the involved operation instead, which helps avoid unwanted edge cases in common operations. For example */!0 returns 1, |/!0 returns 0i (both the smallest possible integer and a false value), and |/?0 returns -0w.

The “each” adverb also has a special consideration with respect to empty lists: the result is usually an empty generic list, but specially recognized forms may return specialized kinds of empty lists. For example #'() returns !0, and $'() returns !"".

4.7 Scripting #

4.7.1 How do you exit early with a status from a script? #

When executing a script using the goal command-line interpreter, using return :x in global code exits the script immediately. The exit status will be non-zero if x is an error, and 0 otherwise. In the error case, the error message will be displayed on standard error before execution ends. Also, if the error value is a dict with a key code, the corresponding value is used to set the exit error code, instead of the default 1, following a convention similar to the one of the run dyad. Only portable integer values within [1,125] are supported for the exit error code.

When executing the code using the Go API instead, it is possible to inspect the returned value and handle it in whatever way is most appropriate.

Note that you can’t exit directly from within a lambda: you have to return from there before, and then return early in global code.

4.7.2 Is there something similar to awk or perl -p one-liners ? #

Goal does not provide a built-in command-line option for that kind of mode of operation, but the examples/goalx script at the root of the distribution provides an alternative that can be used for similar purposes.

Note that the goalx script works a bit differently from AWK and Perl, because Goal favors loading a whole file into memory and performing whole-array operations on all the lines, like other array languages, instead of working line by line.

4.7.3 Is there editor support for Vim/Emacs/LSP/...? #

Advanced features like language-aware auto-completion and syntax checkers are unlikely to be very useful when writing typical Goal scripts, but some syntax highlighting is nice.

The vim-goal repository provides syntax highlighting support for Vim. It is what I use and maintain. There is not yet support for other editors (that I’m aware of). Highlighting beyond strings, numbers and comments is not really that useful, so adding support for basic syntax highlighting should be simple enough to do for most editors, and might be both a good contribution and learning experience. LSP would be a nice bonus.

4.8 Interactive use #

4.8.1 How do you quit the REPL? #

Quitting the read-eval-print-loop is done by closing standard input. This can be done with Ctrl-D on Unix-like systems. Typing close STDIN also works.

4.8.2 Can you get standard shortcuts and completion in the REPL? #

Goal’s interactive mode is quite minimal, so you need to use an external program like the readline-wrapper rlwrap to get a more convenient interactive experience. It’s as easy as installing rlwrap and then typing rlwrap goal instead of just goal. The rlwrap program provides some programmable completion functionality.

Also, the ongoing project ari by semperos aims at providing an interactive programming environment built on top of Goal. Among other extra features, including a dedicated SQL mode, it provides language-aware auto-completion and a more interactive help than Goal’s default minimal REPL.

4.8.3 How do you clear the screen in the REPL? #

On systems that have a clear command, you can clear the screen in interactive mode by executing print run "clear";.

4.9 Implementation #

4.9.1 How is Goal implemented? #

Goal is implemented as an embeddable bytecode interpreter, written in Go, without any dependencies outside the standard library. Go provides good garbage collection, a comprehensive standard library, fast compilation, and a higher-level library interface than a non-GC language would. As a tradeoff, we cannot catch out of memory errors reliably in programs, and a panic is expected in such cases.

The implementation makes use of a recursive-descent parser that provides quite accurate error messages.

Interestingly, Goal is at least the third project for an array language in Go, after ivy and ktye/i. While Goal and Go are quite the opposite in terms of conciseness due to the huge gap in their programming paradigms, they both share a practical mindset that encourages idioms over abstraction, and writing executable code over writing declarations.

4.9.2 Is Goal’s performance any good? #

Array performance is good, with specialized algorithms depending on inputs (type, size, range), and variable liveness analysis that reduces cloning by reusing dead immutable arrays (in code with limited branching). On amd64, many algorithms have assembly SIMD implementations using the various SSE extensions up to SSE4.2 (no AVX yet). However, it is not a goal to reach state-of-the-art: we don’t use bit booleans nor the full range of integer sizes, fitting integers in arrays using either uint8 or int64 elements, striking a middle-ground between performance and implementation simplicity.

Scalar performance is typical for a bytecode-compiled interpreter (without JIT), somewhat slower than a well-tuned C bytecode interpreter: value representation in Go is less compact than how it could be done in C, but Goal does have unboxed integers and floats.

4.9.3 Does Goal optimize any special combinations of primitives? #

In floating-point operations like +/N, the implementation may chose, for performance reasons, to ignore non-associativity, potentially producing slightly different results than the non-optimized and explicitly sequential {x+y}/N. This currently happens on amd64 with the default optimized SIMD build.

4.9.4 When does Goal perform in-place mutations? #

Goal’s arrays are immutable, but in cases the implementation can determine an array will not be used again, it will use in-place mutation in most operations where it makes sense, like arithmetic operators, join or amend.

Goal makes use of a reference count and a variable liveness compilation pass to determine if a value is reusable and will not be used again. In typical branchless code portions the last use of local variables is always determined. In code with branches the analysis is incomplete and may not always allow in-place mutation when variables are used in branches (though common cases are still handled, like when the branch ends the function, including early-return cases, or when an explicit assignment operation is used).

Because memory management is handled by Go’s GC, Goal only keeps track of a reference count for arrays, and only if they’re not nested, otherwise they’re simply marked as not reusable: this makes the implementation simpler and less error-prone and reference count handling faster in the common case, but if you have a matrix as a list of lists, any modification in a given line will replace the whole line, so consider batching updates or using a flat list in such cases.

array type	zero value fill
`"I"`	`0`
`"N"`	`0.0`
`"S"`	`""`
`"A"`	zero value of type `@*X` (defaulting to `!""` if `X` is empty)

key	description (type)
`"code"`	exit code of external command (i)
`"err"`	short description of the error’s nature (s)
`"layout"`	format layout of `time` parse error (s)
`"newpath"`	new path name in `rename` (s)
`"offset"`	number of bytes parsed before error occurred (in `json` and `"v"$`) (i)
`"oldpath"`	old path name in `rename` (s)
`"op"`	name of path-related operation (s)
`"out"`	standard output of external command (s)
`"path"`	file path (s)
`"syscall"`	name of syscall (s)
`"time"`	date/time of `time` parse error (s)

key	description (type)	`open` modes
`"buf"`	whether to enable buffering (i)	`"w" "a"`
`"dir"`	working directory (s)	`"pw" "pr"`
`"env"`	`"key=value"` environment list (S)	`"pw" "pr"`
`"err"`	stderr filename (s, `""` to discard) or handle (h)	`"pw" "pr"`
`"in"`	stdin filename (s) or handle (h)	`"pr"`
`"out"`	stdout filename (s, `""` to discard) or handle (h)	`"pw"`
`"s"`	stdin from input string (s) or lines (S)	`"pr"`

4 FAQ #

4.1 Origins #

4.1.1 Why a new array programming language? #

4.1.2 Influences #

4.1.3 What does the name “Goal” stand for? #

4.2 Design #

4.2.1 Why use so many symbols over words? #

4.2.2 Why are there a few non-ASCII symbols? #

4.2.3 Does Goal perform tail-call optimization? #

4.2.4 Why does Goal not have imperative loops? #

4.2.5 Why does Goal not have closures? #

4.2.6 Is Goal designed as a code golf language? #

4.2.7 Is Goal stable? Is backward compatibility expected? #

4.3 Syntax #

4.3.1 Is Goal space-insensitive? #

4.3.2 What’s the difference between newline and semicolon? #

4.3.3 Is : syntax or an operator? #

4.3.4 What are parens used for? #

4.3.5 How do you write a delimited comment? #

4.3.6 Which number and string literal syntax is supported? #

4.3.7 Which regexp syntax is supported? #

4.3.8 How do variable scoping rules work? #

4.3.9 How do tacit compositions work? #

4.3.10 How do .. and field expressions work? #

4.4 Semantics #

4.4.1 How does namespacing work? #

4.4.2 How does function rank work? #

4.4.3 Which values are false in conditionals? #

4.4.4 How do zero value fills work? #

4.4.5 How do error values work? #

4.5 Primitives #

4.5.1 Which primitives generalize operations to arrays element-wise? #

4.5.2 Which primitives support dictionaries? #

4.5.3 Why don’t weed-out and replicate call the filter for each element? #

4.5.4 How do format strings s$ work? #

4.5.5 How does regexp application work? #

4.5.6 When does converge stop? #

4.5.7 Can you return early from fold/scan? #

4.5.8 How do decode/encode work? #

4.5.9 How does time handle locations? #

4.5.10 How do time layouts work? #

4.5.11 Does time support date calculations? #

4.5.12 How does json handle booleans and nulls? #

4.5.13 What kinds of errors do primitives return? #

4.5.14 What is the purpose of dirfs? #

4.5.15 Are stdin/stdout/stderr/… configurable in run and open? #

4.5.16 How do "sr" and "sw" open modes work? #

4.6 Caveats #

4.6.1 Why do -2+3 and - 2+3 give different results? #

4.6.2 How do operations behave with 0n NaN inputs? #

4.6.3 Implicit numeric type conversions and overflow #

4.6.4 Fold and Each on empty lists #

4.7 Scripting #

4.7.1 How do you exit early with a status from a script? #

4.7.2 Is there something similar to awk or perl -p one-liners ? #

4.7.3 Is there editor support for Vim/Emacs/LSP/...? #

4.8 Interactive use #

4.8.1 How do you quit the REPL? #

4.8.2 Can you get standard shortcuts and completion in the REPL? #

4.8.3 How do you clear the screen in the REPL? #

4.9 Implementation #

4.9.1 How is Goal implemented? #

4.9.2 Is Goal’s performance any good? #

4.9.3 Does Goal optimize any special combinations of primitives? #

4.9.4 When does Goal perform in-place mutations? #

4.3.3 Is `:` syntax or an operator? #

4.3.10 How do `..` and field expressions work? #

4.5.4 How do format strings `s$` work? #

4.5.9 How does `time` handle locations? #

4.5.10 How do `time` layouts work? #

4.5.11 Does `time` support date calculations? #

4.5.12 How does `json` handle booleans and nulls? #

4.5.14 What is the purpose of `dirfs`? #

4.5.15 Are stdin/stdout/stderr/… configurable in `run` and `open`? #

4.5.16 How do `"sr"` and `"sw"` `open` modes work? #

4.6.1 Why do `-2+3` and `-` `2+3` give different results? #

4.6.2 How do operations behave with `0n` NaN inputs? #

4.7.2 Is there something similar to `awk` or `perl -p` one-liners ? #