:
syntax or an operator?
..
and field expressions work?
s$
work?
time
handle locations?
time
layouts work?
time
support date calculations?
json
handle booleans and nulls?
dirfs
?
run
and open
?
"sr"
and "sw"
open
modes work?
I’ve had a strong liking for array programming languages since quite a few years, but I never managed to like array-based text processing. The approach provides smart and efficient solutions for some kinds of parsing tasks, as BQN’s parser well illustrates, but it does not offer immediate solutions to various fundamental text processing needs that arise in common scripting tasks.
Also, Unicode and UTF-8 don’t play well with the array vision. Text is complicated, there is no straightforward mapping between “character” (a somewhat ill-defined notion that is maybe best approached by grapheme clusters) and bytes or code points. For example, some abstract characters cannot be encoded by a single code point, and some abstract characters have more than one possible encoding.
That’s why I feel a scripting language should consider strings as a whole in most operations, which in an array language carries implicit iteration over collections of strings as an added benefit. Moreover, other than providing all the usual string handling functions, I think there is value in integrating standard text-processing features into the design. Regular expressions are a nice tool for describing character classes and Unicode properties: having dedicated syntax support makes them more convenient and less error-prone. String interpolation and quoting constructs make simple templating more intuitive.
Text processing aside, Goal does have a few other characteristics that I wanted to see in an array language:
I wanted some of BQN’s primitives, but with less tacit stuff and leaving the multi-dimensional complexity out, like K. Dictionaries are nice to have too.
I wanted both “ASCII is easy to type” and “no digraphs”. Actually, there are a few exceptions that prove the ASCII rule to help with “no digraphs”, but they each have an ASCII keyword or idiom alternative, and they’re still common enough symbols that I have direct access to them on my bépo keyboard layout :-)
Goal’s easily embeddable and extensible in Go, which has a nice ecosystem and provides a higher-level interface than C.
Last but not least, I had some previous experience with various aspects of compilation, but I had never written a whole bytecode interpreter from scratch before: it’s a great and fun experience!
Goal made use of many inspiration sources both for design and implementation.
Language design was greatly inspired by both K (for syntax and many primitives), in particular the ngn/k dialect, and BQN (a few fundamental primitives like group by, classify and shifts). I was thinking of Perl and Raku when adding regexp literals, quoting constructs, and a couple of IO primitives. There is some inspiration from the implementation language, Go: similar syntax for numbers, string literals, and time layouts.
I wrote the bytecode implementation after reading the one for GoAWK, and it still shows. I wrote the scanner after reading ivy’s. Vim syntax highlighting is based on ngn/k’s. Be sure to check out all those great projects if you haven’t yet!
“Goal” stands for Go Array Language. Not to be confused with GOAL, as in Game Oriented Assembly Lisp, nor with its namesake GOAL the agent programming language!
Goal uses both symbols and keywords depending on the nature of the operation and its frequency.
Goal prefers symbols for common pure operations, like most programming languages do for arithmetic operations. As an array language, the range of interesting pure and common operations increases significantly. Also, due to practical considerations, Goal extends that reasoning to common string handling functions too.
The usefulness of concise notation is well-known in mathematics, and array languages have made use of it since early APL versions. Unlike APL or BQN, but like its main inspiration K, Goal prefers highly-polysemic symbols, preferably ASCII.
That choice is based on my personal experience that non-extensible polysemy is intuitive and natural as long as the meanings are both mnemonic and easily resolved by context. Note that using words would not be as well-suited in that respect, because natural language already has its own polysemic meanings.
For I/O operations, as well as less frequent operations, Goal uses keywords. This makes the distinction between math-like code and stateful code with side effects clear.
A few primitives have non-ASCII symbols:
«
,
»
,
¿
,
´
.
The first three have keyword alternatives. The last does
not have a single-keyword alternative (due to lack of
adverbial keywords which would be quite verbose anyway).
It does though have reasonable idiom alternatives. While not
ASCII, all of those symbols are still very common symbols
found even in Latin1 and used in many natural languages, so
they should be accessible on most systems without
configuration.
Goal optimizes tail calls made with the special
o
variable. Other kinds of tail calls are not optimized: in
particular, tail-call elimination is not performed for
mutually recursive functions.
Most loops in Goal are performed implicitly by primitive verbs. Other kinds of loops are replaced with functional alternatives using the various adverbial forms. Absence of any kind of imperative backwards flow in the language simplifies both language semantics and static analysis of bytecode in the implementation.
Both verbs and adverbs make the lack of imperative loop syntax mostly a non-issue. Sometimes, due to the lack of closures, you might need to explicitly pass around state from local variables, for example with the “while” adverbial form or when writing a tail-recursive function. You may also simply use a global: abusing globals is harder in a language where explicit loops are rare anyway.
Most other K dialects do not feature them either, or only some restricted form. The lack of closures is rarely felt, as functional projections can be used to pass variables as extra arguments: typical cases can be written succinctly with tacit composition and field expression syntax. In this context, not supporting closures helps with implementation simplicity and performance with little drawback.
If you feel like needing mutable closures at some point, you might want to check before that what you want to do can’t be done easily with the “while” or “fold while” adverbial forms, or using a tail-recursive function. If not, just try using a global variable instead: it’s the only source of mutable state in Goal, and it’s perfectly reasonable for typical scripts, as it should be at most an occasional need.
No, and I don’t even use it as such. I don’t have much experience with code golf. Most array languages are concise by their very nature and their larger range of symbol operators, so Goal shares some of this heritage.
Both Goal the language and the interpreter are stable. Since version 1 was released, programs written in Goal or using its Go API are expected to continue to work in future versions without changes, within some limits detailed below.
While compatibility will be maintained for most programs, there are a few limitations to the compatibility promise to keep in mind:
Bugs will be fixed, in particular if they violate documented behavior or rely on an unspecified edge-case nobody cares about. This is unlikely to break user code, but it could. The changelog should document those. Of course, given that Goal doesn’t provide a full language specification, “unspecified edge-case” should be interpreted magnanimously: while stability is an important priority, Goal is a practical but opinionated niche language, developed as a hobby project for fun.
New features, libraries, or optional extensions may be explicitly marked as experimental at first. The compatibility guarantees won’t apply to them until stabilized in a later release.
New keywords may be defined under the reserved
rt
and
os
dot prefixes (the latter is currently unused).
If new keywords are added, they’ll either use one of those
prefixes, or be optional (through a build option). It is
recommended that extensions provide a way to use a custom
prefix. Also, most of the time, exporting new functionality
as globals instead of new keywords is preferable.
Builtins like
read
or
stat
may return struct-like dict results: new fields may be
added, so dict length and key order may change.
Panic error messages may be improved: do not rely on their contents.
Goal is space-insensitive for the most part, except in a few cases where they’re either needed or forbidden to disambiguate:
A space on at least one of the dot sides disambiguates n-ary
application
a . b
from a variable name
a.b
with dot-prefix.
Also, field syntax
a..b
binds identifiers tightly.
No space is allowed between a verb or noun and the adverb
that modifies it.
Otherwise
/
,
\
and
'
would get their non-adverb syntax meanings: commenting,
logging, and early-return on error.
In an assignment operation, no space is allowed between the operator and the colon.
In bracket indexing, like in
f[x;y]
,
no space is allowed between the indexed (or applied) value
and the left bracket
[
.
Otherwise the code between brackets would be parsed as a
sequence instead.
When minus
-
is followed by a number with no space in between and also
doesn’t tightly follow a noun, it’s parsed as a single token
with the number and not as a verb.
No space is allowed between the quoting construct name, like
qq
or
rx
,
and the starting delimiter. This shouldn’t come as a
surprise, but Perl allows such a fancy thing, so I’m
mentioning it for exhaustivity :-)
Other than those few cases, you are free to use spacing in the way that seems the most readable for you.
As stated in the help, newlines are ignored after any of
({[
or before
)}]
but act as semi-colons otherwise. There’s still one
minor difference: duplicate newlines are ignored so that
it’s possible to freely use spacing and comments within a
multi-line list.
:
syntax or an operator? #Colon can represent various things in Goal: early return, a
verb returning its right argument, assignment syntax, or
either a monadic verb or modified assignment marker when
tightly following an operator (like in
+:
).
Despite all those different uses, there’s no confusion in
practice.
The assignment or modified assignment meaning is used when
it follows an identifier like
x
,
an indexing L-value like
x[i]
,
or a list of identifiers like
(a;b;c)
(only plain assignment).
If the colon is followed by indexing brackets, the verb
meaning is used. When there is an expression on the right,
but no noun on the left, a single colon means “return”.
Goal uses parens for two things: list creation and
controlling operation precedence. List creation happens when
one or more semicolons
;
appear within parens. The semicolon is used as item
separator, and items are evaluated left-to-right.
Otherwise, the parens are used to control precedence of
operations, as is usual in mathematics and most languages.
Lists with a single item are created using the “enlist”
monadic form
,x
.
As a special case, a couple of lone parens
()
represents an empty generic list. Note that other kinds of
empty lists don’t have a special syntax and have to be
produced using primitives: an empty list of
integers is
!0
,
an empty list of strings is
!""
,
and an empty list of floats is
?0
.
While not syntax, those forms are currently optimized by a
basic constant-folding pass.
Comments in Goal are either line or multi-line based, there
are no C-style delimited block comments allowing to control
both comment start and end within a line. However,
the discard
`expr
form that allows to ignore the expression on the right can
be used for similar purposes. As a special case,
(`expr)
is not parsed as an empty generic list, but is completely
ignored instead. This can be used to even more finely
control the portion of the code that should be ignored, when
necessary.
Also, the special
x:expr
form with literal number or string
x
can be used as a sort of prefix comment before an
expression: it is recognized and optimized, so there’s no
runtime overhead.
Number literals are based on Go’s integer and floating-point
literals. Goal first attemps to parse a number literal as a
64-bit integer using
ParseInt
(with automatic base recognition based on string prefix),
then as a 64-bit float using
ParseFloat,
or as a 64-bit integer duration using
ParseDuration.
Moreover, Goal introduces special literals for a few
specific numeric values:
0i
for the smallest 64-bit integer value,
0n
for NaN,
0w
for positive infinity, and
-0w
for negative infinity.
Double-quoted string literals are similar to Go’s too, as
described
in the specification,
but they can be multi-line and support variable
interpolation using
$var
or
${var}
,
meaning that a literal
$
needs to be escaped.
The
qq/STRING/
form accepts the same syntax as double-quoted strings, but
supports a custom delimiter, as described in the help.
The
rq/STRING/
is a raw string literal variant supporting a custom
delimiter that can still be inserted, but by doubling it
instead of by using a backslash.
The syntax for regexps is the same as the one described in
the
regexp/syntax
Go package, using the default Perl-like syntax. As an
extension, Goal allows multi-line regexps when using the
quoting construct
rx/PATTERN/
.
In multi-line regexp literals, leading and trailing spaces
on each line are ignored, and space followed by
/
(or
#
if
/
is already used as the regexp delimiter)
starts a comment that spans until the end of the line.
Variable scoping rules in Goal are simple: variables are either global or local to a lambda function. In other words, Goal has global scope and function scope. There is no concept of block, and nested functions don’t have access to their parent’s scope (see related question about the lack of closures).
Within a lambda, single-colon assignment defines local
variables, so globals have to be assigned using double-colon
::
there.
A global variable can still be accessed, but priority is
given to the local variable in case of naming conflict.
Note, however, that assignment operations aren’t ambiguous,
so
::
isn’t needed for them. Also, local variable names cannot
have dots in them, so
::
isn’t needed either for global names with dots.
Note: a related question is that keywords and variables share the same namespace in source code: keywords are resolved early during scanning and take priority over variable names. They are fixed and cannot be reassigned in any way from Goal code.
Goal’s tacit compositions are similar to other K dialects,
but they are just sugar for a lambda or a lambda projection.
A composition is formed from any kind of expression that
ends in a verb: it simply produces an equivalent function
with the implicit arguments added at the end. If the last
verb is monadic, the function takes just one argument
x
.
If it is dyadic, the function takes two arguments
x
and
y
.
Both dyadic built-in operators and derived verbs can be made
monadic by appending a
:
without spacing.
Dyadic keyword verbs can be made monadic by adding
::
at the end.
Note that arity in tacit compositions is a syntactic notion
unrelated to the semantic concept of
function rank.
Most compositions translate easily into a lambda, but when compositions make use of non-constant expressions, they are represented as a lambda projection. In particular, compositions do not capture global variables: those get automatically passed as extra arguments.
Some array languages support more complex tacit features, like BQN or J. Those features carry some cognitive overhead, at least for me. I find the regular switching between explicit and tacit styles distracting.
..
and field expressions work? #Double dot
..
is syntax sugar that can be used for several purposes, but
the main motivation is allowing concise evaluation of
expressions under a dict by referring to values using string
keys as unquoted variable names.
The simplest case is a tight bind between identifiers as in
x..a
,
which is parsed as a single token and expands to
x["a"]
.
There’s a special consideration with respect to bracket
indexing as in
x..a[y]
,
which expands to
x["a";y]
instead of
x["a"][y]
.
In the common case where
x
holds a dict to some nested array, the result is the same,
but merging both applications is closer to how one would
write such cases by hand without using the
..
syntactic sugar, and it also simplifies making extensions
with new kinds of values callable in a method-like style
(for example an hypothetical heap value type could be made
usable as in
x..push[y]
).
You can write
(x..a)[y]
to prevent this merging behavior when appropriate.
When
..
is not tightly surrounded by two identifiers, as in
x .. expr
where
x
can be any kind of expression,
a more general expansion is performed: if provided,
x
is passed as a single argument to a lambda (or lambda
projection) described by
expr
using special variable scoping rules, without braces.
Any variable named
a
(without dot-prefix and not among
x
,
y
,
z
,
and
o
)
appearing in
expr
expands to
x..a
,
that is
x["a"]
.
Variables with a
p.
prefix are passed as extra projection arguments from the
parent context (without the prefix), while variables with a
q.
quoting prefix are inserted as-is (without the prefix).
Other kinds of dot-prefixed names always represent globals
and don’t get any kind of special treatment.
Note that those expansion rules are not only convenient for
concise dict manipulation but also sometimes for concise
definition of lambda projections without having to
explicitly pass local variables from a parent context as
extra arguments.
When
..
is tightly followed by an opening bracket
[
,
we get unquoted assignment-style syntax for dict amend
d..[a:e1;b:e2]
,
that expands to
@[x;"a""b";:;x..(e1;e2)]
,
which is further expanded following the field expression
expansion rules already described.
When
d
is not provided, the syntax is simply used as dict creation
syntax, expanding directly to
"a""b"!(e1;e2)
,
as already shown in the help.
As a special case, when keys are provided without a
corresponding value in dict creation, as in
..[a;b]
,
the value is assumed to come from a variable named as that
key, expanding to
..[a:a;b:b]
,
that is
"a""b"!(a;b)
.
This is useful to create a dict from a few variables by
using their names as keys without naming redundancy.
Practical examples of the various kinds of usages for
..
can be found in the
Working with tables
chapter.
Goal makes use of a simple flat naming scheme for globals so
that they can be efficiently stored in an array, supporting
fast access by index. For convenience, variable names can
make use of a dot prefix as in
pfx.name
.
Some primitives, such as
eval
and
import
,
allow to set a prefix for evaluation. That prefix works as
a means to add code into a
namespace
of choice.
The default for
import
is to use a prefix per file and use the filename as the
prefix, but a custom prefix (even empty) can be used.
Variables with no prefix can be accessed as
main.name
from other namespaces.
If you need a way to pass a namespace-like structure as a
value, you have to use a dictionary. The
var..field
syntax is convenient when using dictionaries for such
purpose. Also, note that the
p.
and
q.
prefixes have a special meaning and are used to refer to
projection variables and quoted variables in
field expressions.
In Goal, functions have a semantic rank (default arity). For example:
Monadic keywords and substitutions have a rank of 1.
Verbal operators and dyadic keywords have a rank of 2, even
when they have also a monadic case. Verbal operators can be
made monadic by appending a colon, as in flip
+:
.
Monadic marking with colon is also used in
tacit compositions
but for syntactic reasons instead of semantic function rank.
The rank of a lambda is the number of arguments, which unlike for built-in verbs is fixed.
The rank of a projection is either given by the number of gaps when created with explicit bracket projection syntax, or by the rank of the function minus the number of fixed arguments (for lambdas, projections and derived verbs).
The rank of a function is a semantic notion and should not be confused with syntactically resolved dyadic or monadic usage of a verbal operator.
Other than in projections created by implicitly fixing some
of the firsts arguments, rank is mainly used in adverbial
forms, for example with the “fold” adverb
/
,
which depending on the rank of the function it modifies can
either mean fold (rank > 1) or while/converge (rank 1). Note
that derived verbs also have a rank, which depends both on
the adverb and modified value:
The rank of
f'
is the rank of
f
,
or 1 if
f
is not a function.
The rank of
F`
and
F´
is always 2.
The rank of
f/
and
f\
is usually equal to the rank of
f
,
but folds and scans have a special rule: when deriving from
a function of rank 2, the result has rank 1 (corresponding
to the non-seeded case), instead of rank 2. This is for
convenience, so that idioms like
,//
use the monadic meaning of
,/
which is much more useful and frequent.
There are several kinds of false values in Goal:
0
,
0i
,
0n
,
-0w
,
""
,
rx//
,
the identity function
(:)
,
empty arrays and error values.
False values only matter when used in conditionals: the
?[cond;then;else]
form, as well as short-circuiting
and[x;y;…]
and
or[x;y;…]
.
Numerical
0
is the most common form of false value, while
0i
being false is useful so that
|/
on an empty list of booleans returns a false value.
NaNs
0n
and floating point negative infinity
-0w
being false is consistent with
0i
being false. Also, empty strings or arrays and error values
being false is often convenient for obvious reasons.
In primitives that require it, Goal uses zero values to
provide fill elements, like for example in outdexing,
folds over empty arrays,
or in the take/pad
i@y
verb form.
The kind of zero value fills used for an array
X
depends on its type and/or first element, as described in
the following table:
array type | zero value fill |
|
|
|
|
|
|
|
zero value of type
|
Note that this means empty generic arrays may lose type information: this could be partially solved with a smarter prototype/fill system, but Goal’s behavior is simpler to understand and implement.
Zero value fills are useful in practice: for example, outdexing the result of “group by” will always return an empty array. Also, by definition, zero values are also false values, which can occasionally be convenient.
Goal has a dedicated value type
"e"
for non-fatal errors, as for example returned by some IO
primitives. As seen in the tutorial, custom error values of
any kind can be produced with
error
.
In addition to the
.e
form that retrieves the underlying value, and the syntactic
sugar
'e
that is used to return early on error, error values are
special in that they are callable using the
e..msg
form to produce an error message suitable for
user consumption, instead of a program string representation
as
$e
would return.
The
e..msg
call works in different ways depending on the type of error.
If
.e
is a string, number, or array, then
e..msg
returns a default string representation suitable as error
message.
Otherwise, the value
(.e)..msg
is computed: if it is a function
f
,
a default string representation of
f@.e
is returned; otherwise, a default string representation of
(.e)..msg
itself is returned.
The
e..msg
call is implicit in string interpolation,
say
and
print
,
as those contexts expect an error message suitable for user
consumption.
This is the case of arithmetic primitives, but also of most
other primitives when the generalization is useful and does
not conflict with other usages. Primitives recursively
operating at the scalar or atomic level are often referred to
as
scalar,
pervasive,
or
atomic
functions, but there’s no universal terminology for them.
Some non-arithmetic primitives are right-atomic, meaning
they handle the right argument recursively but not the left
one, like for example mod/div
i!n
.
More rarely, a primitive can be left-atomic, like
rotate
.
Primitives working on more than two arguments may be
pervasive on one or more of the arguments. For example, the
time
verb is pervasive with respect to the time
t
argument but not the others.
The amend form is pervasive with respect to the index and
function arguments.
The case of IO primitives is worth mentioning: none of them
is truly pervasive. In particular, they do not operate
recursively on generic array inputs. However, when the help
says they work on file name string inputs, they may accept a
list of strings too: this is the case for
import
,
mkdir
,
read
,
remove
,
and
stat
.
Unlike when calling them using the “each” adverb, they
return an error as soon as a call on any of the files
produces one. Also, both
mkdir
and
remove
return the number of files on success, while
stat
returns a table-like dict, instead of a list of dicts.
The monads
abspath
and
glob
also accept a list of strings but with special
non-pervasive semantics:
abspath
first joins the path elements into one, while
glob
gathers the path results for each pattern in a single common
list of strings.
In most cases, primitives handle dictionary values by
applying to their value arrays but returning matching keys
along when sensible. This happens, for example, in
monadic arithmetic forms like
-d
,
or dyadic ones when only one
argument is a dict like
n+d
,
as well as structural operations like reverse
|d
that simply reverses both key and value arrays.
For example:
-..[a:1;b:2] → ..[a:-1;b:-2]
|..[a:1;b:2] → ..[b:2;a:1]
The case of dyadic operations on dictionary pairs, for
arithmetic operators or merge, doesn’t require both dicts to
have a matching set of keys, using zero value fills as
needed. For example,
..[a:1;b:2]+..[b:3;c:4]
gives
..[a:1;b:5;c:4]
.
Also note that the merge form
d,d
uses upsert semantics: when some keys are common to both
dicts, it gives priority to values of the right one.
The help only mentions cases that have some special
dictionary-specific semantics. Also, some primitives have
additional forms for dictionaries with string keys and
columnar values: notation
t
is used to refer to such table-like dicts in the help.
By passing the whole argument to the filter function and
making it return an array, the performance of
f#Y
and
f^Y
is greatly improved: the filter is only called once and
performs fast whole-array operations.
It’s also more flexible, as you can use the whole array to
compute the filtering condition: for example,
{x<(+/x)%#x}#Y
keeps all values lesser than the average.
s$
work? #Format strings follow the conventions of
Go’s fmt package
with some limitations: only
%
-format
“verbs” related to integers, floating-point numbers and
strings are supported. Applying a format string to other
kinds of atomic values or using other formatting verbs will
produce an unspecified string result, but later Goal
versions may extend and specify support beyond that.
Format strings can work in two distinct modes in Goal.
If there is only one
%
-format
verb in
s
,
the
s$y
form is right-atomic, applying recursively to each atom in
y
.
Otherwise,
y
is expected to be an array or dict of same length as the
number of
%
-format
verbs appearing in
s
,
matching
%
-format
verbs to values in occurrence order (except if the
%[n]
indexing syntax is used).
General regexp application has the following signature:
r[x[;i;s]] .
The mandatory argument
x
should be a string or a recursively-handled list whose atoms
are all strings. If no additional arguments are provided, as
in
r[x]
,
the type of result depends on whether the regular expression
r
contains capturing groups. If it does not contain a
capturing group, the operation returns a boolean integer
telling whether the string contains any match of the regular
expression. Otherwise, the result is a list containing the whole
regexp leftmost match, followed by any submatches.
The optional argument
i
specifies the maximum number of matches (negative means any
number). A regexp with capturing groups still behaves the
same, but returns a list of results for successive matches.
A regexp without capturing groups now simply returns a list
of successive string matches.
The optional final argument
s
can be used to ask for a specific kind of result,
irrespectively of the presence of any capturing groups in
the regexp. The default behavior is convenient most of the
time, but you might occasionally want to avoid it. If
i
is not provided, a value of
"i"
for
s
indicates an integer boolean result is wanted; a value of
"s"
asks instead for a string match result for the whole regexp.
If
i
is also provided,
"s"
simply indicates that the result should ignore capturing
groups and treat them as non-capturing ones (single string
result for each successive match).
The
f/x
and
f\x
forms both repeatedly apply a monadic function to successive
results and stop when the next result matches either the
current one or the original input. The
f\x
form gathers all intermediate results, like other scan-like
forms.
A common idiom based on converge is
,//X
for flattening a list of any depth.
Goal supports a “fold while” form
F/[f;x;y]
that combines the “fold” and “while” adverbial forms.
It works like a seeded fold, but it takes an additional
first argument
f
that is called on the accumulator
x
before each iteration. If
f x
returns a false value, the iteration will stop early and
return
x
.
There’s also a “scan while” form
F\[f;x;y]
that works in a similar way but collecting all prior
results. Note that both forms support as many list arguments
as the rank of
F
minus one, like for plain “fold” and “scan”.
Pure code rarely needs those extensions of “fold” and
“scan”, as there are often easy and efficient ways to
determine the portion of the array you want to iterate on.
However, it can sometimes come handy, like when the
iterating function
F
can return an error, for example after some IO processing,
making it easy to return early in case of error. The
alternatives in such cases, like using a plain “while” or a
recursive function, are cumbersome.
Decode
I/x
is simply a polynomial evaluation function, implemented
using Horner’s method, mostly equivalent to the adverbial
projection
0{z+x*y}/
.
The argument
x
in
I/x
represents the coefficients, and
I
represents the bases, which can be a single number or a list
of numbers.
Encode
I\x
is the inverse of decode, limited to non-negative
integer bases. As a special case, using
0
as base in encode is equivalent to using an integer larger
than the maximum value being encoded, so it can be a
convenient leftmost base.
time
handle locations? #Specifying a location follows the conventions described in
the documentation of the
LoadLocation
function from
Go’s time package,
using the mapping from names to locations described by the
IANA Time Zone database.
Note that
"UTC"
is always the default location, both when parsing and
formatting. To use the local time zone, you need to pass
"Local"
as location.
Location is used when formatting integer time values into a
string representation, and when parsing non-integer time
values without time zone offset information. In the latter
case, formatting will still use
"UTC"
.
If you need to convert from a string representation of time
to another with a specific location, you first need to
convert to an integer time value.
time
layouts work? #Time layouts follow the conventions described in
Go’s time package,
using the reference time
"01/02 03:04:05PM '06 -0700"
,
which is easy enough to remember (except for the historic
convention of month
01
before day
02
).
For conveniency, the predefined constant layouts can be used
as
fmt
format argument too, for example
"UnixDate"
or
"DateOnly"
.
Any new additions in the Go package will only be supported
after some delay, because we want to support the latest two
Go releases at least.
time
support date calculations? #The
time
verb does not provide a special way to add raw durations,
subtract time values, or perform comparisons on them. Those
can be performed on an integer representation of the time
like the one given by
"unix"
.
Note hence that, unlike
time.Time
values in Go, time difference operations on values obtained
in successive reads of the current time won’t implicitly use
a monotonic clock reading.
Date calculations, like adding months or days, can be
performed using the
"date"
format, as it normalizes values outside their usual range,
like the
time.Date
function from
Go’s time package.
For example,
time["date";2024 14 0;"date"]
returns
2025 1 31
.
As per the help, some format names can be used both as
cmd
or
fmt
argument: here, the first
"date"
argument asks for results in
(year;month;day)
form, while the
"date"
used as third argument determines how the time argument is
interpreted.
Like
time.Date
,
the
"date"
format also accepts extra optional clock fields for hour,
minute, second, and nanoseconds.
Moreover, note that to process a list of dates at once, you
have to pass a single
(I;I;I)
list with all years, months and days instead.
This is in line with how other primitives like odometer or
encode/decode work in Goal, the rationale being that a short
list of long nested ones is more efficient and convenient
than a long one of short nested ones.
json
handle booleans and nulls? #Parsing and encoding use
-0w
for
false
and
0w
for
true
.
Similarly,
0n
is used for nulls. One has to be careful of not encoding
infinity by mistake. The choice of using special floating
values is somewhat arbitrary but simple to implement and
understand, and it takes advantage of
-0w
being a false value, and of
0n
being easy to handle thanks to the
nan
verb. Note that some kind of arbitrary choice had to be made
anyway, because Goal lacks a dedicated boolean type and
general kind of null value.
IO primitives, as well as parsing primitives and forms like
json
,
time
or
"v"$
,
return an error value when appropriate.
The underlying values of such errors can be plain strings,
but they are usually dicts.
In the latter case, the only mandatory key is
"msg"
,
which provides an error-message suitable for
user consumption. See the question about
how error values work
for the basics of error handling in Goal.
Depending on the kind of error, various kinds of fields with extra information can be provided, as described in the following table:
key | description (type) |
|
exit code of external command (i) |
|
short description of the error’s nature (s) |
|
format layout of
|
|
new path name in
|
|
number of bytes parsed before error occurred
(in
|
|
old path name in
|
|
name of path-related operation (s) |
|
standard output of external command (s) |
|
file path (s) |
|
name of syscall (s) |
|
date/time of
|
The
lib/os.goal
and
lib/fs.goal
user libraries define a few globals with portable error
strings that can be compared with the
"err"
field when present:
ErrInvalid
,
ErrPermission
,
ErrExist
,
ErrNotExist
,
and
ErrClosed
.
Each of those abstracts one or more concrete kinds of errors
in a portable way.
dirfs
? #Goal’s verbs
glob
,
import
,
open
,
read
,
and
stat
accept a read-only file system value as left argument, with
similar semantics as the monadic cases but using the
provided file system value instead of the host’s file system
rooted at the current directory.
There is also a
subfs
dyad that is used to derive a new file system rooted at a
subtree.
All those operations accept slash-separated paths, working
portably on all systems.
In the case of
import
,
the semantics follow the same rules as the empty extension
GOALLIB
case but using the provided file system value instead, and
accepting a file name with extension as well.
The
dirfs
monad returns a read-only file system value as provided by
the host operating system, rooted at the given directory.
Extensions may provide other kinds of file system values
that can then be used from Goal using the same builtins.
run
and open
? #Both
run
and
open
accept a left dict argument to handle more advanced usage
cases. By default, commands inherit from parent stdin,
stderr, and stdout, as well as the environment and working
directory.
When using a dict, the
open
dyad uses the special
"mode"
key for specifying the mode, defaulting to
"r"
.
The accepted configuration keys are described in the
following table:
key | description (type) |
|
|
whether to enable buffering (i) |
|
|
working directory (s) |
|
|
|
|
|
stderr filename (s,
|
|
|
stdin filename (s) or handle (h) |
|
|
stdout filename (s,
|
|
|
stdin from input string (s) |
|
When used in pipe configuration, file handles using modes
"r"
or
"w"
with buffering disabled will be directly connected to the
standard input/output/error of the process.
"sr"
and "sw"
open
modes work? #These modes produce handles for reading and building
in-memory strings. The string-writer
"sw"
mode can be useful when redirecting standard output or error
in a command pipe, or when progressively building a long
string. The string-reader
"sr"
mode is less useful, but can be used to pass a string as a
handle to a function expecting a handle argument.
The
"sw"
open
mode accepts either an initial string
s
argument, or an initial buffer size
i
argument.
-2
+
3
and -
2
+
3
give different results? #In the first case,
-2
is parsed as a single token. In the second case, the
-
represents the verb “negate”.
0n
not equal to itself? #Goal follows the usual floating-point arithmetic conventions
for NaN values, so any atomic comparison primitive
(among
=
,
<
,
and
>
)
where either operand is
0n
will return
0
.
Use the
nan
verb’s monadic and dyadic forms to search for or replace NaN
values when needed.
Note that only the above atomic comparison primitives are
affected by the floating-point standard rules.
In particular, the match
~
dyad supports NaNs and
0n~0n
holds.
As a result, all (self-)search primitives like “classify”,
“find” or “distinct” support inputs containing
0n
too using the same matching convention.
Also, as a special case, sorting primitives sort NaNs before
other numeric values.
Goal has both 64-bit integer and floating point numbers,
whose types are
"i"
and
"n"
as returned by
@
respectively. Primitives convert from one to another whenever
possible, so most applications do not have to care about
this distinction.
Conversion from integer to float means that big integers might be approximated. From float to integer, if the float is too big to be represented or is NaN, it will not be considered an integer by primitives that want an integer. Also, operations on integer operands can overflow, as defined by two’s complement integer overflow.
One thing worth noting is that while integers and floats are two different types, Goal does not allow flat generic arrays with mixed floats and integers: it will convert all elements to floats in such cases, because that’s what’s most convenient and efficient in the common case. If mixed numeric types without coercion are needed, you will have to enlist the values separately or append a dummy generic value.
When applying a function
f
using fold on an empty list
x
,
as in
f/x
,
the result is the default
zero value fill
for that type of list.
There is a special exception, though: specially recognized
adverbial forms (see section on
special combinations)
return a neutral element for the involved
operation instead, which helps avoid unwanted edge cases in
common operations. For example
*/!0
returns
1
,
|/!0
returns
0i
(both the smallest possible integer and a false value),
and
|/?0
returns
-0w
.
The “each” adverb also has a special consideration with respect
to empty lists: the result is usually an empty generic list,
but specially recognized forms may return specialized kinds
of empty lists. For example
#'()
returns
!0
,
and
$'()
returns
!""
.
When executing a script using the
goal
command-line interpreter, using return
:x
in global code exits the script immediately.
The exit status will be non-zero if
x
is an error, and
0
otherwise. In the error case, the error message will be
displayed on standard error before execution ends. Also, if
the error value is a dict with a key
code
,
the corresponding value is used to set the exit error code,
instead of the default
1
,
following a convention similar to the one of the
run
dyad. Only portable integer values within
[1,125]
are supported for the exit error code.
When executing the code using the Go API instead, it is possible to inspect the returned value and handle it in whatever way is most appropriate.
Note that you can’t exit directly from within a lambda: you have to return from there before, and then return early in global code.
awk
or perl -p
one-liners ? #Goal does not provide a built-in command-line option for
that kind of mode of operation, but the
examples/goalx
script at the root of the distribution provides an
alternative that can be used for similar purposes.
Note that the
goalx
script works a bit differently from AWK and Perl, because
Goal favors loading a whole file into memory and performing
whole-array operations on all the lines, like other array
languages, instead of working line by line.
Advanced features like language-aware auto-completion and syntax checkers are unlikely to be very useful when writing typical Goal scripts, but some syntax highlighting is nice.
The vim-goal repository provides syntax highlighting support for Vim. It is what I use and maintain. There is not yet support for other editors (that I’m aware of). Highlighting beyond strings, numbers and comments is not really that useful, so adding support for basic syntax highlighting should be simple enough to do for most editors, and might be both a good contribution and learning experience. LSP would be a nice bonus.
Quitting the read-eval-print-loop is done by closing
standard input. This can be done with
Ctrl-D
on Unix-like systems. Typing
close STDIN
also works.
Goal’s interactive mode is quite minimal, so you need to use
an external program like the readline-wrapper
rlwrap
to get a more convenient interactive experience.
It’s as easy as installing
rlwrap
and then typing
rlwrap goal
instead of just
goal
.
The
rlwrap
program provides some programmable completion functionality.
Also, the ongoing project ari by semperos aims at providing an interactive programming environment built on top of Goal. Among other extra features, including a dedicated SQL mode, it provides language-aware auto-completion and a more interactive help than Goal’s default minimal REPL.
On systems that have a
clear
command, you can clear the screen in interactive mode by
executing
print run "clear";
.
Goal is implemented as an embeddable bytecode interpreter, written in Go, without any dependencies outside the standard library. Go provides good garbage collection, a comprehensive standard library, fast compilation, and a higher-level library interface than a non-GC language would. As a tradeoff, we cannot catch out of memory errors reliably in programs, and a panic is expected in such cases.
The implementation makes use of a recursive-descent parser that provides quite accurate error messages.
Interestingly, Goal is at least the third project for an array language in Go, after ivy and ktye/i. While Goal and Go are quite the opposite in terms of conciseness due to the huge gap in their programming paradigms, they both share a practical mindset that encourages idioms over abstraction, and writing executable code over writing declarations.
Array performance is quite decent, with specialized algorithms depending on inputs (type, size, range), and variable liveness analysis that reduces cloning by reusing dead immutable arrays (in code with limited branching). However, it is not a goal to reach state-of-the-art (no SIMD, and no bit booleans, fitting integers in arrays using either uint8 or int64 elements).
Scalar performance is typical for a bytecode-compiled interpreter (without JIT), somewhat slower than a C bytecode interpreter: value representation in Go is less compact than how it could be done in C, but Goal does have unboxed integers and floats.
Goal uses optimized code paths for the following adverbial and verbal forms:
+/ -/ */ |/ &/ ,/ / folds (monadic and dyadic forms)
,// / converge
+\ -\ |\ &\ / scans (monadic and dyadic forms)
<\ =\ / boolean scans (monadic and dyadic forms)
$' #' *' @' / each (monadic forms)
@[x;y;:;z] / tetradic amend with :
@[x;y;op;z] / with arithmetic op among + - * % | &
@[x;y;~] @[x;y;-] / triadic amend for not and negate
Also, Goal recognizes and optimizes a few monadic idioms:
++ / flip twice (make all rows have same size using take/repeat)
*| / last
*< / index of first occurrence of minimum value (for arrays)
*> / index of first occurrence of maximum value (for arrays)
Goal’s arrays are immutable, but in cases the implementation can determine an array will not be used again, it will use in-place mutation in most operations where it makes sense, like arithmetic operators, join or amend.
Goal makes use of a reference count and a variable liveness compilation pass to determine if a value is reusable and will not be used again. In typical branchless code portions the last use of local variables is always determined. In code with branches the analysis is incomplete and may not always allow in-place mutation when variables are used in branches (though common cases are still handled, like when the branch ends the function, including early-return cases, or when an explicit assignment operation is used).
Because memory management is handled by Go’s GC, Goal only keeps track of a reference count for arrays, and only if they’re not nested, otherwise they’re simply marked as not reusable: this makes the implementation simpler and less error-prone, and makes reference count handling faster in the common case, but if you have a matrix as a list of lists, any modification in a given line will replace the whole line, so consider batching updates or using a flat list in such cases.