Issue #15: Compile-time errors are the best kind of errors
Summary
Topics: Improved, type-aware static analyser and Partial type inference
This week was spent on further improvements to the new SA, and making it complete (i.e. making it cover
all instructions).
The new static analysis is not yet ready, but for supported subset of instructions it is available
on devel
branch, hidden behind --new-sa
option.
To see the new SA in action download Viua VM, compile it, and use ./build/bin/vm/asm --new-sa
on files from
the ./sample/static_analysis
directory.
Improved, type-aware static analyser
The new SA does not only track register use and warn about leaving values unused, or using uninitialised registers (which the old SA was capable of). It is now also aware of the type of values stored in the registers and is able to catch some errors arising due to use of incorrect type combinations (e.g. shifting bits by a text value, instead of an integer value).
Compile-time type errors
Catching type errors at compile time (which the new SA is capable of) is another feature of Viua that is intended to improve the reliability of the code that runs on the VM.
Without static type checking all type error detection is deferred until runtime, when it is sometimes to late to do something about them. With static type checking some type errors can be caught at compile-time which means that invalid programs will not even compile, and, as an effect, will not get the chance to be run and crash.
Types are attached to values by "constructor instructions" (istore
, text
,
vec
, etc.) and then are carried with values during the usual value-movement analysis.
Basic type system
The fact that SA was made type-aware means that the type system of Viua now manifests at compile time. It also means that, due to the fact that it is actually being used to analyse and verify correctness of Viua VM programs, it might prove useful to spend some time defining and describing it.
The type-aware SA also serves as a basic documentation for the inputs taken and outputs produced by all instructions, since the SA by necessity codifies all these rules. Without a doubt, though, such a documentation would be more approachable in a form of typical documentation (a human-readable English text, instead of a piece of C++ code).
Partial type inference
Sometimes values are being defined by instructions that are not in the "constructor instructions" group, and in such cases, it is not immediately obvious what type a value has. The SA does not bail and complain but instead tries to deduce what type the value is.
The inferencer used by the SA is quite primitive.
Types
It looks at the instructions used to manipulate the value, and combines this with the assumptions it has about the types of instructions' operands to infer what type the value contained in registers used as operands has.
For example, if a value with previously undefined type is used as an input to the
iinc
instruction the inferencer will assign it the integer
type, and
if it is used as an operand of texteq
instruction the inferencer will decide
that the type of the value is text
.
Pointerness
The second piece of information the inferencer deduces is the "pointerness".
Basically, a type may exist in two variants: X
or pointer to X
.
Accessing values by-pointer requires using a different access mode than the default direct one.
The SA ensures that the values are accessed using correct modes; it does not allow using pointers as direct values which is an error. By analysing the access mode used to fetch values from registers the inferencer is able to deduce the pointerness of a value.
The inferencer is able to deduce the full type of a value (the basic type plus pointerness) in two
steps, if needed.
It may deduce pointerness without the basic type if the value as accessed as a pointer by an
instruction that has no assumptions about basic types (e.g. print
, or
vpush
for pushed operands).
Next: #16: More work on static analyser
Previous: #14: More, better error messages