30 March 2018

Floating-point numbers

Often floating-point numbers lead to confusion and frustration. Unfortunately, these problems cannot be avoided and to properly work with floating-point values a basic understanding is required.

How floating-point numbers are stored
Floating point decimal values generally do not have an exact binary representation. This is a side effect of how the FPU represents and processes floating point numbers. The storage format for Double and Single is the same as expected by the FPU-registers of the CPU. This ensures consistency and fast reading from and writing to memory. The problem however, is how to store a floating-point value in a binary computer. This is solved by storing a floating-point number as a formula. There are two types of floating-point numbers: Float (or Single) and Double. The difference is their size in bytes, and therefore the minimum and maximum values that can be stored. Another difference is the higher accuracy for a Double. The maximum number a Float (taking 4-bytes) can store is much less than a Double (taking 8-bytes) can store. Since floating-point numbers can have an infinite number of values, you cannot store all of them in either 4 bytes (float) or 8 bytes (double). To be able to store as much numbers as possible, with as much accuracy as possible, another approach is necessary. A floating-point value is stored as a formula:

X = (-1)^sign * 2^(exponent - bias) * (1 + fraction * 2^-23)

The formula contains 3 variables (sign, exponent, fraction) and one constant (bias).The bias for single-precision numbers is 127 and 1,023 (decimal) for double-precision numbers. The values of these formula-variables are stored in either 4 bytes for a Single or 8 bytes for a Double.

The next example uses an user defined type Sfloat to illustrate the storage of the Float data type. The values for fraction and exponent, together with a sign bit, are stored in the 4 bytes. By using a Union we can assign a value to a Single variable and then use Sfloat to dump the 4 bytes that make up the Float:

Debug.Show
Type Sfloat
fraction As Bits 23   // fractional part
exponent As Bits  8   // exponent + 127
sign     As Bits  1   // sign bit
EndType
Type TFloat Union       // sizeof() = 4
value As Float
sf    As Sfloat
EndType
Dim fv As TFloat
fv.value = 2.0     : DumpFloat(fv)
fv.value = 0.0     : DumpFloat(fv)
fv.value = -345.01 : DumpFloat(fv)
' Do test some more ..

Proc DumpFloat(ByRef tflt As TFloat)
Global Const bias As Int = 127        ' Standard IEEE
Global Const frexp As Float = 2 ^ -23 ' Standard IEEE
Dim flt!

Debug "> DumpFloat:"; tflt.value;
With tflt.sf
Debug " (sign =";.sign; " exponent ="; .exponent; _
" fraction =";.fraction;")"
Debug "Binary format: ";Bin(.sign, 1)` _
Bin(.exponent, 8)`Bin(.fraction, 23)

' Reconstruct value from Sfloat using formula:
flt! = ((-1) ^ .sign Mul 2 ^ (.exponent - bias)) _
* (1 + (.fraction * frexp))
Debug "Float reconstructed ="; flt!
EndWith
Debug
EndProc

The output of the demo is:

> DumpFloat: 2 (sign = 0 exponent = 128 fraction = 0)
Binary format: 0 10000000 00000000000000000000000
Float reconstructed = 2
> DumpFloat: 0 (sign = 0 exponent = 0 fraction = 0)
Binary format: 0 00000000 00000000000000000000000
Float reconstructed = 0
> DumpFloat:-345.01 (sign = 1 exponent = 135 fraction = 2916680)
Binary format: 1 10000111 01011001000000101001000
Float reconstructed =-345.01

What does this tell us? A Float (or Double) is stored and described using 3 components in the bits of either a 4 or 8 bytes type. Due to the limited storage of all these components only an approximation of the decimal value can be ‘described’. When a floating point value is assigned to a variable the value is dissected into these 3 components. To get back to the original floating-point number these 3 components are substituted in this standardized formula.

Effect of floating point values
A floating-point value is stored by a description, not by its value. This makes it inherently inaccurate. Even common decimal fractions, such as 0.0001 cannot be represented exactly in binary, only fractional numbers of the form n/f where f is an integer power of 2 can be expressed exactly with a finite number of bits. Examples are 1/4, 7/16, 3/128, of each f is a power of 2.
The inaccuracy may increase slightly when a floating point value is loaded into a 80-bits FPU register. The FPU fills the remaining bits, because the 80-bits representation is different from the 32-bits Single or 64-bits Double format. Moving a value out of the FPU register will round the value back to fit in either a Single or a Double. Storing and transporting may add to the inaccuracy of of the value.

The following example shows what happens when the small error in representing 0.0001 propagates to the sum:

Dim dSum As Double, i As Int
For i = 1 To 10000
dSum = dSum + 0.0001
Next i
Debug dSum     ' = 0.999999999999906

Theoretically the sum should be 1.0.

Not only the calculations suffer from inaccuracy, comparisons with floating point numbers are equally problematic. The following example demonstrates a ‘forbidden’ comparison between a floating-point constant number and the result of a calculation:

Global Double dVal1, dVal2
dVal1 = 69.82
dVal2 = 69.20 + 0.62
Assert dVal1 == dVal2  ' Not equal

This throws an ASSERT exception, because the assertion that dVal1 and dVal2 are equal fails.
A comparison between two floating-point constants of the same type is allowed. For instance, the GFA-BASIC runtime returns a Single constant from DllVersion (2.33; 2.341; etc.). This constant may be compared to a literal Single constant (note the exclamation mark, without it 2.33 is a double!):

If DllVersion == 2.33! MsgBox "This is version 2.33"

Never compare two different data types, a Single to to Double, or a floating-point to an integer, These comparisons will most certainly fail (unless they can be described using a finite number of bits, see above). Any comparison to a floating point will most likely fail, because the comparison is executed in the FPU expanding the values to 80-bits. The same is true for the comparison of the results of two floating point calculations, it will most certainly fail.

The most logical solution for floating-point comparison of type Double is the use of the special operator NEAR, which uses only 7 decimal digits from both expressions for the comparison. In practice the expressions are compared as if they are both of Single precision.

Improve floating-point consistency in calculations
If your application expects multiple fp-calculations, it is necessary to keep the intermediate values in the proper data format, otherwise small errors are propagated through the calculations. For multiple floating-point calculations the FPU uses the intermediate results that it holds in the 80-bits FPU registers. However, these 80-bits calculations do not reflect the data types involved, the Single or Double. Due to the extra level of accuracy multiple calculations may produce unexpected results. The compiler setting ‘Improve floating-point consistency’ inserts code to load and write immediate results from and to memory in the appropriate type. This decreases program speed, but improves the chance for an expected result of the calculation. Make sure the ‘Improve floating-point consistency’ is checked always (unless you know exactly what you’re doing).

Conclusion
Floating-point values are inherently inaccurate, you might want to avoid them as much as possible. Instead use integers when ever possible, or otherwise use Currency, which is an integer value as well. The Currency data type exactly stores up to 19 digits, with 4 digits after the decimal point.

21 March 2018

Function and Sub parameters

In the new English Html help additional information is provided for Function and Sub. Since this is new information a copy of the text has a place in a blogpost.

Function parameters
The return value of a Function can be assigned to a local variable with the same name as the Function. When the return type is a numeric data type a local variable of that type is automatically added to the function’s local variables. With String, Type (UDT) and Variant as the return type a by reference variable is passed as the last argument on the stack. The string, UDT or variant becomes the variable that can be used to pass the function’s return value. Therefor, the following is equal:

Dim h\$
h\$ = testf(8) ' assign result to h\$
testp(8, h\$)  ' put result in h\$
Function testf(a%) As String testf = "3" & a%
Procedure testp(a%, ByRef p\$) p\$ = "3" & a%

In the function testf the variable h\$ is silently passed on the stack. Inside the function this by reference variable is known as testf. Assigning a new string to testf actually assigns the string to h\$ directly.

When a Function is used for a Windows API callback make sure the return data type is a primary numeric type (Byte, Word, Long, Int64, Single, Double), otherwise the stack will be overwritten.

Sub And FunctionVar parameters
For compatibility reasons GFA-BASIC 32 includes the Sub and FunctionVar statements. FunctionVar is compatible with VB’s Function; arguments are passed by reference by default and without a data type the Variant type is assumed. The same is true for Sub, without a ByVal or ByRef keyword the default is by reference, an implicit by reference. When a datatype is missing the Variant type is the default. Examples:

Sub test(vnt)        ' implicit ByRef, Variant datatype
vnt = "new Value"  ' do not write to parameter

FunctionVar tfv(a As String) ' As Variant
tfv = "new value" + s

Although ByRef is implied the rules aren’t as strict as with an explicit ByRef. When ByRef is included only actual variables can be passed, without the ByRef keyword the Sub and FunctionVar also accept literal values and types that don’t match. The following calls are allowed:

Local vnt As Variant, s As String
test(7)     ' 7 is assigned to a local Variant first
test(vnt)   ' vnt variable is passed by ref
test(s)     ' types don't match
vnt = tfv(7)' 7 is assigned to string first

Note that the literal value 7 is passed to the by reference parameter vnt in test. The compiler won’t complain since an implicit by reference does not have to reference an actual application varaible. When the Sub test(vnt) and FunctionVar tfv() include a ByRef keyword explicitly the compiler will check the argument’s type against the parameter’s type and will complain if they don’t match. An explicit ByRef declaration only accepts actual variables of the same type as the argument. Both, the variable that is passed as the argument, and the by reference parameter must be of the same type, like this

Dim s As String
CallByRef s     ' must be a variable
Debug s         ' = “new value”
Sub CallByRef(ByRef p As String)
' only accepts String variables as arguments
p = "new value"
EndSub

Limits to the use of parameters
So, there is a subtle difference between the default, an implicit ByRef and an explicit ByRef in a Sub and FunctionVar. In both cases only a variable can be passed by reference properly. With an implicit by reference an actual variable can be passed only when the types match. In all other circumstances the argument is first copied to a hidden local variable of the same datatype as the parameter and then the hidden variable is passed by reference. In the example above, the literal value 7 is first copied to a temporary hidden Variant variable and then the temporary variable is passed by reference. The same is true for the third call: test(s). Since the data types don’t match the string s is first copied to a temporary Variant variable which in turn is then passed to test(). The temporary hidden variable is immediately destroyed after returning from the Sub or FunctionVar.

Now lets look at it from the Sub’s point of view. Although by reference is implied the Sub doesn’t know what kind of variable is actually passed. It might be an actual variable, but it might also be a reference to a temporary local variable. Therefore Sub and FunctionVar cannot return a value using an implicit by reference parameter. In case a temporary variable is passed, it will be destroyed immediately after returning from the Sub. Only when a parameter is declared using the ByRef keyword explicitly is the Sub guaranteed to receive an actual variable.

Note An implicit ByRef parameter cannot be used as a local variable as can with ByVal parameters. The Sub doesn’t know whether the parameter is a reference to a temporary variable or to an actual variable. In case the parameter references an actual application variable the Sub might very well overwrite the contents of that variable.

04 March 2018

GfaWin23.Ocx Update 2.34

Another three bugs are fixed in this version. The SetPrinterByName needed maintenance because of the always growing need of memory with newer versions of Windows. The command failed with some printers that returned a lot of information in the GetPrinter() API. GFA-BASIC did not reserve enough memory and caused a buffer overrun. The EOF() function now also works with inline files, the ones that are stored in the :Files section of the program. The ListView.GetFirstVisible property now returns a ListItem. 