27 July 2020

Where are variables stored?

In the past months I got some questions that are perfectly suited for a blog post. One of those questions expressed a curiosity in the storage location of variables. In other words, where are the variables stored?
The location of variables depends on several things. First, there are global and local variables and they are handled differently. Second, there are non-dynamic variables – like simple numeric data types - and dynamic variables like strings, arrays and objects.

Global variables
To the compiler a variable name references a memory address. The declaration statements like Dim and Global enter the specified variable name(s) into a database which holds information about the declared global variables. At the same time, the compiler allocates some memory for the variable and when the compiler encounters the same variable again the variable is replaced by its memory address.

The compiler allocates a data section to store the global variables. When more global variables are entered in the database the data section will grow. When a program is run inside the IDE the compiler uses a simple malloc to allocate the data section. When a program is compiled to EXE the data section is saved with the EXE.

The amount of memory reserved for a variable depends on its data type. A Long (Integer32) variable is assigned a block of 4 bytes, a Word variable gets 2 bytes, etc. The following table describes the amount of memory that is reserved for non-dynamic variables.

Non-dynamic type           Memory requirement      
Bool 1 byte
Byte 1 byte
Word, Card 2 bytes
Integer 4 bytes
Single 4 bytes
Double 8 bytes
Currency 8 bytes
Date 8 bytes
Large 8 bytes
String * n n bytes

The location of a variable of a primary data type can be obtained with VarPtr, ArrPtr, V:, or the * – operator. These functions return the fixed memory address of the global variables. The contents of the global data section is cleared so that the value of each variable is zero.

The dynamic variables allocate their desired memory at runtime, for instance a string is dynamically allocated when it is assigned a value. However, for the compiler to handle a string it must be entered in the database and have some memory address assigned to it. In general, dynamic variables are assigned a Long (4 bytes) in the global data section to store the address of the – at runtime - dynamically allocated memory. This is also called a pointer. The data of a dynamic variable is stored elsewhere, not at the variable’s memory address. To obtain the storage address of the pointer use ArrPtr or the * – operator. The location of the data is only known at runtime and can be obtained using VarPtr or V:. These functions actually read the memory address returned from ArrPtr.
An OCX or Object variable receives a 4 byte pointer in the global data section and is initially zero (Nothing). The VarPtr function does not return the address of the object, but the location of the Object variable like ArrPtr.

An array is handled differently. A global array allocates an array descriptor with a size of 124 bytes in the global data section which contains information about the data type, number of dimensions (if specified in the declaration) and upper and lower bounds. The address of the descriptor can be obtained with ArrPtr, the actual memory locations of the array’s elements can be obtained with VarPtr.

The following table specifies the number of bytes required to store a pointer or descriptor for dynamic variables.

Variable type   Memory requirement
String 4 bytes
Array 124 bytes (descriptor)
Hash 8 bytes (descriptor: pointer + data type)
Object 4 bytes
Variant 16 bytes

For all dynamic variables the runtime uses malloc() to obtain the required memory. The malloc() function uses the Windows API HeapAlloc(), so the data for the dynamic variables is stored on the heap.

Local variables
Almost the same can be told for local variables, except that they are stored on the stack. When the compiler encounters a local variable declaration it calculates the offset to the stack pointer, which is then entered in the variable-database. Any reference to a local variable is then replaced with this offset value. The location of a variable of a primary data type can be obtained with VarPtr, ArrPtr, V:, or the * – operator. These functions return the calculated absolute memory address – relative to the stack pointer - of the local variables at runtime.

The only difference with global variable creation is the local array declaration. The compiler does not reserve a 124 byte descriptor relative to the stack pointer, but only a 4-byte pointer. When the program is executing the Dim statement allocates the descriptor and the required array memory. An array declaration without specifying a dimension only allocates a dynamic descriptor of 124 bytes at runtime. Note that the Erase command only releases the array elements, the dynamically allocated array-descriptor is not released. This conforms to the functionality of Erase for a global array where the descriptor is static and stored in the global data section.

Note – In GFA-BASIC 32 versions before 2.52 the automatic release of local arrays and local hashes did not work and the program suffered from memory leaks. The local hash could be freed using Hash Erase though, but there was no way to free an array-descriptor. Since then, the automatic release of arrays and hashes is fixed.