31 May 2014

Problem with local Hash variables

GB32 does not support the destruction of local Hash variables. This is by design. Global Hash variables are released when the program terminates. This does not mean you can’t use local Hash variables, you just have to add the termination code yourself.

This behavior might be the cause of many reported memory leaking problems.

Let us look at some examples. First suppose you have a subroutine like this where a local hash variable is used to store the results of the Split regular expression command:

Proc Split_Local(ByRef t$, sep$)
  ' Declare a local Hash variable.
  Dim hs As Hash String

  ' Split creates a new hash table of String,
and destroys the hash allocated memory first. Split hs[] = t$, sep$ ' Explicitly erase Hash variable, because ' memory allocated by hs[] is not destroyed ' automatically when going out of scope. Hash Erase hs[] EndProc

The Dim command declares a Hash String variable hs. The variable is put on the stack. Stack memory is temporary space limited to the scope of the procedure call. A hash variable occupies 8 bytes divided in two Long types. The first Long is a pointer to heap memory. This pointer is set to the hash-table descriptor once a value is added to the table. Initially, the pointer is Null. The second Long contains the code for type of data the hash is going to store. Here the type indicates a String, so the hash is used to store String values.
When the hash variable goes out of scope, at the end of the procedure, the stack is cleared or reset to the value it was at the point where the procedure started executing. The 8 bytes reserved for the hash variable are simply discarded without freeing allocated memory first.
A hash table allocates memory dynamically. When the hash needs more memory, it is allocated automatically. The hash grows and shrinks automatically allocating and freeing heap memory on demand. When the hash variable goes out of scope the allocated memory is no longer referenced by any GB variable or GB garbage collector. This memory gets freed after the application has ended and all of the application’s memory is released to the OS.

  • A local hash variable must be freed explicitly using Hash Erase on the variable name.

The Split command clears the Hash table as well, that is when the first Long of the hash variable isn’t Null. Before Split starts splitting the string in to tokens that are to be stored in the Hash String variable, it completely erases the Hash. In fact, the Split command invokes the runtime function HASHERASE, which is also called with the Hash Erase command.

Static Hash
Now suppose you would like to use Static on a local hash variable. That would prevent unnecessary memory de-allocations in your procedure and would improve performance when the procedure is executed, wouldn’t it? Wrong. Look at the next example:

Proc Split_Static(ByRef t$, sep$)
  ' Static declares a global Hash variable
  ' with local scope.
  Static hs As Hash String
  Split hs[] = t$, sep$, 10

Although the hash variable is declared local, a Static variable is actually a global variable. They are treated the same as other global variables, only their visibility is limited to the procedure they are declared.

Note When asked for the variable address using VarPtr(hs) the address returned is located in the global data section of the program. VarPtr() does not return a stack address, the Static variable is not stored in the stack. For the duration of the program execution the static/global hash variable hs[] can not be changed by other code than the procedure it is declared in. Because a static variable is assigned to the program’s data section, the GB32 application will release the memory it allocated. The static hs[] variable is destroyed when the application quits. When a program is executed from within the IDE all globals will be destroyed and so will be the static hs[] hash variable.

Let us consider the example where the static/global hash variable is passed to Split. Upon entry Split will destroy any entries the hash variable references, the hash variable is destroyed. Than, a new hash table is allocated and its new pointer is stored in VarPtr(hs) + 0. The contents of  VarPtr(hs) + 4 remains unchanged, because it specifies the hash data type. When Split finishes the global/static variable hs[] points to a new hash table. Access to the elements in hs[] is now limited to the code between the Split command and EndProc. Since the hash isn’t destroyed when the procedure returns, the program now carries with it allocated memory that cannot be accessed until the next time the procedure is called. And when the procedure is executed the Split command immediately destroys the hash table. Instead, you could insert a Hash Erase at the end of the procedure, but what does that give you? A local hash variable that is stored as a global variable.

If you like to read more on hash variable, go to Passing a hash variable to a subroutine

More on local scope
Automatic destruction of variables with local scope is only supported for String, Variant, Object,  BSTR and arrays. When a local array is used the compiler inserts a call to CLEARARR(). The code responsible for clearing other type of local variables – a function called CLEARMULTI located in the runtime GfaWin23.Ocx – accepts a Long integer (4-bytes) where each byte contains the number of local variables of one specific type. The Long is coded like this: BBVVOOSS. The Lo byte contains the number of String variables to clean and the high byte of the lo-word contains the number COM objects to release. The hi-word contains the number of Variants and BSTR types.

Yes BSTR types. Unfortunately GB does not provide a data type BSTR we can use to store UNICODE strings. However, all COM objects that require a string use BSTR as a parameter type. When the compiler creates code to invoke a method or property taking a COM string as an argument, the ANSI string is converted to a BSTR before the method or property is called. When necessary the BSTR is stored in a hidden local variable and should later be destroyed.