01 December 2018

Anatomy of a procedure (2)

Using ‘Proc Disassembly’ we can inspect the assembly code produced by the compiler. In the first part of this series we looked at the generated code for a Naked proc. Now we will discuss regular procedures (or subs and functions) that do not produce minimal code. Regular procs support destruction for local dynamic variable types like String, Object, Variant, arrays and hash tables. In addition, they provide the logic to trace code using the Tron and Gfa_Tron statements. A regular procedure stores information for for Trace, TraceLnr, and other debugging related commands. Finally and maybe most importantly, regular procs support structured exception handling (Try/Catch, On Error, and unwinding).

Inspecting a procedure’s disassembly requires knowledge to identify the three main parts of a procedure; the entry code, the actual code, and the exit code, In the previous post we discussed how we can recognize these parts in Naked procedures, now we’ll see how to identify these parts in a regular procedure.  The test() procedure is changed a little to demonstrate the use of local dynamic variables:

test(2, 6)
Proc test(x As Int, y As Int)
  Local sStr As String, result As Int
  result = x \ y
  sStr = Str(result)
  Print sStr
EndProc

This procedure declares a local dynamic variable of type String. Before the procedure exits the memory allocated for the string-data has to be released. We’ll see how this will become part of the exit code.

In a regular procedure all dynamic types are destroyed automatically before the procedure returns. (Starting with version 2.5 local arrays and hash tables are destructed correctly as well).

After selecting ‘Proc Disassembly’ the result is displayed in the Debug Output window:

--------  Disassembly -----------------------------------
1 Proc test(x As Int, y As Int) (Lines=7)
03C002D0: 6A 02                   push    2
03C002D2: B8 63 00 00 00          mov     eax,0x00000063
03C002D7: FF 15 3C 1A 4D 00       scall   INITPROC ; Ocx: $1802775D
03C002DD: E8 5A 00 00 00          call    0x03C0033C
03C002E2: FF 55 B4                call    dpt -76[ebp] ; @Tron
03C002E5: 8B 45 14                mov     eax,dpt 20[ebp]
03C002E8: 99                      cdq    
03C002E9: F7 7D 18                idiv    dpt 24[ebp]
03C002EC: 89 43 78                mov     dpt 120[ebx],eax
03C002EF: FF 55 B4                call    dpt -76[ebp] ; @Tron
03C002F2: 50                      push    eax
03C002F3: FF 15 60 1B 4D 00       scall   STRSTRI ; Ocx: $1806AC50
03C002F9: 50                      push    eax
03C002FA: 8D 43 7C                lea     eax,124[ebx]
03C002FD: 50                      push    eax
03C002FE: FF 15 C0 1D 4D 00       scall   STOSTRSV ; Ocx: $18067F73
03C00304: FF 55 B4                call    dpt -76[ebp] ; @Tron
03C00307: 6A FF                   push    -1
03C00309: 8B 43 7C                mov     eax,dpt 124[ebx]
03C0030C: FF 15 08 24 4D 00       scall   PRSEXPCR ; Ocx: $18043D3F
03C00312: 5A                      pop     edx
03C00313: FF 55 B4                call    dpt -76[ebp] ; @Tron
03C00316: 8B 4D F0                mov     ecx,dpt -16[ebp]
03C00319: 64 89 0D 00 00 00 00    mov     dpt fs:[0x00000000],ecx
03C00320: 8D 4B 7C                lea     ecx,124[ebx]
03C00323: FF 15 CC 25 4D 00       scall   CLEARSTR ; Ocx: $1807BA06
03C00329: 8B E5                   mov     esp,ebp
03C0032B: 5D                      pop     ebp
03C0032C: 5B                      pop     ebx
03C0032D: 5F                      pop     edi
03C0032E: 5E                      pop     esi
03C0032F: C2 08 00                ret     8
03C00332: 51                      push    ecx
03C00333: 8D 4B 7C                lea     ecx,124[ebx]
03C00336: FF 15 CC 25 4D 00       scall   CLEARSTR ; Ocx: $1807BA06
03C0033C: C3                      ret 
03C0033D: 90                      nop    
03C0033E: EB F2                   jmp     short 0x03C00332  

It’s immediately clear that the disassembly differs greatly from the Naked attributed procedure discussed in the previous post. The first thing we need to identify is the entry code where the stackframe is established.

The entry code
Part of the procedure’s entry code, where the stack is prepared, is located in the INITPROC library function, which takes two arguments. As we will see, in GB arguments to library functions are often passed via eax and the stack. Here, the first argument is passed through the stack and specifies an encoded value used to reserve and initialize stack space for local variables. The second argument is stored in eax and specifies the offset to an unwind (termination) handler. INITPROC is a general function that is responsible for setting up a stack for a regular GB procedure, preparing it for structured exception handling and for use with Tron/Trace. Therefor, the stack needs an additional of 80 bytes for an ‘Extended Information Block’ .

After INITPROC has returned the stack for test() has been setup as shown if the figure below.

These four lines constitute the entry code of the proc:

03C002D0:  push    2
03C002D2:  mov     eax,0x00000063
03C002D7:  scall   INITPROC ; Ocx: $1802775D
03C002DD:  call    0x03C0033C

If a procedure contains local variables the first argument tells INITPROC the number of stack-bytes to reserve and initialize. This happens in the same way as we saw in the Naked procedure. The stack bytes are reserved and initialized through a series of push eax instructions, determined by the encoded value of the argument. (The value is a compiler encoded number and does not specify the actually number of pushes that are inserted. In this case the value is coincidentally 2.)
The second argument of INITPROC, passed via eax, is an offset value used by INITPROC to calculate the address of the unwind code stored at virtual address $03C00332.

Without local variables the compiler inserts a call to INITPROC0 instead of INITPROC. INITPROC0 takes one argument only, the offset to the unwind-handler, and omits the code to prepare the procedure’s stack for use of local variables.

The unwind-termination code is only executed in case of an unhandled exception in the current proc, that is an exception that isn’t caught by a Try/Catch handler. (To be discussed in a coming post.)

The fourth and last line of the entry code calls $03C0033C and returns immediately. Why? I have no idea …

When the program is compiled to EXE, the calls to INITPROC or INITPROC0 are replaced by calls to INITPROCEXE and INITPROCEXE0 respectively. These library functions produce smaller extended stack information blocks (68 bytes), because EXEs don’t need the Tron/Trace support. In addition, the mysterious call just below INITPROC is removed.

The actual code
The code that represents the actual code starts at the fifth line at $03C002E2. The actual code starts with a call to the tron-handler:

03C002E2:  call    dpt -76[ebp] ; @Tron

The address of the tron-handler is stored on the stack in the extended information block. If there is no tron procedure present the call immediately returns and nothing happens. Because calls to a tron procedure occur before a code-statement is executed we use that information to identify and examine the actual code. The first executable statement is result = x \ y, so its assembly code is found just below the first call to the tron-handler. Note that the Dim statement doesn’t produce executable code, declaration statements only introduce variables to the compiler.

We can now inspect the code for the division of x by y and the storage of the result in a local variable. Identifying the parameters is a bit more complicated now.

03C002E5:  mov     eax,dpt 20[ebp]   ; eax = x
03C002E8:  cdq                       ; clear flag
03C002E9:  idiv    dpt 24[ebp]       ; eax idiv y
03C002EC:  mov     dpt 120[ebx],eax  ; store eax in result

The x-parameter is accessed using 20[ebp] and y-parameter through 24[ebp]. This means that the stackframe has moved 8 bytes compared to the Naked attributed procedure. Exactly the number of bytes required to save esi and edi so that they can be used for Register Int types.
The local variable result is access though the value in ebx at 120[ebx]. As you can see the compiler generated code for an integer division (idiv).

The next statement assigns result to sStr: sStr = Str(result) Again, the statement is preceded by a call to the tron-handler. We can easily identify the code:

03C002EF:  call    dpt -76[ebp] ; @Tron
03C002F2:  push    eax          ; result of division
03C002F3:  scall   STRSTRI      ; integer to temp string
03C002F9:  push    eax          ; address of temp string
03C002FA:  lea     eax,124[ebx] ; address of string variable
03C002FD:  push    eax
03C002FE:  scall   STOSTRSV     ; assign to variable

The instruction push eax passes the result of the division, which is still in eax, to STRSTRI. The integer argument is converted to string and STRSTRI returns (in eax) a pointer to a temporary string. Both the temporary string and the local string variable sStr at 124[ebx] are passed to STOSTRSV to assign (attach) the temporary string to the variable sStr, which makes it a permanent string.

Finally, the code prints the contents of sStr to the window: Print sStr

03C00304:  call    dpt -76[ebp] ; @Tron
03C00307:  push    -1                    ; True, print CRLF
03C00309:  mov     eax,dpt 124[ebx]      ; address of stringdata
03C0030C:  scall   PRSEXPCR              ; print to window 
03C00312:  pop     edx          ; fix the stack      

The PRSEXPCR shows how GB optimizes library function calls. The calling convention of this function and many more is GB-specific, one argument is pushed on the stack and one is passed via eax. Passing arguments via a register is very common and is always faster than using the stack. VC++ uses ecx and edx to pass the first arguments to a __fastcall function and Borland uses eax, ecx, and edx with its fast function calls. Many GB library functions use eax and the stack, however there are also examples of GB library functions that use the VC++ __fastcall convention and use ecx and edx for argument passing. We’ll see this with local variable destruction functions in the exit code.

The function PRSEXPCR uses the cdecl convention and doesn’t cleanup the stack so it has to be corrected by popping the one argument.

The exit code
The EndProc statement is preceded by a call to the tron-handler as well:

03C00313:  call    dpt -76[ebp] ; @Tron
03C00316:  mov     ecx,dpt -16[ebp]         ; get saved ptr to prev SEH
03C00319:  mov     dpt fs:[0x00000000],ecx  ; remove us from SEH-list
03C00320:  lea     ecx,124[ebx]             ; address string variable
03C00323:  scall   CLEARSTR ; Ocx: $1807BA06
03C00329:  mov     esp,ebp                  ; restore the stack
03C0032B:  pop     ebp
03C0032C:  pop     ebx
03C0032D:  pop     edi
03C0032E:  pop     esi
03C0032F:  ret     8                  ; pop 2 parameters of 4 bytes

The first two lines remove the structured exception record from the thread’s SEH-linked list. The record was inserted when INITPROC created the ‘Extended Information Block’. Then, before leaving the procedure, the dynamic string has to be destroyed. The address of the string variable is passed in the ecx register to CLEARSTR which frees the allocated string memory.

The compiler inserts destruction code for all local variables with a dynamic datatype: String, Variant, Object (all COM objects), array and hash table. Before GFA-BASIC version 2.5 the destruction code for an array was only inserted if the proc contained at least one other local variable of type String, Variant or Object. Often, this required the addition of a dummy local string variable so the compiler was forced to generate the array’s destruction code. For hash tables the situation was even worse; a local hash wasn’t destructed at all. This has been fixed in update version 2.5.

Finally, the disassembly shows some more code below the the procedure’s return instruction.

03C00332:  push    ecx
03C00333:  lea     ecx,124[ebx]
03C00336:  scall   CLEARSTR ; Ocx: $1807BA06
03C0033C:  ret 
03C0033D:  nop    
03C0033E:  jmp     short 0x03C00332 
some more code that is actually data

Below the procedure’s return instruction the local variables destruction code is replicated. With a normal execution of the procedure, without exceptions, this code is never executed. It is only called by the OS when the structured exception handler tries to recover from an error and starts unwinding. Most importantly, the destruction code for dynamic types within the normal flow of the procedure must be replicated here.The rest of the disassembly contains information for Tron/Trace. Although these bytes represent data, the disassembler tries to produce assembly code. Actually, everything below the second ret instruction is data.

Conclusion
Regular procedures are separated into four parts: the entry code, actual code, exit code, and unwind code. The construction of the entry code is relayed to INITPROC(0). The statements in the actual code are preceded by calls to the tron-handler and they can be used to identify the statement lines. Library functions use a wide variety of calling conventions, it sometimes requires some puzzling to identify the arguments.
When you start analyzing procedure disassemblies you will encounter a variety of the same sort of code. The information presented in this and the previous post should help you in interpreting it. 

16 November 2018

Anatomy of a procedure (1)

Only recently the IDE features ‘Proc Disassembly’, an option available under the Edit | Proc menuitem. This is a valuable resource if you want to get a better understanding of the code generated by the compiler. Once you understand the disassembly of a proc you can use the information to your advantage, especially when it comes to optimizing procedures.

Bare minimum: Naked
Let’s start with a Naked procedure. A Naked procedure is fully optimized, both in size as in performance. This comes with a penalty though, a Naked procedure lacks support for dynamic variables, structured exception handling, and runtime debugging (Tron, Trace). The Naked attribute forces the compiler to produce code much like if it would be done in a pure assembly program. The assembly code of a procedure has great similarity with textbook samples. It’s not hard to understand the procedure flow when it is compared to the theory in the assembly books. Therefor, I start this series on the anatomy of procedures with these bare minimum procs. Examining naked procedures allow us to understand how a proc is constructed and this knowledge can later be used to examine regular procedures.

The following sample shows a Naked proc taking two parameter of a simple type (Long). For now, we’ll omit the use of dynamic datatypes like String, Variant, Object, etc. The procedure contains the local variable tmp, also of a simple datatype, and assigns the product of x and y to tmp. This is the entire program:

TestMul(2, 3)
Proc TestMul(x As Int, y As Int) Naked
  Local Int tmp
  tmp = x * y
EndProc

Now put the caret inside the procedure TestMul and select Proc | Disassembly, it produces the following listing in the debug output window:

--------  Disassembly -----------------------------------
1 Proc TestMul(x As Int, y As Int) Naked (Lines=5)
042704D0: 53                             push    ebx
042704D1: 55                             push    ebp
042704D2: 8B EC                          mov     ebp,esp
042704D4: 8D 5D 80                       lea     ebx,-128[ebp]
042704D7: 2B C0                          sub     eax,eax
042704D9: 50                             push    eax
042704DA: DB 45 0C                       fild    dpt 12[ebp]
042704DD: DA 4D 10                       fimul   dpt 16[ebp]
042704E0: DB 5B 7C                       fistp   dpt 124[ebx]
042704E3: 8B E5                          mov     esp,ebp
042704E5: 5D                             pop     ebp
042704E6: 5B                             pop     ebx
042704E7: C2 08 00                       ret     8

The first line specifies the line number of the procedure (1), its entire prototype, and the number of lines (here 5, but might be more if the procedure includes any trailing empty lines).
The numbers at the start of each line show the memory address of the instructions, which might be different from your result. Consequently, in this case, the function ProcAddr(TestMul) would return the address of the first byte of the procedure: 0x042704D0.
After the memory address follow the opcodes for the assembly instruction. For instance, the opcode with value 0x53 corresponds to the push ebx assembly command. Some instructions  require a one byte opcode only, others require multiple opcodes.

The first 6 lines make up the the procedure’s entry code (sometimes called prologue). The last 4 lines are the procedure’s exit code (or epilogue). The lines in between represent the actual functionality of the procedure.

Entry code
The procedure’s entry code prepares the procedure’s code to handle parameters and local variables:

push    ebx            ; save ebx
push    ebp            ; save ebp
mov     ebp,esp        ; establish stackframe
lea     ebx,-128[ebp]  ; let ebx reference local vars
sub     eax,eax        ; eax = 0
push    eax            ; clear first local var

Whenever a procedure takes a parameter or declares a local variable you’ll always find the same three instructions at the start of each procedure: push ebx / push ebp / mov ebp, esp. If the procedure also contains local variables the fourth line lea ebx, –128[ebp] is present as well. Following this line you’ll find the code that initializes the local variable; all local variables are initialized to zero.

Local variables, the purpose of ebx
In GFA-BASIC 32 the ebx register has a special purpose and thus ebx cannot be used as a general purpose register. It is used as a fixed reference point to address the local variables.

Note - According to the documentation it allows to layout variables that require more than 4 bytes (Double, Date, Large, Currency) on 8-byte borders increasing performance when accessed.

The ebx value points to an address 128 bytes down on the stack relative to the value in ebp, the stackframe. Although the first local variable is actually located at ebp – 4 , it will be referenced using the value in ebx. The location of the local variable is +124 bytes relative to the value in ebx, in assembly syntax the tmp variable is located at 124[ebx].

The value stored at that position is obtained using dword ptr 124[ebx]. This is illustrated by the next three lines of code where the parameters are multiplied by the fpu (floating-point processor) and where the result is assigned to the local variable tmp.

042704DA: fild    dpt 12[ebp]  ; load value of param x into fpu
042704DD: fimul   dpt 16[ebp]  ; multiply by value in param y
042704E0: fistp   dpt 124[ebx] ; store result in tmp

The parameters x and y are accessed using the value in ebp as we will see.

Stack structure
When the procedure is called, the caller puts the parameters y and x on the stack, in reversed order. GB subroutines conform to the stdcall convention, which means that the parameters are pushed from right to left and that the subroutine corrects the stack before returning. Since y is the most right parameter it is pushed first, followed by the parameter at the left (here x). Then the CPU adds the return address on the stack and executes the subroutine.

From this point on the stack is prepared according the procedure’s entry code discussed above. The result can be viewed in the next picture:

The entry code saves the current values from the ebx and ebp registers on the stack. Then ebp is assigned the new esp value. Now ebp is used to address the parameters: parameter x is located at a positive offset of 12 bytes from the value in ebp; in assembly code 12[ebp]. Parameter y is 16 bytes up the stack relative to the value in ebp, in assembly code 16[ebp].

To address the parameters throughout the procedure ebp needs to remain constant during the execution of the procedure. The same is true for ebx that is used to address the local variables. We cannot use esp to reference both parameters and local variables because esp changes automatically during the execution of the procedure. (Although C/C++ compilers sometimes keep track of esp and address all stack variables using an offset to esp.)

Allocating and initializing
After the stackframe is established (mov ebp, esp) the next step requires the reservation of stackspace for local variables, see listing. The general idea, and described in most textbooks, is to subtract the required number of bytes from esp and then clear that piece of memory. In our sample esp would have to be decreased by 4 bytes (for the Long variable tmp) and then cleared by zero. Although GB produces the same effect, it proceeds a bit different.

Note that the first byte of the 32-bits local variable tmp is located at ebp-4. After creating the stackframe by mov ebp, esp the registers esp and ebp point to the same stack address. To reserve and initialize the 4 bytes below esp GB uses the instructions sub eax, eax / push eax.

Subtracting a register by itself results in zero. By pushing zero, now the value in eax, GB both reserves and initializes the local variable in one step. It prevents the additional step to first decrease esp explicitly. The technique to use push to reserve and initialize is typical for GB. The push eax can be repeated to clear and reserve all stack memory necessary for local variables. Thus, if the procedure would have contained two local variables of type long it would have had two push eax instructions.

Exit code
Before leaving the procedure the stack must be returned to the state it was when the procedure was entered. In addition, because of the stdcall convention, the procedure must remove the bytes necessary for the parameters (2 * 4 bytes for two long parameters). This is how its done:

042704E3:   mov     esp,ebp ; restore esp
042704E5:   pop     ebp     ; restore ebp
042704E6:   pop     ebx     ; restore ebx
042704E7:   ret     8       ; return, discarding parameters

When the program returned to the caller the registers that matter and must remain constant are restored. This makes sure the caller can use the correct ebp value to access its parameters and that ebx can be used to access its local variables.

Optimize using disassembly
Inspecting a procedure’s disassembly is useful to get an idea what’s going on underneath the GFA-BASIC statements. The example presented in this blog proves why. The example performs a multiplication of two integer parameters and stores the result in another integer. As you can see, the compiler generates floating-point assembly instructions to perform the math. Since all variables are of type Long, the compiler could have generated more efficient code using the integer multiplication instruction imul. However, the compiler generates integer instruction only for addition and subtraction operators. Now its up to the programmer to optimize this procedure by replacing the multiplication operator * by the Mul operator. The optimized procedure then becomes:

Proc TestMul(x As Int, y As Int) Naked
  Local Int tmp
  tmp = x Mul y
EndProc

Now compile the code and inspect the disassembly. As you can see the floating point instructions are vanished and replaced by the imul instruction.

Conclusion
Inspecting a procedure’s disassembly requires knowledge to identify the three parts  of a procedure; the entry code, the actual code, and the exit code, We discussed how to identify parameters and local variables and saw how GB uses a specific technique to reserve and initialize local variables.

In coming blog posts we’ll discuss non-naked procedures and how you can tell a procedure is a good candidate to be naked.

09 September 2018

Did the mouse leave the window?

There are two mouse-messages that are never received unless you explicitly instruct Windows to track the mouse movement. The first message is WM_MOUSELEAVE that is supposed to report that the mouse has left the client-area. The second is WM_MOUSEHOVER which is posted after hovering a certain amount of time over some area. To obtain one (or both) of these messages you need to call TrackMouseEvent() API which notifies the application when the mouse leaves the window or when the mouse hovers over an area for a while.

The next program illustrates how to implement a mouseover feature by drawing a box that turns black when you move the mouse over it. The basic idea is to use WM_ MOUSEMOVE to know when the mouse has moved in or out of the box. The only problem is that if the user moves the mouse quickly outside the window, you won't get a WM_ MOUSEMOVE. To implement a correct behavior of mouseover, you need to know when the mouse has left the window entirely.

The program doesn’t use the _MouseMove eventsub, but combines the mouseover-logic into the _Message eventsub, which receives all (posted) mouse messages. There are no event subs for WM_MOUSELEAVE and WM_MOUSEHOVER, so they have to be handled in a general event- sub. An alternative would be to handle the messages in _MessageProc, but its use is a bit more complicated. In addition, _Message doesn’t require any return values, so it serves our purpose best.

OpenW Center 1
Do
  Sleep
Until Win_1 Is Nothing

Sub Win_1_Paint
  Box 10, 10, 100, 100
EndSub

Sub Win_1_Message(hWnd%, Mess%, wParam%, lParam%)
  Static Bool fTrackingMouse, fBoxHighLighted
  Dim tme As TRACKMOUSEEVENT
  Local Int mx, my
  Switch Mess%
  Case WM_MOUSEMOVE

    ' Track a mouseleave event. Results in a WM_MOUSELEAVE
    ' message when the mouse leaves the window.
    If !fTrackingMouse           ' set it only once
      tme.cbSize     = SizeOf(TRACKMOUSEEVENT)
      tme.dwFlags    = TME_LEAVE
      tme.hwndTrack  = Me.hWnd
      fTrackingMouse = TrackMouseEvent(tme) != 0
    EndIf

    ' If mouse is over the box start timer
    mx = LoWord(lParam%), my = HiWord(lParam%)
    If mx > 10 && mx < 100 && my > 10 && my < 100
      tme.cbSize      = SizeOf(TRACKMOUSEEVENT)
      tme.dwFlags     = TME_HOVER       ' start timer
      tme.hwndTrack   = Me.hWnd
      tme.dwHoverTime = HOVER_DEFAULT   ' use default time
      TrackMouseEvent(tme)              ' now wait for WM_MOUSEHOVER
    Else If fBoxHighLighted             ' hilighted and not over box
      Win_1.Invalidate 10, 10, 90, 90
      fBoxHighLighted = False
    EndIf

  Case WM_MOUSELEAVE            ' triggered by TrackMouseEvent
    fTrackingMouse = False      ' TrackMouseEvent not active anymore
    If fBoxHighLighted          ' redraw original box
      Win_1.Invalidate 10, 10, 90, 90
      fBoxHighLighted = False   ' box is not highlighted
    EndIf

  Case WM_MOUSEHOVER            ' triggered by TrackMouseEvent's timer
    mx = LoWord(lParam%), my = HiWord(lParam%)    ' mouse coordinates
    If !fBoxHighLighted && mx > 10 && mx < 100 && my > 10 && my < 100
      PBox 10, 10, 100, 100                     ' highlight the box
      fBoxHighLighted = True
    EndIf
  EndSwitch
EndSub

Public Const HOVER_DEFAULT   = 0xFFFFFFFF
Type TRACKMOUSEEVENT
  - DWord  cbSize
  - DWord  dwFlags
  - Handle hwndTrack
  - DWord  dwHoverTime
EndType
Declare Function TrackMouseEvent Lib "user32" Alias _
  "TrackMouseEvent" (ByRef EventTrack As TRACKMOUSEEVENT) As Long

The _Message sub declares two static booleans, fTrackingMouse and fBoxHighLighted, that keep track of the current state of the mouseover-logic. (If I would have used the _MouseMove eventsub to initiate the mouse tracking the variables should have been declared global, I always try to avoid global variables as much as possible.)
When the first (of many) WM_MOUSEMOVE message is received, the TrackMouseEvent() API is used to set up a WM_MOUSELEAVE  "one-shot" event. Exactly one and only one  WM_MOUSELEAVE message will be posted to the window specified in the hwndTrack member of the TRACKMOUSEEVENT structure, when the mouse has left the client area.
Note - The message will be generated only once. The application must call the TrackMouseEvent API again in order for the system to generate another WM_MOUSELEAVE message. In addition, when the mouse pointer is not over the application, a call to TrackMouseEvent() will result in the immediate posting of a WM_MOUSELEAVE message.

When the mouse is over the box, the TrackMouseEvent() is used to start a timer that eventually posts the WM_MOUSEHOVER message. After receiving WM_MOUSEHOVER the box is highlighted if the mouse is still over the box. The fBoxHighLigted variable is set to indicate the state of the box. If the variable is set but the mouse is no longer over the box the area occupied by the box is invalidated so that it is redrawn eventually.

21 August 2018

What’s the difference between ?: and Iif()?

There is a simple answer: none. The conditional ?: is called a ternary operator because it takes three arguments. It is often used as a shortcut for an If – Else statement. The syntax is:

result = condition ? expr1 : expr2

The condition must evaluate to either True or False. If the condition evaluates to True, expr1 is assigned to result. If the condition is False, result becomes the value of expr2. Consequently the data-types of expr1 and expr2 must match the type of result, they are either equal or the types of expr1 and expr1 can be implicitly converted to the type of result. Here is an example of how to use the ternary operator:

Dim number As Int, result As String
number = 1
result = number >= 0 ? "Positive" : "Negative"

This construct replaces the following If-Else statement:

If number >= 0
  result = "Positive"
Else
  result = "Negative"
EndIf

The ?: operator is ‘inlined’, the compiler generates the fastest code possible without calling any function. Nonmatching data types are catched by the compiler.
The If-Else statement generates a little more code, because it contains two assignment statements.

The Iif() function is implemented in exactly the same way as the ?: operator. It generates the exact same inlined code.

result = Iif(number >= 0, "Positive", "Negative")

This results in the same optimized code:

037C04D0: mov     eax,[0x048BC190]  ; number to eax
037C04D5: test    eax,eax
037C04D7: jl      short 0x037C04E0  ; jmp if < 0
037C04D9: mov     eax,0x049205B4    ; address of “Positive”
037C04DE: jmp     short 0x037C04E5  ; goto
037C04E0: mov     eax,0x049205F4    ; address of “Negative”
037C04E5: push    eax               ; address of string
037C04E6: push    0x02A37F90        ; address of result
037C04EB: scall   STOSTRSV          ; store string in result

The disassembly is produced with the new Disassemble Proc feature present in the recent updates of GFA-BASIC.
Note – the compiler setting Branch-Optimizing is set to ‘Normal’, which results in shorter code.

29 April 2018

Definition of KB changed

Being an old-school programmer I learned 1KB == 1024 bytes, however this has been changed back in 1998. The prefix K (kilo) now means 1000 not 1024: 1 kilo byte is 1000 bytes. The why is discussed at physics.nist.gov and of course wikipedia.

The latest GB32 update displays the size of the created EXE (Gll or Lg32) after it has been written to disk. The number of KB is calculated as filesize / 1024. However, the value should have added the KiB unit, rather than the KB unit. The KiB unit denotes a value divided by 1024, KB doesn’t. If the file size is to be displayed in the KB unit it should have been divided by 1000.

I became aware of this when I noticed the Explorer displayed a different size for the compiled file. The next update will fix this by displaying the correct value in KB, so filesize /1000.

30 March 2018

Floating-point numbers

Often floating-point numbers lead to confusion and frustration. Unfortunately, these problems cannot be avoided and to properly work with floating-point values a basic understanding is required.

How floating-point numbers are stored
Floating point decimal values generally do not have an exact binary representation. This is a side effect of how the FPU represents and processes floating point numbers. The storage format for Double and Single is the same as expected by the FPU-registers of the CPU. This ensures consistency and fast reading from and writing to memory. The problem however, is how to store a floating-point value in a binary computer. This is solved by storing a floating-point number as a formula. There are two types of floating-point numbers: Float (or Single) and Double. The difference is their size in bytes, and therefore the minimum and maximum values that can be stored. Another difference is the higher accuracy for a Double. The maximum number a Float (taking 4-bytes) can store is much less than a Double (taking 8-bytes) can store. Since floating-point numbers can have an infinite number of values, you cannot store all of them in either 4 bytes (float) or 8 bytes (double). To be able to store as much numbers as possible, with as much accuracy as possible, another approach is necessary. A floating-point value is stored as a formula:

X = (-1)^sign * 2^(exponent - bias) * (1 + fraction * 2^-23)

The formula contains 3 variables (sign, exponent, fraction) and one constant (bias).The bias for single-precision numbers is 127 and 1,023 (decimal) for double-precision numbers. The values of these formula-variables are stored in either 4 bytes for a Single or 8 bytes for a Double.

The next example uses an user defined type Sfloat to illustrate the storage of the Float data type. The values for fraction and exponent, together with a sign bit, are stored in the 4 bytes. By using a Union we can assign a value to a Single variable and then use Sfloat to dump the 4 bytes that make up the Float:

Debug.Show
Type Sfloat
  fraction As Bits 23   // fractional part
  exponent As Bits  8   // exponent + 127
  sign     As Bits  1   // sign bit
EndType
Type TFloat Union       // sizeof() = 4
  value As Float
  sf    As Sfloat
EndType
Dim fv As TFloat
fv.value = 2.0     : DumpFloat(fv)
fv.value = 0.0     : DumpFloat(fv)
fv.value = -345.01 : DumpFloat(fv)
' Do test some more ..

Proc DumpFloat(ByRef tflt As TFloat)
  Global Const bias As Int = 127        ' Standard IEEE
  Global Const frexp As Float = 2 ^ -23 ' Standard IEEE
  Dim flt!

  Debug "> DumpFloat:"; tflt.value;
  With tflt.sf
    Debug " (sign =";.sign; " exponent ="; .exponent; _
      " fraction =";.fraction;")"
    Debug "Binary format: ";Bin(.sign, 1)` _
      Bin(.exponent, 8)`Bin(.fraction, 23)

    ' Reconstruct value from Sfloat using formula:
    flt! = ((-1) ^ .sign Mul 2 ^ (.exponent - bias)) _
      * (1 + (.fraction * frexp))
    Debug "Float reconstructed ="; flt!
  EndWith
  Debug
EndProc

The output of the demo is:

> DumpFloat: 2 (sign = 0 exponent = 128 fraction = 0)
Binary format: 0 10000000 00000000000000000000000
Float reconstructed = 2
> DumpFloat: 0 (sign = 0 exponent = 0 fraction = 0)
Binary format: 0 00000000 00000000000000000000000
Float reconstructed = 0
> DumpFloat:-345.01 (sign = 1 exponent = 135 fraction = 2916680)
Binary format: 1 10000111 01011001000000101001000
Float reconstructed =-345.01

What does this tell us? A Float (or Double) is stored and described using 3 components in the bits of either a 4 or 8 bytes type. Due to the limited storage of all these components only an approximation of the decimal value can be ‘described’. When a floating point value is assigned to a variable the value is dissected into these 3 components. To get back to the original floating-point number these 3 components are substituted in this standardized formula.

Effect of floating point values
A floating-point value is stored by a description, not by its value. This makes it inherently inaccurate. Even common decimal fractions, such as 0.0001 cannot be represented exactly in binary, only fractional numbers of the form n/f where f is an integer power of 2 can be expressed exactly with a finite number of bits. Examples are 1/4, 7/16, 3/128, of each f is a power of 2.
The inaccuracy may increase slightly when a floating point value is loaded into a 80-bits FPU register. The FPU fills the remaining bits, because the 80-bits representation is different from the 32-bits Single or 64-bits Double format. Moving a value out of the FPU register will round the value back to fit in either a Single or a Double. Storing and transporting may add to the inaccuracy of of the value.

The following example shows what happens when the small error in representing 0.0001 propagates to the sum:

Dim dSum As Double, i As Int
For i = 1 To 10000
  dSum = dSum + 0.0001
Next i
Debug dSum     ' = 0.999999999999906

Theoretically the sum should be 1.0.

Not only the calculations suffer from inaccuracy, comparisons with floating point numbers are equally problematic. The following example demonstrates a ‘forbidden’ comparison between a floating-point constant number and the result of a calculation:

Global Double dVal1, dVal2
dVal1 = 69.82
dVal2 = 69.20 + 0.62
Assert dVal1 == dVal2  ' Not equal

This throws an ASSERT exception, because the assertion that dVal1 and dVal2 are equal fails.
A comparison between two floating-point constants of the same type is allowed. For instance, the GFA-BASIC runtime returns a Single constant from DllVersion (2.33; 2.341; etc.). This constant may be compared to a literal Single constant (note the exclamation mark, without it 2.33 is a double!):

If DllVersion == 2.33! MsgBox "This is version 2.33"

Never compare two different data types, a Single to to Double, or a floating-point to an integer, These comparisons will most certainly fail (unless they can be described using a finite number of bits, see above). Any comparison to a floating point will most likely fail, because the comparison is executed in the FPU expanding the values to 80-bits. The same is true for the comparison of the results of two floating point calculations, it will most certainly fail.

The most logical solution for floating-point comparison of type Double is the use of the special operator NEAR, which uses only 7 decimal digits from both expressions for the comparison. In practice the expressions are compared as if they are both of Single precision.

Improve floating-point consistency in calculations
If your application expects multiple fp-calculations, it is necessary to keep the intermediate values in the proper data format, otherwise small errors are propagated through the calculations. For multiple floating-point calculations the FPU uses the intermediate results that it holds in the 80-bits FPU registers. However, these 80-bits calculations do not reflect the data types involved, the Single or Double. Due to the extra level of accuracy multiple calculations may produce unexpected results. The compiler setting ‘Improve floating-point consistency’ inserts code to load and write immediate results from and to memory in the appropriate type. This decreases program speed, but improves the chance for an expected result of the calculation. Make sure the ‘Improve floating-point consistency’ is checked always (unless you know exactly what you’re doing).

Conclusion 
Floating-point values are inherently inaccurate, you might want to avoid them as much as possible. Instead use integers when ever possible, or otherwise use Currency, which is an integer value as well. The Currency data type exactly stores up to 19 digits, with 4 digits after the decimal point.

21 March 2018

Function and Sub parameters

In the new English Html help additional information is provided for Function and Sub. Since this is new information a copy of the text has a place in a blogpost.

Function parameters
The return value of a Function can be assigned to a local variable with the same name as the Function. When the return type is a numeric data type a local variable of that type is automatically added to the function’s local variables. With String, Type (UDT) and Variant as the return type a by reference variable is passed as the last argument on the stack. The string, UDT or variant becomes the variable that can be used to pass the function’s return value. Therefor, the following is equal:

Dim h$
h$ = testf(8) ' assign result to h$
testp(8, h$)  ' put result in h$
Function testf(a%) As String testf = "3" & a%
Procedure testp(a%, ByRef p$) p$ = "3" & a%

In the function testf the variable h$ is silently passed on the stack. Inside the function this by reference variable is known as testf. Assigning a new string to testf actually assigns the string to h$ directly.

When a Function is used for a Windows API callback make sure the return data type is a primary numeric type (Byte, Word, Long, Int64, Single, Double), otherwise the stack will be overwritten.

Sub And FunctionVar parameters
For compatibility reasons GFA-BASIC 32 includes the Sub and FunctionVar statements. FunctionVar is compatible with VB’s Function; arguments are passed by reference by default and without a data type the Variant type is assumed. The same is true for Sub, without a ByVal or ByRef keyword the default is by reference, an implicit by reference. When a datatype is missing the Variant type is the default. Examples:

Sub test(vnt)        ' implicit ByRef, Variant datatype
  vnt = "new Value"  ' do not write to parameter

FunctionVar tfv(a As String) ' As Variant
  tfv = "new value" + s

Although ByRef is implied the rules aren’t as strict as with an explicit ByRef. When ByRef is included only actual variables can be passed, without the ByRef keyword the Sub and FunctionVar also accept literal values and types that don’t match. The following calls are allowed:

Local vnt As Variant, s As String
test(7)     ' 7 is assigned to a local Variant first
test(vnt)   ' vnt variable is passed by ref
test(s)     ' types don't match
vnt = tfv(7)' 7 is assigned to string first

Note that the literal value 7 is passed to the by reference parameter vnt in test. The compiler won’t complain since an implicit by reference does not have to reference an actual application varaible. When the Sub test(vnt) and FunctionVar tfv() include a ByRef keyword explicitly the compiler will check the argument’s type against the parameter’s type and will complain if they don’t match. An explicit ByRef declaration only accepts actual variables of the same type as the argument. Both, the variable that is passed as the argument, and the by reference parameter must be of the same type, like this

Dim s As String
CallByRef s     ' must be a variable
Debug s         ' = “new value”
Sub CallByRef(ByRef p As String)
  ' only accepts String variables as arguments
  p = "new value"
EndSub

Limits to the use of parameters
So, there is a subtle difference between the default, an implicit ByRef and an explicit ByRef in a Sub and FunctionVar. In both cases only a variable can be passed by reference properly. With an implicit by reference an actual variable can be passed only when the types match. In all other circumstances the argument is first copied to a hidden local variable of the same datatype as the parameter and then the hidden variable is passed by reference. In the example above, the literal value 7 is first copied to a temporary hidden Variant variable and then the temporary variable is passed by reference. The same is true for the third call: test(s). Since the data types don’t match the string s is first copied to a temporary Variant variable which in turn is then passed to test(). The temporary hidden variable is immediately destroyed after returning from the Sub or FunctionVar.

Now lets look at it from the Sub’s point of view. Although by reference is implied the Sub doesn’t know what kind of variable is actually passed. It might be an actual variable, but it might also be a reference to a temporary local variable. Therefore Sub and FunctionVar cannot return a value using an implicit by reference parameter. In case a temporary variable is passed, it will be destroyed immediately after returning from the Sub. Only when a parameter is declared using the ByRef keyword explicitly is the Sub guaranteed to receive an actual variable.

Note An implicit ByRef parameter cannot be used as a local variable as can with ByVal parameters. The Sub doesn’t know whether the parameter is a reference to a temporary variable or to an actual variable. In case the parameter references an actual application variable the Sub might very well overwrite the contents of that variable.


04 March 2018

GfaWin23.Ocx Update 2.34

Another three bugs are fixed in this version. The SetPrinterByName needed maintenance because of the always growing need of memory with newer versions of Windows. The command failed with some printers that returned a lot of information in the GetPrinter() API. GFA-BASIC did not reserve enough memory and caused a buffer overrun. The EOF() function now also works with inline files, the ones that are stored in the :Files section of the program. The ListView.GetFirstVisible property now returns a ListItem.

About Version-numbering
The update gets FileVersion 2.34.1803 and still belongs to GFA-BASIC product version 2.3. The Build number now shows the year and month of the release. This is the first update with the new version structure. The DllVersion$ is 2.34 Build 1803 and indicates a release date of March 2018. A DLLVERSION structure always uses a 3 part format, major, minor and build. The VERSIONINFO resource on the other hand uses 4 part format. The GfaWin23.Ocx VERSIONINFO structure == 2.34.1803.0 and leaves the last part unused. This is the version showed in the File Properties dialog:

The ocx extension indicates a DLL with OCX-controls that are described using a type library. In contrast with the purpose of the extension, the OCX controls are not publically registered and are only available within in GFA-BASIC 32.

Automation
All OLE classes defined in the GfaWin23.ocx are private to the GFA-BASIC 32 application, each OLE class is implemented with its own command. This leaves no room for dynamically loading of other COM classes and automatic use of their interfaces (like VB). Third party COM classes can only be used when they implement a dual interface so they are accessible through a dispatch identifier. In GFA-BASIC these dual interface classes are supported through the use of CreateObject and the Object.property syntax. Unfortunately, calling an interface member (property/method) requires a two step process. First the caller must ask the server for an ID number and then use that number to actually invoke the member (property/method). Each dot operator requires this two step process and accessing dual interface members can cost quite some time when a command consists of many dots. For instance something like this: Object.List.Items(n).Text requires 6 calls to the COM-class provider. GFA-BASIC 32 provides a hidden optimization for automation objects created with CreateObject(). It caches all IDs in a hash-table the first time they are used. The next time a property/method is used it is looked up in the hash-table which is considerably faster.