24 March 2020

Converting a float to an integer

Without giving it another thought we often convert a floating-point value to an integer by simply assigning a float to an integer data type variable. We assume the compiler knows what’s best and we trust the compiler does the proper conversion for us. Time for a look behind the scenes.

The FPU and the control register
The FPU is independent of the main processor and contains its own set of registers to perform its task. The FPU registers include eight 80-bit data registers (ST0-ST7), and three 16-bit registers called the control, status, and tag registers. We will ignore the status and tag registers and focus on the control register which is used to access the features of the FPU. The control register controls the floating-point functions within the FPU. The control register uses a 16-bit register where each bit defines a specific setting such as the exceptions to produce, the precision the FPU uses to calculate floating-point values, and the method used to round the floating-point results. The bits are shown below:

Control bits      Description
0-5 Exception masks
6-7Reserved
8-9 Precision control
10-11 Rounding control                
12Infinity control

For the conversion from float to integer we are only interested in bits 10 and 11. The x87 FPU implements four rounding methods in hardware. The possible settings of the rounding control bits are as follows:

00 - Round to nearest (even)
01 - Round down, towards negative infinity
10 - Round up, towards positive infinity
11 - Round toward zero

The "Round to nearest (even)" method is used by default by GFA-BASIC 32, so there's a high chance you're already using it. The current rounding mode of the x87 can be obtained using the fstcw assembler command, as shown in the following sample:

' Status of bits 10 and 11 of control word
Dim cf As Word
. fstcw [cf]
Debug Bin(cf %& %110000000000, 2)  ' = 00

By default, GB clears both rounding bits and converts floating-point values to integers using the “Round to nearest (even)”. For this to happen, GB initializes the FPU with the value 0x372, which clears the rounding bits. The way of rounding can be changed and set to a new value by setting  the control word to a new value using the fldcw instruction. GB changes the rounding bits when it rounds a floating-point value using one of the truncation/rounding functions like Int, Trunc, Floor, Ceil, etc. as we’ll see later in this post.

By default, the rounding control is set to rounding to the nearest (even) which seems to be correct for most calculations. However, this might cause an unexpected behavior, for instance 2.5 is not rounded to 3, but rounded to 2. The FPU prefers the nearest even value when the decimal part is exactly .5. For the same reason 3.5 is rounded to 4. This is useful in case of statistical analyzing where you want to spread numbers evenly when they are exactly in the middle of two integers. However, this might not always be what you want. For instance, by default C/C++ compilers always convert floating point values by truncating them down towards zero (cast to Int).

If you don’t care and if you are happy with the GB’s default rounding, you can easily convert a floating point value to integer by simply assigning the one to the other:

Dim f As Float = 2.5, i As Int
i = f     ' assign float to int
Debug i   ' = 2

This is the fastest possible conversion, it takes only one assembler instruction (fistp) to convert a floating point as shown in the disassembly (without the Debug command). See for more details of disassembling GB-code the blogpost Anatomy of a procedure.

--------  Disassembly -----------------------------------
0 - (Sub Main) (Lines=2)
0559C350: B8 2E 00 00 00                 mov     eax,0x0000002E
0559C355: FF 15 40 1A 4D 00              scall   INITPROC0 ; Ocx: $180277CB
0559C35B: F8                             clc    
0559C35C: FF 55 B4                       call    dpt -76[ebp] ; @Tron
0559C35F: C7 05 10 A7 5C 06 00 00 20 40  mov     dpt [0x065CA710],0x40200000
0559C369: FF 55 B4                       call    dpt -76[ebp] ; @Tron
0559C36C: D9 05 10 A7 5C 06              fld     dpt [0x065CA710]
0559C372: DB 1D 14 A7 5C 06              fistp   dpt [0x065CA714]

0559C378: 8B 4D F0                       mov     ecx,dpt -16[ebp]
0559C37B: 64 89 0D 00 00 00 00           mov     dpt fs:[0x00000000],ecx
0559C382: 8B E5                          mov     esp,ebp
0559C384: 5D                             pop     ebp
0559C385: 5B                             pop     ebx
0559C386: 5F                             pop     edi
0559C387: 5E                             pop     esi
0559C388: C3                             ret
    

The interesting commands are fld and fistp. First fld loads the floating-point value into ST0 – the top of the stack - and then the fistp instruction pops the value off the floating-point stack. fistp converts it to an integer, and then stores it at the address specified. This instruction uses the rounding control settings to determine how they will convert the floating point data to an integer during the store operation.

There is one other assembler instruction that rounds to integer. The frndint instruction rounds the value in ST0 (the top of the stack) to the nearest integer using the rounding algorithm specified in the control register. The result remains in ST0 as a floating point value, it simply does not have a fractional component. GB uses the frndint instruction for its truncation functions Int(), Fix() and others.

The truncation functions
The function Trunc (or Fix) round towards zero. The Int (or Floor) function round towards negative infinity and Ceil rounds to positive infinity. Let’s start with the most often used truncation function Int(), or its synonym Floor(). We’ll use this simple program to look at it’s disassembly.

Dim f As Float = 2.5, i As Int
i = Int(f)   ' truncate down to negative infinity

The disassembly of the Int() function (without the surrounding commands):

0559C4EC: D9 05 10 A7 5C 06   fld     dpt [0x065CA710]
0559C4F2: D9 2D 1C 1A 4D 00   fldcw   V_RNDMINUS
0559C4F8: D9 FC               frndint
0559C4FA: D9 2D 14 1A 4D 00   fldcw   V_RNDNEAR
0559C500: DB 1D 14 A7 5C 06   fistp   dpt [0x065CA714]

First ST0 is loaded with the value in variable f, then the control word of the FPU is loaded with the value from a variable called V_RNDMINUS, which is 0x77. This value sets the rounding bits of the control word to %01. By executing frndint the value in ST0 is rounded to an integer using the setting “Round down towards negative infinity”. Int() rounds 2.5 to 2 and –2.5 to -3. After rounding, the control word is reset to the default value 0x372, which is stored in the runtime variable V_RNDNEAR. Finally, the value in ST0 is moved to the variable i with fistp,

Note Technically fld and fistp aren’t part of the Int() function. How a value ends up in ST0 depends on the code of the program. Similarly, fistp is only inserted if the result of Int() is to be stored in a variable. If Int() is used inside an expression the value remains in ST0.

In the same way you can create samples for the other truncation functions Trunc(), Fix(), and Ceil() and then examine their disassembly output. Trunc() and its synonym Fix() load the control word with the value stored in V_RNDZERO ( = 0xF72). This value sets the rounding bits to %11 to round towards zero. For Ceil() the rounding bits are set to %10, the value for the control word is obtained from the variable V_RNDPLUS ( = 0xB72), and frndint then rounds to positive infinity.

The QRound function
The QRound function is an addition to the truncation/rounding functions. The compiler generates only one instruction for this function: the frndint instruction to round the value in ST0. The compiler does not load the control word prior to executing the rounding.  QRound uses the current control word setting (0x372) and “Round to nearest (even)”. QRound is useful inside a mathematical expression where some interim outcome needs to be rounded (converted) to integer using the current control word setting. Since the interim outcome remains in ST0 (without the fractional part) a (complex) expression can be evaluated more quickly. In short the steps for variable = QRound(float) are:

fld float
frndint
fistp variable

Again, fld and fistp are not part of the function itself. Since QRound is mostly used with the default control word setting rounding is the same as when a float is assigned to an integer variable directly, as demonstrated at the beginning of this blogpost.

Note that using the assembler instruction fldcw prior to QRound you can determine your own float-to-integer conversion. Do not forget to return the value of the control word back to 0x372 afterwards.

Use Round for proper rounding
The Round function generates the same assembler instructions as Int(). However before frndint is executed the value in ST0 is increased with 0.5. The value in ST0 is then rounded towards negative infinity. In short, these steps are:

fld value
fadd 0.5 / fldcw 0x772 / frndint / fldcw 0x372
fistp variable

If your program wants “proper rounding” it should use Round() to convert a floating-point value to an integer.

18 January 2020

High resolution timer wrapped in a COM object

Only recently I needed a timer with a shorter interval than that the Ocx Timer can provide. The Ocx Timer smallest interval is 15.625 ms – 64 ticks per second - where I needed an interval of 10 ms to receive 100 timer-events per second. After some research I decided to use the API function CreateTimerQueueTimer() as a high-resolution timer. For a discussion on available timers see this Code Project article. I didn’t use the multi-media timers because MS advises against it, because these timers increase the system clock’s frequency which leads to a drain of battery-power on mobile devices. Nevertheless, the CreateTimerQueueTimer() API only produces shorter intervals than 15.625 ms if the the application’s system clock is adjusted as well using the multimedia function timeBeginPeriod(). This function cannot be used to increase the resolution of the SetTimer API which is used by the Ocx Timer.

A resource must be deleted
As with most Windows resources the queued timer comes with a create- and a release function. The created timer is released using the Windows API DeleteTimerQueueTimer() and is (usually) invoked when the program terminates. In addition, at the very end of the application, the system timer must be reset using timeEndPeriod(). The most common scenario for a GB program is outlined in this simple program:

$Library "mmsystem.inc"
OpenW 1
Global timerHandle As Handle, param As Large
' Create a 10 ms timer with ID=1 for Me
param = MakeLargeHiLo(Me.hWnd, 1)   ' assemble handle and ID
~timeBeginPeriod(1)
CreateTimerQueueTimer(timerHandle, Null, _
  ProcAddr(TimerQProc), V:param, 0, 10, WT_EXECUTEINTIMERTHREAD)
Do
  Sleep
Until Me Is Nothing
DeleteTimerQueueTimer(Null, timerHandle, Null)
~timeEndPeriod(1)

Proc TimerQProc(ByVal pParameter As Long, ByVal TimerOrWaitFired As Long) Naked
  ' Process timer event
  Dim pL As Pointer Large, hWnd As Handle, ID As Long
  Pointer pL = pParameter
  hWnd = HiLarge(pL), ID = LoLarge(pL)
EndProc

This sample only shows the general structure of a program that uses a Windows timer resource, the structure of the program is the same if it uses some other Windows resource. In this scenario a Windows resource is allocated before entering the message loop and released after the message loop has finished and the last Form has closed. However, when the program unexpectedly stops with a runtime-error the code below the message loop is never executed! This leads to unreleased Windows resources, something you don’t want. When GB raises a runtime error it stops at the line the error occurred and halts further execution of the program. The program’s windows (Forms) remain on the screen waiting to be closed or ‘cleaned up’ by using the wipe-window button in the IDE’s toolbar. Closing the remaining windows this way does not trigger any event subs like - for instance - the Form_Destroy event sub. Consequently, it is  pointless to move the resource delete function to this event sub, because it is not executed once the program stopped with a runtime error.

Each time the program is run within the IDE and stops with a runtime error it does not release the allocated resources. But, this is also true for the GB function mAlloc calls that require a call to mFree to release the memory. We need a way to release allocated resources under all circumstances. 

Using a COM wrapper
When GB stops executing after a runtime error it still releases GB resources, it closes I/O channels and deletes any TempFileName files, and finally it clears all the program’s global variables. For dynamic variables types (String, Object, arrays, hashes) the allocated memory is freed as well. (Therefore, it is sometimes better to use a string to allocate memory than to use mAlloc, strings are freed automatically.) For global variables that hold a COM object GB calls the Release vtable function of the IUnknown interface that each COM object implements. So, if we could wrap the resource handling in a (minimal) COM wrapper and store it in an Object type we are assured the Release function is called and we can properly delete the resource in the object’s Release function. This way we’re able to free the resources under all conditions.

If you’re not familiar with COM objects and the IUnknown implementation you might read a previous post first: COM in GB32 – IUnknown. The rest of this post discusses how to create a minimal COM wrapper for the queued timer APIs.

The minimal COM wrapper
The following full working sample creates a queued timer in the QueTimer function which returns an Object that holds a reference to the minimal COM object it creates. A COM object must at least implement the IUnknown interface that consists of the QueryInterface, AddRef and Release functions. Since this COM object doesn’t support any other interfaces (except IUnknown) we simply return with E_NOTIMPL from the QueryInterface function. The COM object is built manually in code and cannot be created by a function like CreateObject(). As a result QueryInterface is never called. The AddRef and Release functions require a proper implementation since these vtable functions are called by GB’s Set command.
The vtable functions must have the Naked attribute, or at least a $StepOff command, to prevent the GB compiler from inserting Tron code which can result in nasty and hard to find bugs. This is also true for any callback function Windows calls; the QueTimer callback procedure needs the Naked attribute as well.

$Library "mmsystem.inc"

OpenW 1, 0, 0, 300, 300, 48
PrintScroll = 1 : PrintWrap = 1

Global Object tmrQ1, tmrQ2
Set tmrQ1 = QueTimer(Me, 1, 10)   ' ID=1, 10 msec
Set tmrQ2 = QueTimer(Me, 2, 1000) ' ID=2, 1000 msec

Global Long Count, CountToErr
Do
  Sleep
Until Me Is Nothing

Sub Win_1_Message(hWnd%, Mess%, wParam%, lParam%)
  ' Process the WM_TIMER
  Static Long CountToErr
  If Mess% = WM_TIMER
    If wParam% == 1
      Count++
      Print ".";      // do something
    ElseIf wParam% == 2
      TitleW 1, "Timer Events/s:" + Str(Count) : Count = 0
      ' Interrupt GB with a runtime error after 10 sec
      CountToErr++ : If CountToErr = 10 Then Error 3
    EndIf
  EndIf
EndSub

Proc QueTimerProc(ByVal pParameter As Long, ByVal TimerOrWaitFired As Long) Naked
  ' Callback function
  Local hWnd As Handle, id As Long, pObj As Pointer IQueTimer
  Pointer pObj = pParameter             ' holds address of a IQueTimer object
  ~PostMessage(pObj.hWndTarget, WM_TIMER, pObj.TimerID, 0)
EndProc


Function QueTimer(frm As Form, id As Long, mSec As Long) As Object
  ' Create high-resolution timer wrapped in a minimal COM object.
  Global Long g_IQueTimerCnt

  Type IQueTimer         ' definition of object
    lpVtbl As Long
    refcount As Long
    Handle As Handle
    hWndTarget As Handle
    TimerID As Long
  EndType

  ' Set up the IUnknown vtable, same for each object
  Static vTable(0 .. 2) As Long    ' must remain in memory
  If vTable(0) == 0                ' do this only once
    vTable(0) = ProcAddr(IQueTimerVtbl_QueryInterface)
    vTable(1) = ProcAddr(IQueTimerVtbl_AddRef)
    vTable(2) = ProcAddr(IQueTimerVtbl_Release)
  EndIf

  ' Alloc and clear an IQueTimer object (Type) and
  ' assign it to an IQueTimer pointer.
  Local pObj As Pointer IQueTimer
  Pointer pObj = cAlloc(1, SizeOf(IQueTimer))

  ' Initialize the IQueTimer object
  pObj.lpVtbl = ArrayAddr(vTable())     ' set vtable
  pObj.refcount = 1                     ' set refcount
  pObj.hWndTarget = frm.hWnd            ' target window
  pObj.TimerID = id                     ' timer ID

  ' Create the API timerqueue resource and
  ' if succesfull finish the COM object, otherwise
  ' free the allocated memory.
  Local timerHandle As Handle
  If CreateTimerQueueTimer(timerHandle, Null, _
    ProcAddr(QueTimerProc), Pointer(pObj), 0, mSec, WT_EXECUTEINTIMERTHREAD)
    '  store the resource handle in the object
    pObj.Handle = timerHandle

    ' Set the system's clock resolution to 1 ms,
    ' do this only once per application.
    If g_IQueTimerCnt == 0 Then ~timeBeginPeriod(1)
    g_IQueTimerCnt++            ' count the number of instances

    ' Return COM object as Object
    {V:QueTimer} = Pointer(pObj)

  Else    ' something went wrong, release already allocated resource(s)
    ~mFree(Pointer(pObj))     ' free the alloced memory

    ' Do not set returnvalue to return Nothing
  EndIf
EndFunc

Function IQueTimerVtbl_QueryInterface(ByRef This As IQueTimer, _
  ByVal riid As Long, ByVal ppvObject As Long) As Long Naked
  Return E_NOTIMPL
EndFunc

Function IQueTimerVtbl_AddRef(ByRef this As IQueTimer) As Long Naked
  this.refcount++
  Return this.refcount
EndFunc

Function IQueTimerVtbl_Release(ByRef this As IQueTimer) As Long Naked
  this.refcount--
  If this.refcount == 0
    MsgBox "terminating" ' remove comment to see that Release is called
    DeleteTimerQueueTimer(Null, this.Handle, Null)
    g_IQueTimerCnt--     ' decrease instance counter
    ' If all instances are released, reset the system clock
    If g_IQueTimerCnt == 0 Then ~timeEndPeriod(1)
    ~mFree(*this)
  EndIf
  Return this.refcount
EndFunc

' Declares and Constants
Declare Function CreateTimerQueueTimer Lib "kernel32" (ByRef hNewTimer As Handle, _
  ByVal hTimer As Handle, ByVal Callbck As Long, ByVal Parameter As Long, _
  ByVal DueTime As Long, ByVal Period As Long, ByVal Flags As Long) As Long

Declare Function DeleteTimerQueueTimer Lib "kernel32" (ByVal hTimer As Handle, _
  ByVal Timer As Handle, ByVal CompletionEvent As Handle) As Long

Global Const E_NOTIMPL = 0x80004001
Global Const WT_EXECUTEINTIMERTHREAD = 0x00000020

The program creates two high-resolution timers and stores the minimal COM wrappers in the Object variables tmrQ1 and tmrQ2. The second timer is used to display the number of timer events per second produced by timer 1. There is nothing you can do with the Object variables, the minimal COM wrapper does not support any properties or methods. The Set command is the only command that can be used on these objects. The only reason for the existence of these Object variables is to sit and wait to be released so that the resources can be deleted properly. In fact, you could collect all globally used resources into the creation function – here QueTimer() - and release them in the Release vtable function.
To demonstrate the proper calling of the object’s Release the program raises an error after 10 seconds. A message box pops up to show you that Release is invoked after a runtime error.

Finally
If you’re not familiar with the binary layout of a COM object and maybe having trouble understanding how the COM object is build, don’t worry. You can copy paste this code to create your own minimal COM wrapper, simply replace the string ‘QueTimer’ with a name of your own (do not select Whole Word in the Replace dialog box). Then replace the code that creates and deletes the Windows resource with the functions you require. Of course you will need to edit the IQueTimer type that holds the information for a particular COM object.

14 October 2019

Update v 2.57 and Find in Files

On October the 7th I released version 2.57 of GFA-BASIC 32. It fixes problems in GfaWin32.exe (the IDE), gfawin32.gll, GfaWin23.ocx (the runtime) and the manifest file.

  • The GfaWin32.exe now properly handles the *Call()() functions like (L)StdCall and (L)CCall that take coercion modifiers like Dbl:, Sng:, etc. in the parameterlist.
  • The new GfaWin23.ocx version 2.36 fixes DlgBase Inside and DlgBase Outside, that were swapped. This is important if you want to use the Dialog command together with a dialog definition from an external program like I showed in Using Unicode Controls.
  • The gfawin32.gll has undergone some maintenance and should be even more stable. PeekView in the editor (while editing, not when running) is now optional solving to prevent the sudden appearance of large boxes of info. When disabled use right mouseclick on a word to obtain instant help. However, by disabling PeekView you will be missing instant help for variables and procedures.
  • Also updated is gfawinx.lg32 found in the Include directory (together with its source file). It is extended with some new functions and procedures. All of its procedures and functions are displayed in the GFA-BASIC command and function syntax colors and are discussed in the English helpfile. Gfawinx contains some essential functions like a new version helper function that indicates on what version of Windows the application is running. The return value depends on the enabled OS-es in the manifest file, which is updated now to ‘unlock’ features of all Windows OS-es.
  • The manifest file that comes with GfaWin32.exe is now also included as a resource in the compiled EXE (unless the program includes the $ManifestOff directive). The advantage is obvious, the manifest used to develop the application inside the IDE is now the same as used in the resulting EXE.

For more information about the update please see the readme25.rtf available from the Start menu.

Find In Files
Another GLL that has been updated is the findfiles.gll extension that can be installed from the GFA-BASIC home directory. From the main menubar choose Extra | Extension Manager and click the Add button and then select findfiles.gll. After it is loaded it inserts itself in the Edit submenu, where you’ll find it in the Find & Replace submenu, but you can also start it with Shift+Ctrl+F.

The findfiles.gll is a dialog based editor extension developed and tested from inside the IDE, the structure of the program helps to quickly edit and run in the IDE. This way it isn’t necessary to compile the extension and than load it with the Extension Manager to test.

The source code can be found here.

Find in Files enables you to locate a string in a group of files and folders. After locating the files a search result can be opened in an application of your choice:

In the Find what box, type the string you want to find (the button to the right is not functional). The search string may not contain wildcards. In the Look in box specify the folder where you want to begin the search. Use the Browse button to the right of the box to display a dialog where you can navigate to the folder you want. You can search recursively through a directory structure by selecting the Include subdirectories check box. In the Look at these file types box, specify the types of files in which you want to search. To limit the list of search results only 5 results per file are displayed.
You can perform a regular expression search if the corresponding checkbox is selected. The Find What string must specify a GFA-BASIC 32 regular expression as discussed in the reMatch command topic in the GFA-BASIC 32 helpfile.
Click Find now to start searching. You may abort the search any time by clicking the button again or by pressing Esc

Together with Incremental Search (Shift+Ctrl+I) the Find in Files feature is my most often used utility to locate information. Incremental Search locates a word very fast in the current GB file and Find In Files does the same job in external files.