Sorting and stability

Often you want to sort into alphabetical order or numerical order. And there are various ways to accomplish this.

But what if you need to sort first numerically and then alphabetically and maybe several other additional criteria ? How does this affect the overall result ?

It might be nice to make it so that the second sort doesn’t break the ordering the first sort did. And there are sorting schemes that do this.

For instance if you had the following data :

35,33, 42,10,14,19,26,44,26,31

and used a “stable sorting algorithm” then you should see the following (note that items have been numbered so its possible to keep track of which item is where in the newly sorted data)

Note that 26 appears twice in the input and, in the output the two instances of 26 retain their relative order. The 26 at item 6 appears in the output before the 26 in item 8.

Wikipedia defines a “stable sort” as

Stable sorting algorithms maintain the relative order of records with equal keys (i.e. values). That is, a sorting algorithm is stable if whenever there are two records R and S with the same key and with R appearing before S in the original list, R will appear before S in the sorted list.

https://en.wikipedia.org/wiki/Category:Stable_sorts

Fortunately Xojo uses a RADIX sort which IS a stable sort so subsequent sorting wont ruin the relative order of elements. This code shows this occurring.

Dim rawData() As Integer = Array(35,33, 42,10,14,19,26,44,26,31)

Dim indexes() As Integer
Dim data() As Integer

For i As Integer = 0 To rawData.Ubound
  indexes.append 0i
  data.append rawData(i)
Next


data.sortwith indexes

Break

Whats harder to see is that if you then sort the data in another way that the relative orders are preserved in the result.

The following more complex code shows this. We first sort by ascending numeric value, and then by ascending alphabetical value.

Dim rawData() As Integer = Array(35,33, 42,10,14,19,26,44,26,31)
Dim alphaRawData() As String = Array("f", "z", "c", "d", "e", "a", "b", "q", "s", "b")

Dim indexes() As Integer
Dim data() As Integer

For i As Integer = 0 To rawData.Ubound
  indexes.append i
  data.append rawData(i)
Next

data.sortwith indexes, alphaRawData

Break

alphaRawData.SortWith data, indexes

Break

Again note that the 26’s retain their relative order. Sorting multiple times doesnt ruin the relative order of the input items.

Even if we chnage the data so that the two 26’s use the same alphabetic value they will retain their relative order.

Error handling speed

A question I was asked recently posed an interesting question about exceptions and the cost of using them as a general error handling mechanism.

Xojo uses whats know as “zero cost exception handling”. Essentially at runtime there is no penalty for having exception handling in place. It imposes no runtime overhead when NO exceptions are encountered. But, when exceptions are encountered, it can be “expensive”.

Expensive in this sense can mean it induces slowness or requires more memory. Or both.

So I put together a simple example that demonstrates the cost of using exception handling instead of error codes. Its very simple and uses a deprecated API that reported error codes instead of the newer one that raised an exception instead.

Its just a simple desktop application with a textarea on the default window and this code in the Window’s Open event

Dim f As folderitem = SpecialFolder.Desktop.Child("foo")

Dim errorcodeStart As Double = Microseconds
Dim errorCodeCount As Integer
For i As Integer = 1 To 1000
  
  Dim ts As TextInputStream = f.OpenasTextFile
  
  If f.LastErrorCode <> 0 Then
    errorCodeCount = errorCodeCount + 1
  End If
  
Next

Dim errorcodeEnd As Double = Microseconds

Dim exceptionStart As Double = Microseconds
Dim exceptionCount As Integer
For i As Integer = 1 To 1000
  
  Try
    Dim ts As TextInputStream = TextInputStream.Open(f)
    
  Catch IOX As IOException
    exceptionCount = exceptionCount + 1
  End Try
  
Next

Dim exceptionEnd As Double = Microseconds


Dim errorTotal As Double = errorcodeEnd - errorcodeStart
Dim exceptionTotal As Double = exceptionEnd - exceptionStart


TextArea1.AppendText "1000 iterations" + EndOfLine 
TextArea1.AppendText "Error Code = " + Str(errorTotal) + "ms" + EndOfLine 
TextArea1.AppendText "Exceptions = " + Str(exceptionTotal) + "ms" + EndOfLine 

In a debug run on my compute I get the following output.

1000 iterations
Error Code = 28096.87ms
Exceptions = 41711.62ms

The version using exceptions is juts about 50% slower over 1000 iterations when debugging.

In a version compiled with the DEFAULT setting the difference is less, but still present.

1000 iterations
Error Code = 28367.42ms
Exceptions = 35925.94ms

Exceptions are still about 25% slower than error codes. The Moderate optimization setting is similar

1000 iterations
Error Code = 29292.96ms
Exceptions = 34415.9ms

Aggressive settings remain similar

1000 iterations
Error Code = 28222.49ms
Exceptions = 35309.52ms

There are some advantages to exceptions. Unlike error codes they are impossible to ignore at runtime. At some point you MUST put code in place to handle them or your application will just quit with an unhandled exception error.

However, there’s nothing in Xojo that helps you make sure you have handled the possible exceptions that can be raised, and also nothing that tells you what exceptions might be raised.

So heads up before you dive into using exceptions everywhere as a general error handling mechanism in your applications. There are costs to doing this and they could manifest themselves in slower code or code that requires more memory. Or both.

Huh !?

Something I never really gave much thought about made me do a bit of a double take today.

If you assign a color literal to a color variable you get what might look like “funny” behaviour when compared to other types of variables.

For instance, if you do

Dim c As Color = &c12345

what you get in the color is the color value &c12345000, a right zero fill, instead of &c00012345, a left zero fill.

If however you do something like

Dim i32 As Int32 = &h12345

what you get is a 32 bit value that is analogous to &h00012345 and its left zero filled.

So why does a color do a zero fill on the “right” and yet an integer, which is also a 32 bit value, you get what looks like a zero fill on the left ?

I suppose internally the compiler is actually doing something like

Dim i32 As Int32 = Int32 (&h12345)
Dim c As Color = Color(i32)

It first converts the &c notation into an Int32 which is then recast as a color and since a color, in hex, is actually AARRGGBB where a color is RRGGBBAA this is why you get the apparent “right zero fill” for a color.

One of those things you just take for granted and never really think about.

Now what got me even thinking about this ? The code editors display of a color literal.

It struck me that IF a color literal was left zero filled like you might assume it would be then this image SHOULD look like this

But, since I assume the compiler is doing the conversion of a color via an Int32. you get LEFT zero filling on the color constant and it shows up the way it does.

One of those things I just never really thought of in all the time I’ve used Xojo.

More good habits

On the forum there was a suggestion that a person should use a container control that had a few controls embedded in it at design time. And then when using that container control that they should skip providing an API that the container presented and just reach inside the container and manipulate the controls it contain directly.

Personally I would recommend against this.

I’d start by saying when you create a container control ALL the controls in it should be private by default to prevent this. And that if you want to expose functionality of the controls on the container you do so by defining methods and events on the container that any code OUTSIDE the container can call or react to just as if the container control was any other single control and not a composite one like it is.

Why would I make such a recommendation ?

  1. good habits
  2. encapsulation
  3. reusability
  4. long term flexibility and maintainability

The first point is just that this is a good habit to get into. And the reason its a good habit is because of points 2, 3 and 4. Properly encapsulating and hiding the details from other bits of your code is a good thing. Code outside the container doesnt need to know HOW the container does what it does. Just that it does what is expected when you call its methods, change its properties and react to the events it exposes. Thats it. It should be a black box like the built in Xojo listbox, pushbutton, or any other built in control is. You dont need to know how those do what they do, just that they do what you expect when you call the methods, set the properties and react to their events.

And the bonus to doing this is that it makes the likelihood you, or others, can reuse your control in more places in your project or in other projects much higher because the control is self contained.

Long term it also lets you do things like completely swap out the implementation of the container for some other means and as long as you dont need to change the API nothing outside the container control even needs to be aware this has happened. This makes your own code easier to maintain since you no longer have to look through all the code outside of the container to know if you also need to alter it because something in the container changed.

These are all good things regardless of whether this code is for your own use, more general distribution or possibly for sale or to give away.

I’d encourage everyone to keep these things in mind when ever they write their own custom controls.

Of bounds and positions

Windows have two sets of properties that all relate to positioning a window.

However, they are not all quite created equally.

There are, of course, the typical top, left, width and height properties. And also the “bounds” property which is a Rect.

If you examine the bounds property for a Document window, and the top, left, width and height you will find that the Bounds.top is not the same as the top property value. Nor is the height. Now why is that ?

In the following image the BOUNDS are the area enclosed by the red rectangle. And the top, left, width and height properties describe the area in light blue.

If you only had the top, left, width and height properties to use to position a window you would have to somehow figure out what the real size of the window was and account for the title bar size and possibly the outer window frame size. There may or may not be one depending on platform and window type.

And if you add a toolbar this further complicates that. The following image is the same window with a toolbar and once again marked with a red rectangle around the bounds and the blue area is the rectangle described by the top, left, width and height properties.

If, for some reason, you want to know the height of the title bar you can use the difference between the bounds top and the windows top property to see how tall it is.

Note that you cant use this difference to know how tall the title bar and toolbar independently. And toolbars dont appear to propertly report their top, left, width, or height at runtime. 🙁

Still the difference between the bounds properties and the windows other properties will let you determine how tall the title bar + any toolbar is.

Careful with those bounds out there.

Generics

One of the things that Xojo lacks is the notion of generics.

So what are these things and why would they be useful ?

In many programming languages you might want to define a class that behaves like a List. But you want to be able to make this generic enough that when you go to use one you can make a List of Strings, a List of Classes, a List of controls etc.

Right now the only way to do this in Xojo is to make all the the parameters and return values be variants in the interface definition. The downside to this is that you lose all compile time type checking and have to rely solely on runtime checks YOU put in the code.

If you could declare a List variable like

Dim myStringList as List<String>

This would indicate that the list should use String as the “generic type” for all the method parameters and return types. The interface declaration might have to change to something like

Interface List<Type>

  Sub AddRow(ParamArray values() as <Type>)
  End Sub
  
  Sub AddRowAt(ParamArray values() as <Type>, zeroBasedInxed as integer)
  End Sub
  
  Sub FirstRowIndex() as integer
  End Sub
  
  Sub LastAddedRowIndex() as integer
  End Sub
  
  Sub LastRowIndex() as integer
  End Sub
  
  Sub RemoveAllRows()
  End Sub
  
  Sub RemoveRowAt(zeroBasedIndex as integer)
  End Sub
  
  Sub RowCount() as integer
  End Sub
  
  Sub RowTag() as Variant
  End Sub
  
  Sub RowTagAt(zeroBasedIndex as integer) as <Type>
  End Sub
  
  Sub RowValue() as <Type>
  End Sub
  
  Sub RowValueAt(zeroBasedIndex as integer) as <Type>
  End Sub
  
  Sub SelectedRowCount() as Integer
  End Sub
  
  Sub SelectedRowIndex() as integer
  End Sub
End Interface

And now we have a generic interface AND a way to define a list that will, at compile time, have a specific and known type so the compiler can detect any incompatible type errors.

This would make interfaces even more useful than they are now.

Good habits when creating custom controls

Suppose you have the need to create a custom control like I did recently

One of the things that you should do so people do not get confused about using your control is to “implement” any events in whatever you use as your base class that should not be exposed to end users of your control. If you dont inplement these events then a user could, and that might end up in surprising behaviour in your carefully crafted control.

If you dont implement the Open event for your custom control a user could put an instance on a layout and implement that event. If this causes problems than you can make it so they cannot implement the Open event simply by adding that event handler to your custom control.

There may be events, like ConstructContextualMenu, DragEnter, DragMove, etc that make no sense for your custom control and so implementing them in your class would make it so users cant.

And this should make your custom controls easier for others to use.