INSIGHT: Atari
Bill Wilkinson
More On Structure
In last month's column, I discussed structured data types. In the program listing, I then used a large string and its substrings to simulate an array of records. This month, I will continue my efforts to help you write more structured programs, but first I'll make some general comments on Atari BASIC strings.
I think that last month's program demonstrates that long strings can be just as powerful as string arrays. I would expect that program to operate at least as quickly as an equivalent Microsoft BASIC program using string arrays, because the typical Microsoft implementation goes through a lot of overhead generating and reclaiming dynamic strings. In that example, we were inserting new records into our string structure. If we had been deleting records, Atari BASIC really would have shone. For example, suppose we have a string filled with 50-character pseudorecords. To delete the third such record, we could simply do this:
RECORD$(101) = RECORD$(151)
Presto! All records are moved up one spot, and the third one is gone.
A small sidetrack: Unfortunately, one failing of Atari BASIC is that it has no built-in way to conveniently save and restore such long strings to and from disk. The most common output method is to PRINT* such a string, but that doesn't help a lot, since INPUT# is limited to no more than 255 bytes per line. I have seen many users resort to using a FOR/NEXT loop to PRINT* or INPUT* the individual pseudorecords one at a time. That works, but it certainly slows down disk I/O speed. In BASIC XL and BASIC XE, we added a pair of special statements for this purpose (BPUT* and BGET#, where the B stands for Block or Buffer), but you can accomplish the same thing in Atari BASIC with a pair of fairly short USR assembly language subroutines. (These routines have been published several times, and I won't repeat them this month.)
Out Of Sorts
To continue my discussion of how to achieve structured data features in Atari BASIC, I call your attention to Program 1. Study it, type it in, and run it. It is a fairly clumsy but working record-sort routine. That is, it sorts the kinds of pseudorecord strings that we also used in last month's program. (Incidentally, I have used the worst of all possible sort algorithms: the bubble sort. Please, if you are serious about sorting your data, learn a couple of other methods, such as the heap sort, Shell sort, and quicksort. Why do I use the bubble sort? Because it is the smallest and easiest to demonstrate. Or maybe because I'm just lazy.)
As you study that listing, pay attention to lines 230-260, where the tests and swaps necessary to any sort routine are made. Because I purposely organized the data in my array of pseudorecords in the worst possible way (for a bubble sort, at least), the IF test will never branch around the swap, and we will make more than 4900 of these string swaps. Surely there must be a better way.
Pointing The Finger
Time to introduce another concept from the structured languages. In any computer language, moving blocks of data around (whether records, pictures, disk blocks, or whatever) is time-consuming. So most programs don't actually move the data. Instead, they move pointers to the data.
What is a pointer? Quite simply, a pointer is a variable that contains the address of another variable. Atari BASIC allows only one explicit kind of pointer—the ADR function that gives the address of a string. And, indeed, many programmers use ADR as a pointer when they pass the address of a string to an assembly language subroutine. (Imagine having to pass the bytes of a string through a series of POKEs.)
But there is another, hidden kind of pointer in almost any computer language: array and string indices. As an example, if the record data we are working on is always within a particular string, for example, then we need only know the relative position of a given record within the string in order to obtain its information. We most likely do not care about the actual physical memory address of the data.
Merge the lines shown in Program 2 to those already in place in Program 1 (deleting the three lines noted) and study the results of using implicit pointers. There is (rather obviously) little difference between the two programs. Instead of swapping actual substring pseudorecords, we are now swapping only the indices into the master string. When you run this second version, you should notice a speed improvement of almost 2:1. (I got 70.8 seconds versus 135.3 seconds, but that was done using the FAST mode of BASIC XL. Your times will likely be slower.)
As clever as this trick is, it does not answer all of a programmer's needs. Suppose, for example, that the data to be sorted resides in two or three different arrays. The logistics in BASIC get complicated. In a language with more data structuring capabilities, where a pointer can point to a given record type no matter where the record might be, such a division of data would probably make virtually no difference on program speed.
Enough for this month. Next month, we will go back to the acrostics puzzle of the December issue, since it generated more mail for me than any topic in recent months. If I have managed to convince you that records and pointers are valuable tools, wait until you see my proposed solution to the acrostics problem!