Machine Language

MACHINE LANGUAGE

Jim Butterfield, Associate Editor

A Simple Sort

I recently received a request from Marshall Stewart in Louisiana for a numeric array sort. Such a sort isn't too useful for real data, but can illustrate a number of machine language coding techniques.

It should be noted that a sort, in order to be practical, should be able to find its way through multifield records and should handle strings, floating point, and fixed point numbers. The program presented here, "Tiny Sort," is written for the Commodore 64 and sorts a single floating point array into ascending order. This might be useful for certain types of statistical analysis, but is otherwise of limited practical use.

The sorting method (or algorithm) is called an "insertion sort." In other words, each number is inserted into the collection of sorted numbers obtained so far. As an example; suppose we have so far sorted the five numbers: 3, 8, 22, 35, and 84. Now the next number comes along; it has a value of 18. The insertion sort will "move up" the values 22, 35, and 84, pop the 18 into the blank space to get the sequence of six: 3, 8, 18, 22, 35, and 84. This algorithm is easy to follow, but like most simple sorting procedures it takes a long time to sort large arrays. Most simple sort algorithms are called "N squared"; this means that if you have an array twice as big as before, it will take four times as much time to do the job. With large collections of data, the programmer must seek out more sophisticated algorithms.

So Tiny Sort is limited in application, and it uses a decent but not superfast algorithm. It is useful for study purposes, however. We do a number of interesting jobs, such as digging into the workings of an array and comparing floating point numbers.

Tracking The Program

When Tiny Sort is called, it assumes that only one array is in the machine—or at least it looks only at the first array. It assumes that the array is one-dimensional, that the type is floating point, and that the zero element is part of the data to be sorted. We could choose to check all this, but let's forge ahead.

How do we find the array? Well, there's a pointer which indicates the start of the first array, and that's the one we want. It's called the Start-of-Arrays pointer (ARYTAB), and in the Commodore 64 it's found at addresses $2F and $30. (Consult your memory maps to find similar pointers in other 6502 machines.) By looking at this pointer, we can tell where to find the first array.

The array comes in two parts: information about the array, and the array data itself. Most of the information we'll pass by: the array name, its size in bytes, and the number of dimensions. We'll assume it's the right array and that it's singly dimensioned. One piece of information we will extract: the number of elements in the array. That will tell us how many items we have to sort. If there are 15 elements, we'll need to do 14 inserts. The first element is already "sorted." The number of elements is held in two bytes, which are to be found five locations from the start of the array. So we dig out the array size minus one and place it into our storage location we call SIZE, at hex address 033D and 033E:

	LDY	#5		; get array size
	LDA	(SOA), Y	; from pointer
	TAX			; size hi byte
	INY			;try for 10 byte
	LDA	(SOA), Y	; here it is
	TAY			; check zero
	BNE	DECK		; minus one
	DEX
DECK	DEY
	STY	SIZE		; store size
	STX	SIZE + 1

Now let's go for the array data. For a single dimension array, we must skip ahead 7 locations to get past the overhead information. The start of the data will be logged in START, and we'll also place it into pointer NEXT. START will stay where it is, but NEXT will move along as we add data to our sorted list.

	CLC			;go for start
	LDA	 SOA		;of array
	ADC	 #7		;plus 7
	STA	 START		;gives start
	STA	 NEXT		;of numbers
	LDA	 SOA + 1
	ADC	 #0
	STA	 START + 1
	STA	 NEXT + 1

Now we accept a value into the sorted list, and move pointer NEW along five locations. Each floating value occupies five locations.

* SORT NEW ITEM INTO EXISTING ARRAY
BIGLP	CLC			;on to next
	LDA	 NEXT		;array item
	ADC	 #5		;five bytes up
	STA	 NEXT
	LDA	 NEXT + 1
	ADC	 #0
	STA	 NEXT + 1

All five bytes of the new item of data, which pointer NEW has selected, are transferred to a work area WORK. That makes comparisons simpler, but performs another task. As we search the list, we'll move the existing items up to make room. The new value's old location will be written over as we do this move.

	LDY	 #4		;move item to
	MVLP	LDA (NEXT), Y	;work area
	STA	WORK, Y	;for testing
	DEY
	BPL	MVLP

Now the stage is set. We'll call subroutine SCAN to find the proper insertion point, move the existing values over, and put the new value in place.

	 JSR	 SCAN	;insert it

Most of the work has been done. We may count the number of insertions—by counting down SIZE—and if there are more numbers, loop back to BIGLP.

	 LDY	 SIZE		;now count down
	 BNE	 INK
	 DEC	 SIZE + 1	;hi and low
INK	 DEC	 SIZE
	 BNE	 BIGLP		;more? go back
	 LDA	 SIZE + 1
	 BNE	 BIGLP
	 RTS

Subroutine SCAN's task is to move down through the data until the correct spot is found to insert the new item. We use pointer CHECK to do the scan; first, we must set it up.

*MOVE EVERYTHING UP AND INSERT ITEM
SCAN	 LDA	 NEXT	;start at top
	 STA	 CHECK
	 LDA	 NEXT + 1
	 STA	 CHECK + 1

Now we move the pointer CHECK down to look at the next item. We do this, of course, by subtracting five from pointer CHECK.

*DOWN TO NEXT ITEM
SLOOP	 SEC
	 LDA	 CHECK	 ;go five bytes
	 SBC	 #5	 ;lower
	 STA	 CHECK
	 LDA	 CHECK + 1
	 SBC	 #0
	 STA CHECK+1

CHECK may have gone too far. We must compare it with pointer START; if it's gone below, we must insert the new item at the bottom. We do the comparison by subtraction. Usually, before we subtract, we give an SEC command; in this case, it's not necessary since we have just completed a previous legal subtraction.

*TEST IF BOTTOM OF DATA
	 LDA	 CHECK		;subtract
	 SBC	 START		;pointer from
	 LDA	 CHECK + 1	;bottom pointer
	 SBC	 START + 1
	 BCC	 SWRAP		;if low, wrap up

Now that it has been established that CHECK is in a legitimate range, we may perform the comparison. Subroutine COMPAR will do this for us. If the new value compares the right way (low), we go to SWRAP to insert it.

*COMPARE NEW ITEM WITH CURRENT ENTRY
	 JSR	 COMPAR	;compare it
	 BCS	 SWARP		;yup, insert it

If we haven't rambled away to SWRAP, it means we haven't yet found the right spot to insert the new item. We move over the item in the list that we have just checked; when we finally find the right spot, everything will be moved over neatly. To move up this five-byte item, we use the stack. When we're finished, back to SLOOP to check the next point on the list.

* NOT YET; MOVE ENTRY UP
	 LDY	 #4			;take out entry
SPUSH	 LDA	 (CHECK), Y		;and push to
	 PHA				;stack
	 DEY
	 BPL	 SPUSH
	 LDY	 #5			;pull entry back
SPULL	 PLA				;and insert five
	 STA	 (CHECK), Y		;bytes higher
	 INY
	 CPY	 #10
	 BCC	 SPULL
	 BCS	 SLOOP			;now get next

When we get to SWRAP, we can put the item into its proper place. Pointer CHECK has gone too far; rather than back it up, we use a higher index value.

* FOUND THE SPOT; PUT NEW ITEM IN PLACE
SWARP	LDY	#5
SWLOOP	LDA	WORK-5, Y
	STA	(CHECK), Y
	INY
	CPY	#10
	BNE	SWLOOP
	RTS

The COMPAR subroutine compares signed floating point numbers. Floating point numbers as stored in arrays consist of one byte giving the exponent and four bytes giving the mantissa. But there's more: The high bit in the mantissa is the sign of the number. Providing we check the signs first, everything works out neatly: compare the exponents, then the bytes of the mantissa. But first, the signs; if they match we can continue with the main comparison.

*COMPARE CURRENT ENTRY TO NEW ITEM IN WORK
COMPAR	LDY	#1		;floating signs
	LDA	WORK,Y
	EOR	(CHECK), Y	;do they match?
	BMI	SGDIF		;no, special

An EOR (Exclusive OR) is an excellent way to check if the high bits match. If they are different, the EOR'd result will have a high bit on, and the N flag will be set. Thus, BMI will branch on unequal signs.

If we didn't branch, the signs are the same. We still need to note the sign, since negative numbers will sort "backward" compared to positive numbers.

	LDA	WORK, Y	;yes, log
	STA	SIGN		;..the sign

Now for the comparison. Quite straight-forward coding.

* COMPARE UNSIGNED VALUE
	LDY	#0		;compare bytes
CLOOP	LDA	WORK, Y	;from left
	CMP	(CHECK), Y	;to right
	BNE	CEXIT		;quit not equal
	INY
	CPY	#5
	BCC	CLOOP

At this time, the C flag (carry) will tell us how the comparison went. But if the numbers are negative, we must invert the comparison result. By switching the carry flag into the high bit of the accumulator, using EOR again, and sliding the high bit back into the carry, we can do the job neatly.

* INSERT SIGN DATA
CEXIT	ROR		;carry to hi-bit
	EOR	SIGN	;flip if negative
	ASL		;back to carry
	RTS

If the signs are different, we don't need to do the main comparison. The negative value is smaller, of course.

* DIFFERING SIGNS–SPECIAL CHECK
SGDIF	LDA	(CHECK), Y	;get sign
	ASL			;switch to carry
	RTS

That's the whole program. Note that the subroutines are called only once. In principle, we could have written the program into a single mainstream. The subroutines tend to break up the logic into neat modules, however.

Note that the comparison subroutine COMPAR always returns the result of the comparison in the Carry flag. That's where it belongs: Carry is the natural flag for signaling less-than or greater-equal-than. We might have used the N flag instead of the C flag to signal the result; this would have saved us two bytes (two ASL instructions), but it seems less comfortable than the traditional Carry.

BASIC Demonstration

The program can be typed in as a BASIC module on the Commodore 64. Since the machine language portion will end up at address $C000 (decimal 49152), be sure you don't have any special software up there.

10	FOR I = 49152 TO 49344			:rem 126
20	READ A : CK=CK+A				:rem 190
30	POKE I, A : NEXT				:rem 193
40	IFCK <> 24165 THEN PRINT "TYPING ERROR IN DATA STATEMENTS"	:rem 27
49152	DATA 160, 5, 177, 47, 170, 200, 177	:rem 198
49159	DATA 47, 168, 208, 1, 202, 136, 140	:rem 198
49166	DATA 61, 3, 142, 62, 3, 24, 165		:rem 250
49173	DATA 47, 105, 7, 141, 63, 3, 133	:rem 43
49180	DATA 251, 165, 48, 105, 0, 141, 64	:rem 142
49187	DATA 3, 133, 252, 24, 165, 251, 105	:rem 194
49194	DATA 5, 133, 251, 165, 252, 105, 0	:rem 140
49201	DATA 133, 252, 160, 4, 177, 251, 153	:rem 237
49208	DATA 67, 3, 136, 16, 248, 32, 83	:rem 56
49215	DATA 192, 172, 61, 3, 208, 3, 206	:rem 92
49222	DATA 62, 3, 206, 61, 3, 208, 217	:rem 38
49229	DATA 173, 62, 3, 208, 212, 96, 165	:rem 156
49236	DATA 251, 133, 253, 165, 252, 133, 254	:rem 90
49243	DATA 56, 165, 253, 233, 5, 133, 253	:rem 199
49250	DATA 165, 254, 233, 0, 133, 254, 165	:rem 243
49257	DATA 253, 237, 63, 3, 165, 254, 237	:rem 210
49264	DATA 64, 3, 144, 25, 32, 154, 192	:rem 99
49271	DATA 176, 20, 160, 4, 177, 253, 72	:rem 150
49278	DATA 136, 16, 250, 160, 5, 104, 145	:rem 195
49285	DATA 253, 200, 192, 10, 144, 248, 176	:rem 44
49292	DATA 206, 160, 5, 185, 62, 3, 145	:rem 99
49299	DATA 253, 200, 192, 10, 208, 246, 96	:rem 1
49306	DATA 160, 1, 185, 67, 3, 81, 253	:rem 49
49313	DATA 48, 26, 185, 67, 3, 141, 72	:rem 55
49320	DATA 3, 160, 0, 185, 67, 3, 209		:rem 247
49327	DATA 253, 208, 5, 200, 192, 5, 144	:rem 144
49334	DATA 244, 106, 77, 72, 3, 10, 96	:rem 52
49341	DATA 177, 253, 10, 96	:rem 172

Once the machine language is in place, we can demonstrate the program with a random number generator. After the first program run, the machine language program remains in place and RUN 900 allows another try.

899 REM RANDOM NUMBER GENERATOR				:rem 191
900 INPUT "NUMBER IF ITEMS" ; X				:rem 218
910 J = RND(0) : X = X - 1 : DIM A(X)		:rem 9
920 FOR J = 0 TO X						:rem 52
930 A(J) = RND(1) * 50-20					:rem 57
940 NEXT J	:rem 38
950 FOR J = 0 TO X : PRINT A(J); : NEXT J : PRINT	:rem 159
960 PRINT : PRINT						:rem 243
970 SYS12 * 4096						:rem 255
980 FOR J = 0 TO X : PRINT A(J); : NEXT : PRINT	:rem 88