MACHINE LANGUAGE
Jim Butterfield, Associate Editor
Hopping Around
Transfer of control — jumping and branching — seems to be easy and straightforward to accomplish. In 6502 programming, you can make a decision-based branch, which will take you forward or backward a hundred-odd locations; or an unconditional jump, which will take you anywhere you want to go.
Yet there are a number of techniques that transfer control in unusual ways. Often they may seem like tricks, but they can be useful in achieving programming objectives: speed, flexibility, or compactness. We'll look at some of these techniques here.
The Long Branch
If you want to use a branch to implement a decision, your range is limited to slightly over 120 locations forward or backward. We often want to get around this limitation. It may be argued, by the way, that well-organized programs should never need to branch over any great distance; that your programs should be organized into subroutine modules so that transfers of control will always be short and visible.
For the moment, let's look at an example:
2000 | LDX | #$20 | |
2002 | BIGLOOP | LDA | #$0D |
.... | |||
.... | |||
20C0 | DEX | ||
20C1 | BNE | BIGLOOP | |
20C3 | .... |
We have a problem here. We can't branch over the needed range — about 190 bytes. The simple way is to insert a JMP:
20C0 | DEX | ||
20C1 | BEQ | SKIP | |
20C3 | JMP | BIGLOOP | |
20C6 | SKIP | .... |
Another way is more subtle and must be used with care. It avoids the JMP, and thus makes a routine more easily relocatable. Let's assume that somewhere in our program sequence we have a BNE:
2000 | LDX | #$20 | |
2002 | BIGLOOP | LDA | #$0D |
.... | |||
.... | |||
2065 | LDA | $027A | |
2068 | BNE | STEP |
Now, immediately after the BNE at address 2068, another BNE instruction would never branch. After all, if the Z flag is clear, we will take the previous branch to STEP. And if the Z flag is set, neither branch will be taken. So we might use:
2000 | LDX | #$20 | |
2002 | BIGLOOP | LDA | #$0D |
.... | |||
.... | |||
2065 | LDA | $027A | |
2068 | BNE | STEP | |
206A | LINK | BNE | BIGLOOP |
.... | |||
.... | |||
20C2 | DEX | ||
20C3 | BNE | LINK |
As the program executes in the area of 2065, it will never take the branch to BIGLOOP. But when we get down to the bottom, the instruction at 20C3 will (if conditions are right) branch to LINK, and will immediately branch again to BIGLOOP. Each branch is now a shorter hop and easily within range.
Hidden Instructions
Suppose you need a series of PRINT subroutines, one to print a RETURN ($0D), one to print a space ($20), and another to print an exclamation point. You could write three subroutines; or you could write the three Load commands and then branch to a common point; or you could do this:
2000 | A9 | 0D | LDA | #$0D | ;return | |
2002 | 2C | 1A9 | 20 | BIT | $20A9 | ;hidden space |
2005 | 2C | A9 | 3F | BIT | $3FA9 | ;hidden question mark |
2008 | 20 | D2 | FF | ;print it | ||
200B | 60 | JSR | RTS | ;return | ||
What happens when we call address 2000? We load the RETURN character, perform two meaningless BIT tests — they set the status flags, but we never test them — and then print RETURN.
But, what happens if we JSR to 2003? That's not an instruction — wait — yes, it is. It's A9 20, which is the same as LDA #$20. So we load the A register with a space character, do one meaningless BIT instruction, and print it. And if we JSR to 2006, we'll load A with $3F, the question mark, and print that.
What's happening here? By inserting the byte 2C ahead of the two extra A9 or LDA commands, we have made them "invisible." We can slide right through them, without needing to jump over them.
The BIT test, $2C, is ideal since it does not affect memory or any registers other than the status register, which we don't need. Some computers have a series of NOP commands of various instruction lengths, which are useful for "hiding" instructions within the address field. Sometimes these instructions have names other than NOP – for example, "Branch Never" or "Rotate 0 Bits" — but you get the idea.
The Invisible Return
Our last example ended with a JSR and RTS. Think about this. We will call a subroutine; it will return to us; and then we will return to the routine that called us. The return addresses are kept on the stack, of course. Suppose we just JMP to the subroutine. When the subroutine is ready to return, it will go directly to the routine that called our program. Thus, with rare exceptions, JSR and RTS are identical to JMP. We've saved a byte and a little time.
Programmers working with limited memory find this kind of tightening up useful, and it often leads to further economies. For example, if there's a routine called DOG and one called CAT; and if DOG ends with JSR CAT:RTS; then the first step is to replace this with JMP CAT. Now, we won't need to jump to CAT if that subroutine immediately follows. Instead of jumping there, we'll just fall into it. Suddenly, two subroutines have become one — with two entry points.
There's another interesting use for this technique. Suppose you've written a subroutine SPC to print a space, and now you want to write a subroutine to print two spaces. You might start with the sequence JSR SPOJSR SPORTS — but a little boiling down will generate the sequence:
SPC2 | JSR | SPC |
SPC | LDA | #$20 |
JMP |
It seems odd to see a subroutine that starts out by calling the following instruction as a subroutine. But if you think of the way subroutines work, you'll see that it does a simple job: it executes the subroutine twice.
By the way, some theorists are very strong on the idea that all subroutines should have one entry point and one clearly defined exit. You'll have to decide on your own style. If you have lots of memory and processing time, you might prefer neatness. On the other hand, if you're trying to crowd a lot of programming into a small 2K ROM, you'll take all the economies you can get.