BASIC Style—Program Evolution
Jim Butterfield, Associate Editor
Sometimes you see programs that are so crisp and neat that you wonder how the programmer's mind can be so orderly. The statements come out in an elegant, incisive style. Every line zeros in on exactly the right thing to do.
How does a programmer develop an elegant style? Why can't you write like that? Sometimes a lowly hacker can feel inferior when facing such immaculate programming style. Yet the program you see is often a matter of evolution—rewriting and tidying up. It's not always written that way from the beginning.
I have been accused of writing "squeaky clean" programs. It seems to me that you might like to see how my murky first programs get reworked and tightened up into their final version. In some ways, programming style isn't what you write (at least at first)—it's knowing what to look for when you clean up.
A Simple Lister
I needed to do an almost trivial job: list a file from disk to the printer. I had a minor extra feature to add: I wanted individual pages, so that the lines needed to be counted; I needed a title on each page; and at the end of the run, for the sake of neatness, I wanted the printer to eject the page.
It's not a demanding task, but I'd like to show you how I went about it. Even a simple job like that can be revised and tightened up extensively.
Here's my first program: I'll talk my way through the listing.
100 OPEN 4,3
Open file number four to the screen. Why? So I can send the program's output to the screen and see that it's working right. After the program looks good, I'll change the above line to OPEN 4,4.
105 OPEN 1,8,3,"CONTROL"
That's my input file to be listed.
110 REM START OF PAGE 120 FOR J = l TO 2 : PRINT#4 : L = L + 1 : NEXT J 130 PRINT#4,"{5 SPACES]TITLE{3 SPACES}" : L = L + 1 140 PRINT#4:L = L + 1
This prints the page title. I know I'll come back here for each new page, so I'm placing a REM statement here to mark the place. I rigorously add 1 to the line count, L, each time I print a line.
150 INPUT#1, A$ : SW = ST 170 PRINT#4, A$ : L = L + 1
Here's where I input from disk and output (to the screen first, later to the printer). I need to save the value of ST (the status variable) so that later I can check to see if this is the last line from the file. ST will be changed by the PRINT# command, so I save its input value in variable SW.
180 IF L<62 GOTO 250 190 IF L = 66 THEN L = 0 : GOTO 250 200 PRINT#4 : L = L + 1 : GOTO 190
If I have printed the maximum number of lines desired, I want to eject the paper by printing until the line count L equals 66. Since each page has 66 lines, I'm now at the start of the next page and can set L back to zero.
250 IF SW<>0 GOTO 300 260 IF L = 0 GOTO 110 270 GOTO 150
If I'm at the end of the input file (SW = 0), I'll go to line 300 and wind things up. Otherwise, I want to go back.
Here's a cute touch—perhaps too cute for some tastes. Variable L can only be equal to zero if I've just ejected a page. If so, I want to go back to 110 and print a new title. If not, get another line from the input file starting at line 150.
300 IF L<>0 GOTO 190
Here's a supercute trick. I pondered this one for a while, since it's almost too clever; that sort of thing can trip up your logic. Here's the objective: If we're finished, but the paper hasn't been ejected, go back to line 190 and eject the paper. The program will branch back here again, but this time variable L will be zero and we can finish the job by closing the files.
310 CLOSE 1 320 CLOSE 4
That's it. It's really rather messy. It works, and for a temporary job that's all we would need.
But it doesn't feel right. The code feels messy: It seems to jump around, and I don't get a feeling of smoothness in the program. It's time to pick at the coding.
First Revision
The first awkward spot is around lines 190 and 200. The routine to eject the paper works but looks clumsy. Besides, we call it twice (once at 62 lines, and again at end of file).
I have feelings about this part of the program, too. It's a unit to do a particular job. I would feel better moving it to a separate subroutine where it can stand out as an identifiable action. Sometimes I create a subroutine out of some in-line code and then move it back later; it helps me identify the modules that make up the program. Let's move the eject routine to a subroutine at line 500, clean it up a bit, and see what we get:
100 OPEN 4, 3 105 OPEN 1, 8, 3, "CONTROL" 110 REM START OF PAGE 120 FOR J = l TO 2 : PR1NT#4 : L = L + 1 : NEXT J 130 PRINT#4, "{5 SPACES}TITLE{3 SPACES}" : L = L + 1 140 PRINT#4 : L = L + 1 150 INPUT#1, A$ : SW = ST 170 PRINT#4, A$ : L = L + 1 180 IF L<62 GOTO 250 190 GOSUB 500 : GOTO 250 250 IF SW<>0 GOTO 300 260 IF L = 0 GOTO 110 270 GOTO 150 300 IF L<>0 GOTO 190 310 CLOSE 1 320 CLOSE 4 330 END 500 FOR J = L TO 66 : PRINT#4 : NEXT J 510 L = 0 : RETURN
We can see that the GOTO 250 on line 190 is now redundant since we'll go there anyway. But we have other things to do. We're still trimming the program and have some distance to go yet.
Digging Deeper
Around lines 250 to 270, we jump around a lot. We have one jump forward to 300 and two jumps back to 110 or 150. The logic seems scattered.
I have a thing about loops: I like to see them neatly nested, with short jumps entirely within longer jumps. It might even be summarized as a rule of thumb: Where possible, make short jumps as short as possible.
Using this rule, I want to get the loop back to 150 into logical order first. Then we'll work in the longer loop to 110 and finally the forward branch to 300. We'll need to expand the logic using an AND operator, but that's not too hard.
As the routine is written, certain logical things start to fall together. For example, we don't have to GOTO forward to line 300. When we're finished writing the two loops, we'll fall into 300 naturally. ("Naturally" seems to be a key word in how programs seem to come together as you tighten them up.)
We can also tighten up the page eject conditions. If we write line 180 correctly, there will be no need to go back to get a page ejection. One option would be to call the subroutine at 500 twice. But if we think of what our objective really is at line 180, we can do it all correctly the first time through. Inverting the logic and adding an OR connective does the trick nicely.
Look at how far the original program has come:
100 OPEN 4, 4 105 OPEN 1, 8, 3, "CONTROL" 110 REM START OF PAGE 120 FOR J = 1 TO 2 : PRINT #4 : L = L + 1 : NEXT J 130 PRINT #4, "{5 SPACES} TITLE {3 SPACES}" : L = L + 1 140 PRINT #4 : L = L + 1 150 INPUT # 1, A$ : SW = ST 170 PRINT #4, A$ : L = L + 1 180 IF L>61 OR SW <> 0 THEN GOSUB 500 250 IF SW = 0 AND L>0 GOTO 150 260 IF SW = 0 GOTO 110 310 CLOSE 1 320 CLOSE 4 330 END 500 FOR J = L TO 66 : PRINT #4 : NEXT J 510 L = 0 : RETURN
This is pleasing, but we can do even more. The repeated SW = 0 test in lines 250 and 260 still irks a little: It seems clumsy. The whole business is tied up with whether to print a title or not. Is there a better way? Could the test of L>0 be somehow shuttled up to the top of the loop instead of sitting at the bottom?
The Header Module
While we're thinking about it, that whole business of printing a header is really a module—we must do the whole thing, title and all, or nothing. If we move it out to a subroutine, we might see the logic flow more clearly. Let's do it and work on the logic flow. We end up with this:
100 OPEN 4,3 105 OPEN 1, 8, 3 , "CONTROL" 110 IF L = 0 THEN GOSUB 600 150 INPUT#1, A$ : SW = ST 170 PRINT#4, A$ : L = L + 1 180 IF L>61 OR SW<>0 THEN GOSUB 500 260 IF SW = 0 GOTO 110 310 CLOSE 1 320 CLOSE 4 330 END 500 FOR J = L TO 66 : PRINT#4 : NEXT J 510 L = 0 : RETURN 600 FOR J = L TO 2 : PRINT#4 : L = L + 1 : NEXT J 610 PRINT#4, "{5 SPACES}TITLE{3 SPACES}" : L = L + 1 620 PRINT#4 : L = L + 1 630 RETURN
Look at that main section from lines 100 to 330. It now seems tight and concise like a finely tuned instrument.
Both subroutines—at lines 500 and 600—are called only once. If it seemed important, we could put them back into the main program stream. But I'm happy to see them as clearly isolated modules. At this stage I would add comments (line 499: REM PAGE EJECT and line 599: REM PAGE TITLE) to neaten things up.
Moral
First, what you see published is not always the first idea that popped into the author's head. The programmer is not always smarter than you. Time has been taken to groom the program into its final shape. When many people are going to read your code, you like to take a few extra pains with its appearance.
Second, don't be afraid to revise your programs, even if they work correctly. Sure, a one-shot program often doesn't warrant picking over; use it and forget it. But sometimes the exercise can reveal, almost accidentally, powerful and effective programming methods.
Third, style isn't an inborn talent that some people have and some don't. You learn it as you go. Some things you will discover for yourself, and others you'll pick up by looking at other people's programs.
The odd thing is that we instinctively recognize better writing when we have written it. You may not know exactly why, but you often feel good about a certain piece of programming. Usually, it's because it has style.
Copyright © 1983 Jim Butterfield