BFP Chapter 05: Scrolling

Download sources for this chapter

Introduction

In this chapter we’re going to handle timing and have our code run periodically. This will enable us to do such cool things as a scrolling background.
We’re also going to ingest the super mushroom! We’ll start looking at what other people have done. Figuring out stuff by myself has been fun but it gets old real soon.

So, let’s dig in!

Creating a main loop

There are different kinds of things we can do during the active scan and the vertical retrace periods. During the active scan, VDP is really busy, so we’d better not mess with it too much. While the VDP is putting pixels on screen, the CPU has “limited access” to it. This means that the writes we do to data and control ports of the VDP will be consumed less frequently by the VDP, so the corresponding FIFO will become full quickly. I think I saw somewhere in the manual that the FIFO has enough space for 4 writes.

It is both a good and bad thing that, after the FIFO becomes full, the CPU must wait before it can write again. Thank God for this, because otherwise we’d have to check if the FIFO is full before writing. This would complicate things a lot. The bad thing is that we get a performance penalty. Fortunately this is small (about 5-6 microseconds), but nevertheless we don’t want to perform frequent writes to the VDP during active scan. These small delays could stack up pretty easily.

During the vertical retrace, we have unlimited access to the VDP.

Given the above, the simplest way to organize our main loop would be:

	Update (perform game logic etc)
	Wait for VBlank
	Render (push cells/indices/palettes to VDP)
	Wait for active scan
	Repeat

Probably, this is the only way we can do it. There’s this VBlank interrupt that bugs me. I’m trying to figure out what its purpose is, and if we could somehow make use of it. For example, we could set a flag in the interrupt, so that our update routine terminates, but this requires that our update routine is periodically checking for that flag. And if we do that, we don’t even need to use the interrupt. The flag is already available to us in the VDP “status” register:

statusRegister.PNG

The other thing that’s bugging me is that we have absolutely no way of controlling when a frame is displayed. Actually, the frame is controlling us. So, in case our update routine takes much time, we will end up waiting for the VBlank of a next frame. I guess the only way around this is to make sure that our update routine is short enough.

Since there’s no practical way of enforcing this, the next best thing is to somehow display a warning if update exceeds its time limit. Let’s push that to our TO-DO stack and carry on with our main loop. We need two routines to implement it:

	VDPWaitForVBlank
	VPDWaitForActiveScan

Both routines will loop until a condition is met. They’re going to be similar to what we did for waiting for DMA completion:

VDPWaitForDMA:
	VDPReadStatus d0
	btst #3, d0
	bne VDPWaitForDMA
	rts

This time we’re looking at bit #2. Let’s start with waiting for the active scan, which will be nearly identical to waiting for DMA completion. We’re looking for a bit to be cleared:

VDPWaitForActiveScan:
	VDPReadStatus d0
	btst #2, d0
	bne VDPWaitForActiveScan
	rts

And then, we flip the condition to wait for VBlank (the things I’m capable of, to avoid understanding again what “bne” means):

VDPWaitForVBlank:
	VDPReadStatus d0
	btst #2, d0
	beq VDPWaitForVBlank
	rts

Ready to roll

We are now ready to make our main loop. But I’d really like to make it do something. Would scrolling our background be too much to ask? Let’s see what the manual has to say…

IhadToAsk.PNG

I had to ask, hadn’t I?

So, to scroll horizontally, essentially you fill this table with the same offset value. Even slots of the table will affect scroll A and odd will affect scroll B. We can also specify a different offset per scanline, which would enable us to do some nice effects. I never liked a game that didn’t have a fake perspective floor.

sf2.png

But I’m not going to do it right now. Alternatively, you can only set the first table position for the entire scrool A and the second for B. I’m not in the mood right now to push 480 words to VRAM, during the VBlank, which, as we saw earlier, is severely limited. In order to do that, you set register #11 to be zero, which we already did during initialization.

Like the sprite table, this table will be more or less permanent in VRAM. So, let’s put it right above the sprite table.

We’re putting the HScroll table to F800, which is right above the sprite table, as tightly packed as the resolution permits (1KB increments – 6 most significant bits of the 16 bit address). We set those bits to register #13.

By the way, here is the main program logic:

	jsr appStart					; app initialization routine
	jsr VDPWaitForActiveScan		; wait for the next active scan
	
mainLoop:
	jsr appUpdate					; app update - during active scan
	jsr VDPWaitForVBlank
	jsr appRender					; app render - during VBlanking
	jsr VDPWaitForActiveScan
	jmp mainLoop

I put the start/update/render routines in a separate file called “app.x68”. Now our previous test’s code is in appStart. We now have a cleaner main file. Actually let me rename it from “test.x68” to “entryPoint.x68”, to be more descriptive.

Now, let’s actually do something in our update routine. I’m just going to increase d1 by one:

appUpdate:
	add.l #1, d1	
	rts

It’s going way too fast. I’ve put code in the wait functions as well, to increase other registers every time they loop. I then divided these registers with d1 to find out that we can do 6 iterations of adding and branching back per frame. I know the Mega Drive is a bit limited, but this is just ridiculous! Furthermore, my frame counter goes up really fast. In the order of thousands of frames per second.

This can only mean one thing: we are not actually waiting for VSync.

Maybe we’re not testing the right bit? Nope, I double-checked the docs.

– Enough is enough. –

facepalm.jpg

This is a major turning point in this series. From now on, I’m going to utilise every single bit of information I can get my hands on. Because otherwise we’re getting nowhere. I’ll find another way to add some adventure to it.

There are far too many errors in the sega2 document. Look at this for example:

WaitVBlankStart:
	move.w  vdp_control, d0	; Move VDP status word to d0
	andi.w  #0x0008, d0     ; AND with bit 4 (vblank), result in status register
	bne     WaitVBlankStart ; Branch if not equal (to zero)
	rts

WaitVBlankEnd:
	move.w  vdp_control, d0	; Move VDP status word to d0
	andi.w  #0x0008, d0     ; AND with bit 4 (vblank), result in status register
	beq     WaitVBlankEnd   ; Branch if equal (to zero)
	rts

This is written by Matt, from BIG EVIL CORPORATION.

Again, I’m not going to try and trace where this info came from, but I’d really like to recursively thank everyone who helped solve this ridiculous mystery.

Also, the BIG EVIL CORPORATION blog has a very nice wordpress theme. I’m stealing that as well.

So, after all, we need to check for byte #3. Fine by me, but then, what on earth were we doing when we were waiting for DMA completion? We were actually waiting for the next frame…

So, where is the DMA bit then? Aha:

realStatusRegister.PNG

I’m keeping this txt by Charles MacDonald.

So we correct all 3 of our waiting routines. Now, frame counts are more believable.

Some stats

Again, I let it run for a while and did the divisions in windows calculator. Seems like we can do about 2658 repetitions of “add, read status, check bit, branch” in our update routine, and about 242 in our render routine. Which totally sucks.

So, VBlanking will take about 8% of our time. That’s an easier number to keep in mind.

Back to scrolling

The only thing left to do now, is write the value of d1 to VRAM location F800. A macro would be handy for writing VRAM:

	macro VDPPointToVRAM, addr
		VDPWriteToControl #(( \addr & $3FFF) | $4000)
		VDPWriteToControl #(( \addr >> 14) & 3)
	endmacro

And then, here is how we scroll:

	VDPPointToVRAM $F800
	VDPWriteToData d1

And yes! It slides!

itslides.PNG

I’m only scrolling plane A. It seems to be enough, as plane A is of higher priority than plane B, there are no transparent pixels in our image, so we can’t see plane B anyway.

Now, I’d like it to scroll a bit faster and from right to left:

	add.l #-4, d1

Initializing the rest of the VDP (CRAM, VSRAM)

I’m adding a similar macro to point to CRAM, so we can easily change colors:

	macro VDPPointToCRAM, addr
		VDPWriteToControl #(( \addr & $3FFF) | $C000)
		VDPWriteToControl #(( \addr >> 14) & 3)
	endmacro

And one more for VSRAM (Vertical Scrolling RAM):

	macro VDPPointToVSRAM, addr
		VDPWriteToControl #(( \addr & $3FFF) | $4000)
		VDPWriteToControl #((( \addr >> 14) & 3) | $1000)
	endmacro

And here is how we clear both the CRAM and VSRAM to zero in our initialization routine:

	VDPPointToCRAM 0
	repeat 64
		VDPWriteToData #0
	endrepeat
	
	VDPPointToVSRAM 0
	repeat 40
		VDPWriteToData #0
	endrepeat

A little more housekeeping

Sprite table and Hscroll table will be constants, so let’s make some assembler symbols for them:

	defc VDPSpriteTable = $FE00
	defc VDPHScrollTable = $F800

Next item in my list is: “Optimize double writes to control with a long write”. I don’t know if this is going to save some cycles, but let’s do it anyway.

I’m creating two more macros VDPWrite*L, with the ‘L’ suffix:

	macro VDPWriteToControlL, value
		move.l \value, $C00004
	endmacro
	
	macro VDPWriteToDataL, value
		move.l \value, $C00000
	endmacro

And replacing my writes with packed long word writes throughout the code, like this:

	VDPWriteToControlL #$40000080		; Destination address (0)

Afterword

Having taken care of a lot of items in our list, it’s time to move to somewhat higher level programs. Stay tuned for next chapter where we’ll create a primitive memory manager that will let us refer to memory locations using variable names instead of numbers.

Previous Chapter TOC Next Chapter
Advertisements

3 thoughts on “BFP Chapter 05: Scrolling

  1. Pingback: A Blast From the Past – TOC | Coder's Diary

  2. Very helpful articles, thanks for taking the time to share them with the world. One question, the source code is missing from CH5 folder. Is that intentional? If not, is it possible if it can be posted?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s