6502eval
kickc c64 golangIntroduction
Figuring I needed to learn a bit more Go, I went on to expand on Marco Peerebooms toy6502 cpu emulator written in golang.
After forking the project off, fixing a small nit for it to actually compile and turn it into a golang module, I set off to make a eval6502 program using the module to be useful for my own purposes, which relates to kickC and C64 development.

The end goal was to have something that can read C64 programs, run them in the CPU emulator and do a checksum on the screen afterwards.
Loader and BASIC parsing
I had to make it possible to read files into the emulator memory using a small C64 bin-loader that can load a binary where it wants to go, and make it exit nicely.
My loader is kind of simplistic. The C64 basic is lightly tokenized and calling into machine code is done with a SYS call which gets encoded into the single byte 0x9e, and after that an optional space that usually isn’t there and then the address in PETSCII.
It then reads the chars one by one after the first 0x9e and as long as it is within 0x30-39 it takes the value-0x30, and if there is another number after, multiply previous sum with 10 and repeat until you have figured out where it meant to jump to.
Common structure of C64 asm programs
In almost all cases for compiled code, compressed code or output from a cross-assembler the only BASIC line is a single SYS that calls into the assembler without having the BASIC part prepare anything. There might be a space and text afterwards, like the groupname or something, but it doesn’t matter for this case since I stop the parser at first non-number anyhow.
Jmp right in
After that, I start the emulator at this address, and let it run until it runs an unimplemented 65xx operand and crashes or more nicely end when it runs a JAM (0x02) instruction or if last-pc == curr-pc (ie an endless loop like JMP *).
This has sufficed for most programs I’ve tested, but I probably will have to add some kind of limit on the amount of cycles it will emulate before deciding it will not finish ever. This number can probably be quite large, it already runs a huge number of cycles without taking very long wall time to execute:
C64 Load Addr: 0x0801, end: 0x0828
Exit on JAM instruction at PC 0x0827.
instructions run: 100007424 cycles: 416480995
real 0m0.502s
Checksumming
Next step for my program was to print a checksum of the screen so I can verify that the output is the same after one changes something like packing the program, or adding some smart inline asm for speeding up a certain subroutine. I selected a checksum because it is fast, and “only” 64bit. I don’t think a larger checksum is required for something as small as the 1k of screen RAM, so I ended up with xxhash.
All in all this means I can play around with kickc options and/or compressors as long as my C64 programs are not using basic/rom/hw or depends on interrupts, and as long as it clears the screen ram at start of its run to remove potential output from loading the binary in the normal C64 case. It would be somewhat easy to add both Basic and Kernal ROMs from files, but many programs seem to do very well without calling into either.
Example output:
$ ./6502eval -filename goloop.prg
** Starting emulation **
Exit on JAM instruction at PC 0x0827.
instructions run: 100007424 cycles: 416480995
Reading screen memory, checksum: 14d4deb8c881a940
How well does it work
I have tested programs generated with KickC/KickAssembler, the output of various crunchers, and the executable output from PetMate, and they all work fine. Some compiled programs I tried used illegal 6510 ops, and while the 6502 emulator module is complete and validated for the documented 6502 ops, it has nothing yet for the undocumented ones, so I have this left to add before I can take on a larger set of programs.
The structure of instruction decoding is very simple and clean, so it will just be a bit tedious to make sure you grasp all the subtleties of the illegal ops and implement their quirks correctly.
Scripts I use around this
One of the things I usually do for my C64 programs as they grow large is to run a wide variety of packers, and since they all perform differently depending on the structure of the data one needs to run all of them (possibly running them several times with different flags) when you have made large changes to a program like relocating a large part of the code or imported a picture or added music.
It was easy to modify the script I normally use for testing all possible packer combinations and instead of just picking the smallest one, I now run the original through the evaluator and each of the outputs from exomizer, subsizer, pucrunch and compare the checksums.
Original: checksum: 89bbeba7f39405ea size: 1990
prog.m_5: checksum: 89bbeba7f39405ea size: 1389
prog.m_6: checksum: 89bbeba7f39405ea size: 1390
prog.m_7: checksum: 89bbeba7f39405ea size: 1390
prog.msb: checksum: 89bbeba7f39405ea size: 1327
prog.mx7: checksum: 89bbeba7f39405ea size: 1290
prog.mx9: checksum: 89bbeba7f39405ea size: 1290
prog.mxB: checksum: 89bbeba7f39405ea size: 1288
prog.mxC: checksum: 89bbeba7f39405ea size: 1287
prog.mxD: checksum: 89bbeba7f39405ea size: 1286
Back to how KickC fits into all this
I am testing a lot of C sources to see how it fares in generating decent C64 asm, and while it does handle many normal C sources and constructs (apart from the deliberately not-implemented parts) there still are some things it gets stuck on. While coding on compilers in Java is not my strongest point, there are lots of things one can do to improve the compiler.
One of the neat things kickc does well is when you give it some odd construct in C that it doesn’t yet know how to turn into some small asm fragment is that it prints out something along the lines of “if only I had a fragment that took a pointer to a wordsized unsigned int in zeropage, which when tested is not equal to a signed byte pointed to by a constant and jump to label LA1 if so” but of course in a more cryptic form:
Fragment not found vwuz1_neq_vbsc1_then_la1
The neatness comes in that you can start coding that fragment, drop the hopefully-working version in the fragment dir, recompile your program and suddenly your program works and the compiler is a tad bit better. You don’t even have to rebuild the compiler, the fragment lists are built at runtime so any new fragment is available immediately.
Fragment combining
KickC can also tell you when it lacks something like the above, but it manages to combine 2-3-4 smaller fragments instead in order to synthesize the same result so you get the chance to whip up some neat 65xx-asm and then your program gets smaller and/or works faster. If you code it wrong it will be hard to debug, so that is why I would like to validate that test programs don’t change behaviour after my edits.
TODO
- 6510 illegal ops
- Possibility to emulate interrupts
- Some web service: upload your prg, get a checksum back
- OpenBSD pledge/unveil support for such a web service
- Additional 65xx platform loaders, VIC20? AtariXL?
- Print out the screen memory after the run
- Option to load kernal and basic roms
- Get a range of test programs from KickC regression tests and other places to compare known-good checksums for the future