Labels & Symbols

Suppose you want to call some code at address $1000. CPUs fundamentally deal with numeric values, so the machine code to call it would be JSR $1000. Humans tend to work better with words, so associating a meaningful symbol with address $1000 can greatly improve the readability of the code: something like JSR DrawSprite is far more helpful for human readers. Further, once the code has been disassembled to source code, using symbols instead of fixed addresses makes it easier to alter the program or reuse the code in another project.

When the target address of instructions like JSR and LDA falls within the scope of the data file, SourceGen classifies the reference as internal, and automatically adds a generic symbolic label (e.g. L1000). The label can be edited if desired.

On the line at address $2000, select Actions > Edit Label, or double-click on the label "L2000". Change the label to "MAIN", and hit Enter. The label changes on that line, and on the two lines that refer to address $2000. (If you're not sure which lines refer to address $2000, select line $2000 and check the list in the References window.)

Sometimes the target address falls outside the data file. Examples include calls to ROM routines, use of zero-page storage, and access to memory-mapped I/O locations. SourceGen classifies these as external, and does not generate a symbol. In an assembler source file, symbols for these would be expressed as equates (e.g. FOO = $8000), usually at the top of the file or in an "include file". SourceGen allows you to specify symbols for addresses and numeric constants within the project ("project symbols"), or in a symbol file that can be included in multiple projects ("platform symbols"). The SourceGen distribution includes platform symbol files with ROM addresses for several common systems.

For an example, consider the code at address $2000, which is LDA $3000. We want to assign the symbol "INPUT" to address $3000, but we can't do that by editing a label because it's not inside the file bounds. We can open the project symbol editor from the project properties editor, or we can use a shortcut.

With the line at $2000 selected, use Actions > Edit Operand, or double-click on the value in the Operand column ("$3000"). This opens the Edit Instruction Operand dialog. In the bottom left, click Create Project Symbol. Set the Label field to "INPUT", and click OK, then OK in the operand editor.

The instruction at $2000 now uses the symbol "INPUT" as its operand. If you scroll to the top of the file, you will see a ".EQ" line for the symbol.

Numeric v. Symbolic

When SourceGen sees a reference to an address, such as the operand of an absolute JSR or LDA, it recognizes it as a numeric reference. You can edit the instruction's operand to use a symbol instead, changing to a symbolic reference. Sometimes the way these are handled can be confusing.

Let's use the branch statement at $2005 to illustrate the difference. It performs a branch to $2009, which was automatically assigned the label "L2009".

Edit the label at $2009 (double-click on "L2009" there), and change it to "IN_RANGE". Line $2005 changes to match. This works because SourceGen is auto-formatting line $2005's operand based on the label it finds when it chases the numeric reference to $2009. The Info window shows this as Format (auto): symbol "IN_RANGE".

Use Edit > Undo to revert the label change.

Edit the instruction operand at $2005 (double-click on "L2009" there). Change the format to Symbol, and type "IN_RANGE" in the symbol box. The preview shows BCC IN_RANGE (?), which hints at a problem. Click OK.

Some things changed, but not the things we wanted. Line $2005 now says BCC $2009, instead of BCC L2009, and the label at $2009 has disappeared entirely. What went wrong?

The problem is that we edited the operand to use a symbol that isn't defined anywhere. Because "IN_RANGE" isn't defined, the operand was given the default format, and displayed as a hex value. The numeric reference to $2009 was replaced by the symbol, and nothing else refers to that address, so SourceGen no longer had any reason to put an auto-generated label on line $2009, which is why that disappeared.

The missing symbol is called out in a message window that popped up at the bottom of the code list window. The message window only appears when there are messages to read. You can hide the window with the Hide button, and make it re-appear with the button in the bottom right of the main window that currently says 1 message.

We can resolve this issue by providing the desired symbol. As you did earlier, edit the label on line $2009 (double-click in the label column) and set it to "IN_RANGE". When you do, the operand on line $2005 is updated appropriately. If you select line $2005, the Info window shows the format as Format: symbol "IN_RANGE", indicating that the symbol was set explicitly rather than automatically.

Symbolic references always link to the symbol, even when the symbol doesn't match the numeric reference. To see this, remove the label from line $2009 by undoing that change with Edit > Undo, so the symbol is again undefined. Now set the label on the following line, $200A, to "IN_RANGE".

Line $2005 now says "BCC IN_RANGE-1". Earlier you set the operand to be a symbolic reference to "IN_RANGE", but the symbol doesn't quite match, so SourceGen automatically adjusted the operand by one byte to point to the correct address. Generally speaking, SourceGen will do its best to use the symbols that you tell it to, and will adjust the symbolic references so that the code assembles correctly.

Edit the label on line $200A, and change it to "NIFTY". Note how the reference on line $2005 also changed. This is an example of a "refactoring rename": when you changed the label, SourceGen automatically found everything that referred to it and updated it. If you edit the operand on line $2005, you can confirm that the symbol has changed.

(If you want to clean this up before continuing on to the next section, put the label back on line $2009.)

Non-Unique Labels

Most assemblers have a notion of "local" labels, which go out of scope when a non-local (global) label is encountered. The actual definition of "local" is assembler-specific, but SourceGen allows you to create labels that serve the same purpose.

By default, newly-created labels have global scope and must be unique. You can change these attributes when you edit the label. Up near the top of the file, at address $1002, double-click on the label ("L1002"). Change the label to "LOOP" and click the "non-unique local" radio button. Click OK.

The label at line $1002 (and the operand on line $100B) should now be "@LOOP". By default, '@' is used to indicate non-unique labels, though you can change it to a different character in the application settings.

At address $2019, double-click to edit the label ("L2019"). If you type "MAIN" or "IS_OK" with Global selected you'll get an error, but if you type "@LOOP" it will be accepted. Note the "non-unique local" radio button is selected automatically if you start a label with '@' (or whatever character you have configured). Click OK.

You now have two lines with the same label. In some cases the assembly source generator may need to "promote" them to globals, or rename them to make them unique, depending on what your preferred assembler allows.

Address Region Pre-Labels

When we created an address region at $2000, the LDA on line $1002 lost its label, and the STA on line $1005 gained one. The difficulty with having labels in both operands is that both instructions refer to the byte at offset +000017, but that offset has different addresses before and after the code is relocated, and you can't assign multiple addresses to a single file offset.

In assembly source code, we'd solve this by putting a label right before the address change, and another one after. We can do the same thing here by setting a "pre-label" for the address region.

Select the .addrs line before line $2000, then use Actions > Create/Edit Address Region (or double-click on the operand of the .addrs line, i.e. "$2000"; if you double-click on the .addrs opcode you'll jump to the matching .adrend instead). This opens the address region editor. In the Pre-label section near the bottom, enter "COPY_SRC", then click OK.

This added a line before the .addrs $2000 with the new label, and updated the LDA at line $1002 to refer to the symbol.

Pre-labels are treated as "external" symbols, because the address they're associated with isn't actually represented by the file in its final form. As a result, you can't use non-unique local names like @LOOP.