Labels & Symbols
Suppose you want to call some code at address $1000. CPUs
fundamentally deal with numeric values, so the machine code to
call it would be JSR $1000
. Humans tend to work better with
words, so associating a meaningful symbol with address $1000
can greatly improve the readability of the code: something like
JSR DrawSprite
is far more helpful for human readers.
Further, once the code has been disassembled to source code, using symbols
instead of fixed addresses makes it easier to alter the program or reuse
the code in another project.
When the target address of instructions like JSR
and
LDA
falls within the scope of the data file, SourceGen classifies
the reference as internal, and automatically adds a generic
symbolic label (e.g. L1000
). The label can be edited if desired.
On the line at address $2000, select
Actions > Edit Label, or double-click on the label
"L2000
". Change the label to "MAIN", and hit
Enter. The label changes on that line,
and on the two lines that refer to address $2000.
(If you're not sure which lines refer to address $2000,
select line $2000 and check the list in the References window.)
Sometimes the target address falls outside the data file. Examples
include calls to ROM routines, use of zero-page storage, and access to
memory-mapped I/O locations. SourceGen classifies these as external,
and does not generate a symbol. In an assembler source file, symbols
for these would be expressed as equates (e.g. FOO = $8000
),
usually at the top of the file or in an "include file". SourceGen
allows you to specify symbols for addresses and numeric constants within
the project ("project symbols"), or in a symbol file that can be
included in multiple projects ("platform symbols"). The SourceGen
distribution includes platform symbol files with ROM addresses for
several common systems.
For an example, consider the code at address $2000, which is
LDA $3000
. We want to assign the symbol "INPUT" to address
$3000, but we can't do that by editing a label because it's not inside
the file bounds. We can open the project symbol editor from the project
properties editor, or we can use a shortcut.
With the line at $2000 selected, use Actions > Edit Operand,
or double-click on the value in the Operand column
("$3000
"). This opens the
Edit Instruction Operand dialog. In the bottom left, click
Create Project Symbol. Set the Label field to
"INPUT", and
click OK, then OK in the operand editor.
The instruction at $2000 now uses the symbol "INPUT"
as its operand. If you scroll to the top of the file, you will see a
".EQ
" line for the symbol.
Numeric v. Symbolic
When SourceGen sees a reference to an address, such as the operand of an
absolute JSR
or LDA
, it recognizes it
as a numeric reference. You can edit the instruction's operand
to use a symbol instead, changing to a symbolic reference.
Sometimes the way these are handled can be confusing.
Let's use the branch statement at $2005 to illustrate the difference. It performs a branch to $2009, which was automatically assigned the label "L2009".
Edit the label at $2009 (double-click on "L2009" there),
and change it to "IN_RANGE". Line $2005 changes to match.
This works because SourceGen
is auto-formatting line $2005's operand based on the label it finds when it
chases the numeric reference to $2009.
The Info window shows this as Format (auto): symbol "IN_RANGE"
.
Use Edit > Undo to revert the label change.
Edit the instruction operand at $2005 (double-click on "L2009" there). Change the format to Symbol, and type "IN_RANGE" in the symbol box. The preview shows BCC IN_RANGE (?), which hints at a problem. Click OK.
Some things changed, but not the things we wanted. Line $2005 now
says BCC $2009
, instead of BCC L2009
, and the
label at $2009 has disappeared entirely. What went wrong?
The problem is that we edited the operand to use a symbol that isn't defined anywhere. Because "IN_RANGE" isn't defined, the operand was given the default format, and displayed as a hex value. The numeric reference to $2009 was replaced by the symbol, and nothing else refers to that address, so SourceGen no longer had any reason to put an auto-generated label on line $2009, which is why that disappeared.
The missing symbol is called out in a message window that popped up at the bottom of the code list window. The message window only appears when there are messages to read. You can hide the window with the Hide button, and make it re-appear with the button in the bottom right of the main window that currently says 1 message.
We can resolve this issue by providing the desired symbol. As you did earlier, edit the label on line $2009 (double-click in the label column) and set it to "IN_RANGE". When you do, the operand on line $2005 is updated appropriately. If you select line $2005, the Info window shows the format as Format: symbol "IN_RANGE", indicating that the symbol was set explicitly rather than automatically.
Symbolic references always link to the symbol, even when the symbol doesn't match the numeric reference. To see this, remove the label from line $2009 by undoing that change with Edit > Undo, so the symbol is again undefined. Now set the label on the following line, $200A, to "IN_RANGE".
Line $2005 now says "BCC IN_RANGE-1
". Earlier you set
the operand to be a symbolic reference to "IN_RANGE", but the symbol
doesn't quite match, so SourceGen automatically adjusted the operand by
one byte to point to the correct address. Generally speaking, SourceGen
will do its best to use the symbols that you tell it to, and will adjust the
symbolic references so that the code assembles correctly.
Edit the label on line $200A, and change it to "NIFTY". Note how the reference on line $2005 also changed. This is an example of a "refactoring rename": when you changed the label, SourceGen automatically found everything that referred to it and updated it. If you edit the operand on line $2005, you can confirm that the symbol has changed.
(If you want to clean this up before continuing on to the next section, put the label back on line $2009.)
Non-Unique Labels
Most assemblers have a notion of "local" labels, which go out of scope when a non-local (global) label is encountered. The actual definition of "local" is assembler-specific, but SourceGen allows you to create labels that serve the same purpose.
By default, newly-created labels have global scope and must be unique. You can change these attributes when you edit the label. Up near the top of the file, at address $1002, double-click on the label ("L1002"). Change the label to "LOOP" and click the "non-unique local" radio button. Click OK.
The label at line $1002 (and the operand on line $100B) should now be "@LOOP". By default, '@' is used to indicate non-unique labels, though you can change it to a different character in the application settings.
At address $2019, double-click to edit the label ("L2019"). If you type "MAIN" or "IS_OK" with Global selected you'll get an error, but if you type "@LOOP" it will be accepted. Note the "non-unique local" radio button is selected automatically if you start a label with '@' (or whatever character you have configured). Click OK.
You now have two lines with the same label. In some cases the assembly source generator may need to "promote" them to globals, or rename them to make them unique, depending on what your preferred assembler allows.
Address Region Pre-Labels
When we created an address region at $2000, the LDA
on line $1002 lost its label, and the STA
on line $1005
gained one. The difficulty with having labels in both operands is that
both instructions refer to the byte at offset +000017, but that offset has
different addresses before and after the code is relocated, and you
can't assign multiple addresses to a single file offset.
In assembly source code, we'd solve this by putting a label right before the address change, and another one after. We can do the same thing here by setting a "pre-label" for the address region.
Select the .addrs
line before line $2000, then use
Actions > Create/Edit Address Region (or double-click
on the operand of the .addrs
line, i.e. "$2000
";
if you double-click on the .addrs
opcode you'll jump
to the matching .adrend
instead).
This opens the address region editor. In the Pre-label
section near the bottom, enter "COPY_SRC", then click
OK.
This added a line before the .addrs $2000
with
the new label, and updated the LDA
at line $1002 to
refer to the symbol.
Pre-labels are treated as "external" symbols, because the address they're associated with isn't actually represented by the file in its final form. As a result, you can't use non-unique local names like @LOOP.