Remember, when we fetched EPROM outputs with a 74374 at the
rising edge of CLK, to get a stable/clear signal for one CLK cycle ?
When we want to control tristate_buffers, latches and bus signals
(such as address_valid and write_enable), this trick will also work.
And if there are synchronous counters in our processor (PC,
stackpointer, etc.), they can be controlled in the same way
as our 74163 microcode counter.
We just define single Bits, and combine them in one line of
assembler source code with logic_OR_operators, as we already
did in our traffic light example.
Now imagine 16 registers (maybe static RAM, or latches
encoded/controlled by 74138).
To handle them, we only have to modify, what we already know.
Imagine O0..O3 as register read select, and O4..O7 as register
write select for one CLK cycle (reading and writing different
registers turned out to be useful when calculating jumps/branches).
The trick is, to define 4 Bits at once.
R_R0 EQU $00000000 R_R1 EQU $00000001 R_R2 EQU $00000002 ;... R_R15 EQU $0000000f W_R0 EQU $00000000 W_R1 EQU $00000010 W_R2 EQU $00000020 ;... W_R15 EQU $000000f0
When using static RAM as registers, we could use two 74374 to
fetch read and write select/address from the EPROMs.
Outputs of both latches and RAM address lines are wired together,
we only have to switch the output_enables of bouth latches within
the right moment.
Be warned: the timing issues when implementing registers,
no matter if RAM or TTL latches in parallel, can be tough.
Unless there is a chance of using synchronous counters as registers.
(Preferably up/down_counters with three_state outputs.)
Hint: it is not necessary to disable RAM write cycles,
if we could write "unwanted" ALU_results
(after comparing values by subtraction, for instance)
into oblivion... also known as /dev/null.
Try to reserve one register for that purpose, preferably register 0.
It's a good thing to use a notation such as R_register for read,
and W_register for write.
We may as well define registers as R_PC, W_PC, R_ACC, W_ACC
and so on, to make the code look better.
And for the ALU, we just go on with that example, and reserve
(for instance) O8..O15 from the next EPROM.
ALU_INC EQU $00000300 ALU_DEC EQU $00000400
So one line of assembler code for incrementing register 14
would look like this:
DC.L R_R14 | R_W14 | ALU_INC
And if there is a 74374 between ALU output and address_bus
to program memory, and R14 is used as PC
(assuming equal word sizes in ALU and address_bus, what we
won't have in a 6502), you sure can imagine what to do with it.
Another latch could be between ALU output and data_bus.
To make things more complicated, we could add some other
latches and/or multiplexers for bypassing the ALU and
increasing throughput to external address_bus/data_bus.
And now a macro for incrementing one register:
INC_REG MACRO regr,regw
DC.L regr | regw | ALU_INC
ENDM
;...
INC_REG R_PC,W_PC ;increment PC
And for performing operations with the accumulator
(just another register), we could also pass the ALU
command into the macro, while using R_ACC | W_ACC
inside of it.
In our traffic light example, we used a complete macro
for our "program".
When writing microcode for a processor, it's a good thing
to combine several macros to a "program" for one opcode.
Example:
ORG $29*$20*4
;$29 = 6502 opcode AND#,
;$20 Longwords (4 Bytes) reserved for each "program".
AND_IMM ;label that explains us what the "program" is good for.
M_IMM ;calculate address for data (would be PC+1),
;and put it on the address_bus
M_AND ;perform AND with ACC and Byte from data_bus,
;write the result back to ACC, set status register.
M_NEXT ;increment PC, put it on the address_bus,
;prepare to fetch the next command.
When using an EPROM based ALU, there is a trick of reducing
component count (and speed).
The basic idea is to store the ALU status outputs (temporary
status) into a 74374 after a data calculation.
The status register would be just a register as all the others,
and there would be special ALU commands,
that read processor
status and temporary status, throw the Bits together according
to the type of the previous data calculation (arithmetic/logic),
and write the result back into the status register.
Problem is, that we need a 74374 as "shadow register", that
gives the flags to the IRQ enable and the flag multiplexer
(for testing true/false condition in microcode, as already
discussed).
We could clock this latch with another microcode Bit... or decide
to use register 15 and decode the write address with a NAND.
Updating the flags can be done during command fetch
(or memory write back).
The macro could look like:
M_NEXT MACRO flagmod
DC.L R_R14 | W_R14 | ALU_INC | LD_MA
;increment PC, load memory_address output latch with the result
DC.L R_R15 | W_R15 | flagmod | OE_TF | LD_CMD
;modify flags (OE_TF=output enable for temorary flag register)
;opcode should be stable on databus, so we load it at the next
;rising edge on CLK with LD_CMD. (remember our traffic light "AGAIN".)
ENDM
We would just define some constants as ALU commands, let's suppose
M_NEXT F_NZ ;increment PC. modify N,Z. fetch next command.
Now for something different:
Oh captain, my captain ! Iceberg ahead !
(Responding to real live events/issues can take some time.)
When writing the opcode into the 74374 command latch at the
rising edge of CLK, and fetching the microcode Bits into other
74374 at the next rising edge, there is one problem:
The processor can not respond within the next cycle after
fetching the opcode, only in the cycle thereafter.
One trick would be, to put a 74373 and a 74374 in series for
fetching the opcode:
we could either read the opcode the
old fashioned way, or try to do a "prefetch".
Theoretically, using an up/down_counter as PC could be a good idea.
Thinking one step ahead when writing microcode isn't much fun.
If we don't aim for high speed (3 MHz, for instance),
we could simplify the problem by arranging program memory and
microcode EPROMs in series.
What means, the program memory address changes at the
rising edge of CLK, the opcode is fetched with a 74373
(that is transparent for the full cycle), runs through the
microcode EPROMs, and the resulting microcode Bits
are fetched at the next rising edge on CLK (and so, we
can respond to the opcode within the next cycle).
Note, that the propagation delay of Program Memory
(and adress decoding), Command Latch, Microcode ROM
(and perhaps Microcode Latches) will sum up
in that case.
But we are going too far off topic, because harware design
would be another story.
The things discussed in here are just a "starting point" for
experimenting with your own microcode, and quite a few
tricks went unmentioned.
For more sophisticated approaches on microprogramming,
consult/dissect the AM2910 datasheet.
Be aware, that writing microcode can be more time consuming
than hardware design.
That's all for now. Good luck (You're going to need it).
[HOME] [UP]/ [BACK] [1] [2] [3] [4] [5] [6] [7] [8]
(c) Dieter Mueller 2005.