home.social

#4-bit — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #4-bit, aggregated by home.social.

fetched live
  1. TD4 4-bit Sound

    Over on my other blog, I spentt a fair bit of time looking at the TD4 4-bit CPU. One of the things I wanted to do with my NAND Oscillators and Logic Sequencer PCB was hook up the address/select pins to something else. And with three select pins, allowing the choice between 8 notes, what better to connect it to, than a 4-bit CPU?

    https://makertube.net/w/aroDZYM2BHYpoB9QLJvHnk

    Warning! I strongly recommend using old or second hand equipment for your experiments.  I am not responsible for any damage to expensive instruments!

    If you are new to microcontrollers, see the Getting Started pages.

    Parts list

    The Setup

    The most obvious thing in my mind, is to hook up three of the four outputs to the three selection pins of the NAND sequencer, so that is what this post explores.

    The NAND PCB needs the jumpers removing, which disconnects the pot-driven oscillators. Then the three select/address lines can be connected to three of the four resistors supporting the OUTPUT LEDs of the TD4, as shown above.

    It is also possible to use the POWER header pins to power the NAND PCB too.

    Any of the variants of TD4 I’ve built could be used, but I’ve shown above where they would need to be connected on the original. In the end I actually soldered four header pins to the appropriate side of the resistors on my own PCB version of the TD4 as shown below. A bit crude, but it does the job.

    Connecting these over to the NAND sequencer and hooking up power gives me the following.

    The Code

    The simplest way to create a sequence is a set of OUT xx instructions where the least significant 3 bits (so values 0 to 7) map onto the three possible notes played by the NAND sequencer.

    This is the simple LED OUTPUT code from Part 3 of my series, but this continually toggles between the lowest and highest notes.

    0000 OUT 0001   # 1000 1101
    0001 OUT 0111 # 1110 1101
    0010 JMP 0000 # 0000 1111

    A counter can be used to play all 8 notes. Note that in this code B will go from 0 to 15 (b0000 to b1111) but only the last three bits select notes. This means that the sequence will count from b000 to b111 twice for each pass through this loop with the top bit being ignored.

    0000 ADD B,0001  # 1000 1010
    0001 OUT B # 0000 1001
    0010 JMP 0000 # 0000 1111

    There are only two speeds though, 1Hz and 10Hz so the above, which has three instructions, has a tempo of 20 bpm (1 note every 3 seconds) or 200 bpm (approx 3 notes every second). The tempo can be slowed down in steps of 1 second or 1/10 second by moving the JMP an instruction further down and back-filling with other instructions (ADD A,0 or b00000000 is a good one, and is essentially equivalent to a NOP).

    The following code uses the INPUT as a counter in a loop to provide a partly configurable tempo.

    0000 IN A        # 0000 0100    A = INPUT
    0001 OUT B # 0000 1001 OUTPUT = B # Plays the note in B
    0010 ADD B,0001 # 1000 1010 B = B + 1
    0011 ADD A,1111 # 1111 0000 A = A + (-1) # Loops until A = 0
    0100 JNC 0000 # 0000 0111 JUMP IF NO CARRY TO 0000 # Jump back to start for next note
    0101 ADD A,0 # 0000 0000 Optional additional NOPs
    0110 JMP 0011 # 1100 1111 JUMP to 0011 # Else keep counting

    This is still only cycling through each note individually though, but that is kind of what an 8-step sequencer would do.

    To get more creative with the programmability of the sequencer requires a series of OUT instructions and NOPs between them, for example:

    0000 OUT 0000    # 0000 1101    OUTPUT = 0000 # Play note 000
    0001 OUT 0010 # 0100 1101 OUTPUT = 0010 # Play note 010
    0010 ADD A,0 # 0000 0000 A = A + 0 # NOP
    0011 OUT 0001 # 1000 1101 OUTPUT = 0001 # Play note 001
    0100 OUT 0100 # 0010 1101 OUTPUT = 0100 # Play note 100
    0101 ADD A,0 # 0000 0000 A = A + 0 # NOP
    0110 ADD A,0 # 0000 0000 A = A + 0 # NOP
    0111 OUT 0110 # 0110 1101 OUTPUT = 0110 # Play note 110
    1000 OUT 0101 # 1010 1101 OUTPUT = 0101 # Play note 101
    1001 OUT 0011 # 1100 1101 OUTPUT = 0011 # Play note 011
    1010 ADD A,0 # 0000 0000 A = A + 0 # NOP
    1011 ADD A,0 # 0000 0000 A = A + 0 # NOP
    1100 ADD A,0 # 0000 0000 A = A + 0 # NOP
    1101 OUT 0111 # 1110 1101 OUTPUT = 0111 # Play note 111
    1110 ADD A,0 # 0000 0000 A = A + 0 # NOP
    1111 ADD A,0 # 0000 0000 A = A + 0 # NOP

    This last programme is the one running in the video at the start of this post.

    Closing Thoughts

    I appear to have made a sound card for a 4-bit CPU 🙂

    One thing I am quite keen to do is connect up the sequencer’s select pins to the TD4’s address lines, as I’d like to be able to have some incidental (accidental?) music that appears as a result of the CPU just running any other normal programme.

    To do this I’d need to either hook into the output of the PC register or the input to the HC154 ROM decoder.

    In fact, it would be really interesting to be able to hook up any sets of four signals – so the INPUT selector, or even the control decoding logic – just to see what it sounds like as the CPU is running normal code. That might require a special build of the CPU though.

    I also have an address line spare of course, so it would also be interesting to use that to select between two NAND sequencers to give me a 16 step sequence.

    Kevin

    #4Bit #74hc4051 #nand #sequencer #td4

  2. TD4 4-bit DIY CPU – Part 7

    Once the idea was floated, in Part 6 of creating an Arduino “direct to ROM” assembler, I had to just do it, so this post is a little diversion from the hardware discussion into how that could work.

    • Part 1 – Introduction, Discussion and Analysis
    • Part 2 – Building and Hardware
    • Part 3 – Programming and Simple Programs
    • Part 4 – Some hardware enhancements
    • Part 5 – My own PCB version
    • Part 6 – Replacing the ROM with a microcontroller
    • Part 7 – Creating an Arduino “assembler” for the TD4

    Basic Concepts

    This relies on using an Arduino as the ROM as described in Part 6, but the Arduino now has the option to change the ROM contents independently of the TD4 itself.

    The Arduino sketch will do the following:

    • Run the TD4 ROM routine off a timer interrupt so that it is always running and responsive.
    • Take input over the Arduino serial port to allow basic control, e.g. list, clear, etc.
    • Allow the direct input of assembler instructions, such as MOVE A,B or OUT B and so on.
    • Provide a means of selecting which line of the program to change.

    The code will thus have a number of key sections:

    • The TD4 ROM routine.
    • Some kind of serial-port command-line interpreter.
    • Handler routines for all the commands.
    • An assembler.
    • A disassembler.

    The TD4 ROM routine has already been fully described in Part 6. The only difference is that the scanning routine will be driven from a 1mS timer using the TimerOne library.

    As I want to still support a built-in demo, I now have the concept of ROM being the demo code and RAM being the “live” code to pass onto the TD4. The Arduino will initialise the RAM on startup from the ROM.

    As far as the TD4 is concerned of course, this is all still ROM.

    Command Line Interpreter

    The standard Arduino Serial routines will be used to scan for input via the serial port. It will support a line-oriented input as follows:

    bool cmdRunner (void) {
    while (Serial.available()) {
    char c = Serial.read();
    if (c == '\n') {
    strcpy(cmdSaved, cmdInput);
    cmdIdx = 0;
    return true;
    }
    else if (cmdIdx < CMD_BUFFER-1) {
    cmdInput[cmdIdx++] = c;
    cmdInput[cmdIdx] = '\0';
    }
    }
    return false;
    }

    This will keep adding any received characters to the cmdInput buffer until a newline is received, at which point the command is saved in cmdSaved and the routine will return true indicating a full line is ready to be processed.

    Once a complete line is received, then a processing function will parse it.

    Key to the processing of commands is a command table that stores the text to match and the handler function to call on finding a valid command. There is an additional parameter that will be passed into the handler function to allow the same handler function to support several commands. This will be used in the assembler itself later.

    struct cmd_t {
    char cmd[CMD_BUFFER+1];
    hdlr_t pFn;
    uint8_t idx;
    };

    const cmd_t PROGMEM cmdTable[NUM_CMDS] = {
    {"H", hdlrHelp, 0},
    {"L", hdlrList, 0},
    {"G", hdlrGoto, 0},
    };

    The algorithm for parsing commands is as follows:

    cmdProcess:
    Look for a space or newline
    IF found a space THEN
    This is the start of the parameter

    Look for the command in the command table
    IF command found THEN
    Call the handler function with the parameters

    The implementation is a bit complex, as it uses string pointers and has to chop and parse strings as it goes. It is also detailing with the command table in the Arduino’s PROGMEM which is an additional complication too.

    In order to be able to use the same command line interpreter for the input of assembler instructions, I’ve had to simplify the syntax. There are no spaces in opcodes and there has to be a space between the opcode and immediate value if used.

    Here are some examples:

    IN A       -> INA
    MOVE A,B -> MOVAB
    OUT im -> OUT im
    JNC im -> JNC im
    ADD A,im -> ADDA im

    Handler Routines

    All handler routines have the following prototype:

    typedef void (*hdlr_t)(int idx, char *param);

    void hdlrHelp(int idx, char *pParam) {
    Serial.print("\nHelp\n----\n");
    Serial.println("H: Help");
    }

    The idx parameter is the number in the last field of the command table. pParam will be a pointer to the parameter string for the command (if used).

    As we’re dealing with strings all the time, there are a number of helper functions to do things like convert strings to numbers as well as others to print numbers in various formats.

    Number formats are assumed to be as follows:

    0..9   - decimal digits
    0x0..F - hex digits
    b0..1 - binary digits

    The code provides the following:

    • str2num – the basic string parsing routine to recognise all three number formats as strings.
    • printbin – print a number in b0..1 format.
    • printhex – print a number in 0x0..F format, allowing for a possible leading zero if required.
    • printins – print an instruction in textual format.
    • printop – print an instruction in binary and hex opcode format.
    • printline – print a line number in a consistent binary and hex format.

    The code supports the following commands, so each has its own handler function:

    • H – help – show the list of commands.
    • L – list – show the disassembly of the whole working memory (RAM).
    • G – goto – set the working line number.
    • C – clear – reset all working memory (RAM) to zeros.
    • R – restore – restore the working memory (RAM) to the pre-build demo code (ROM).
    • O – opcodes – list the supported opcodes.

    Assembler

    As already mentioned, I’m using the same command line interpreter code to create the assembler. To do this, each opcode has an entry in the command table:

    const cmd_t PROGMEM cmdTable[NUM_CMDS] = {
    // Assembly commands - must be first
    {"ADDA", hdlrAsm, 0},
    {"MOVAB", hdlrAsm, 1},
    {"INA", hdlrAsm, 2},
    {"MOVA", hdlrAsm, 3},
    {"MOVBA", hdlrAsm, 4},
    {"ADDB", hdlrAsm, 5},
    {"INB", hdlrAsm, 6},
    {"MOVB", hdlrAsm, 7},
    {"OUTB", hdlrAsm, 8},
    {"OUT2B", hdlrAsm, 9},
    {"OUT", hdlrAsm, 10},
    {"OUT2", hdlrAsm, 11},
    {"JNCB", hdlrAsm, 12},
    {"JMPB", hdlrAsm, 13},
    {"JNC", hdlrAsm, 14},
    {"JMP", hdlrAsm, 15},

    // Other commands
    {"H", hdlrHelp, 0},
    {"L", hdlrList, 0},
    {"G", hdlrGoto, 0},
    {"C", hdlrClear, 0},
    {"R", hdlrRestore, 0},
    {"O", hdlrOpcodes, 0},
    };

    The order corresponds to the opcode command value, as does the parameter. As these are at the start of the table, I can assume that the position in the table is the same as the command value. This does mean that I also need to account for the duplicated instructions even if I don’t need to use them.

    I’m making the following design decisions:

    • There is the concept of a “current line” which can be set with the G (goto) command.
    • Entering a valid opcode automatically moves the current line on by 1.
    • No line information is entered as part of the opcode.

    The main logic of the assembler handler is as follows:

    Assembler:
    Command value is the provided index parameter
    Determine the im value from the provided string parameter
    RAM[line] = cmd << 4 + im
    Increment current line

    Disassembler

    Disassembly is really largely a look-up table matching opcode command values to text. This is all hidden away behind the two print routines printins() and printop().

    void printins (uint8_t ins) {
    uint8_t cmd = ins >> 4;
    uint8_t im = ins & 0x0F;

    Serial.print(FSH(cmdTable[cmd].cmd));
    if (HASIM(cmd)) {
    Serial.print(" b");
    printbin(im,4);
    } else {
    Serial.print(" ");
    }
    }

    void printop (uint8_t op) {
    uint8_t cmd = op >> 4;
    uint8_t im = op & 0x0F;

    Serial.print("b");
    printbin(cmd,4);
    Serial.print(" ");
    printbin(im,4);
    Serial.print("\t0x");
    printhex(op,2);
    }

    The main complexity is pulling the strings out of the command table. I’ve had to include a macro to provide access to the strings from the Arduino’s PROGMEM:

    #define FSH(x) ((const __FlashStringHelper *)x)

    This feels like a bit of a hack, but apparently this is how it should be done for the kind of thing I need to do!

    There is another macro here that needs explaining:

    #define HASIM(op) (op==0||op==3||op==5||op==7||op>9)

    This is a set of conditions that if true means that the command supports an immediate value. This is used in a few places to know how to parse the commands.

    Whilst in principle all commands could use the immediate value, the “official” statement of how they work assumes im=0 in many cases. So, for example, OUT B does not require an immediate value, but if one is provided then OUT B becomes OUT B+im.

    I’m not really supporting that with this code at the moment.

    Putting it all together

    Here is a serial output log of a session using the assembler.

    > H
    Help
    ----
    H: Help
    L: List
    G: Goto
    C: Clear
    R: Restore
    O: Opcodes
    OpCode
    OpCode im

    Current line: b0000 [0]

    > L
    RAM Disassembly

    b0000 [0]: JNC b1000b1110 10000xE8
    b0001 [1]: JMP b0011b1111 00110xF3
    b0010 [2]: OUT b0010b1010 00100xA2
    b0011 [3]: ADDB b0001b0101 00010x51
    b0100 [4]: OUT b0100b1010 01000xA4
    b0101 [5]: ADDA b0001b0000 00010x01
    b0110 [6]: OUT b1000b1010 10000xA8
    b0111 [7]: ADDB b0001b0101 00010x51
    b1000 [8]: OUT b0100b1010 01000xA4
    b1001 [9]: ADDA b0001b0000 00010x01
    b1010 [A]: OUT b0010b1010 00100xA2
    b1011 [B]: ADDB b0001b0101 00010x51
    b1100 [C]: JMP b0000b1111 00000xF0
    b1101 [D]: ADDA b0000b0000 00000x00
    b1110 [E]: ADDA b0000b0000 00000x00
    b1111 [F]: ADDA b0000b0000 00000x00
    Current line: b0010 [2]

    > G 13
    Goto line 13
    Current line: b1101 [D]

    > OUTB
    Assemble:
    b1101 [D] OUTB b1000 00000x80
    Current line: b1110 [E]

    > L
    RAM Disassembly

    b0000 [0]: JNC b1000b1110 10000xE8
    b0001 [1]: JMP b0011b1111 00110xF3
    b0010 [2]: OUT b0010b1010 00100xA2
    b0011 [3]: ADDB b0001b0101 00010x51
    b0100 [4]: OUT b0100b1010 01000xA4
    b0101 [5]: ADDA b0001b0000 00010x01
    b0110 [6]: OUT b1000b1010 10000xA8
    b0111 [7]: ADDB b0001b0101 00010x51
    b1000 [8]: OUT b0100b1010 01000xA4
    b1001 [9]: ADDA b0001b0000 00010x01
    b1010 [A]: OUT b0010b1010 00100xA2
    b1011 [B]: ADDB b0001b0101 00010x51
    b1100 [C]: JMP b0000b1111 00000xF0
    b1101 [D]: OUTB b1000 00000x80
    b1110 [E]: ADDA b0000b0000 00000x00
    b1111 [F]: ADDA b0000b0000 00000x00
    Current line: b1110 [E]

    > O
    Supported OpCodes:
    b0000 dataADDA im
    b0001 0000MOVAB
    b0010 0000INA
    b0011 dataMOVA im
    b0100 0000MOVBA
    b0101 dataADDB im
    b0110 0000INB
    b0111 dataMOVB im
    b1000 0000OUTB
    b1001 0000OUT2B
    b1010 dataOUT im
    b1011 dataOUT2 im
    b1100 dataJNCB im
    b1101 dataJMPB im
    b1110 dataJNC im
    b1111 dataJMP im

    > C
    Clearing RAM ... Done

    Find the code on GitHub here.

    Conclusion

    The basics for this actually came together fairly quickly, but I must admit to spending a fair bit of time fiddling about with output formats and refactoring various bits of code to try to give some consistency in terms of when newlines are applied, what is shown in binary, what in hex, and so on.

    I can’t guarantee everything has been caught, but I’ve typed in all the code (using the newer, limited syntax) from Part 3 and they all seem to work.

    It would be nice to be able to automatically reset the TD4 from the Arduino, but for now, pressing the button when required is fine.

    For the most part, unless there is a loop to get caught in, the code will cycle back to the start anyway.

    In terms of possible updates and enhancements, there are a few on my mind:

    • It would be nice to support the undocumented use of immediate values somehow.
    • It might be nice to have a way to save/load the code. It only needs to be a string of 16 2-byte hex codes.
    • It might be nice to have several demo programs to choose from.

    If I expand the instruction set and architecture, then I’ll have to think again about chunks of this code, but for now, it seems to work pretty well.

    Kevin

    #4bit #arduinoUno #define #td4

  3. TD4 4-bit DIY CPU – Part 6

    Having now successfully built my own version of the TD4 4-bit CPU in Part 5, I’m now chewing over some of the ways I’d like to try to expand it.

    • Part 1 – Introduction, Discussion and Analysis
    • Part 2 – Building and Hardware
    • Part 3 – Programming and Simple Programs
    • Part 4 – Some hardware enhancements
    • Part 5 – My own PCB version
    • Part 6 – Replacing the ROM with a microcontroller
    • Part 7 – Creating an Arduino “assembler” for the TD4
    • Part 8 – Extending the address space to 5-bits and an Arduino ROM PCB

    I already have a list of others extended projects at the end of Part 4, so I might be drawing on some of them for inspiration moving forward. Many of these are very similar projects, but with a completely different architecture. But really at this stage rather than build a different, more capable, 4-bit CPU from someone else’s design, I’m interested in seeing how far the TD4 design can go. So, ultimately, like all my projects, the fun here is in the reinventing and learning on the way.

    One of the questions I have is can I replace the DIP switches with something that can provide the data in a better way? This would be particularly critical if I expand the address space in the future. A ROM is the obvious option, but something more dynamic might be an interesting experiment too.

    This post looks at options for replacing the DIP switches with microcontrollers.

    Now I feel like I really ought to state right up front that this is a pretty ludicrous thing to do.

    At the more charitable end of the endeavor I’m using a 16MHz 8-bit AVR microcontroller with 2kB of RAM to serve up 16 8-bit values to a 10Hz 4-bit CPU.

    At the most extreme end I’m using a 125MHz, dual-core, 32-bit ARM Cortex M0+ CPU with 264 kB of RAM running an entire interpreted programming environment requiring (probably) millions of lines of low-level code to implement it, to do the same thing.

    So why bother? Well – why not?

    TD4 without the ROM

    To interface to a microcontroller, I’m after two things:

    • Ability to read the 4 address lines.
    • Ability to drive the 8 data lines.

    The best place to get at these signals is on the interface to the ROM itself – the 74HC540 octal line driver, and 74HC154 4-to-8 line decoder.

    Conveniently, these signals can be broken out quite easily on my board as shown below.

    The pink shaded area shows which components are needed for a ROM-less build. The two yellow highlights show where headers should be soldered to permit access to the address lines (top) and data lines (bottom).

    In this build, the following components are omitted from the full board:

    • 74HC154
    • 74HC540
    • 16x 8-way DIP switches
    • 128x small signal diodes
    • 8x 10k pull-up resistors

    I’ve used 6-way and 10-way pin header sockets to allow me to patch in a microcontroller. This allows for each header to conveniently include 5V and GND too. I’ve included the USB socket for power to the PCB but expect I’ll probably power the board via these 5V and GND links from the microcontroller.

    Using Arduino

    The natural choice here is to use one of the older Arduino boards, as these are all 5V IO which makes interfacing with the 4-bit CPU fairly straight forward.

    Using Arduino direct PORTIO should also make it pretty trivial to read address lines and write the data. I’ve configured the connections as follows:

    TD4 SignalArduino GPIOArduino PORTIOA0A0PORTC:0A1A1PORTC:1A2A2PORTC:2A3A3PORTC:3D0D8PORTB:0D1D9PORTB:1D2D10PORTB:2D3D11PORTB:3D4D4PORTD:4D5D5PORTD:5D6D6PORTD:6D7D7PORTD:7

    I’m avoiding D0/D1 (PORTD[0:1]) and D13 as they all have other hardware attached (serial port and LED in this case).

    Accessing the data corresponding to any specific address is as simple as follows:

    uint8_t ROM[16];

    loop:
    unt8_t addr = PINC & 0x0F
    PORTB = (PORTD & ~(0x0F)) | (ROM[addr] & 0x0F);
    PORTD = (PORTD & ~(0xF0)) | (ROM[addr] & 0xF0);

    The code could be simplified if I didn’t mind trashing whatever is configured for the other GPIO pins via the PORTIO, but it is good practice to preserve those values when only writing to a subset of the IO ports.

    In the final code below, I’ve included a toggle for A5 which allows me to do some timing measurements too.

    uint8_t ROM[16] = {
    0xB1, 0x01, 0xB2, 0x51,
    0xB4, 0x01, 0xB8, 0x51,
    0xB4, 0x01, 0xB2, 0x51,
    0xF0, 0x00, 0x00, 0x00
    };

    void setup() {
    DDRB |= 0x0F;
    DDRD |= 0xF0;
    DDRC |= 0x20;
    }

    int toggle;
    void loop() {
    if (toggle == 0) {
    toggle = 1;
    PORTC |= 0x20;
    } else {
    toggle = 0;
    PORTC &= ~(0x20);
    }

    uint8_t addr = PINC & 0x0F;
    PORTB = (PORTD & ~(0x0F)) | (ROM[addr] & 0x0F);
    PORTD = (PORTD & ~(0xF0)) | (ROM[addr] & 0xF0);
    }

    Running the code in a loop like this gives a scan frequency of around 500kHz and a response time of something like 2-3 uS for each read. That seems pretty responsive and I’m sure will be fine for a 10Hz CPU. And it is – it works great!

    Using Circuitpython

    One thing that would be really nice is a workflow that allows more of a “direct save to the CPU” approach to programming it. One option is to use a more modern microcontroller that supports a filesystem.

    The obvious choice here is a 32-bit microcontroller that supports Circuitpython. But will IO in Circuitpython be fast enough to respond to the CPU? There is one obvious way to find out – give it a try.

    There is another complication too – most Circuitpython boards run at 3.3V not 5V so that needs to be addressed too.

    Level Shifting

    I’m going to use a 74LVC245. The Adafruit product page puts it best:

    “essentially: connect VCC to your logic level you want to convert to (say 3.3V), Ground connects to Ground. Wire OE (output enable) to ground to enable the device and DIR (direction) to VCC. Then digital logic on the A pins up to 5V will appear on the B pins shifted down to the VCC logic.”

    This is an 8-way bi-directional bus transceiver and should be powered by 3V3, then the direction pin will determine the direction of the conversion as shown[ below.

    Two devices will be required. The address lines will need a 5V to 3V3 conversion; the data lines will need 3V3 o 5V.

    Here is how I’ve wired these up for a Raspberry Pi Pico:

    The Pico is connected as follows:

    • INPUT: GPIO 10-13 = A0-A3
    • OUTPUT: GPIO 2-9 = D7-D0 (not the ordering!)

    CircuitPython ROM

    The basic algorithm will be as follows:

    ROM = [16 command byte values]

    LOOP:
    Read four address lines
    Set data lines from ROM[address]

    For performance reasons it would be best to optimise both the reading of the address lines and the writing of the data lines, ideally into a single access. But as this is for a CPU that runs at a maximum of 10Hz, so for now, I’m just going with simple and see how it goes.

    import board
    import digitalio

    ROM = [
    0xB1, 0x01, 0xB2, 0x51,
    0xB4, 0x01, 0xB8, 0x51,
    0xB4, 0x01, 0xB2, 0x51,
    0xF0, 0x00, 0x00, 0x00
    ]

    Tpin = digitalio.DigitalInOut(board.GP21)
    Tpin.direction = digitalio.Direction.OUTPUT

    A0pin = digitalio.DigitalInOut(board.GP10)
    A1pin = digitalio.DigitalInOut(board.GP11)
    A2pin = digitalio.DigitalInOut(board.GP12)
    A3pin = digitalio.DigitalInOut(board.GP13)

    D0pin = digitalio.DigitalInOut(board.GP2)
    D0pin.direction = digitalio.Direction.OUTPUT
    D1pin = digitalio.DigitalInOut(board.GP3)
    D1pin.direction = digitalio.Direction.OUTPUT
    D2pin = digitalio.DigitalInOut(board.GP4)
    D2pin.direction = digitalio.Direction.OUTPUT
    D3pin = digitalio.DigitalInOut(board.GP5)
    D3pin.direction = digitalio.Direction.OUTPUT
    D4pin = digitalio.DigitalInOut(board.GP6)
    D4pin.direction = digitalio.Direction.OUTPUT
    D5pin = digitalio.DigitalInOut(board.GP7)
    D5pin.direction = digitalio.Direction.OUTPUT
    D6pin = digitalio.DigitalInOut(board.GP8)
    D6pin.direction = digitalio.Direction.OUTPUT
    D7pin = digitalio.DigitalInOut(board.GP9)
    D7pin.direction = digitalio.Direction.OUTPUT

    def doOutput (data):
    if (data & 0x01):
    D0pin.value = True
    else:
    D0pin.value = False

    if (data & 0x02):
    D1pin.value = True
    else:
    D1pin.value = False

    if (data & 0x04):
    D2pin.value = True
    else:
    D2pin.value = False

    if (data & 0x08):
    D3pin.value = True
    else:
    D3pin.value = False

    if (data & 0x10):
    D4pin.value = True
    else:
    D4pin.value = False

    if (data & 0x20):
    D5pin.value = True
    else:
    D5pin.value = False

    if (data & 0x40):
    D6pin.value = True
    else:
    D6pin.value = False

    if (data & 0x80):
    D7pin.value = True
    else:
    D7pin.value = False

    while True:
    Tpin.value = True
    addr = 0
    if (A0pin.value == True):
    addr = addr + 1
    if (A1pin.value == True):
    addr = addr + 2
    if (A2pin.value == True):
    addr = addr + 4
    if (A3pin.value == True):
    addr = addr + 8

    Tpin.value = False
    doOutput(ROM[addr])

    I’ve included a timing pin to GPIO21 so I can see how long it takes to access the IO.

    It turns out that it takes something of the order of 50-60uS to read the four address lines and something in the region of 70-80uS to write out the 8 data lines. The above simple Circuitpython code to do this is running with a frequency of around 7kHz.

    Now at this point I ought to be reading through the datasheets for the ICs used in the CPU to check response times and timing tolerances so see if this is ok. But I didn’t bother with any of that as it all appears to work!

    Conclusion

    The Circuitpython is obviously a lot slower than the Arduino running optimised PORTIO code, even though the Circuitpython is running on a 125MHz processor compared to the Arduino’s 16MHz. Of course, if performance was critical then switching to direct GPIO access in C on the Pico would be a lot faster again. Even just having a way to do a single block-access of GPIO would probably make quite a difference.

    But for this application, either as they are seem to work absolutely fine.

    The ability to quickly edit the ROM contents is pretty useful with the Circuitpython. But I am now wondering how difficult it would be to have some kind of uploader to the Arduino over the serial port. There are only 16 bytes to transfer after all.

    In fact it might even be possible to create a simple interactive assembler that allows code to be typed in over the serial port using proper word-based op-codes (like ADD, IN, OUT, etc). At the very least a simple serial port interface to type in numeric values would be relatively straight forward I think. It might also be possible to allow the microcontroller to reset the CPU too.

    I’m not sure the added complications of logic shifting, etc, make it worth carrying on with a Pico version at this stage, so I think improving the Arduino is probably the way to go for now.

    Kevin

    #4bit #arduinoUno #circuitpython #PORTIO #raspberryPiPico #TD4

  4. TD4 4-bit DIY CPU – Part 5

    As a prelude to expanding the TD4 in my own way, I thought I ought to at least attempt to reproduce the circuit myself. But if I’m going to design my own version of the TD4 PCB and build it, I may as well add a few little extras.

    • Part 1 – Introduction, Discussion and Analysis
    • Part 2 – Building and Hardware
    • Part 3 – Programming and Simple Programs
    • Part 4 – Some hardware enhancements
    • Part 5 – My own PCB version

    PCB Design

    I’ve build the schematic up from the published schematics to be found in: https://github.com/wuxx/TD4-4BIT-CPU

    But I wanted to ensure I had the following additions:

    • LED outputs for the two registers.
    • Some kind of LED output to show which instruction is being worked on.
    • Minimal number of surface mount components.
    • Option to experiment with replace the diodes with LEDs.

    With all that in mind, I’ve ended up with the following schematic, which I’ve spread over four sheets.

    The core CPU:

    Power, clock and reset:

    IO and User Interface:

    ROM:

    I’ve managed to get this all into a 180×110 mm board.

    Design choices/notes:

    • I’ve largely followed the same layout as the cheap TD4 kit I bought.
    • I’ve added the register LEDs to the top of the board.
    • And moved the INPUT and OUTPUT up there to match.
    • I’ve added LEDs next to each bank of switches to show which instruction is being worked on.
    • All LEDs are 3x2mm rectangle LEDs, including the OUTPUT LEDs (which were surface mount on the original).
    • I’ve kept a micro USB socket, which unfortunately is still surface mount. I’m hoping it is the right footprint for a commonly available connector. But I’ve moved it to the bottom of the board.
    • I’ve made sure all IC labels are included on the silkscreen along with all resistor values.

    Full BOM

    ICs:

    • 1x 74HC10 Triple 3-input NAND
    • 1x 74HC14 Hex Schmitt trigger inverters
    • 1x 74HC32 Quad 2-input OR
    • 2x 74HC153 Dual 4-to-1 selector/multiplexer
    • 1x 74HC154 4-to-16 decoder/demultiplexer
    • 4x 74HC161 4-bit binary counter
    • 1x 74HC283 4-bit binary full adder
    • 1x 74HC540 Octal inverting buffer/line driver

    Semiconductors:

    • 128x 1N4148 or 1N914 small signal diode
    • 28x 3x2mm rectangular LED

    Passive components:

    • Resistors: 2x100R; 35x 1K; 1x 3K3; 9x 10K; 1 x 33K; 3x 100K
    • Capacitors: 3x 10uF electrolytic

    Other components:

    • 2x SPDT slider switches (see PCB for footprint)
    • 1x micro USB socket (Molex, see PCB for footprint)
    • 2x tactile switches
    • 1x 4-way DIP switches
    • 16x 8-way DIP switches
    • DIP sockets: 7x 16 way; 4x 14 way; 1x 20 way; 1x 24 way

    Build Photos

    Conclusion

    I’ve managed to get the sense of the address LEDs reversed. They are all lit apart from the running instruction. But actually I quite like that effect. It has more LEDs active and still shows which instruction is active.

    Apart from that, this seems to work as far as I can tell at present, so I’m saying this is a success!

    I haven’t decided if I want to publish this board yet though. It relies on so much of the effort of others. I’ve really not done very much myself at all.

    But now I can go back to the schematic and see if I can expand on the logic in anyway, knowing I had a known-good, working, starting point.

    Kevin

    https://makertube.net/w/beyW1XWnp4dfxkXhFWfhhm

    #4bit #cpu #TD4

  5. TD4 4-bit DIY CPU

    I was looking for DIY CPU projects, as I like kits that help me think at the lowest level of processing. It helps keep me grounded in how far technology has come over the years.

    • Part 1 – Introduction, Discussion and Analysis
    • Part 2 – Building and Hardware
    • Part 3 – Programming and Simple Programs
    • Part 4 – Some hardware enhancements
    • Part 5 – My own PCB version
    • Part 6 – Replacing the ROM with a microcontroller
    • Part 7 – Creating an Arduino “assembler” for the TD4
    • Part 8 – Extending the address space to 5-bits and an Arduino ROM PCB

    Some of the options that I know about, that actually come as kits you can buy and are interesting for me for DIY computers are:

    But I wanted to go further down and actually find something that lets me build a simple CPU from gates. Here there are several options too:

    Whilst I’d love to build Ben Eater’s 8-bit CPU, the kit as provided is too much of an outlay for me. It is ~$300 – I mean, good for what you get and all the knowledge, but it is a solderless breadboard kit and that isn’t really what I’m after. The Gigatron is a distinct possibility that I’ll come back to at some point I think.

    NAND to Tetris is excellent, and I have their book, but it is all emulated or virtualised, which does allow for all the scaling required for an (arguably) actually useful device, but isn’t designed to be built in actual hardware.

    But the TD4 is really interesting. It is available as a PCB and components for approx £25 on Aliexpress and based on an open source design that shows the basic operation of a 4-bit CPU.

    The “deluxe” kit mentioned above is a lot more expensive ~£120 but has all signals broken out to LEDs which, whilst is an awful lot of soldering, does looks incredibly impressive! The MiniMax is an evolution of the TD4 and kits for that are around £120. In fact, searching on Tindie and Hackaday.io for “TD4” will surface a few other DIY projects and even kits to purchase.

    The TD4 does seem to fit the bill for me as an inexpensive kit to try. The downside is that documentation for it (in English) is pretty sparse.

    The TD4 project itself is by “wuxx” an embedded engineer from HangZhou and much of the documentation is in Chinese. It is based on a Japenese book by Kaoru Tonami called “how to build a CPU” which can be found for ~£50 online, but as I don’t know Japanese either is unlikely to help me very much.

    There are some sources of information that others have put together though, so I’m going to be using those as a starting point along with whatever I can figure out myself:

    This post is my own “thinking out loud” as I work through the various parts to see how they work.

    Basic Architecture

    This is a 4-bit computer, with a 4-bit data bus, 4-bit commands, and a 4-bit address bus.

    There is a block diagram on GitHub:

    The fundamental process is as follows. For each “tick” of the computer:

    • An OpCode is read from the ROM using the current 4-bit address (0 to 15) from the program counter.
    • Each ROM entry is an 8-bit word with 4-bits as a command and 4-bits as data for the command.
    • The data selector determines a 4-bit INPUT value. This can come from one of the two registers (A or B); or a set of four switches for the IN register; or be set to zero.
    • This goes to the adder which adds it with the immediate data from the ROM (which could of course be zero).
    • The OUTPUT of the adder can go to either of the two registers (A or B), an OUT register which is hooked up to four LEDS, or the program counter register to create a “jump”.

    I’ll pull apart the different parts of the CPU in the following sections.

    ROM Format

    Each 8-bit word in the 16-byte ROM has the following format:

    • 4 command bits
    • 4 immediate data bits

    Instruction Decoding

    The 4 command bits from each ROM instruction have to be turned into the various selection signals to activate different parts of the CPU.

    There is a table from GitHub again:

    The explanation in Japanese translates (apparently) to:

    “Explanation: The SEL_B and SEL_A signals select the ALU data source, while #LOAD0-#LOAD3 select the ALU data destination. More formally, they control the source and destination operands of instructions, respectively.”

    From this we can note the following:

    • There is no instruction for 1000,1010,1100 or 1101.
    • Instruction 1110 appears twice, and the selectors set are dependent on the state of the C (carry) flag.
    • Some instructions act on immediate data, others assume it will be 0.

    The LOAD# have the following meanings in the system:

    • LOAD#0 – Register A (A)
    • LOAD#1 – Register B (B)
    • LOAD#2 – OUTPUT (OUT)
    • LOAD#3 – Program counter (PC)

    The actual decoding happens in two parts: input selection; and output selection.

    Registers

    The system has four registers, each formed from a 74HC161 “presettable, synchronous, 4-bit binary counter”. There are two general purpose registers: A and B. There is one output register, whose contents drive the state of four LEDs. And there is a program counter. Here is the schematic for register A:

    P0-P3 come from the output of the adder directly. RST and CLK are hopefully self-explanatory. For the A and B registers, Q0-Q3 go into the INPUT selection section (see later). For the OUTPUT register, these go directly to LEDs. For the program counter, these go into the ROM address logic (again more on that later).

    The relevant operation of the 161 is described in the datasheet:

    “The outputs (Q0 to Q3) of the counters may be preset HIGH or LOW. A LOW at the parallel enable input (PE) disables the counting action and causes the data at the data inputs (D0 to D3) to be loaded into the counter on the positive-going edge of the clock… A LOW at the master reset input (MR) sets Q0 to Q3 LOW…”

    So on reset the outputs are all 0. When PE goes LOW, on the next clock pulse, the value on the inputs (P0-P3) is loaded into the counter and reflected on Q0-Q3. However, because CET and CEP are LOW the counter won’t actually count any further.

    The program counter is a bit special, in that it is actually allowed the count by having CET and CEP set HIGH. This allows it to step through the instructions on a clock pulse.

    In this case Q0-Q3 go off to the ROM address decoding, which I’ll come to in a moment.

    INPUT Selection

    There are two SELECT lines select the INPUT data as follows:

    SEL_BSEL_ASOURCE00Register A (A)01Register B (B)10INPUT (IN)11Zero value (0)

    Input selection is handled by two 74HC153 dual 4-input multiplexers. Two are required as there are four data lines to be switched, and they all have one of four options to switch between based on the SELECT lines above.

    Here is the relevant part of the schematic.

    On the left are the three sets of four data signals that come from the A, B and IN inputs. D0 from each of the inputs goes to U7/1Cn; D1 goes to U7/2Cn; D2 to U8/1Cn; and D3 to U8/2Cn. Notice that the fourth set of data signals (U7/1C3, 2C3 and U8 1C3, 2C3) are connected directly to GND for the “zero” INPUT state (SEL_A=1, SEL_B=1).

    On the right, the two pairs of outputs make up the four data lines to feed into the adder section.

    So where does the SEL_A and SEL_B signals come from? From the schematic, we can see:

    • SEL_A = D4 OR D7 (via U10B – one of the 74HC32 2-input OR gates)
    • SEL_B = D5

    We can start to explain why some of the instruction combinations don’t exist (or at least, aren’t distinct) as we can see that SEL_A depends on either D4 or D7.

    OUTPUT Selection

    The OUTPUT selection is a little more complicated. As previously mentioned, there are four destinations: the two registers, the OUTPUT register, and the program counter.

    Each register has a /PE (“parallel enable input”) signal which is active low. These are individually fed by the output of the LOAD# logic.

    The three signals at the bottom are D6, D7 and D4. The lone signal top left is the carry (/C) flag, and the four outputs top right are the four LOAD# signals which feed directly into the /PE pins of the four registers.

    So from this we deduce the following relationships:

    • Reg A LOAD0 HIGH = D6 OR D7 – so LOAD0 is only active (LOW) when both D6 and D7 are LOW.
    • Reg B LOAD1 HIGH = NOT D6 OR D7 – so LOAD1 is only active (LOW) when D6 is HIGH and D7 is LOW.
    • OUT LOAD2 LOW = NOT D6 AND D7 – so LOAD2 is only active (LOW) when D6 is LOW and D7 is HIGH.
    • PC LOAD3 LOW = D6 AND D7 AND (D4 OR /C) – so LOAD3 is only active (LOW) when both D6 and D7 are HIGH and either D4 is HIGH or the carry signal (/C) is LOW.

    This effectively means that D6 is used to select between registers A and B when D7 is LOW; and between OUT and PC when D7 is HIGH (subject to either D4 or the /C signal too in the case of PC).

    Once again, we can see that there is some redundancy in the system for certain combinations of D4 to D7.

    ROM Address Decoding

    The 4-bit output from the program counter is effectively a 4-bit address bus. This gets turned into a set of selection signals to select which “byte” of the ROM should be active.

    This simply uses a 74HC154, 4 to 16 line decoder, meaning that a 4-bit number goes in and one of 16 corresponding outputs goes LOW whilst the rest remain HIGH. There is no memory address or matrix handling – there is literally one control line per “memory” location.

    The ROM itself is a set of 16 8-way DIP switches and diodes, so once its control signal is active (LOW) then those DIP switches become relevant on the data bus. Here is the last location and data bus logic. Note that all data signals are pulled HIGH by default, so will only be read as LOW if the DIP switch connects it to LOW via the diode, and that is only possible if that DIP block is selected from the 4 to 16 line decoder.

    The 74HC540 is an inverting line buffer, turning any active LOW DIP switch settings into HIGH signals on the command/data bus. Recall that D0-D3 represent immediate data and D4-D7 represent command logic.

    The Adder (ALU)

    The arithmetic logic unit (ALU) for this CPU is a simple adder. A 74HC283 is a 4-bit binary full adder. “full” in that it supports 4-bit add-with-carry functionality, although in this design, carry is only used on the output stage – it doesn’t form part of the input addition.

    A0-A3 comes from the INPUT selection circuitry, so can represent either register A or B, the state of the IN switches, or a fixed zero (0) value. B0-B3 comes directly from D0-D3 from the ROM contents as selected by the ROM addressing logic.

    The COUT (carry) flag goes into a flip-flop and the active LOW version of the output is used as the carry flag in the LOAD# decoding logic to support the “JUMP IF NOT CARRY” instruction. So returning to the logic of #LOAD3, we have:

      COUT    /C    D4   D6   D7    LOAD3
    0 1 X 1 1 0 -> Dst = PC
    X X 1 1 1 0 -> Dst = PC

    Hence a jump will only happen (i.e. the PC get loaded) either if D4, D6, D7 are all 1 (unconditional) or if D4 =0, D6, D7 are 1 (conditional) if the CARRY flag is NOT set by the adder, resulting in /C = 1.

    Some of the ROM instructions require D0-D3 to be zero in which case the adder is effectively taking the input (A, B, IN, 0) and loading it into the destination register (A, B, OUT, PC).

    Notice that the adder does not use the carry in (CIN). This is tied to zero. Apparently this was left floating on an earlier revision of the board, which caused spurious results!

    Putting it all Together

    The complete truth table for the SEL, D4-7 and LOAD signals is as follows.

    SEL_BSEL_AD4D5D6D7LD0/ALD1/BLD2/OPLD3/PCADD A,i0000LL00000111MOV AB0001LH10000111IN A0010HL01000111MOV A,i0011HH11000111MOV BA0100LL00101011ADD B,i0101LH10101011IN B0110HL01101011MOV B,i0111HH111010111000LH00011101OUT B1001LH100111011010HH01011101OUT i1011HH110111011100LH0011111=C1101LH10111110JNC1110HH0111111=CJMP1111HH11111110

    Returning to our instruction table, we can see how the decoding of the D4-D7 lines leads to enacting the various commands. In particular, we can now expand the table to show how the SEL and LOAD logic results in selecting the source and destination registers as follows:

    D7-D4D3-D0INPUTOUTPUTADD A, data0000dataAAMOV A, B00010000BAIN A00100000INAMOV A, data0011data0AMOV B, A01000000ABADD B, data0101dataBBIN B01100000INBMOV B, data0111data0BOUT B *10000000BOUTOUT B10010000BOUTOUT data *1010data0OUTOUT data1011data0OUTJNC B *1100dataB/CPC/noneJMP B *1101dataBPCJNC1110data0/CPC/noneJMP1111data0PC

    As per the table, we can also now infer the missing, or duplicate, instructions (marked * above).

    In this table, the output will always be the addition of the INPUT and D3-D0, so everywhere 0 is specified for D3-D0 then in reality a value could be placed here instead. But then the instruction would take on a different meaning.

    For example, MOV A, B is really MOV A, B+data, which really only makes sense when data is set to 0 otherwise overflows are very likely to occur.

    It is also worth noting that SEL_A depends on either D4 or D7, and when SEL_A is set to 1 the input can only be either register B or zero. However, to output to OUT or PC, D7 has to be set. This means that instructions that act on OUT or PC can only take an input from register B or zero.

    The two JMP B instructions are going to be of limited use too. They are essentially JMP to B+data instructions. There are probably some creative uses of such instructions, but for simplicity, keeping to the “0” versions that just depend on the immediate data is probably best.

    Utility Blocks

    There is one section of the circuit that hasn’t been considered yet. There is a block that provides the clock and reset circuitry.

    The clock is based on a Schmidt trigger oscillator and can run on automatic or on manual trigger. There are two selectable speeds: 1Hz or 10Hz.

    Both the clock and reset signals feed into the four registers and the carry flip-flop.

    The remaining block is the power. It has a micro-USB socket and has to be powered from 5V directly either via the USB socket or directly into a 2-pin jumper header.

    Conclusion

    I have one on order. I’m looking forward to building it and giving it a go!

    I really like the LEDs on the deluxe version, but that is a bit too much for me just for some messing around, but I am wondering how difficult it would be to attempt my own version with a few extra LEDs.

    Assuming I manage to get one built and working, I’ll have a poke about at some signals and see what the art of the possible might be.

    Kevin

    #4bit #cpu #LOAD0 #LOAD3 #TD4

  6. CW: WaveBoy: Now Available As a Pre-Release!
    Here it is! It's at the point where I think I'm happy enough for folks to try it out. I only have a few PCBs available but plan to have more made. The firmware is in a state I consider it usable though I do have some future plans for it. The original concept is basically there albeit without a few creature comforts (on-module wave generation algorithms for example). But it sounds just as I wanted it to and I think folks will really enjoy it.

    It pairs exceptionally well with the Doepfer Wasp (A-124) module since the intentional aliasing really seems to drive the resonance and bring out that signature wasp sound.

    It's been a long road and there's more to do but hoping folks give this a look see! Available on our store now! (bitbybitsynths.com). Be aware there's a week or so turn-around time to assemble the module since I'm currently making these to order. In the future I hope to have a kit version so folks that are handy with an iron can assemble the through-hole parts themselves (the SMD parts will be pre-soldered).

    #WaveBoy #Eurorack #Modular #NowAvailable #Chiptune #GameBoy #4bit #lofi