PSCD32 Development Diary

Erasing the condition

Earlier, I said I noticed that there was a possible race condition between the high-priority interrupt responding to the CD32 polling sequence by sending the CD32 reply word (a 16-bit value in memory used as a response) through SPI, and the lower-priority repeating polling of the PS2 which forms the CD32 reply word as it analyses the received message. Since the CD32 polling handler retrieves the CD32 reply word directly from the structure in memory where the analysed state is constructed, there's a possibility (I thought) that at some point a half-formed message would be present in this variable. This could be fixed by only writing to this location with completely valid messages, which currently isn't done. Let's take a look at the source code.
/* Header response before pad type is determined. */ typedef struct __attribute__((packed)) _PERIPHERAL_RESPONSE_GENERIC { unsigned char no_use; unsigned char peripheral_type; unsigned char spacer_0x5a; } PERIPHERAL_RESPONSE_GENERIC; /* Digital controller 0x41 response. */ typedef struct __attribute__((packed)) _PERIPHERAL_RESPONSE_0x41_DIGITAL_CONTROLLER { PERIPHERAL_RESPONSE_GENERIC generic; /* 1 16-bit word extra data after header. */ union { struct __attribute__((packed)) { unsigned select:1; unsigned blank2:1; unsigned blank1:1; unsigned start:1; unsigned u:1; unsigned r:1; unsigned d:1; unsigned l:1; }; unsigned char dpad_buttons; }; union { struct __attribute__((packed)) { unsigned l2:1; unsigned r2:1; unsigned l1:1; unsigned r1:1; unsigned bT:1; unsigned bO:1; unsigned bX:1; unsigned bS:1; }; unsigned char shape_shoulder_buttons; }; } PERIPHERAL_RESPONSE_0x41_DIGITAL_CONTROLLER; /** * Stores the complete state of a PS2 controller. * Buttons are stored as 1 idle, 0 pressed. * (Same 'switch grounded' set up as received from the pad.) */ typedef struct _PS2_CONTROLLER_STATE { union { struct { unsigned zero:8; /* Stores a constant 0 for CD32 shift reply. */ unsigned one:1; /* Stores a constant 1 for CD32 shift reply. */ unsigned start:1; unsigned l1:1; unsigned r1:1; unsigned bS:1; unsigned bT:1; unsigned bX:1; unsigned bO:1; }; unsigned int CD32reply; /* Contains a word 0bBRYGRLP100000000 */ ... some lines skipped /** * Given a peripheral response byte sequence r, populate * the memory controller state c with the extracted contents. * This also implicitly formats the face buttons into * the correct word format for the CD32 SPI shift. */ void PS2_CONTROLLER_STATE_analyse_received_data(PS2_CONTROLLER_STATE *c, const PS2_SPI_PERIPHERAL_RESPONSE *r) { ... some lines skipped /* These constants form part of the CD32 word. */ c->one = 1; c->zero = 0; switch (c->peripheral_present) { case (PERIPHERAL_TYPE_0x41_DIGITAL_CONTROLLER): { const PERIPHERAL_RESPONSE_0x41_DIGITAL_CONTROLLER *b = &(r->digital); c->u = b->u; c->d = b->d; c->l = b->l; c->r = b->r; c->select = b->select; c->start = b->start; c->l3 = 1; /* Do not exist. */ c->r3 = 1; c->l1 = b->l1; c->r1 = b->r1; c->l2 = b->l2; c->r2 = b->r2; c->bT = b->bT; c->bO = b->bO; c->bX = b->bX; c->bS = b->bS; c->lx = 0x80; /* Do not exist. */ c->ly = 0x80; c->rx = 0x80; c->ry = 0x80; } break;
As you can see, the code that extracts the button states from the response is very neat and tidy... though those repeated field names are possibly asking for a copy-paste error to sneak in somewhere. An attempt at a solution, either a macro that removes the repeated token or some kind of silly in-depth routine with indexes mapping things back and forth, would probably break it further. The instruction set of the PIC doesn't allow these bitfield to bitfield assignments to be performed with a single instruction. It'd be a pretty impressively huge instruction if it existed! Here's the disassembly of the assignment block, as given by MPLAB X:
432: c->l2 = b->l2; 000594 9041C1 MOV.B [W1+4], W3 000596 61C1E1 AND.B W3, #0x1, W3 000598 984073 MOV.B W3, [W0+7] 433: c->r2 = b->r2; 00059A 9041C1 MOV.B [W1+4], W3 00059C D10183 LSR W3, W3 00059E 6181E1 AND W3, #0x1, W3 0005A0 984803 MOV.B W3, [W0+8] 434: 435: c->bT = b->bT; 0005A2 9041C1 MOV.B [W1+4], W3 0005A4 DE19C4 LSR W3, #4, W3 0005A6 6181E1 AND W3, #0x1, W3 0005A8 DD19CD SL W3, #13, W3 0005AA A1D002 BCLR W2, #13 0005AC 718102 IOR W3, W2, W2 0005AE 780802 MOV W2, [W0] 436: c->bO = b->bO; 0005B0 9041C1 MOV.B [W1+4], W3 0005B2 DE19C5 LSR W3, #5, W3 0005B4 DD19CF SL W3, #15, W3 0005B6 A1F002 BCLR W2, #15 0005B8 718102 IOR W3, W2, W2 0005BA 780802 MOV W2, [W0] 437: c->bX = b->bX; 0005BC 9041C1 MOV.B [W1+4], W3 0005BE DE19C6 LSR W3, #6, W3 0005C0 6181E1 AND W3, #0x1, W3 0005C2 DD19CE SL W3, #14, W3 0005C4 A1E002 BCLR W2, #14 0005C6 718102 IOR W3, W2, W2 0005C8 780802 MOV W2, [W0] 438: c->bS = b->bS; 0005CA 9040C1 MOV.B [W1+4], W1 0005CC FB8081 ZE W1, W1 0005CE DE08C7 LSR W1, #7, W1 0005D0 DD08CC SL W1, #12, W1 0005D2 A1C002 BCLR W2, #12 0005D4 710801 IOR W2, W1, [W0]
For fields that aren't part of the CD32 reply word, a small instruction sequence retrieves the bit, shifts it into the correct position, ANDs it to a single bit, and stores it. For fields that are part of the CD32 reply word (face buttons, shoulder buttons, Start), they're retrieved, shifted, ANDed (as the literal range for AND is 5-bit 0-31), shifted to the high bit position in the word, the bit in the register-resident copy of the reply word is blanked, the new bit is ORed into the register, and the contents of the register are stored. Throughout the sequence the value of the CD32 reply word is stored in W2 and repeatedly updated and written into the reply word in the structure. So there's never an instant where the reply word is invalid - it's always being completely overwritten with a valid word, but there's microsecond differences as each button is updated in turn within it. Assuming the complete PS2 polling sequence takes constant time p and begins at time t, that means that the CD32 reply word fields are updated one after another at a series of very near instaneous intervals after some point t + p. If I used a variable to store the CD32 reply word instead of manipulating the fields of the word in the structure directly, the instruction sequence would be near identical, except the MOV W2, [W0] would appear once at the end of the series. If that 7 * FCY interval (350 nanoseconds) makes you blow up in Super Stardust, I'll eat a hat.