Wednesday 16 September 2015

Commodore PET 8032 repair - Overclocked to death!

This is an old post, preserved for reference.
The products and services mentioned within are no longer available.

Commodore PET's never fail to find new and interesting ways to fail. This is a new one on me. It started as a normal repair, black screen, no beep, usual 8032 faults, but took a lot longer than usual.
So far, so good, looks OK in general. One 4116 has been removed, and some tracks damaged and repaired on the back. I checked those out, no problems with that, all continuity fine.
My usual first tests are power, clock, reset, all the necessary things to get started. Power was OK, all the voltage rails reading in tolerance. Clock and reset however, were a little odd.
The two traces shown are the reset line and the clock out from the CPU. The top one is the reset line, and it's really not meant to do that, but I'll come back to that. The first thing to deal with is the clock. It looks fine, a nice clean 2MHz clock. The thing is, the PET is meant to run at 1MHz.
The frequency counter confirms this, 2MHz. Everything else tells me it should be 1MHz. I tracked this down to a 74LS393 used as the clock divider. This was meant to take the 16MHz clock (which was measured at 16MHz) and divide it down to 8MHz, 4MHz, 2MHz and 1MHz for the system clock. It had other ideas, and it's outputs were 16Mhz, 8MHz, 4MHz, 2MHz. I presume the first stage had failed and passed the full 16MHz down to the second. Replacing that returned the system to it's non overclocked state. I don't know how long it had run overclocked to 2MHz, but many of the parts would not have been designed for that.
The reset was still pulsing though. On the PET it's a 555 timer set to generate a short reset pulse, this is buffered by a 74LS04, and then a 7417 open collector buffer is used to supply the reset line to the CPU, the I/O chips and the video CRTC. Tracing the signal through, the 555 was fine, and the signal out of the 74LS04 was OK, but the output of the 7417 was being pulsed low. It's open collector output, so it cannot drive high, it relies on a pull up resistor. This allows you to add other OC outputs or switches to the same line to reset the system.
I checked the resistor (R15) that pulls it high, and it measured fine, and putting another resistor in parallel changed the signal a bit, but not much. I replaced it anyway, along with the 7417. This made no difference. With the 7417 removed, I could see the line was being pulled down elsewhere. Another gate on the same chip is used to generate the IFC signal on the IEEE-488 bus, which was also fine. The only other things on there are the five 40 pin chips. With the CPU removed, something is still pulling the line down, so I started with the usual candidate, the 6545. That didn't change it, next the keyboard 6520. That seemed to fix the reset problem. With those replaced, the PET wasn't booting, so I installed a ROM/RAM board to bypass the onboard ROM and RAM. Still nothing. Checking the IRQ line, that was also doing odd things. I removed the 6522 and the other 6520 and that seemed to fix that. Once those were all replaced, it booted with a chirp and something on the display.
When you get banding like that in blocks of 8 or 16 etc. It is often down the the address multiplexers. Checking the outputs of these there were a few stuck and some doing strange things. I removed and tested the three 74LS157's and all were faulty. Replacing those changed the display, so progress was being made. Replacing the 74LS166 got rid of the flickering around the edges and finally a display.
The screen was full of @ symbols (0x00 in PETSCII), but there was some writing on the screen, and you can just make out 'commodore' in there. Some of the bits were wrong. I checked around various parts of the board, and it seemed that some of the data lines were always low. The ROM/RAM board has an isolated databus for it's ROM and RAM, so that wasn't affected. So it was something on board, one of the buffers, the RAM or the ROM. The only socketed ROM chip was UD7, the editor ROM, and that was a bit warm, so I removed that and it fixed some of the faults, and checking with the PET tester ROM, it initially looked OK.
But if you check carefully, you can see it's not right. It's repeating chunks of the character set. Where are the numbers? Checking the PETSCII codes of what was displayed and what should have been, it appeared that bit 5 was stuck low. I desoldered the other 4 ROMs and tried again.
Excellent, so all the video side was now working. The ROM/RAM board was still required, as the ROMs had been removed and the onboard RAM didn't appear to be working. With the ROM/RAM board, I can separately enable the upper and lower bank of memory, so I replaced the lower bank, which allowed it to boot, but left the on board upper bank enabled. Using the POKE and PEEK methods I described in a previous repair, I was getting results that showed that 5 of the 8 bits were faulty. I was hoping it would be better than that. It could be addressing issues causing that, or it could indeed mean that 5 RAM chips were faulty. To test this, I removed one chip and tested it on my Spectrum DRAM tester.
Yes, that's faulty. Putting a working DRAM chip into the PET confirmed this as it now looked like 4 of 8 bits were faulty. I replaced the other 4 and found the same, the chips were faulty, and the replacements reduced the error count. I then had the whole upper 16K bank working.
The lower was also faulty, so I swapped the bank select resistors so I could work on the lower bank, but mapped to the upper 16K, leaving the lower 16K working so I could get into BASIC. Again, I was hoping for better, but the results showed 7 of the 8 bits were faulty. Given that, I removed and tested all the remaining RAM and in the end, 3 of the original chips were working, and the one that was replaced originally.
With a complete set of replacement RAM, I could disable the RAM on the ROM/RAM board and I let it run through various diagnostics. I had to load these from tape as the IEEE-488 port wasn't working, as can be seen in the 8296 diagnostic, the video RAM fault is because the 8296 has 4K of video RAM, and the editor ROM is different, but it's still a useful stability test.
With the RAM working, I tried replacing the ROMs and most of them seems to cause the bit 5 issue. I also tried a working set of ROMs, and although they didn't corrupt the screen, it wouldn't boot.
Checking the enable pins, they were all wrong, and tracing back through that found various faults in the address decoding circuitry. The 74154 was stuck with one output low, even with the NOP generator feeding in all possible addresses. That fixed most of the chips, but the enable for UD7 is gated to mask out the I/O area. Various issues where found with the decoding chips, some of the signals seem to stick at 1V for a while, rather than being clearly high or low. Replacing several other logic gates around that seemed to fix that, and I was able to run it without the ROM/RAM board for the first time.
The burn in tests ram for several hours without a problem. So what's left? the IEEE-488 is still not working. I checked the MC3446N buffer chips and guess what, all were faulty. Does nothing work on this board? With those replaced and it was loading files fine from a PET microSD board.
The final tally then, with the surprising exception of the video RAM and character ROM, pretty much everything else on the board was faulty. I retested the parts and most would not let the system boot as they were pulling down reset, IRQ or data lines, or doing odd things with the addressing. Something really bad happened to this board in the past. I don't know if it was static or a power spike or aliens. It may just have been bad luck that the clock divider failed and overclocked several parts to death, and the address decoding faults caused lots of chips to try to write at the same time and burn each other out.
Given the cost of a full set of replacement ROM and RAM chips, I'll be returning this board with empty ROM and RAM sockets and a ROM/RAM board installed. This keeps the cost down and the owner still has the option to fit a set or 4116s and EPROMs in future.

Update:
The board is now back with it's owner. Here is his video showing the board being reinstalled in the case.
Also see their videos covering restoring the case and keyboard.