I have a project on the bench that has been dragging on for a while, and I just want to get it finished.
I can see it being finished in one of two ways.
1) I get it working, and working well and get it into production
2) I conclude to my satisfaction that it's not going to work so I can move on to something else
It is one of those projects that is always going to be doomed to failure if you are not careful - trying to do something in software which would be better done in hardware.
And yes, the answer is easy, just do it in hardware.
Well, as usual, there are various problems with that. Modern microcontrollers can do an awful lot and can be of great benefit in terms of cost and board space and simplifying things.
And also the hardware is not necessarily available, some of it went out of production in the early 1980s, some of it only recently, but it doesn't seem like any of it is likely to come back
Until they make a 3D printer which can produce custom silicon chips, it's a software solution.
(I am sure that will happen sometime in the next 5 years, people are already making their own silicon. That is assuming we don't end up in some kind of technological dark age where knowledge is feared and banned by the powers that be ....)
What I need to do
I need a microcontroller to do some fast IO. I need it to detect a signal, read some data and respond to it within a short window.
My microcontroller of choice here is the AVR128DA48 or one of that series. These run at 24 MHz and has mostly single cycle instructions, specifically including single cycle access to the IO ports.
That means I should be able to run up to 24 million instructions per second.
In this application, I am waiting for a pulse from a 1MHz system, and so being 24 times faster, I should have 24 instructions to do it?
Well, not quite.
The pulse is actually only active when the clock is high, only half a cycle, but that's fine, that is still 12 instructions.
I got my code down to 10 instruction cycles and tried it.
It worked most of the time.
Not good enough.
Checking deeper, it turns out I don't have 12 cycles, the pulse is asymmetric, so I only get about 8-10.
The first instructions are along the lines of "is the pin low, no? loop back and test again" which takes up to three of the cycles, depending which cycle the tested pulse falls on, so in some cases, it just misses the change and takes the full 10 cycles, and is a bit late.
Here I am monitoring an LED signal I am using for testing. It is activated last, so if that still falls within the pulse, it should be safe.
But if it comes too late, it is likely the preceding pulse will also have been missed.
I have tried various things to streamline the code, but I think it is about as optimal as it can be.
There are various elements to this:
- detect the pulse
- work out the response
- respond
- clear and loop back for the next one.
I think I can save a few cycles on steps 3 and 4 by using some additional hardware, and maybe optimise the processing, but that only gets me close to the 8 cycles I have.
I can't work out the response in advance as it will be different every time.
I would prefer to have a lot more clear space than that. 24 cycles to run 10 instructions would have been reasonable, 8 cycles to run 8 instructions is too tight, specially given likely variation in the host hardware, some machines may be 7, some may be 9 etc.
What about interrupts?
Just as a side note, could I save time on the detection using an interrupt?
Well, no.
It looks like it is going to be about 6 cycles before it even gets to my code, so that's not an option.
An Alternative Option
I put that to the side for a while and got on with some other things, including building a Minstrel ZXpand that had been ordered.
Whilst programming the microcontroller for that, I wonder if the PIC18F4525 used on the ZXpand might be better, that has a couple of features that might help?
I covered my early adventures with PIC microcontrollers in a previous post, I glossed over the point I switched my allegiance from Microchip to Atmel.
(get ready to mark off the first "bloated framework" on your Dave rants Bingo card, possibly the first of many)
I had been working on a project and was getting frustrated by the PIC development environment, generating too much code, and code which didn't build or didn't work.
Whilst I had put that to one side, this thing called an Arduino started to appear, and that used the Atmel ATmega328, and I started looking into that and was impressed.
Since then I have been sticking with the Atmel family (which was going great until Microchip bought them out, but that's a rant for another day).
PIC a number
I wonder what's been going on with the PICs, maybe there is something in the PIC world that would help?
The chip in the ZXpand only works up to 20MHz, oh well, maybe not.
But then I noticed some other chips which said they would run up to 64MHz.
Now you're talking, 64MHz internal oscillator, that would mean I have 32 instructions for the half cycle, with the asymmetry, maybe 24, and my code is 10, so that would be a a good clearance.
We're back in business.
Oh great, there are dozens of part numbers and no clear indication of what the differences are.
I mean who put the marketing department in charge of a technical website. Who is that sort of table aimed at? it's just nonsense. Can Microchip please find someone with a grey beard and let them write a useful version of that.
OK, it's getting late in the day, but if I order a few of these now I will get them in the morning. So I scatter bought a selection of the Curiosity Nano boards, which are development boards with integrated programmers, which means I don't need to bother messing about with PICKits or Snaps etc.
Testing Methodology
You get an ology and you're a scientist.
The most important thing here is the IO operations, so to test things, I wrote a little bit of code which toggles an IO pin as fast as it can.
Loops can throw things out, so I have been doing eight instructions that turn the output pin on and then off alternately.
I have written versions in assembler with a register holding the value to write, but here the compiler optimised that to some single cycle instructions, so I am happy enough to stick with C for now.
I am not concerned about the overhead at the start or the loop, as I can just measure the speed of the 4 pulses. It gives 4 pulses at top speed, then a gap, then another 4.
Out of the box, the AVRxDx chips run at 4MHz, so when I run the code, it turns the output on and then off 4 times in 8 cycles, so I see a run of pulses at 2MHz.
With a bit of extra code to change the clock, it can get 24MHz.
The clock settings are locked, so you need to set a register to enable access for a few cycles, so that bit has to be in assembler.
(you also need to set RAMPZ if using the 128K chip, but I am just testing here)
Now I get a train of pulses at 12MHz, which is as fast as I can change them with a 24MHz clock.
If these new chips run at 64MHz, that should mean similar test code would generate a stream of pulses at 32MHz.
Now I just need to wait for the postman......
Lots of red boxes, I tried to cover all the eventualities.
These are all Curiosity Nano boards, quite neat development boards with a built in programmer debugger and pins for all the signals, and available with lots of different microcontrollers on them.
The PIC18F57Q43 seemed to do everything I would need.
I will spare you all the faffing around with different system like Atmel Start and Microchip MCC, Microchip Studio and MPLAB X, automatically generated code and hand turned from a blank page. Suffice to say I tried every combination I could think of, and the best I could get was 8MHz.
I was using the simplest code with the same sort of loop as before.
Checking the listing file, it is indeed using the BCF and BSF instructions that I would like to use.
And double checking, those instructions are 1 cycle instructions, highlighted below.
It is supposed to be running at 64MHz, but the pulses are only showing up at 8MHz, which if they are "single cycle", as it appears from the listing file they are, then the chip is running at only 16MHz?
Looking around it finally dawned on me, on the PIC, clock cycles and instruction cycles are not the same.
They really should have had that in big flashing lights at the start.
Each instruction cycle is actually 4 clock cycles, so 64MHz is 16MIPS.
On the AVR they are the same, 1 instruction cycle is 1 clock cycle, so 24MHz is 24MIPS.
That means these PICs are slower than the AVR.
Well, that was a bit of a waste of time.
I had started thinking about a couple of projects I could revisit with the extra speed, and the better peripherals the PIC chips have over the AVR, but sadly they just aren't going to be fast enough.
They might come in handy one day on something that doesn't need as much speed, but for this application they aren't going to help.
But wait, what's this other red box I bought?
I don't really want to move to an ARM chip, I would like to stay 8-bit if I can, but there are some ATSAMC microcontrollers which might do the job. These run at 48MHz (it says there is a 64MHz option, but that seems to be a different part number with a -64 on the end), but either way, that's got to be better than 24MHz or 16MHz.......
As part of the scatter buying exercise, I had picked up the ATSAMD21 Curiosity Nano. That's not ideal, I would prefer the ATSAMC21 as that is 5V, and the ATSAMD21 is 3.3V, but I thought I could at least get a proof of principle test working and look at other options later.
Credit here for this very useful table in the datasheet which explains the major differences between versions.
Why can't they do that for the other series of chips?
This is what they show for the AVR series. Same nonsense as for the PICs.
More expletives and trying different development environments and toolchains etc. Endless updates and completely reinstalling one of the development studios which broke itself doing even more updates.
5.86MHz is the best I could get out of it. How did it even get there? That isn't an integer multiple?
I think that is running at 48MHz, but it is using internal PLLs generated from 32.768KHz, and I am not quite sure I got the configuration right. How did it get down to 5.88MHz? Confused.
It does have an 8MHz fixed oscillator. I know it's not using that as the best I could hope to get from that would be 4MHz toggling if it were one clock cycle per instruction (which I doubt it is).
I remembered I had looked at the ATSAMC series before, I couldn't find any of the boards I had, but I did find a sole IC.
(how I found one single IC in my house is quite a miracle, finding the PCBs as well would be too much to ask).
The ATSAMC series have two advantages over the ATSAMD as far as I can see. They run at 5V, and they have an internal 48MHz oscillator as well as the 8MHz one. Maybe I had the configuration of the PLL wrong? With a fixed 48MHz oscillation there should be no problem getting full speed.
Can I just swap the ATSAMD on the Curiosity Nano board?
Is that chip smaller or further away?
Ah no then. Although both are available in 48 pin and 64 pin versions (and many other variations), the board is 48 pin and the chip is 64 pin. Of course they are different. I wouldn't be that lucky.
OK, time to see if the jungle website can rush me a breakout board I can use for a 64 pin 0.5mm TQFP.
Anyway, at 1AM I ordered an overpriced breakout board from Mr Bezos, which he promised to bring to my door in the morning.
(how is that "compatible with Arduino", in what conceivable way would you use that with an Arduino? Oh never mind. )
Seems to fit, it's the right pitch, even if the pin numbering and pad ordering is completely wrong as the board is designed for TQFP 100.
Add sockets for all 64 pins and a few decoupling capacitors (and the obligatory power LED).
I just need a programmer.
Luckily, the Curiosity Nano boards can do that, if you cut some of the traces on the back to disconnect the onboard programmer from the original microcontroller..
Not the neatest, but it should work. (It's JTAG, so more wires than the PIC or AVR)
Again, lots of messing around with code generation and bloated frameworks etc.
I also had to reconfigure the Curiosity Nano onboard programmer to suit the different part. I haven't needed to do that on the AVR versions, I built a programmer using a Curiosity Nano that I have used for a variety of parts.
The ATSAM one wasn't having any of it, so I had to change it. I found pyedebuggerconfig which allowed me to to change the target chip type (and host device voltage limit) and I was away, it was talking to the chip.
Great.
Now for the code.
Well, that's helpful. Thank you Atmel Start.
I managed to get some code built in MPLab X, that also required a bit of fiddling to get it to switch to the 48MHz oscillator, but I got there, and here is the fastest I managed to get it to twiddle IO pins.....
8MHz.
Bugger.
I don't know much about ARM assembly, but the compiler got it down to a single instruction for each bit toggle.
I can only assume that instruction takes six clock cycles, and I couldn't find any obvious alternatives that would be faster.
So after all that, it is also slower than the chip I started with.
(even the 64MHz model would end up 10 and a bit MHz, still slower than the AVRs)
Any other options?
I have previously looked at some different ARM chips, including the STM32F405. Those run at 168MHz, so should be fast enough. However, that is a whole new framework and toolchain to learn from a different manufacturer and probably a step too far.
Conclusion
None of these chips are fast enough to do what I need to do in the time available.
A frustrating, but necessary exercise, due diligence.
This project, in this form, is not viable with the parts I have looked at.
I think I have to give up on this project.
Unless something changes or a new part comes along that will do the job (or I find there is one that I have somehow overlooked).
But, on the positive side, I can clear my bench and start on something new in the morning.
Post Script
Whilst putting this post together, I had a look for photos of the board I first tried the SAMC20 with.
I also found I had edited some of the photos, so I must have written a blog post about it, but this was in the dark ages of 2021 when the public blog was offline and I was writing only for Patreon.
Thank you to the Patrons that stayed with me through that and kept me (just about) sane.
I think this was the main photo for that article - I don't think it went very well.
In that post, I ended with a photo of a note I left myself.
If only
1) I had remembered that
or
2) I had found the box.
Oh well.
Those who do not learn from history are doomed to repeat it.....
Post-Post Script
Since posting this on Patreon last month, I have managed to get the project working, not the ideal version I was hoping for, but still quite nice. It's a bit messy, hopefully the production version will be a bit neater. (ignore the board, that was just an old board used as a base to hold it all together.)
Adverts
My Tindie store is still open, the Christmas deadline is passed for everywhere other than the UK (if you are very quick).
Although that doesn't seem to be a problem the way things have been recently.
Patreon
You can support me via Patreon, and get access to advance previews of development logs on new projects like the Mini PET II and Mini VIC and other behind the scenes updates. This also includes access to my Patreon only Discord server for even more regular updates.