Tynemouth Software: AVR Command Line Hello World in C

I have planned to write this sort of post several times, but for whatever reason it has never worked out. I would take all the screenshots and code saves etc. and by the time I came to write it all up, I would have lost them, or forgot what I took them for.

This time I wrote it up as I was going, so I finally get to complete it.

I have found I don't get along with the many and varied development environments and frameworks and plugins etc. And have ranted about them at nauseam in the past.

A lot of the time, I just want something nice an simple. Hello World, a flashing LED etc. and it's not always easy to get that without hundreds even thousands of lines of irrelevant code being added to your project.

Here I present the minimalistic approach I take when I want, what I consider, the easy approach. (I appreciate that my idea of "easy" and "painless" is very different to others, but this is what works for me)

I am going back to plain text files and command line compiles, and a simple flashing LED.

Development Board

I am using the AVR64DD32 Curiosity Nano board. This is one of many of this type of board, a microcontroller with it's pins connected to two rows of 0.1" spaced pads, and an integrated programmer / debugger at the end, all powered from a USB connector (microUSB on the older boards, USB C on the newer ones).

You can use these in various ways, in the full development environment there are powerful in system debugging and monitoring, but today I am going to use what they call "drag and drop" programming.

When you plug in the Curiosity Nano, it appears as a composite USB device with many features, including USB serial and a USB storage device.

When you open that drive, you see a few files, mostly information about that board and links to the product page which contains links to example projects on github.

The status.txt files is magically generated and shows info about the current settings. It works the other way, if you copy a .hex file into that folder, it programs that into the device. Very neat.

The .hex files are cached in the file views, although it doesn't really exist, you can't read the .hex file back. But you can overwrite it with a new version as you go.

If you unplug the device and plug it back in, any hex files will disappear as they were never really there.

Occasionally I get an error saying "Read Only Filing System", but that resolves by unplugging and plugging back in. (I think it happens when I am too quick with copying a new version on there)

Writing the Code

Now all I need to do is generate a .hex file.

I will start with a nice simple flashing LED, written in C.

The development board has an LED wired to pin PF5.

// Blinking LED #define F_CPU 4000000UL #include <avr/io.h> #include <util/delay.h> int main(void) { PORTF_DIR = (1<<5); while(1) { PORTF_OUTTGL = (1<<5); _delay_ms(500); } }

That is about as minimal as you can get.

Building

I am going to make a bash script to build this (you can do a makefile if you find that easier)

I first create a build directory to hold the output files away from the source.

mkdir build

At the start of the build script, I change into that folder and then clear any temporary files.

cd build rm *.o -f rm *.d -f rm test.* -f

With older chips, support was built into avr-gcc, so the command line would be something like

avr-gcc -Wall -g -Os -mmcu=atmega1284p -o LEDTest.bin LEDtest.c

However, the newer AVR chips are not included, so you need to reference library files. You can install these packs separately, but if you have MPlabX installed, they are already there.

I found the easiest way to get the appropriate build lines is to copy bits from the build output of one of the development environments. Setup a simple project with just a main.c and look at the build output.

The line I took was as follows. This will vary based on system, paths, versions etc. so don't expect this to work on your system.

/usr/bin/avr-gcc -mmcu=avr64dd32 -I "/opt/microchip/mplabx/v6.25/packs/Microchip/AVR-Dx_DFP/2.7.321/include" -B "/opt/microchip/mplabx/v6.25/packs/Microchip/AVR-Dx_DFP/2.7.321/gcc/dev/avr64dd32" -x c -c -D__AVR64DD32__ -funsigned-char -funsigned-bitfields -O1 -ffunction-sections -fdata-sections -fpack-struct -fshort-enums -Wall -MD -MP -MF main.o.d -MT main.o.d -MT main.o -o main.o ../main.c -DXPRJ_default=default

(ah, it used to be so much simpler in the olden days says the old man yelling at clouds...... I should have that gif on speeddial, ah but then the youth of today probably don't know what speeddial was...... insert second copy of old man yells at clouds gif)

There is a lot going on, I tidied it slightly by having all the object files in the current directory and using ../main.c for the single reference to the source file.

The next step is to link it

/usr/bin/avr-gcc -mmcu=avr64dd32 -B "/opt/microchip/mplabx/v6.25/packs/Microchip/AVR-Dx_DFP/2.7.321/gcc/dev/avr64dd32" -D__AVR64DD32__ -Wl,-Map="hello.map" -o hello.elf main.o -DXPRJ_default=default -Wl,--defsym=__MPLAB_BUILD=1 -Wl,--gc-sections -Wl,--start-group -Wl,-lm -Wl,--end-group

Again, taken from the build output with the paths tidied up. I am not sure how many of those options are actually necessary, but I have left them all in for the moment. Maybe one day I will look at streamlining that?

That generates an .elf file. The final step is to convert the .elf file into the .hex file we need.

/usr/bin/avr-objcopy -O ihex hello.elf hello.hex

And an optional step generates a listing file.

/usr/bin/avr-objdump -h -S hello.elf > hello.lss

I Finally clean up the build files I don't need to keep, the full script is as follows:

#!/bin/bash cd build rm *.o -f rm *.d -f rm hello.* -f /usr/bin/avr-gcc -mmcu=avr64dd32 -I "/opt/microchip/mplabx/v6.25/packs/Microchip/AVR-Dx_DFP/2.7.321/include" -B "/opt/microchip/mplabx/v6.25/packs/Microchip/AVR-Dx_DFP/2.7.321/gcc/dev/avr64dd32" -x c -c -D__AVR64DD32__; -funsigned-char -funsigned-bitfields -O1 -ffunction-sections -fdata-sections -fpack-struct -fshort-enums -Wall -MD -MP -MF main.o.d -MT main.o.d -MT main.o -o main.o ../main.c -DXPRJ_default=default /usr/bin/avr-gcc -mmcu=avr64dd32 -B "/opt/microchip/mplabx/v6.25/packs/Microchip/AVR-Dx_DFP/2.7.321/gcc/dev/avr64dd32" -D__AVR64DD32__ -Wl,-Map="hello.map" -o hello.elf main.o -DXPRJ_default=default -Wl,--defsym=__MPLAB_BUILD=1 -Wl,--gc-sections -Wl,--start-group -Wl,-lm -Wl,--end-group /usr/bin/avr-objcopy -O ihex hello.elf hello.hex /usr/bin/avr-objdump -h -S hello.elf > hello.lss rm *.o -f rm *.d -f

The build.sh file is saved and and made it executable with

chmod +x build.sh

Now I just run ./build.sh and hopefully get no errors.

./build.sh

Then just copy the .hex file to the mounted folder to program it. Your paths will vary unless your username is also Dave. Hello Dave.

cp build/hello.hex /media/dave/CURIOSITY/

All being well, the LED will start flashing away at about 1 Hz.

Checking your work.

You can double check with a logic analyser. The Curiosity Nano comes with a set of pin headers, I don't know what they get up to locked in those red boxes, but they always seem to be stuck together and a little difficult to separate without breaking them.

The holes in the PCB are a little offset, which is a neat feature. That holds the pins in place without soldering. Handy for some quick testing, but I would still solder for anything permanent.

And there we have a square wave, approximately 500ms high, 500ms low. It is slightly longer due to the time taken to toggle the LED and jump back to the start of the loop.

That was a trivial exercise, but a useful one to get started.

Sometimes you just need a sanity check. Can I still get the LED to flash? OK, so the code is actually being updated and is running.

Adding More Functionality

Now it is over to you to add in your code.

If you are just doing "if this input, then do that output" etc., away you go. If you need timers or ADCs or glue logic or events or whatever, you will need to set those up.

Depending on what you are doing, it is sometimes easier to read the datasheet and work out which registers you need to setup for your application. It is often only a few.

But getting started, it can be easier to use things like Atmel Start or MCC Melody (or whatever the next flavour of the month is).

You can use those to create a dummy project and generate a load of code, and then pick and choose the bits you need to add to your simple code.

Atmel Start

I generally prefer Atmel Start as that has a web front end, so you don't need to install or update anything, although unfortunately it has stopped being updated itself. It support things up to the AVR DA and DB series, but has not been updated to include the DD or Ex series for example.

It has a graphical view for linking things together which is helpful for clocks and lookup tables and events.

The code it generates it a good starting point. Commented out code for bits you don't need to change (most registers default to 0 at power on, so no point in overwriting 0 with 0.

Take the bits you need to change, they show all the defines and most of them have comments.

MCC Melody

The current version is MCC Melody in MPlabX. This is a Java based platform which I always struggle with. It seems to be the user interface is somehow fragile, and it locks up too easily. Like it is a picture of a user interface that checks for mouse clicks when it feels like it.

The code these generate does not use defines, just gives you the calculated value. e.g.

Some modules have the option to comment out the = 0 lines, but not all do, so by default you get a load of unnecessary code.

I am also not a fan of they way they remove the leading zeroes. Instead of 0x00 they use 0x0, which just looks wrong to me. The same with 16 bit numbers, 0x02ff is written as 0x2ff.

However you choose to generate it, you can add the bits you need to your code.

Most of these peripherals do their own thing, so once you have them setup the processor is free to do other things, so I often just leave it looping around flashing the LED. It's reassuring to see the heartbeat. I also occasionally change the period of the timer to make sure it has been updated.

A Bit of Assembler?

I was going to stick with C for this, but there is one bit which always needs assembler.

Unlike previous Atmel chips, where the clock settings are programmed fuses, the Microchip chips always start with the internal oscillator enabled. That can be quite useful if you accidentally set the fuses to use an external crystal and you don't have one connected (ask me how I know).

The default clock on these devices is 24MHz, with a divide by 6 to get 4MHz.

There is a line at the start of the main.c file is used to tell the delay function about that so it know how many cycles it will need to waste to get to the required 500ms.

#define F_CPU 4000000UL

In order to change that, you need to write to a register, however it is not as easy as a simple assignment as that register is protected, so you need a wrapper function to unlock it first.

The register is only unlocked for a few cycles, so this need to be written in assembler as C compilers have a habit of moving things around to optimise the code, so the order of instructions can change.

I am surprised this is not a library function, like _delay_ms as used above, but it does not seem to be.

The code generated by both Atmel Start and Microchip MCC has a lot going on to deal with lots of different conditionals and various macros to deal with formatting it as a C callable assembler function.

I usually distil that down to a much simpler function.

#include <avr/io.h> .global protected_write_io .section .text_protected_write_io, "ax", @progbits .type protected_write_io, @function protected_write_io : movw r30, r24 // Load addr into Z out CCP, r22 // Start CCP handshake st Z, r20 // Write value to I/O register ret // Return to caller .size protected_write_io, . - protected_write_io

What it is doing is setting the Z register to the address of the register to be written (Z is a 16 bit register made out of two 8 bit registers, r30 and r31).

Once the Z register is preloaded with the address at r24 and r25, the magic unlock code is written to the CCP (Code Configuration Protection) port. That unlocks the protected registers for a few cycles and then the value in r20 is written to the register pointed to be Z.

The registers are sort of like the zero page in a 6502, so in this case ST Z, r20 is a bit like a zero page indirect LDA, r20 STA (r30,X).

That does mean I have a separate protected_io.s file and also a header file with the function definition for the generic function, and a second function wrapper for the more specific version. Which is not ideal, it just means extra files and some extra lines in the build script.

extern void protected_write_io(void *addr, uint8_t magic, uint8_t value); static inline void ccp_write_io(void *addr, uint8_t value) { protected_write_io(addr, CCP_IOREG_gc, value); }

Then you just need the actual call in your code to change the value to 16MHz or 24MHz etc as required.

ccp_write_io((void *)&(CLKCTRL.OSCHFCTRLA), CLKCTRL_FRQSEL_24M_gc);

N.B. watch out for CLKCTRL_FRQSEL_24M_gc, it seems to be CLKCTRL_FREQSEL_24M_gc in the code generated by Atmel Start but the current libraries seem to have lost the E in FREQ.

Inline Assembler

I have been meaning to look at rewriting that with inline assembly. This seems to be a good time.

static void ccp_write_io(void *addr, uint8_t value) { uint8_t magic = CCP_IOREG_gc; asm volatile ( "out %0, %1 \n\t" "st %a2, %3 \n\t" : : "I" (_SFR_IO_ADDR(CCP)), "r" (magic), "z" (addr), "r" (value) ); }

I have made it a single function. It took me a while to work out the best way to use the inline assembler syntax. It's a little convoluted, but you write the assembly in a similar style to printf statements, with %0 etc. placeholders for variables.

I would point you at the official documentation, but for some reason all the code on that page is missing

https://onlinedocs.microchip.com/oxy/GUID-317042D4-BCCE-4065-BB05-AC4312DBC2C4-en-US-2/GUID-E152F8C1-EEE2-4A9D-A728-568E1B02F740.html

See this version instead which still has the code intact.

https://www.nongnu.org/avr-libc/user-manual/inline_asm.html

The four parameters are passed as %0 through to %3, and those refer to the items that appear after the colon (the values after the string in a printf statement). The first section is for return values, in this case empty. The second is for parameters. There is also a third for any registers you need to tell the compiler that you have used, which I am not using here as I am only using the ones already in the parameter list.

The type is in quotes, "I" is a 6-bit positive integer constant (as is required by the out statement). "r" is a register, so this means that the variable magic is copied into a register, and the name of that register is inserted into the code in place of the %1. The compiler chooses an appropriate register for you.

"z" tells the compiler not to choose one, and forces it to use the Z register (r30 and r31 as previously mention). %2 would have been replaced with the name of the first register, r30, but %a2 gives the Z as required for the st Z.

Looking at the code which is generated (in the .lss file)

ldi r24, 0xD8 ldi r25, 0x24 ldi r30, 0x68 ldi r31, 0x00 out 0x34, r24 st Z, r25

The first four lines are loading the registers and then the last two lines are the unlock and the register write.

That is a few lines shorter than the one generated by the code in the separate file as the parameters are passed in different registers and have to be copied into r30 and r31. Here I can specify the values are placed directly into the registers I need to save the unnecessary step.

You can of course just cut and paste the code above rather than having to understand it. That is what I intend to do next time.

If you upload that now, the LED will flash faster as the clock is now running faster.

In order to get it back to the correct time, you also need to change the #define F_CPU the at the start.

#define F_CPU 24000000UL

Rebuild and re-copy and it should be back to 1Hz.

(this is a slightly earlier version where I hard coded the CCP port before I worked out how to pass that properly, and also went for 16MHz rather than 24MHz)

More Assembler

This next section also goes into the assembler weeds a bit, but I think it is important to see how seemingly insignificant changes to the C code can make big differences to the size and speed of the assembly generated.

Normally when you use the code builders or something like the Arduino, you get nice wrapper functions, e.g.

LED_Toggle();

That isolates you from a lot of the underlying implementation, which in many cases is what you want.

In the early days of Arduino, I remember the IO instructions used to generate a load a code, I think they are a bit more streamlined these days.

The AVR ones usually map down to single instructions, in this case wrapped in a do-while loop to avoid compiler optimisation.

#define LED_Toggle() do { PORTF_OUTTGL = 0x20; } while(0)

You can duplicate things like that if you find it helpful, but all you need is the important bit.

PORTF_OUTTGL = (1<<5);

I have used (1<<5) as it is pin PF5, but the autogenerated code just goes for 0x20. Up to you, you could also use %0010000, or 32 (or even 040 if you are perverse). Whatever works for you, they all compile down to the same thing.

ldi r24, 0x20 sts 0x04A7, r24

Some of the code uses a dot rather than an underscore. They are pretty much interchangeable, but subtly different in the code which is generated.

PORTF.OUTTGL = (1<<5);

This generates a less efficient implementation using a redirector. That can be more efficient if you are running a series of this sort of instruction, but is worse by a few lines for a one off.

ldi r30, 0xA0 ldi r31, 0x04 ldi r24, 0x20 std Z+7, r24

With some registers, you can also access shortcut "virtual ports" which produce even more efficient code.

VPORTF_DIR = (1<<5);

The code is 1 word and 1 cycle shorter. That can make a difference when you are doing low level pin toggling. Especially if you preload registers with the things you want to write, you IO calls can come down to a single cycle out instruction.

ldi r24, 0x20 out 0x14, r24

If that isn't enough, you can also use the "set" and "clear" versions which will only change certain bits. Useful if you want to turn on one output without knowing what the rest are set as, or having to do a ready, modify, write sequence.

PORTF_DIR |= (1<<5)

Could be replaced with the more efficient version.

PORTF_DIRSET = (1<<5);

One of the purposes of a simple test build like this is you can try out the different options in isolation, and test them out and also look at the code generated. (well, I say simple, I suppose I should prefix that with "relatively")

How did it go?

In this case, I had been using the MCC generated code as a starting point, but it just wasn't working. Going back to the minimal code and adding a few lines to set the appropriate registers, I was able to see the correct code was being generated and nothing else was interfering with it. That actually turned out to be a silicon bug in the chip I was using, and when I changed to a different one, it started working as designed.

There was a longer, rantier version of that conclusion in the original version of this on Patreon as I had wasted a lot of time on that and another problem, but I will keep things civil on here.

Adverts

My Tindie store contains all sort of kits, test gear and upgrades for the ZX80, ZX81, Jupiter ACE, and Commodore PET.

https://www.tindie.com/stores/tynemouth/

Patreon

You can support me via Patreon, and get access to advance previews of blog posts, and progress updates on new projects like the Mini PET II and Mini VIC and other behind the scenes updates. This also includes access to my Patreon only Discord server for even more regular updates.

https://www.patreon.com/tynemouthsoftware

Sunday, 22 February 2026

AVR Command Line Hello World in C