Fred wrote:An interesting idea, but one that will only mask bugs IMO. I think if something gets out of place this needs to be obvious and quickly found and fixed.
The practice isn't for the code per se, it's for the processor chip and hardware. The CMOS gates in the processor can change state for myriad reasons other that errant code.
If you are interested in using it to catch bugs, it can be used for that too. The code in the housekeeping section is set up to compare the 'constants' in the housekeeping section, with their matching registers. If something doesn't match, it's possible to know which constant didn't match up with it's physical CMOS counterpart. The 'physical errors' say, can be categorized into critical or non critical for instance. Particular critical errors might need to trigger a controlled safe shutdown.
During development you can just log the error, or set it up to trigger an interrupt, or something.
Two other common ruggedizing techniques that have a use here:
'call counting': Set up a register and increment it as the first instruction of each called routine. Then decrement the register as the first instruction after the return to the main code. At certain points in the main loop check that 'call count' register for the appropriate value. Maybe log the event, or take some fail safe action, depending on where in the main code the register shown the wrong value.
'lost program watchdog timer reset': Loosely described as filling each line in the unused portion of the program code space with a call to an endless loop somewhere. That endless loop will trip the watchdog timer and cause a reset. But you can get creative with it too and do other things. Capture the state of the output registers and force a return to re-initialize and then to the main loop for example. That may even happen without the user knowing it.
- Jim
Edit: Often some pieces of the ruggedizing code are written in assembly, so the user knows exactly what they're getting.