Post by FredxxPost by John RummPost by FredxxPost by John RummOnly tenuous links to DIY for this one, but anyone with an interest
in electronics, high power broadcast kit and software may find this
https://wiki.diyfaq.org.uk/index.php/
Hardware_and_software_fault_finding_-_an_interesting_bug
Let me know if it needs more explanatory stuff to de-geek the
technical bits...
Couldn't you have disabled interrupts during the critical 25us window?
I think the answer to that is "Yes, but"....
By the time I had figured out there was a interaction with interrupt
handling, I had actually found the cause of the problem anyway, so
there was no point in doing a workaround at that point.
Disabling interrupts may have had other knock on effects. Plus the
whole, leaving a hardware design error in there bit is sure to bite
someone sooner or later.
I suppose as I sit in both camps, a few lines of extra code, versus
scalpel, soldering iron and wire, a few lines of code win every time.
With an understanding on the actual failure mode, then the safest bodge
would probably have been setting up a reserved area of address space in
ROM that was not available for code, and located that on the address
space window that shared a numeric equivalence with the IO device's
location in IO address space. That would mean that during an interrupt,
you would the *know* you would never be interrupting code running at any
of the "danger" addresses.
Post by FredxxObviously a risk analysis of the knock on effect would be required.
That is where it gets messy - you would probably need to identify all
peripherals mapped into IO space, look at how "tightly" the address
decoding logic was for each was (if you are not short of IO space, then
it is not uncommon to only partially decode the peripheral so that it
gets a bigger space than it really needs - but that also tends to mean
that its control registers are then "echoed" several times at different
locations), and block those out as well. It might mean you then run out
of available ROM space or available contiguous ROM space for the code!
Post by FredxxPost by John Rumm(had the new CPU board hardware already been in service in multiple
customer sites round the world, then you might argue that a software
only "patch" might be preferable, but the software could only be
updated by a physical EPROM swap - so that would mean an engineer
visit anyway, at which point the mod wire patch would be easy enough
to do)
Software updates are always easier to sell than we made a mistake and
need to hack away at your board!
I think in real life it would easier to just swap out the whole PCB for
one with latest software and all hardware mods in place rather than risk
on site mods or chip changes (especially with the production boards
where the ROM might be soldered in, and the board have conformal coating)
Still fortunately, this will all during hardware / software integration
- one of the the points of which being not only to get the thing
working, but also fix stuff and remove as many unwanted "features" as
possible *before* it gets into production or delivered to a customer.
Post by FredxxIt's not uncommon to disable interrupts to allow for atomic-like code
executions.
Indeed - although that is not usually because your hardware is prone to
writing to random IO addresses :-)
--
Cheers,
John.
/=================================================================\
| Internode Ltd - http://www.internode.co.uk |
|-----------------------------------------------------------------|
| John Rumm - john(at)internode(dot)co(dot)uk |
\=================================================================/