Wednesday, May 30, 2012

Arduino Mega2560 - Wrapping up the Bootloader Problem

In various posts (here, here, here) I've whined and discussed the problems with the bootloader implemented on the Mega2560 board.  This is a great little board with enough memory and IO ports to do a ton of things around the house.  Someday, I hope to implement a swimming pool controller with one of these.  However, the bootloader has two critical problems:  First, If you have three exclamation points in the code you're loading, the loader jumps into an on-board monitor and doesn't complete.  This is easy to find and correct IF (big if here) you know about it and the sequence !!! isn't in data or code.  The first time I encountered this problem I had a data table that had, 0x21, 0x21, 0x21 in it.  This took a hex dump and search to find the problem.  The second problem is that the watchdog timer would hang up in a loop.  The symptom here is that the watchdog times out, and then fires again before you can get to the first instruction of your sketch;  meaning you have to power down the board and get a new sketch in place before the timer expires.  If you have a really short watchdog timer, this is hard.

I came up with a work around for the watchdog timer problem and just tried to avoid the !!! problem while waiting impatiently for the Arduino developers to fix the problems.  Every week or two, I posted to the developer's mailing list my continuing concern about this problem.  I even contacted the author of the bootloader for help.  Eventually (last month), after almost a year of whining, there was a bootloader made available for the board.  I tested it, it worked, and now I have it running in my house controller.  The new bootloader isn't part of any official release and isn't burned into the boards coming out of production, but one can get it and put it on the board to overcome these problems.  

Notice, there isn't any automatic way to get the fix in place, you have to get the code and burn it on the board yourself.  Or maybe you have a friend that can do it for you.  I've used both the LadyAda USBTinyISP and the Atmel AVRISP MkII to program the new bootloader.  LadyAda's device throws an error when you use it because it isn't perfect for loading into large memory spaces.  However, if you just ignore the error and get on with it, it will work.  The error looks something like:

avrdude: verification error, first mismatch at byte 0x3e000
                               0x0d != 0xff
                      avrdude: verification error; content mismatch

Of course, there are lots of other errors possible, but my experience is that the one above can be ignored.  When you use the AVRISP MkII with the Arduino IDE, you have to use the drivers that come with the IDE, don't load the drivers that you get with the programmer.  Yes, I made all these mistakes getting the board to work for me.  There is also the possibility of using another Arduino as an ISP.  You'll have to hunt for the proper code to do it though; I didn't use this method.  Take a look at the Arduino Forum at arduino.cc to find more information.

The boot loader is available as a hex file in the Arduino repository at: https://github.com/arduino/Arduino-stk500v2-bootloader/tree/master/goodHexFiles  Just put this in the Arduino bootloader directory, rename it properly and load it onto the board.  Sounds easy right?   ... Right?

So, now we have this wonderful little board containing tons of IO pins, three serial ports, and plenty of memory to work with.  It can even be made to recover on it's own with some watchdog programing.

Tuesday, May 22, 2012

Let's Talk About Smoke Detectors a Bit

I've been having a continuing problem with smoke detectors.  My house is wired for them and each one will connect to the others such that an event in one room will set off the entire house.  That's nice as a protection feature, but not when they false.  The experts call this a 'nuisance alarm'.  Nuisance is right; at 2:00 AM when you get a fire alarm, the dogs shoot outside and hide and I have to wander the house trying to figure out which one is messing up.  Then after ripping the offending device out of the ceiling and dropping it on something horizontal nearby, I have to go round up the dogs and convince them to come back inside.

Sure, I've heard all the arguments about safety.  How a fire that starts in one room can trap you in the house and the earlier it's detected, the safer one is.  Fine, but don't tell me that at 2:00 AM while I'm wandering around the house trying to see that little red light on the smoke detector.  It's easy to check for a fire, just sniff or check out the room in the dark.  Never had a fire, but have had a few hundred false alarms.  Clearly, I'm not understanding something here.

In order to pass the final inspection the contractor put in ionization smoke alarms with battery backup.  All the alarms have power provided, and have a separate wire running such that they can communicate with each other.  Nice setup if the devices actually worked as advertised.  Within the first week, I disconnected the one in the kitchen.  Seems every time I cooked, it went off.  About a month later, I disconnected the one in my den.  Seems it went off every time I turned on the ceiling fan.  After a year, I disconnected the one in the master bedroom.  It went off with the ceiling fan also.  Eventually, I only had one hooked up; it was in an unused bedroom.  Over time I would reconsider and hook them back up (new batteries of course) and start the entire process of getting disgusted with them and disconnecting each one in turn.

There was a turning point last year when I decided to use the evaporative cooler in my garage to cool the house during part of the year.  I turn on the cooler and open the door from the house to the garage and allow the air to flow through the house.  I can control which rooms are cooled by opening or closing doors such that the space humans are in is comfortable.  This makes it nice since I can have open doors and enjoy the outdoors without getting too hot.  However, the freaking smoke alarms reacted to the increased humidity and sounded off.  In disgust, I replaced every one of them with a newer model that didn't have a battery.  Those batteries never last as long as they say and chasing down a battery failure in the middle of the night is as big a pain as a nuisance alarm.

This lasted about a week and I had to disconnect the one in the kitchen again; then the den, then the master bedroom.  Over time, I had to disconnect a number of them to get a night's sleep.  Clearly, I wasn't doing something right so I talked to my neighbors.  Without exception, they had disconnected most of their house's smoke detectors.  One guy actually modified his to have the little light on, but broke the sounder.  Seems his wife wanted the protection of the detector, but complained about it sounding off in the middle of the night.  He saw the modification as his best solution to his particular problem (as in wife).

OK, so did some research on the detectors themselves.  There are two extremely common forms of detectors, ionization and optical.  The ionization ones react quickly to smoke while the optical ones take a bit longer.  The ionization ones, being more sensitive, tend to nuisance alarm a little more often.  Fine, I'll try a couple of the optical.  After installing them and having success, I relaxed.  Then came evaporative cooler time; nuisance alarms started all over again.  Back to the drawing board, I tried various experiments on both kinds and just couldn't figure out why they were firing off in the middle of the night and being nice during the day.  Never did.  However when I read many, many complaints similar to mine, I decided that, not only was I not the only person in the world to hate these things, I was only one of a silent majority.  We're silent because there just isn't a good solution out there.

See, these devices are subject to a ton of possibilities.  Increased humidity such as evaporative cooling or a bathroom shower will cause both the optical and ionization devices to fire.  A random spider will set off an optical device.  Dust will cause both kinds to go off.  The list just goes on and on.  People on forums and in reviews love to say, vacuum it out every three month to be sure they stay clean.  Ever tried to drag a Dyson up a 10 foot ladder to vacuum out a smoke detector?  I used to go up and take the thing down, carry it out to the garage where the compressor is and blow the darn thing out every three months.  Didn't do a bit of good.  Of you folks that have smoke detectors in or near the kitchen, how many of them are still connected?  There's a bunch of folk out there that have given up on the ones near the kitchen since they seem to fire every time there is company over.  Cooking and smoke detectors don't mix.

But, there is hope.  There is a device out there that may help my particular problem, a heat detector.  These things don't detect smoke, they detect changes in temperature.  Since heat rises, they will fire when the temperature hits 135F or rises 10 degress in a short period.  Sure, proper ventilation will keep the temperature constant in a room and delay these things going off, but I will get to sleep and have a lot more protection than the device setting on a bedside table because it sounded off for no reason two nights in a row.

I'm not going to bore you with the technical details of how the various kinds of devices work and the options available, google is your friend if you want more information on them.  But, here's my plan.  I'm going to replace the kitchen, den, and master bedroom devices with temperature detectors.  Then let them run a while.  I'll leave either optical or ionization devices in the other areas until they give me problems and decide what to do based on the results.  While I'm at it, I'll alarm the garage with a temperature detector as well since it doesn't have anything at all (local code didn't require it).

Note that since my house is mostly electric, I don't have any concerns about carbon monoxide.  Additionally, since I leave the doors open and fans running most of the year, radon doesn't worry me either.  See, there are advantages to living in the sticks in the desert.

Edit July 8,2012: Well I've had a heat detector in place where I was getting the most false alarms for a couple of months now.  No problems at all.  I put it in the kitchen and now I can burn dinner without having to mess with a false alarm.  So, anyone stumbling across this post, here's a solution that can maintain a modicum of safety without dragging you out of bed at 2 AM.

Monday, May 14, 2012

Power Monitor Failure - Part 3

OK, fine, I didn't fix it.  It failed again and this time I watched it for around 30 minutes.  I used an XBee hooked into the USB port of my laptop and watched the data coming in.  Each line from the Power Monitor was broken off at the 40 character point, then a couple of seconds later, the rest of the line came in.  About half the time some message from some other sensor came in between the first 40 characters and the rest of the data.  Changing baud rates didn't seem to help.

I went through the Digi documentation and prowled around their site looking for something related to this kind of problem and all I came up with was various warnings about waiting three character periods for the packet to be transmitted and transmit packets being sent when the serial Tx buffer was full.  Neither of these seemed to be the problem.  The data was cutting off in the middle of a line.  I wanted to try the Serial.write() method, but since it takes days to fail I wasn't too hopeful of getting anything concrete short term.

However, at least according to the Digi documentation, using the API mode would guarantee the entire message would be sent as one unit.  So, I stole code from the Acid Timer and switched the XBee programming to API mode 1.  Yes, API mode 1.  For those of you just starting out with XBee, there is a misconception about the API modes.  In API mode 1, you don't have to escape any character at all.  In API mode 2, you have to escape the flow control characters XON, XOFF, ESC as well as the actual 0x7E that is used to start a packet.  This is to allow for software flow control, but I haven't looked into doing that at all.  So, in API mode 1, you can do anything that you can do in API mode 2, except easier.  Trust me on this, I've seen many, many people get confused by this; including the authors of the various how-to guides.

At any rate, the device is back in service using API mode 1 and sending data just fine.  It's been a few hours and no problems have come up.  I'll check on it reasonably often over the next couple of weeks to see if I have to do something else.

One other thing that isn't made clear in the documentation or any of the how-to guides I've looked at.  You can mix and match API mode and transparent mode on these devices.  For example the XBee I use to monitor what is going on is configured for transparent mode and it can see the broadcast sentences from every device broadcasting just fine.  The packet headers and such are stripped off and the data is delivered out the serial line.  This is a nice way to see what is going on over your network.

The other parts of this problem are described here and here.

Wednesday, May 9, 2012

Power Monitor - Failure Part 2

Back at this post I briefly described a seeming failure with the power monitor.  I wasn't able to get enough information at the time to truly isolate the cause so I just made a repair that might help to see if the problem would go away.  Today, it happened again.  This time I gathered some data and think I have it figured out.

The power monitor is not failing, the House Controller isn't decoding the XBee data fast enough to handle the asynchronous data coming in.  OK, that's not correct either.  What appears to be happening is that the data isn't getting off the XBee board quickly enough and is getting messed up.  When I set up an XBee and monitored the traffic at 9600 baud, I got data from the various devices, but they were intermixed with each other such that no single line of data was correct.  What was happening was that the House Clock and the Power Monitor were sending so close to each other they were messing each other up.  I changed the baud rate on the XBee I was using to 57600 and each line was distinct and there were no problems any more.

Best I can tell from the XBee documentation, this isn't supposed to happen.  Collisions can happen, but they would get retried.  Don't have a clue what is actually going on.  However, setting the House Controller baud rate for the XBee port to 57600 eliminated the problem.  Of course, I had to reprogram the XBee for 57600 baud as well.  Not too hard a change, but annoying since it (apparently) shouldn't have happened in the first place.  I also noticed that the firmware version on my XBee was out of date and there have been three updates.  I went through the release notes on the updates and there were a ton of changes to store-forward and comm port communications.  Any of these kinds of things could be causing the problem I'm seeing.  So, I now have a fully updated XBee module in the Controller.

So, as usual, I have changed a number of things and don't really know if the problem is fixed.  I'll just have to wait a couple of weeks to see if it comes back.  I didn't update all the XBees I have running, just the one in the controller since that is the collection point and the device that was having trouble.  As I update the other devices over time, I'll update their firmware as well.

Update, it failed again.  Details here.