Friday, March 27, 2009

Software Safety blog reader Michael Barr has a new article Bug-killing standards for firmware coding on Embedded.com, where he discusses "Ten bug-killing rules" Michael also has his own blog.

There is also an interesting discussion going on in the comments section related to the article.

I even added a comment of my own:

Dale Shpak wrote:

" I have debugged millions of lines of code and have encountered the following type of error many times:

while (condition);

{

/* Execute conditional code */

}"

If you put this in your .emacs file:

(global-cwarn-mode 1)

Errors such as "if(condition);" and "while(condition);", as well as "if( x = 0 )" type errors are highlighted.

No need to use the One True Brace style when you are using the One True Editor... :-)

Also MISRA 21.1(a)/2004 requires the use of static analysis tools, that would never allow the passage of an always executing "conditional".

MISRA doesn't say much about style. It does say braces will always be used. I say that they should clearly show the nesting. Path coverage testing is hard enough without playing "find the matching brace" (EMACS helps out here too).

Saturday, March 14, 2009

On a personal note, yesterday I received my re-certification papers from the American Society for Quality (ASQ). I am now officially certified for three more years as a Certified Software Quality Engineer (CSQE).
The CMMI Product Team has released Technical Report CMU/SEI-2009-TR-001:
"CMMI for Services (CMMI-SVC) is a model that provides guidance to service provider organizations for establishing, managing, and delivering services. The model focuses on service provider processes and integrates bodies of knowledge that are essential for successful service delivery."

Long hours link to dementia risk

Something I really like from the Agile Methodology is the 40 hour week. BBC News is reporting:
Long working hours may raise the risk of mental decline and possibly dementia, research suggests. The Finnish-led study was based on analysis of 2,214 middle-aged British civil servants. It found that those working more than 55 hours a week had poorer mental skills than those who worked a standard working week. The American Journal of Epidemiology study found hard workers had problems with short-term memory and word recall. “ This should say to employers that insisting people work long hours is actually not good for your business ” - Professor Cary Cooper University of Lancaster

Saturday, February 28, 2009

In C are you a Righty or Lefty?

Do you write your code, like almost everyone does, like this (Those are Zeros if you have a funky font):
if( x == 0 ){...}
or do you do it correctly and do it this way?:
if( 0 == x ){...}
Why is the latter the correct way? It prevents you from making this mistake:
if( x = 0 ){...}
"Unless Debugging is an Obsession" put the constants on the left in any conditional test. Also use a lot of parentheses, you can never have to many parentheses, if there is more than one condition in the test. When you put the condition on the left, the compiler will refuse to compile the code at all, because you can not assign a value to a constant. Putting the constants on the right may elicit a warning if you are using a good quality compiler, if your lucky. I've been giving out this advice for years. The responses have been interesting:
I've never made that mistake. I don't need such crutches. -- AVR GCC List It does not read right. -- Well known Compiler Guru, in private email.
What is wrong with reading it as "if Zero is equivalent to X"? Do you want to ship products on time and under budget, or do you want to write code in the way that everyone else does?

Wanted: Experianced Embedded System Developer with a Brain

"I am a consultant and I am frequently hired by CEO's and CFO's who are at their whits end with the 'kids' that got hired by the other kids that got the job then decided the lights were brighter and more sparkley someplace else..." --- by FlyingGuy (989135) on SlashDot.org.

That seemed like a good introduction to this real Want Ad I saw on Craigs List this week. I have all of the experience they are looking, would you sign up based on this Ad (not that I'm looking right now)?:

EE / Embedded Control Hardware / Software Robot Instrumentation

Needed: One damn hot engineer to finish a robotics project for a very established company in East Pittsburgh Area.

This is a full time position but if you are some hot talented Carnegie Mellon University Robotics student we'll consider part time, as long as you perform and deliver ( unlike the previous degreed graduated CMU student.)

This is a robotics project but the robotics are simple. The little robot is designed to carry instrumentation into a tight, hot crack where no instrumentation has gone before.

Personal Requirement:

1.A brain. 2. A watch 3. A cell phone that you answer 4. Ability to give up girl friend for being paid professionally. 5. Working with us professionally between the hours of 7am and 7pm, and not the reverse.

Professional Requirements

An excellent understanding and experience with digital circuit design, layout and interfacing. You had darn well better know how to lay out circuit boards and use a hot air rework station to put down SMD if you have to. You need a full understanding of VLSI circuits as well as discrete circuitry. Motor control and instrumentation associated with robotics. Servo Motors, DC Motors, Step motors Motors, Encoders etc..

A phenomenal understanding of the ATMEL AVR type of chipsets and supporting circuits and an excellent command of the C language used for writing code for those chips. You must have a complete mastery of all of the chips features, A/D, I/O, all TX/RX methods, Counter Timers etc..because they are all in use. Reading the articles in Make Magazine do NOT count. Read the first line again; Phenomenal Understanding.

I would hope that you also have a competent ability to write software in a windows environment for the display of the data the robot sends back. Even if its liberty basic / visual basic that is ok, but we'd prefer a full C++ development environment expertise.

You need to have enough understanding of analog electronics to digitize, transmit, store and display the information as well as the use of DC power supplies and supporting instrumentation such as digital storage oscilloscopes. Don't go getting a funny look on your digital experience face if someone asks you about the impedance of your connection.

You must be able to produce and provide documentation. Schematics, illustrations, photo documentation of progress, component lists etc... so they don't have to be extorted from you if you no longer work for us.

You will be signed up with a non-disclosure and confidentiality agreement. You will have a police & background check performed on you as well as drug testing. No criminal history and no history of drug use. Period.

I personally don't care if you are a student, have a BS, MS, or a Ph.D. What we need is ability and capability along with a high desire (even desperation) to work and finish the project. We are looking or talent and I personally was probably doing assembly language programming and building circuits by hand before you were even dumping in your diapers...and I'll be the chief person interviewing you. Come prepared IF you make it to the interview process.

As with any project, there is a point where it ends but of course...what project have you seen that ever ends. A success of a project always moves to improvements and expansion of that project so there is the very very real potential for this to be full time unending employment. Full professional pay, full benefits, vacation time, medical, house, picket fence, 2.5 kids etc.

The work environment is professional in every sense. Nice office, large lab and work area, new Dell Computers for everything, excellent people to work for and just good natured and nice all around. No jag offs trying to make a joke at your expense. We guarantee that.

So you have a choice. The red pill or the blue pill. If you decide the blue pill than please wake up tomorrow and forget all about this. If you decide the red pill then please send back an email with your interest as well as your resume, experience, links to your website with photos /video of accomplishments/project (not your cat) etc...

The USA is basically in a depression and there millions of people out there with extreme talent looking for professional positions so if you want this position you had better make your submittal good.

Thank you, Steve.

Article: http://pittsburgh.craigslist.org/egr/1049407629.html

Steve sums up my view, and the views of many of today's HR departments. Some of the HR blogs indicate that they have turned into babysitting services, to keep the newly degreed young people from moving on when they are hit with the least bit of negativity.

Like FlyingGuy in the introduction, I do my own part time consulting gig. I get called in to clean up the mess left by people with lots of letters after their name.

I once went in to clean up a project that was designed by a committee of people spread all over the world. The unit was large moving equipment that if something went wrong, people might die. The unit was composed of several different CPU modules communicating on a property bus. Each modules software was written by a different group in a different part of the world.

The operators requested speed was input in Feet Per Minute. The output to a Variable Frequency Drive was in tenths of Hertzs. The tachometer feedback was in RPM, and to top it off all the internal calculations where done in Radians-Per-Second.

The first thing I did to get the project back on track was to adopt a standardized variable naming convention, that included the units. For example the Operator Request became operator_request_fpm_u16. You then knew immediately you where dealing with Feet Per Minutes, and that it was a 16 bit unsigned variable. After the variable name clean up may of the bugs became self documented, when you saw something like "operator_request_fpm_u16 / vfd_hz_s32" in the code, you knew there was a problem that needed fixed...

What has been your experiences with hiring people? Do you turn away people with experience in favor of people with degrees?

Sunday, February 1, 2009

Embedded Systems A Volatile Business

At the end of Embedded systems - a volatile business Jack Ganssle says: Bob Paddock sent me the link [Embedded System Compilers generate dangerous code], and thereby wrecked my day.

Always happy to wreck a day, by pointing out Software Safety issues. :-)

I believe that Jack and I see eye-to-eye on most embedded system issues, but I have to disagree in one area. In his column Skip bugging to speed delivery Jack stated: We only inspect new code because there just isn't time to pour over a million lines of stuff inherited in an acquisition.

Development times are always shorter that we want, so not wanting to look at inherited code may seem like a good shortcut to save time. Myself I do not see it that way. Just because a bug is old, does not mean that it is something to be ignored. The Zune problem, that we have already covered here is a perfect example. The Board Support Package was 'inherited' so it seems no one bothered to inspect the code.

Atmel's new TouchLib product shows us an other example of code that can't be inspected at all. TouchLib is only available as a binary library file. The source code is not available, even under NDA (I asked), to allow for inspection. In Atmel's view this is their way of protecting their Intellectual Property.

If you take the trouble to actually read the three different TouchLib licenses, from registering, installing and in the TouchLib archive, all three more or less say: "If this product screws up it is not our problem".

Why would I want to use code that I can not verify as safe and correct? My company is the one that would have to deal with potential warranty issues, calls from angry customers etc. I should just tell them "Sorry, we used software that we got off the Internet for free, but have no idea how it works"? What would your reaction to that be as a customer? I don't think you'd be very happy. Unhappy customers don't come back as paying customers.

Saturday, January 3, 2009

Does Time Keeping become a safety issue during Leap Seconds in a Leap Year?

At the end of 2008 we had a unusual event of a double compound time leap. 2008 was both a Leap Year, and ended with a Leap Second.

Wikipedia covers how to calculate Leap Years quite well, so I will not duplicate what should be very well known rules of how to calculate leap years here. Sadly it seem programmers don't know these rules. After all as far back as 1886 Christian Zeller came up with Zeller's Congruence to calculate the day of the week, and he got it right without even knowing what a computer was.

The event that most are talking about today, are the locking up of Microsoft's Zune 30 gigabyte devices.

While most people are bashing Microsoft for the problem, the problem really lies in the i.MX31 Board Support Package from Freescale. After registering you can download the i.MX31: Multimedia Applications Processor Board Support Package (FSL-WCE500-14-BSP), with full source code.

The now infamous file rtc.c, as reported on may other Zune related web sites, may be found in WINCE500\PLATFORM\COMMON\SRC\ARM\FREESCALE\PMIC\MC13783\RTC. The lockup bug comes down to the error that there will never be 367 days in *any* year in the function ConvertDays:

 while (days > 365)

  {

    if (IsLeapYear(year))

   {

    if (days > 366)

     {

      days -= 366;

      year += 1;

     }

    }

   else

   {

     days -= 365;

     year += 1;

    }

  }

The line should have read "366 == days" or my preference "days >= 366". A simple Code Inspection should have caught a simple minded bug like this during a design review. We can only assume based on this code, and other questionable constructions in the same file (I did not look at other files), that no such inspections or reviews were done. It is also clear that they didn't even running something as inexpensive as Lint on their code. Doing so would have weeded out several of the non-executable paths that are present.

There have been Leap Second related crashes of various devices, beyond Zune related to Leap Seconds, such as some Linux Kernel Versions. So what I want to concentrate on is the Leap Second. A Leap Second may be inserted or removed to get the common UTC time standard in sync with the Earth's Rotation.

What is more insidious than rtc code above is that any product based on the MC13783 Power Management and Audio Circuit chip, as used in the crashing Zune's, is doomed from the start, because at the hardware level it is impossible to support Leap Seconds correctly. To be fair to Freescale most second counting clock chips have the same problem.

"4.1.2.2.1 Time and Day Counters

The real time clock runs from the 32 kHz clock. This clock is divided down to a 1 Hz time tick which drives a 17 bit time of day (TOD) counter. The TOD counter counts the seconds during a 24 hour period from 0 to 86,399 and will then roll over to 0. When the roll over occurs, it increments the 15-bit DAY counter. The DAY counter can count up to 32767 days..."

According to the National Institute of Standard and Technology on Radio Controled Clocks (page 27) a properly functioning clock would tick from 23:59:59 to 23:59:60, then to 00:00:00 during a Leap Second event. So a single day could have 86,400 seconds, counting zero.

A clock that that is not capable of display 23:59:60 would have two consecutive displays of 23:59:59.

Leap Seconds do not always add a second, they can also subtract a second, so a 'day' could correctly only have 86,398 seconds.

From a software perspective something like libtai that supports two time scales: (1) TAI64, covering a few hundred billion years with 1-second precision; (2) TAI64NA, covering the same period with 1-attosecond precision. Both scales are defined in terms of TAI, the current international real time standard, is worth considering. As long as the Leap Second tables are properly kept up to date, which presents problems of its own.

If you really want to dig into issues of Leap Seconds then check out the LEAPSECS -- Leap Second Discussion List. Also if you are in any way interested in the preciseness of time keeping then check out the Time Nuts Discussion List at the LeapSeconds site.

I am not aware of any loss of life, or loss of major income, due to any of the Leap Second problems that occurred this time. Will we be able to say the same for the next Leap Second that occurs?