Shoddy Software, Subs and Spacecraft

From an article about the Australian Collins class submarines, in The Australian :
The combat system remains the outstanding issue, with the Government set to sign off on a multi-million dollar replacement later this year.

That will be based on the United States Navy Combat Control System Mark 2.

The combat system links the submarine's sonar, navigation and weapon systems, displaying all information on colour computer consoles in the control room.

When this system was specified in the late 1980s, the ADF was paddling uncharted waters as there was no comparable system in service in any navy.

Submariners call this the "legacy system" and say its 1980s computer architecture and archaic processing power could never hope to deal with the vast amount of information available.

By 1999 the problems had become clear. The McIntosh-Prescott report, prompted by "dud sub" headlines, said the combat system was nowhere near providing acceptable capability even six years after their launch.

The government agreed to fast track augmentation of the combat systems aboard submarines Dechaineux and Sheean using newer technology, courtesy of the US Navy.

Dechaineux captain Lieutenant Commander Simon Rusiti, who has experienced both systems, says the augmented model is a substantial improvement and the new system will be even better.

One obvious manifestation is what is called the contact evaluation plot ? a log of sonar contacts and course changes ? which under the old system, used by submariners for half a century, involved pencil notations on a long roll of paper. It's now fully computerised.

The old combat system glitched frequently and simply wasn't capable of handling multiple contacts, limiting its usefulness in a hostile situation.

"This is definitely a vast improvement over the original system," Commander Rusiti said this week.

"We are now capable of working in high contact areas. There are parts of South-East Asia where we have seen up to 600 fishing boat contacts in an hour.

"Some times we get tiny glitches but only once in the last six months have we had to reboot the combat system."

The full capabilities of the Collins' sensors and combat system, like much submarine technology, remains a closely guarded secret.
ISUS-90Oh Dear. Oh Dearie, Dearie me. Being able to deal with 600 contacts/hour is not great performance by my standards. 6 months between resets is good though, or rather, is adequate.

Let's just say that the STN-Atlas ISUS-90 system (see picture to right) that I worked on 10 years ago now has rather better performance. So now we have to make do with a second-rate system from the US, purely for political reasons.. STN-Atlas will get paid off by the Australian taxpayer, and the same mob that made such an apalling mess of the Collins will get yet another go at getting it right.

From a Parliamentary Report on the Collins Class :
One of the recommendations of the report by Malcolm McIntosh and John Prescott on the problems of the Collins submarines, delivered in mid-1999, was that its combat system should be replaced by a proven, off-the-shelf product. This recommendation was accepted by the Government; the evaluation of potential suppliers had reportedly been concluded in favour of the STN Atlas ISUS 90-55 system when, in July 2001, the Minister suspended the program.

The reason was the Government's desire to maximise opportunities for closer cooperation with the USA on submarines. At the time, the decision created some confusion about the potential impact of ANZUS alliance issues on future Defence procurement. In some areas, it was seen as questioning the balance between the concepts of 'alliance' and 'self reliance', at the centre of the development of defence policy since the 1976 white paper. Further, the decision is likely to delay the program by about a year. As a result of it, the Government will have to manage the risk arising from the situation that, as yet, no combat system is available within the parameters of the policy on cooperation.

The only potentially suitable system, the CCS Mk2, produced by the American company Raytheon, is used by United States Navy (USN) nuclear powered submarines. There are sufficient differences between these and a conventional submarine of the Collins type to make the transition neither simple nor assured. The risks to be managed include integration with the existing systems on the Collins class, modifications to work in the less well-supported environment of a conventional submarine and avoiding pressures to include non-essential system enhancements. A system successfully developed to avoid these problems will be unique to RAN service.

Nonetheless, a trouble-free development cannot be assured. Raytheon, for instance, has been unable to satisfactorily conclude its contract to upgrade the Royal Australian Air Force (RAAF) AP-3C maritime surveillance aircraft, elements of which are now running 42 months late. The lessons of the recent history of Defence procurement are that neither sponsorship by the US Armed Forces nor development by corporate America can guarantee success in systems integration programs.
Ah well, enough of Subs. What about Spacecraft? From Spaceflight Now :
"And in that we realized that we had this reset problem. Based on just kind of the hunch of our lead software architect, he believed that the problem was probably associated with the mounting of flash and initialization. There is a hardware command that we can send that bypasses the software where we can actually tell the hardware to not allow us to mount flash on initialization. When we the next day actually sent the command to do that, software initialized normally and was behaving like the software that we had always known. It was a fantastic moment."

"Once we got into the mode where we could command the vehicle to get into a software state that we understood, then we were able to collect data. That is the path that we are on right now."

"Right now, our most likely candidate for the issue has been narrowed down a little bit. It is really an issue with the file system in flash. Essentially, the amount of space required in RAM to manage all of the files we have in flash is apparently more than we initially anticipated."

"We have been collecting data and collecting data thanks to (the science team) and we have lots and lots of files on the spacecraft. That's good -- we intended to have lots and lots of files on the spacecraft. This is a new problem that we encountered based on having many files. "

"We are currently in a much more specific debugging activity. Today (Monday), we started to dump out some of flash. We are actually loading a script that we get kind of the task trace on the software and identify exactly where the problem was in the code so we can make sure that our hunch is correct."
Ten out of Ten for some excellent debugging at an interplanetary distance, but minus several million for piss-poor testing and bloody awful software engineering with the mass memory. More Data than expected = Corrupt the whole system. Bleah. So much for "Software should be ductile, not brittle" slowly abrading away due to problems rather than shattering into a thousand fragments at the first hitch. Never mind, the Systems Engineering was good, as the system as a whole can be made to recover in a relatively straightforward way - scrap the contents of the Flash RAM, reformat, and restart. A few weeks and Spirit will be fully operational.

I bet the Software Architect is kicking himself though - he knew there was a problem in this area, he just didn't know he knew it till it happened. Been there, done that, and There But For The Grace of God Go I. Nice recovery though - everyone makes mistakes, but only the really good engineers can fix them. I prefer my mistakes to be found before the thing goes into service though. Less embarressing that way.

