A brief history of space missions lost to human errors
When seemingly benign code can demolish your rocket, a planetary spacecraft, and millions of dollars.
A space mission is an immensely complex undertaking. Hundreds to thousands of people from several distinct fields of expertise must come together to orchestrate thousands of moving parts and functions to make a space machine behave as expected. As such, failure to communicate or get right even a seemingly minuscule mission aspect can lead to cascading effects that can and have resulted in mission loss.
Here’s a brief history of some major missions lost because of very avoidable human errors, and which aren’t programmatic failures like the Space Shuttle.
Mariner 1 blows up
On July 22, 1962, an Atlas rocket launched successfully with NASA’s Mariner 1 spacecraft encapsulated in its fairing. NASA had hoped to be the first to successfully flyby Venus and counter the Soviet Union’s lead in the Space Race. Less than five minutes into the flight, it was clear Mariner 1 wouldn’t even reach Earth orbit.
A range safety officer tracking the rocket’s flight detected an unexpected yaw-lift (northeast) maneuver, meaning its guidance system faltered. The rocket was unable to steer itself as intended, and was heading towards a crash, either in the North Atlantic shipping lanes or worse, in an inhabited area. The officer had no choice but to order the destructive abort. The rocket along with Mariner 1 exploded, and $80 million dollars ($720 million in 2021) gone.
It turned out that a programmer had left out a hyphen out of an equation while entering its hand-transcribed code into the rocket’s computer. This fed false information to the guidance system and ultimately all that was left was debris. The human error was fixed for Mariner 2, which launched shortly after and conducted the first successful flyby of Venus.
Phobos 1 goes haywire
In 1998, the Soviet Union launched the ambitious Phobos 1 spacecraft to study Mars’ moons and land a probe on the largest of them, Phobos. But the mission was doomed before it could even reach Mars. On September 2 of that year, mission operators lost contact with the spacecraft and never heard back after.
An investigation of prior commands sent to the spacecraft revealed that software uploaded on August 29 missed a single character. This put the spacecraft in a steering test mode—usually only done on Earth—which also deactivated the spacecraft’s attitude thrusters. Phobos 1 could no longer orient its solar arrays towards the Sun. When it ran out of battery power, communication with it was lost. The flight software thus contained code that was supposed to be removed after testing on Earth. Moreover, the spacecraft’s operating system had no safeguards against such mistakes, sealing the spacecraft’s fate.
Likewise, when the scientifically potent Japanese Hitomi X-ray satellite was spinning out of control in Earth orbit in 2016, a wrong, previously uploaded command fired its thrusters in a way that accelerated Hitomi further instead of slowing it down. The spacecraft broke apart into over ten pieces.
A rocket self destructs because it thinks it’s going too fast
On June 4, 1996, Europe’s then new heavy-lift Ariane 5 rocket launched successfully. Merely 37 seconds into the flight, the rocket flipped 90 degrees, and the onboard computer triggered the self-destruct mechanism just two seconds later. Pieces of the expensive rocket and the four science satellites onboard slammed into the ground, which people on a nearby beach could see!
The Ariane 5 rocket was brand new but it was still running some code from its predecessor Ariane 4. The investigation results revealed that some of that old code wasn’t properly adapted for the more powerful Ariane 5. Just before the rocket flipped, it attempted as usual to convert its sideways velocity from a 64-bit “floating point” format to 16-bit “signed integer” format. The resulting latter number at that point was too large for the computer to store or act on. The flip caused the rocket’s two solid boosters to swing out and nearly detach from the body, which triggered the self-destruct.
Usually, such conversion errors are protected by detection and recovery mechanisms in the code, and indeed such protections were present on the Ariane 5’s computer as well. However, in this particular case, the engineers had decided the specific velocity in question was too high to become a real problem. That was only true for the Ariane 4.
An imperial spacecraft destroyed on Mars
The loss of NASA’s Mars Climate Orbiter is a human error on an organizational scale like none other. The spacecraft launched successfully in 1998 and cruised to Mars until September next year just fine. After nearing the red planet, it was supposed to brake and enter orbit. But the spacecraft’s trajectory took it too close to Mars and it disintegrated in the upper atmosphere.
It was later realized that the two mission engineering teams communicating with each other had been using different units for their calculations. The mission navigation team was using metric units, as expected, but the spacecraft manufacturer Lockheed Martin sent commands in conventional imperial units. The seemingly trivial mismatch of converted values obliterated the spacecraft and its $300 million dollars ($500 million in 2021).
A similar case of using different unit conversions almost resulted in NASA and ESA losing their SOHO solar observatory, a now thankfully highly successful mission.
Dropping a satellite on the floor
Another equally embarrassing mistake was when Lockheed Martin workers dropped NASA’s NOAA-N' Earth observation satellite on the floor in 2003. The NASA investigation report revealed that during a standard test, a technician had removed 24 bolts that helped hold the spacecraft in place but failed to document it. A team that came later was moving the spacecraft from vertical to horizontal position for another test but failed to check if all bolts were in place, as was specified in the procedure. The spacecraft fell on the floor and was badly damaged.
Lockheed Martin agreed to forfeit all profit from the project and paid $30 million for repairs. However, the total repair cost was $135 million, the difference of which was paid by the U.S. government.
A hard landing for Genesis
Sample collection missions are hard but NASA’s Genesis spacecraft launched in 2001 had already successfully collected samples of the solar wind and was about to bring them to Earth in 2004. When the sample-carrying probe was descending in Earth’s atmosphere, its parachute never deployed. It crashed into the Utah desert. Many of its sample collectors shattered upon impact, and some remaining samples were contaminated by the desert air. Scientists worked hard to safely recover the remaining intact samples for study, and the mission’s primary objectives were eventually met but the overall scientific output was, of course, lower than expected.
NASA’s failure report in 2009 revealed that Lockheed Martin, the spacecraft manufacturer, had incorrectly installed the probe’s accelerometers in inverted position, which confused the spacecraft’s navigation system and led to the parachutes not deploying.
A similar assembly mistake of installing inverted cables on the European Vega rocket moved the upper stage engine nozzles in the wrong direction during a 2020 flight. This misled the upper stage’s guidance system and so it couldn’t orient itself. The upper stage tumbled and failed to reach desired orbit, losing the satellites onboard.
As we embark on increasingly technologically complex space endeavors such as sustainably living on the Moon, exploring the outer solar system moons, bringing samples from Mars, and more, the increased functionality demanded by such missions could inevitably make human errors likelier.
The general industry direction of late has thus been to reduce human errors by reliably automating as many parts of missions as possible, taking advantage of modern software. This is especially true for mission operations, a major element still heavily reliant on humans to be in the loop. So when humans are exploring the Moon again, the mission control center would look and operate very differently.
Thank you Epsilon3 for supporting me, which allowed me to work on this article. The article’s theme has been inspired by my other piece, “Space grade electronics: How NASA Juno survives near Jupiter”. The next article in this series is “Past mistakes to avoid in our grand return to the Moon this decade”.
<Intro from the supporter> Epsilon3 is a web-based platform for managing complex spacecraft operational procedures that provides coordination, assurance, and risk reduction to prevent failures like some of the ones mentioned here and more.