Words by Lachlan Haycock
On 3 June 1985, a 61-year-old woman in Georgia, United States, was receiving follow-up radiation treatment after a lumpectomy to remove a malignant breast tumor. The radiation machine involved in the procedure was a Therac-25 unit produced by Atomic Energy of Canada Limited (AECL), and one of 11 units that had been installed across Canada and the US.
The patient was set to receive a 10 megaelectron volt dose (or 10 million electron volts) to the clavicle. She later reported that, after the machine was turned on, she felt a “tremendous force of heat” and a “red-hot sensation” near the treatment area. Afterwards, although the skin was warm to the touch, no marks were visible.
On 26 July 1985, a 40-year-old woman receiving treatment for cervical cancer in Ontario, Canada, reported feeling a burning sensation, or “tingling shock”, in her hip following a dose from another Therac-25. The machine had initially shut down and displayed an error message, leading the operator to make five successive attempts at restarting it to administer the dose.
Following the fifth attempt, the machine shut down and a hospital technician was called, but did not identify any fault in its setup. Four days later, the patient was hospitalised with burning and swelling in the treatment area.
Crucially, the lack of hardware interlocks or supervisory circuits meant bugs in the software were not identified, leading both patients to receive fatal radiation overdoses. In the case of the Georgia patient, this totalled 20,000 rad. Therac-25 machines were linked to at least four other similar incidents until 1987, when the units were recalled for inspection and modification. Of the patients who received an overdose, four died and two were left with lifelong injuries.
Two models preceded the Therac-25: the Therac-6 and the Therac-20. After the recall of Therac-25 units, it was found that related issues were also evident in the Therac-20 software, but its hardware safety interlocks – not present in the newer models – prevented software bugs from triggering catastrophic failures. This is what happened in the case of the Therac-25 design.
Nancy Leveson, software engineering and systems safety expert from Massachusetts Institute of Technology (MIT) who has extensively studied the Therac-25 incidents, said that, compared with the Therac-20, the Therac-25 was a more efficient piece of software engineering as it used a novel “double-pass” concept for electron acceleration.
In an appendix to her 1995 book Safeware: system safety and computers, she wrote that “a double-pass accelerator needs much less space to develop comparable energy levels because it folds the long physical mechanism required to accelerate the electrons, and it is more economical to produce”.
The Therac-25 was a dual-mode linear accelerator that could deliver either photons at 25 MeV or electrons at different energy levels. The model was designed for computer control from the outset, while its predecessors were designed around machines with a history of both computer and non-computer control. As an example, although the Therac-25 relied on software for monitoring the electron-beam scanning and the safe operation of the machine, the Therac-20 had independent protective circuits and mechanical interlocks for those respective tasks.
The key difference between the two approaches was the reliance on software, rather than mechanical – i.e. human – checks of the positioning of the equipment prior to dosage, particularly that of the turntable used to place accessory devices in the path of the electron beam.
“The computer is responsible for positioning the turntable (and for checking the turntable position) so that a target, flattening filter and X-ray ion chamber are directly in the beam path,” Leveson said. “With the target in place, electron bombardment produces X-rays. The X-ray beam is shaped by the flattening filter and measured by the X-ray ion chamber.”
The operator’s interface also played a part. The machine’s original design mandated that operators, having positioned the patient on the treatment table, manually input information including beam type, energy level, dose rate, field sizing and more before returning to the computer console. The software then matched the manually entered details against those set at the console. Modifications to the software, following feedback that the process took too long, enabled the bulk copying of treatment site data using carriage returns.
“The Therac-25 could shut down in two ways after it detected an error condition,” Leveson said. “One was a treatment suspend, which required a complete machine reset to restart. The other, not so serious, was a treatment pause, which only required a single key command to restart the machine. If a treatment pause occurred, the operator could press … ‘proceed’ and resume treatment quickly and conveniently. The previous treatment parameters remained in effect, and no reset was required.”
This feature, which could occur up to five times prior to an automatic suspension of treatment, was evident in the case of the Ontario patient described above.
The patient in Ontario died of cancer on 3 November 1985, having received between 13,000-17,000 rad. It was found that, if she hadn’t died, she would have needed a total hip replacement due to the overexposure. The Georgia patient, meanwhile, eventually needed to have her breast removed due to the radiation burns.
Two separate incidents at the East Texas Cancer Centre in Texas in 1985 triggered a response that saw the scope of the problem start to be revealed. The first involved a man attending a follow-up after the removal of a tumor on his back. He ended up dying from severe complications from radiation exposure, including a lesion on his left lung and paralysis of multiple body parts. The second died after entering a coma and receiving neurological damage.
After this incident, a staff physicist at the centre, Fritz Hager, conducted a series of tests to identify the cause of the malfunction.
Data entry to the Therac-25 machine was done using up and down keys on a VT-100 video terminal console. If an operator, having set the machine to enter X-ray mode, for example, switched to electron mode within the eight seconds before the machine finished its setting-up process, the turntable would not be left in the correct position.
After conferring with a staff physicist in Chicago, Hager identified the bug in the software of the previous Therac model, the Therac-20 – but, critically, that unit’s hardware interlock had a fuse built into it that would be triggered when the error occurred. The Therac-25 had no such capability, and so did not blow when the malfunction occurred, leading to fatal overdoses.
In Leveson’s opinion, the failure of the Therac-25 software to prevent the overdoses reveals a fundamental misunderstanding of the safety-critical system – a system where severe injury or death is possible if a malfunction occurs.
“There were no independent checks that the machine was operating correctly,” she wrote in a 1993 paper that was republished in 2017.
“Such verification can’t be assigned to operators without providing them with some means of detecting errors. Operators are often blamed for medical device accidents when the problems were actually in the machine design.
“The Therac-25 had a probabilistic risk assessment – including an update after one of the early accidents – that led to dangerous complacency. Many (perhaps most) industries today make the same types of assumptions based on probabilistic risk assessments. In general, such calculations often exclude aspects of the problem that are difficult to quantify (such as software requirement inadequacies) but which might have a larger impact on safety than the quantitative factors that are included.”
Sometimes, increasing the ease of use of a computer interface inadvertently means reducing the level of safety ensured by that interface, Leveson said.
“Eliminating multiple data entry and assuming that operators would check the values carefully before pressing the return key was unrealistic for the Therac-25, and for most systems. I have been involved in reviews of several newer safety-critical system interfaces and have been surprised by how many included unsafe features.
“Accidents are seldom simple. They usually involve a complex web of interacting events with multiple contributing technical, human, organisational and regulatory factors. We aren’t learning enough today from the events, nor focusing enough on preventing them. It’s time for computer science practitioners to be better educated about engineering for safety.”
This article was originally published in the May 2025 issue of create with the headline “Killer software”.
The Therac-25: 30 Years Later, Nancy G Leveson, IEEE Computer Society.
Killed By A Machine: The Therac-25, Hackaday.
Safeware: System Safety and Computers, Addison-Wesley, 1995.