A. M. Dowell, III, P.E.
Rohm and Haas Texas Incorporated
Deer Park, TX
D. C. Hendershot
Rohm and Haas Company
Bristol, PA
Prepared for Presentation at the American
Institute of Chemical Engineers
31st Annual Loss Prevention Symposium Houston, TX
Paper 44c March 13, 1997
Published in Process Safety Progress 16, 3 (Fall, 1997), pp. 132-139.
AIChE shall not be responsible for statements or opinions
contained in papers or printed in its publications.
''The chief cause of problems is solutions.''
- Eric Sevareid
In the course of chemical process and plant
design, engineers will identify potential hazardous incidents.
These potential incidents may be identified through special
hazard analysis reviews and procedures, or by the design team in
the course of design activities. To manage and control those
hazards, the team will modify the initial design, often by adding
on additional protective devices and systems - alarms, interlocks
or active protective systems. However, any change in a system,
even a change intended to prevent or mitigate a potential
hazardous incident, also has the potential to introduce new
hazards, or new mechanisms by which existing hazards can result
in an incident. A number of case studies illustrating this point
will be reviewed. The examples illustrate the importance of a
management of change program, which must consider all changes
including the addition of safety devices and systems, and which
must thoroughly consider all potential effects on the system.
When we add a new safety device onto an existing system, or to the design of a new system, the desired result is increased safety. That is certainly the intent of the designer when he adds the safety device. But, any change, even the addition of a new safety feature, has the potential to introduce new failure modes and scenarios. The designer, who is focusing on a particular hazard or failure mode when he specifies the new safety device, may not recognize other potential new failure modes. In some cases, the new failure scenarios introduced by the safety device may be more serious than the failure scenarios the device is intended to prevent. In these cases the system with the safety device may actually have a higher risk than the original design. A designer must always remember that "no good deed goes unpunished" (Powers, 1989). A good deed of adding a new safety device has the potential to punish by introducing new failure scenarios. A modified system must be thoroughly reviewed to ensure that all failure modes have been considered in the design. In this paper we will present a number of case studies which show how incidents or potential incidents may actually be caused by safety devices. Some of these case studies are based on actual incidents. Others were identified in the course of process hazard analysis studies before any incident actually occurred - the preferred way to identify potential incidents!
Focusing on a single aspect of a problem, and resolving that
issue without reviewing the impact of the solution on the entire
system has resulted in failures of engineered systems for many
years. Our first example was reported by Galileo (1638), as
described by Petroski (1992, 1994). In Galileo's time, stone
columns for building projects needed to be stored for some period
of time prior to use. If the column was stored on the ground, the
parts of the column in contact with the ground would tend to
stain and discolor from contact with the soil. The discoloration
was difficult to remove, resulting in an unsightly appearance of
the building in which the column was finally used. To prevent
discoloration, it was common practice to store the columns off
the ground, supported by piles of timbers or stones at each end [Figure 1(a)]. Sometimes a column would break in the
middle under its own weight [Figure 1(b)]. A
worker, seeing the failed columns, suggested that an additional
protective feature be added to the column storage system - a
third support in the middle, as shown in Figure 1(c).
This would prevent the failure of the column in the middle by
providing additional support. Everybody thought this was a great
idea, and so a third, center support was provided under the
columns.
What actually happened to a number of the columns after they
had been stored for several months using the additional support?
It was found that many of them still broke in the middle, but in
many cases the failure mode was different. The supports were made
from timbers or stones, and they would settle from the weight of
the columns as they sat outside in the weather. When the columns
were first set on the supports, each of the three supports was in
load-bearing contact with the column, and the system worked as
intended. However, as time went by and the supports and ground
under the column settled, it was extremely unlikely that the
three supports would all settle at exactly the same rate. After
some time, the actual support of most of the columns was as shown
in Figure 1(d) - held up by only two supports,
either the supports on each end as in the original design, or by
one end support and one middle support. In fact, some columns
were actually balanced on the center support, not in contact with
either end support. The center support was an "add on"
device which was newer, and did not deteriorate and settle as
quickly as the older end supports. The columns held up by one end
support and (or) the middle support broke by the failure mode
shown in Figure 1(e).
In this example, an additional safety device, the third column
support, was added with the expectation that it would reduce the
frequency of column failures. In fact it did not - the columns
could still fail by the original failure mode, and a new failure
mode was introduced by the presence of the "safety feature".
It is even possible that the columns would be more prone to
failure by the new failure mode [Figure 1(e)].
This example illustrates the importance of a full understanding of how a system works in designing safety features. The addition of a new safety feature changes the system. While the new safety feature provides a protection against a particular failure mechanism, any change in a system, even the addition of a safety device, introduces new failure modes and mechanisms. It is essential the entire system be reviewed from the standpoint of its complete functional requirements after any change is made, even the addition of a safety device. If the designers had understood the failure mechanisms which actually occurred, they could have identified other potential solutions, or additional safety precautions to protect against all failure modes.
A process required that Raw Material A, a highly reactive,
corrosive, and toxic chemical, be fed at a controlled rate to a
reactor. The flow rate was extremely critical - if the flow rate
exceeded a specified value there would be a potential for a rapid
runaway reaction. The original design included a metering pump
for Raw Material A followed by two flow meters to monitor the Raw
Material A flow rate, as shown in Figure 2. A
high flow rate on either flow meter (FAH 1 or FAH 2) would close
a shutoff valve and stop the metering pump. The focus of the
designer's attention when specifying this design was the hazard
of a runaway reaction in the reactor.
A quantitative risk analysis (QRA) of this system was done, in part because of the potential runaway reaction hazard. The QRA evaluated the system very broadly, including all identified hazards:
All phases of plant operation were also considered:
The QRA results indicated that the second flow meter reduced the risk of a runaway reaction by less than 0.5%. The QRA also indicated that the startup and shutdown phases were major contributors to the risk, and that spurious trips caused by failure (false high indication) of the second flow meter might actually increase the overall risk of runaway reaction (although the actual number of potential spurious trips was not estimated). Furthermore, the flow meter required maintenance and calibration, with a potential exposure of operators and mechanics to Raw Material A, a very toxic and corrosive material, which could cause serious injury. Overall the conclusion was that the addition of the second safety device - flow meter FAH 2 - would in fact result in an increase in the risk of the system when all potential hazards and plant operating modes were considered. The second flow meter was eliminated from the final design.
The following example is not new - the issues are well known
and appropriate solutions are well documented in industry
standards such as API 520 (API, 1990) and the ASME vessel code (ASME,
1995). However, the phenomenon is not clear to many engineers,
and we have explained it many times in the course of process
hazard analysis studies. Therefore it is useful to document the
concern and appropriate solution as another example of a
protective feature introducing a new hazard to a system.
A rupture disk is frequently installed in series with a
pressure relief valve, as shown in Figure 3, for
a number of reasons. These might include:
In the above situations, the vessel could be protected from
overpressurization by using a rupture disk alone. However, the
rupture disk-relief valve combination offers the potential to
minimize the discharge of material in the event of a vessel
overpressure. The relief valve can close, stopping the discharge,
when the vessel pressure returns to normal. Once the rupture disk
bursts, the flow will continue until the vessel pressure reaches
ambient pressure.
These are "good deeds" - when specifying the design
of Figure 3, the designer is focused on ensuring
that the relief valve will work if there is a demand on it, and
also on minimizing the discharge of hazardous material to the
downstream treatment equipment and, potentially, the outside
environment.
But, the designer must also consider the possible punishment
for this good deed. If not properly designed, the system in Figure 3 can result in nearly doubling the pressure
at which the relief system will activate. Assume that the normal
pressure in the vessel is P1. The pressure between the
rupture disk and the relief valve, P2, is assumed to
be ambient (0 psig), as is the pressure downstream of the relief
valve, P3. Assume that the rupture disk and relief
valve are both designed to open somewhat above the normal
operating pressure - say at SP (SP > P1). Figure 4 shows the pressures at various points in
the system during normal operation for an example case. This
system will function as designed - if a process upset results in
the vessel pressure P1 increasing to SP, the rupture
disk will burst, and the relief valve, set to open at SP, will
also open, protecting the vessel from overpressure. The pressures
at various locations in the system during an emergency
overpressure situation are shown in Figure 5 and
Figure 6. When the pressure in the vessel falls
to less than SP, the relief valve should close, minimizing the
release of hazardous material.
What can go wrong? Consider what happens if there is a small,
pinhole leak in the rupture disk. The small leak causes a
pressure increase in the pipe between the relief valve and the
rupture disk. P2 will no longer be 0 psig, but will
eventually increase until it is equal to the vessel pressure, P1.
The pressure will remain in the piping between the rupture disk
and the relief valve because the relief valve set pressure of SP
has not been exceeded, so the relief valve will not open. The
pressures at various locations in the system in case of a pinhole
leak in the rupture disk are as shown in Figure 7.
Now what happens when a system upset results in an increase in
the vessel pressure P1? A rupture disk is a
differential pressure device - it bursts when the pressure on the
upstream side exceeds the downstream pressure by the specified
bursting pressure.
Before the relief devices open, and assuming no back pressure
from the relief discharge header (P3
= 0), the pressure in the vessel P1 can be
approximated by:
P1 = dp(rupture disk) + dp(relief valve) (P3 = 0 )
dp(rupture disk) = P1 - P2, that is, the pressure in the vessel minus the pressure between the disk and the valve.
dp(relief valve) = P2 - P3, that is, the pressure between the disk and the valve minus the ambient pressure.
For the rupture disk to burst,
P1 - P2 = SP, or 50 psi
For the relief valve to open,
P2 - P3 = SP, or 50 psi
If there is a pinhole leak in the rupture disk, P2
(between the disk and valve) approaches P1, the vessel
pressure. Substituting,
P1 in the vessel = (P1 - P2) + (P2 - P3) (P3 = 0 )
= SP + SP
= 50 + 50
= 100 psig, in the worst case.
Figure 8 and Figure 9 show
the pressures that may occur at various locations in the system
during an overpressure event if there is a pinhole leak in the
rupture disk. The pressure in the vessel can rise to 90 psig
or higher for this example case, sufficient to burst the
expansion joint on the pipe connected to the vessel, which is
rated for 60 psig.
For a pinhole leak in the rupture disk, some have suggested
that a slow increase in the vessel pressure, P1, would
allow time for the relief valve to open slightly when P2
reaches the relief valve set pressure, SP. However, the relief
valve will reseat with P2 still at the relief valve
set pressure, SP. And, the vessel pressure P1 can
still increase to as much as twice SP, because the rupture disk
with a pinhole leak prevents P2 from increasing as
rapidly as P1. Thus, a slow increase followed by a
rapid increase in the vessel pressure may give the highest
pressure in the vessel.
As stated, this concern is well known, although surprisingly
unfamiliar to many engineers. Some ways of dealing with this
issue include:
Both of these alternatives are recognized as acceptable design
options by API 520 (API, 1990) and the ASME vessel code (ASME,
1995). Both require a management system to ensure that the
protective features are not compromised by plugging of the hole
or failure of the instruments or alarms. It is also essential
that personnel understand the reason for the protective systems
so that they know the proper response for an alarm or observation
of high pressure between the rupture disk and relief valve, and
so that the systems are not defeated by a future change.
Note the trade-off between monitoring the relief valve outlet for fugitive emissions and monitoring the pressure in the pipe between the rupture disk and relief valve.
A plastics manufacturing plant included a grinder to eliminate
oversize plastic particles from the final product. The plastic
powder was being conveyed to and from the grinder by an air
conveying system, and there was a potential for a dust explosion.
Because of the location of the grinder and its associated piping,
it was not practical to protect the system with explosion vents,
and a chlorofluorocarbon (CFC) suppression system was designed to
protect the equipment against dust explosion, following the
design requirements of NFPA 69 (NFPA, 1992). Figure
10 shows the grinder and its suppression system as a general
schematic. The pressure sensor was designed to detect the onset
of a dust explosion and rapidly release the CFC into the grinder
and its associated piping to quench the explosion before it
generated enough pressure to damage the equipment. The
suppression system is a safety feature designed to protect the
equipment and personnel should a dust explosion occur.
This system operated without incident for many years, and the
suppression system was never challenged - no dust explosion
occurred. Then, after a number of years, the grinder did explode,
and the cause was the suppression system itself! Process upsets
elsewhere in the manufacturing facility resulted in water getting
into the plastic grinding system. The water accumulated in the
bottom of the piping below the grinder and eventually reached a
point where the water pressure was sufficient to activate the CFC
suppression system. Because the pipe and ducts were partially
filled with water, and the plastic powder was wet and did not
flow freely, the CFC suppressing agent was unable to flow easily
through the system. The pressure of the CFC released into the
grinder was sufficient to overpressurize it, resulting in failure
of the grinder. Fortunately, the area was unoccupied at the time
and there were no injuries.
Again, this is an example of the importance of identifying and evaluating all potential failure modes when designing a system. The explosion suppression system was designed and specified assuming that the grinder and its associated piping would be filled with dry, free-flowing powder if the system was triggered. The scenario in which water entered the system, both triggering the suppression system and simultaneously restricting the ability of the CFC suppressing agent to flow freely through the process resulting in overpressurization of the grinder, was not recognized. The result of this incident was a redesign of this particular system, and a thorough review of other similar systems throughout the company to search for similar hazards (Bernard, et. al., 1997).
Thirty to fifty years ago in the USA, chemical plants
frequently had a local vent for each tank and distillation column.
In this example from Dowell (1996), a feed tank of API design was
vented to a lifting lid for both normal and emergency vent (Figure 11). The distillation column of 1
atmosphere (15 psig) design had its normal vent to the product
tank. Since one of the compounds in the column was toxic with a
TWA of 10 ppm, the column's emergency vent went to a tall stack.
As the concern for the environment increased during the 1960-1990
period, the plant installed collection headers for the normal
vents in the process and sent them to a flare at the perimeter of
the unit. This approach was a good deed to eliminate the
emissions from normal operations of emptying and filling tanks,
ambient temperature changes, etc. The feed tank emergency vent
continued to be the lifting lid, since the API design of the tank
would withstand only a few ounces of pressure. By 1990, the
system had evolved to that shown in Figure 12.
However, the environmental good deed introduced a new hazard.
The changes had interconnected the normal vents of distillation
column and the feed tank. When there was a pressure excursion in
the distillation column - say, from low coolant flow to the
condenser - the small pipe in the long run to the flare did not
have enough capacity to relieve all the pressure. The lifting lid
on the feed tank opened at a few ounces of pressure, venting
toxic gas in the tank farm at an elevation of 9.5 m (about 20
feet). Releases from the lifting lid affected personnel in this
and neighboring units with an acute health hazard, requiring
evacuations (Dowell, 1994).
To avoid such incidents, unit procedures specified a quick
column shut-down for high column pressure or temperature.
Additionally, one cause of high distillation column pressure had
been identified as low coolant flow and a low coolant flow
interlock (shown by "FL" in a diamond) had been added
to trip the steam to the column. Probably an incident had been
caused by low coolant flow. However, several other causes of high
column pressure existed and there were no interlocks to protect
against them.
An additional good deed to increase the column design pressure
to 3.3 atm (50 psig) was planned in the early 1990s. The increase
in column design pressure and relief valve set pressure would
make the release from the feed tank worse. Process hazard
analysis studies identified additional causes of high column
pressure, and interlocks were installed for high coolant
temperature, high column temperature, and high column pressure.
These interlocks tripped the steam and are shown as diamonds
within ovals in Figure 12.
In 1996, inherently safer design features were installed as
shown in Figure 13 - the feed tank was replaced
with a new tank designed for 3.3 atm (50 psig). The feed tank
normal vent continued to go to the flare and the feed tank
emergency vent was routed to the tall stack via a dedicated line.
This system:
This case study teaches:
New hazards can be introduced as an unanticipated side effect
of a modification which was originally designed as a safety
feature. Some other examples from various areas of technology
include:
There is no substitute for a full understanding of how a system works.
Any change to a system, including adding a safety feature (a good deed), introduces new failure modes and mechanisms (punishment). Firstpass intuitive analysis may give an inaccurate perspective - remember the stone columns and the pinhole leak in the rupture disk. The changed system must be thoroughly reviewed to understand the new failure modes and to protect against them. It is more effective to gain this understanding before the change is made (management of change, process hazard analysis) than to figure it out during an incident investigation.
To Stan Anderson, Kathy Pearson-Dafft, Frank Worley and Don Zolotorofe for their insightful review.
Although we believe the information contained in this paper is factual, no warranty or representation, expressed or implied, is made with respect to any or all of the content thereof, and no legal responsibility is assumed therefore. The examples shown are adapted from actual experience and are simply for illustration; as such they do not necessarily represent Rohm and Haas Company guidelines. The readers should use data, methodology, and guidelines that are appropriate for their situations.
American Petroleum Institute (API) RP 520 (1990). Sizing, Selection, and Installation of Pressure-Relieving Devices in Refineries. Part I - Sizing and Selection. 5th Edition. (July). Washington, D. C.: American Petroleum Institute.
American Society of Mechanical Engineers (ASME) (1995). Boiler & Pressure Vessel Code. Section VIII, Division I, p. 93. New York: American Society of Mechanical Engineers.
Bernard, L., F. Brodie, D. Ludwig, A. Ness, and K. Weidner (1997). "Vent Access Restriction for Solids Handling Systems." Proceedings of the 31st Annual Loss Prevention Symposium, March 10-13, 1997, Houston, TX, Paper 40b. New York: American Institute of Chemical Engineers.
Dowell, A. M., III, (1994). "Low Cooling Tower Level Causes Evacuation." Process Safety Progress, 13, 1, (July), 114-122.
Dowell, A. M., III, (1996). "Vent Systems: Life Cycle and Inherently Safer Concepts." International Conference and Workshop on Process Safety Management and Inherently Safer Processes, sponsored by the Center for Chemical Process Safety, October 8-11, 1996, Orlando, FL, Workshop F: Case Studies on Inherent Safety: Cost Benefit Analysis; Life Cycle Cost.
Galileo (1638). Dialogues Concerning Two New Sciences, trans. H. Crew and A. de Salvio. New York: McGraw-Hill (Republished in 1963).
National Fire Protection Association (NFPA) NFPA 69 (1992). Explosion Prevention Systems. Quincy, MA: National Fire Protection Association.
O'Donnell, J., and J. R. Healey
(1996). "Feds Poised for Emergency Air-Bag Action." USA
Today (November 11) Sec. B, p. 1.
Petroski, H. (1992). "History and Failure." American Scientist 80, 6 (November-December), 523-26.
Petroski, H. (1994). Design
Paradigms: Case Histories of Error and Judgment in Engineering.
New York: Cambridge University Press.
Powers, G. J. (1989). "A
Short Course on Risk and Reliability Assessment by Fault Tree
Analysis," Carnegie Mellon University, Pittsburgh, PA. (April
25).
Riemerman, P. A (1994). "Keeping Cool Gets Costlier." The Intelligencer, Doylestown, PA (July 10) Sec. A, p. 6.
Figure 1: Failure modes of columns supported
off the ground to prevent discoloration from contact with the
soil
Figure 2: Proposed metering system for Raw Material A
Figure 3: Emergency relief system with rupture disk and relief valve in series
Figure 4: Pressure during normal operation with intact rupture disk
Figure 6: Pressure graph for intact rupture disk prior to overpressure
Figure 7: Pressure during normal operation with a pinhole leak in the rupture disk
Figure 8: Pressure graph for rupture disk with a pinhole leak
Figure 10: Plastics grinder with explosion suppression system
Figure 11: Original design with local normal vents