Sunday, August 17, 2008

Half a HAZOP!

Since ICI told the world about its hazard and operability studies in the early 70's HAZOP studies, as they have come to be called, have been one of the prefered tools for risk assessment in the process industries. A HAZOP study uses a systematic team approach to investigate causes and consequences of relevant process deviations from normal operations in order to improve the ability of the process to handle deviations from normal operations.

Over the years the use of this approach have been adopted in many fields outside the process industries. Recently, while looking for references to the early descriptions of the method by ICI, I came across a publication which claimed to describe the use of HAZOP in the study of intelligent traffic system. This article "Application of Hazard and Operability Studies (HAZOP) to ISA and Speed Humbs in a Build-up Area", was presented at the e-SAFETY conference in Lyon, France in September 2002. The paper presents a deviation matrix for two viewpoints on a traffic situation a) that of a singe participant in the situation, and b) the situation as a whole. Already here the parallel to the HAZOP studies, which is known in the process industries appear to be weak. The process analogy to these two viewpoints would be a) that of a single process operator or other process worker (technician, contractor, engineer), and b) that of the process unit or plant. I don't know if these two viewpoints would be relevant to the process industry or not. It could be argued, that the first viewpoint would be a personal safety viewpoint, and the second viewpoint would be a process safety viewpoint.

The mentioned article by Jagtman and Heijer does however limit itself to presenting causes of deviations. For traffic speed adjustment system the parameters considered are speed, direction, location, attention and travel time. The deviations are no (none), too high, too low, wrong, fail of, part of, unknown and unexpected. The matrrix of relevant deviations is relatively sparse with less than 50% of the parameter and deviation combinations considered relevant by the authors. This and the fact, that the article don't consider the consequences of the deviations, make me think, that maybe HAZOP studies is not the best tool for studying this system. Through 10 years of teaching risk assessment to chemical engineering students at a university in Denmark, I have always struggled with explaining to students when the different qualitative and quantitative tools: HAZOP, Whaf-if, FTA, ETA, FMEA and QRA shoud be used, and which tool to start with. Generally I have recommended starting with What-if or HAZOP studies (HAZOP is easier with limited process knowledge - especially if you use a functional approach during study preparation).

However, the analyses of causes of deviations in a traffic control system presented by Jagtman and Heijer indicate, that maybe FTA and ETA would have been better tools. Since the article only present causes of deviations, but not discuss consequences, then it is really only half a HAZOP study. I think a better choice of tool would be Fault Tree Analysis (FTA) approach in which all causes of a top event, e.g. failure of the traffic control system, is the goal of the analysis. ETA or Event Tree Analysis could complete the study by showing possible consequences of particular failures.

Thus is important not to consider HAZOP studies, or any other tool such as layer of protection analsyis, as a universal tool for all risk assessment studies. Rather the tool among the many available, which best serves the purpose of the study should be choosen. Choosing the right tool for a study requires insight both in the system to be studied and in the different tools available for risk assessment.

Monday, August 11, 2008

Equipment to be banned from plants!

For more than 25 years I have felt, that sight glasses don't belong in chemical plants. 3 years ago the list was expanded with blowdown drums venting directly to the atmosphere. Why? Because there are inherently safer alternatives available.

I would also like to see these kinds eliminated from the textbooks used to educating new engineers at our universities. Maybe authors could introduce a chapter about historical and obsolete equipment in textbooks about chemical unit operations and textbooks about instrumentation.

Recently I was greatly surprised by the article "Level: A visual concept" in the April issue of ISA's InTech. The article appeared in the section automation basics, and was actually based on the book "Industrial Pressure, Level and Density Measurement" soon to be available from the ISA bookstore. My surprise was caused by the fac, that the article appeared to advocate traditional sight glasses with a glas tube in a metal shield for pressures up to a few hundred PSI. My first job was at a Canadian chemical plant in Sarnia, and when I joined in in the early eigthies all sight glasses was already taken out of service. Why? Because sight glasses are a major hazard to the workers, who attempt to use them to determine a drum level. Today alternatives are available using steel tubes - even with remote readout of the level. Such inherently safer alternatives should be used whereever a sight glass was called for 30 years ago.

In the mid 90's a refinery at Milford Haven in southwest England burned for several days, when a blowdown drum was overfilled with liquid, and the gas outlet fractured when due to the liquid head. 3 years ago another blowdown drum overfilled at a refinery in Texas City resulting in an explosion and fire, which killed 15 co-workers. Blowdown drums are necessary to protect our refineries and chemical plants. Usually, you don't want to use them, since anything, which leaves the plant through the blowdown drum is a loss. Therefore it is easy to forget about a piece of equipment, which is not in use during normal plant operations. At Milford Haven the gas pipe from the drum was corroded past its usefull life. Both at Milford Haven and at Texas City the control room operator had no information about the level in the blowdown drum.

This indicates a common problem: Our plants contain equipment, which is only in use during abnormal operational situations. This include flare systems, safety valves, and piping associate with these systems. However, these systems should be maintained as well as the reactor or destillation tower. Only then can we be sure, that they will protect our life and our plant in an emergency situation.

So if you have sight glasses with glas tubes, then replace them with inherently safer more modern sight glasses. If you have blowdown drums in your plant, the ensure, that they are tied to your flare system, and that this equipment is well maintained so it works when needed. Only then can you limit the consequences of process safety events in your plant.

Sunday, August 10, 2008

The biggest danger to your plant!

There has been different efforts at establishing e.g. what is most important. For example the Copenhagen Concensus among economists attempt to decide what is the best investment society can make for the future of mankind. Likewise the British Government have attempted to decide what is the biggest risk to the British society. They find that the biggest risk is not terrorism, but a flu epidemic. The question, which immediately comes to mind is: What is the biggest danger to your plant?

Many facilities especially in North America have since 9/11 performed several security assessment studies of their facilities and reported the results to the ACC and possibly also the Department of Homeland Security. These studies aim to proactively access the likelyhood of a terrorist attack of a particular facility and the potential consequences of such an attack. During a visit to Baton Rouge a few years back I noticed, that the result is an re-inforced perimeter around the plants using e.g. concrete blocks to prevent large trucks from driving through the fence. I have yet to see similar efforts at facilities in my own country. Although the attack on the Glasgow airport a few years ago have already resulted in the establishment of concrete blocks around other airports, such as at Terminals 2 and 3 at the Copenhagen Airport. This indicate, that some security measures spread quicker around the world than others.

Many years ago at a biweekly safety seminar in a major Canadian oil company the engnineer manager asked the audience: What is the most important for the company about you? There was many and varied suggestions. The managers own answer was: Your health!

Without good health you may not be able to go to work. This usually means something will not get done. If a process operator calls in sick, then most companies have in place plans for calling in a replacement, so process safety is not compromised - at least short term. If an engineer calls in sick, then the likely consequences are usually that some development work or some maintenance work gets postponed - at least short term this is not a process safety concern.

Now let us assume, that a flu epidemics stikes your area. Unfortunately current vaccines are not effective against this particular virus, and within a short period a quarter of the population are sick with this flu. Can your company cope with a quarter of your employees being sick at the same time?

So maybe you should be as concerned with employee health as your are with process safety!

Friday, August 08, 2008

Is more regulation the road to better process safety?

Recently the CSB chairman John Bresland called for OSHA to adopt CSB Recommendations on a comprehensive combustible dust standard. It seems that on both sides of the Atlantic the reaction of politicians and regulators are very quick to suggest more regulation after a process safety indicent, such as the explosion and fire at Imperial Sugar Company in Port Wentworth, Georgia, on February 7th this year.
Is more regulation really the way forward? Does regulation really improve process safety? I don't believe it does. Even companies with excellent process safety systems, such as e.g. the Dow Chemical Company, experience from time to time process safety events, which make employees ask the question: Is it really worth the efforts, when we still experience these accidents?
Increase process safety, in my view, is mostly about using once own common sense. Our as one major Canadian oil company put it: "Safety is the art of working properly".

Nonetheless I read on TMCnet.com an excerpts from an interview with the CEO of Imperial Sugar. In it he says among other things: "We have treated worker safety as our top priority ... and will continue to do so." ´I think, that this focus is completely wrong! On another web-site this CEO seems to claim, that his companys lag of knowledge abou the dangers of dust explosions is due to inadequate federal guidelines on the handling of dust. I think, that indecate missing a point as far as responsibility is concerned! Isn't it the responsibility of the CEO to ensure, that he hire the people with the skills necessary to safety operate his company?

At BP's Texas City Refinery there was also a focus on workers safety prior to the 2005 explosion and fire, which killed 15 people. Both the CSB report and the Baker Panels report (also found on the CSB web-site) on that event seem to indicate, that there should have been more focus on process safety. I completely agree.

A focus on process safety will ensure, that hazardous substances and equipment are operated and maintained in such a way, that workers cannot be injured. The result of failing process safety is that workers can be injured. We need to avoid that possibility!

So in my view the best thing, which can be done for workers safety is to have the CEO focus on process safety. That way the CEO protect the shareholders from losses such as those involved in rebuilding the Port Wentworth plant after the February fires and explosions. That way the CEO also protect the employees from the consequences of fires and explosions, such as injuries and loss of employment, and hopefully he/she also hires the people with the skills necessary for safe operation of his/her facility.

Now, how do we get the CEO to focus on the right thing? The CCPS has created a 10 minutes presentation for CEO's about the importance of process safety. The EFCE WP on Loss Prevention and Safety Promotion is developing a video with the same purpose, and the CCPS are developing seminars with a similar focus according to their web-site. Will this do the job?

I don't know.

Thursday, August 07, 2008

What to be alarmed about?

As a young engineer with a major integrated oil company in Canada I had limited conception of when an alarm was needed and what was required to implement an alarm. However, that situation was quickly corrected by a more experienced instrumentation engineer, whom I worked with on computer applications for an ethylene unit.
At the time more than 20 years ago he told me, that if we generate an alarm, then the operator must act on it. The least he should do would be acknowledge the alarm. However, for good alarms the engineer should also generate suggested responses to the alarm. He then added, if you cannot come up with a suggested response, then forget about the alarm, since all it will do is contribute the operator frustration. So instead of generating alarms or messages to the operator when our advanced computer control applications for some reason could not do, what they were supposed to do, we implemented graceful degradation. That ment, that on failure of the computer control application it degraded transparently to standard Honeywell TDC 2000 control strategies - and they worked all the time.

With this philosophy about alarms this and other unit at the site was able to achieve alarm rates below the EEMUA current recommendations of 1 alarm every 5 minutes. Without any use of the alarm mangement (READ: Alarm filtering / inhibition / removal applications), which has been popular in recent years.
However, there is a fundamental question, which we did not address at the time in the mid eigthies: What should we alarm about in our refineries, petrochemical plants and pharmaceutical plants?
A few studies have adressed special situations. I am thinking of PCA for monitoring batch operations or crude mass balance simulations to discover when we are trying to put too much stuff where it should not be, such as during the startup of the raffinate splitter at BP's Texas City refinery in March 2005.

Fundamentally an alarm should be generated, when an operator action is needed because a system goal cannot be achieved or as M.Sc. student Tolga Us the Technical University of Denmark recently expressed it, when the goal is under treath. So to decide when to generate alarms we need to look at the goal of our system or subsystem, and specifically when there is a danger or chance, that this goal cannot be fulfilled.
So let us look at the application of this principle to a type of unit, which I am somewhat familiar with: an ethylene gas cracker. The purpose of the ethylene gas cracker is to convert X kg/hr ethane to Y kg/hr ethylene. Notice the quantification! Without quantification we cannot define deviation from a goal - this of course also applies to power plants generating electricity.

So the purpose of our gas cracker is to convert X kg/hr of ethane rich natural gas to Y kg/hr of almost pure ethylene and of course some by-products. This process is essentially two connected processes: the gas cracking furnace, in which the actual chemical conversion is performed, and the so-called light-end, in which the compounds from the furnaces are separated into pure or almost pure compound streams.
One of the by products of the gas cracking process is coke. Coke builds up on in the furnace tubes, and if there is too much coke in the tubes they could block, and conversion will cease. So one way the goal of conversion could be prevented is by too much coke in the furnace tubes. The amount of coke generated depends among other things on the temperatures in the furnace. Hence wrong temperatures in the furnace could also prevent the conversion process by speeding up coke creation in the furnace tubes. So some things to be alarmed about in the conversion process are too much coke in the furnace tubes and incorrect furnace temperatures.
What does this mean in terms of goal fulfilment? If the coke build-up in the cracking furnace is too high, then the ethylene production goal is treathened. However, if the furnace temperature deviate from normal, then the ethylene production goal could be treathened in the future.

In practice the coke build-up is monitored by monitoring the pressure drop over the transfer line heat exchanger at the end of the cracking tube. If this pressure drop exceed a certain value, which depends on many factor such as feed composition, cracking severity, etc. then the operator should be alerted, that a de-coke is called for. Similarly, if the temperature(s) deviate from normal then the operator properly should be alerted, so the situation does not develop into a coke build-up situation. Possible interventions could be adjustment of the secondary air flow to the furnace or adjustment of the gas-feed firing ratio.

So in conclusion: We need to be alarmed about situations, which treathens the fulfilment of one or more system goals, such as production goals. I believe, that this would limit the number of alarms configured on a particular system, and hence some features of alarm management systems may not be needed.

Wednesday, August 06, 2008

What does it take to survive a double digit fatality event?

Recently the The ChronicleHerald reported, that disaster still looms at the BP Texas City refinery. (unfortunately the story is no longer available online - not even cached by Google!). This made me think about other companies, which have experienced double digti fatality events.

The first which comes to mind is Union Carbide. After the 1984 Bhopal disaster the company struckled for some years before the remains was bought by the Dow Chemical Company. Another, that comes to mind is Nupro Ltd, the owner / operator of the plant that exploded at Flixborough ten years earlier. I seem to recall, that the company initially attempted to relocate production to Eastern Europe, but eventually the effort was abandoned.

Then there is the Phlips 66 polyethylene plant explosion in Pasadena, Texas, on October 23rd, 1989 which killed 23 persons. Among them most of the perssons knowledgeable about the facility. At the 2003 SACHE Workshop for professors at ExxonMobil's Baton Rouge facility Angela Summers of SIS-TECH reported, that eventually the follow-up on the 1989 explosion in Pasadena, and other explosions at the site during the nineties involves more than 3,300 action items. Among that a complete new management at all levels at the plant. The result appear to be a culture change, but it has certainly taken some time: more than 10 years.

Then there is BP's Texas City refinery, where the Associated Press story reported by the ChronicleHerald.ca wants us to believe, that another disaster is just around the corner. Could that really be true? Within half a year of the event in 2005 the BP board had removed all but one person in the command line from the manager of the refinery to the CEO of the company. BP has also started a world-wide education programme in process safety for all their employees.

From reading of the many exellent reports about the BP Texas City refinery, such as BP's Accident investigation report, the OSHA report on the event, the CSB report on the event, and the Baker Panel report, is seems clear, that a culture change was called for at BP's Texas City refinery. I don't believe, that a culture change at a major refinery is easily accomplished. However, I recall a report from a Canadian company at the CCPS conference in Toronto about a month after 9/11. In this report the company reported, that before a new CEO arrived they were experiencing many small fires and explosions in their facilities. Then a new CEO arrived, who demanded to have a report on his desk every day before 10 AM about any fires or explosions the previsous day - no matter how small. This action by the CEO put focus on what most considered nuissance events, and within a relatively short periode of time - less than six months, I believe - these small fires and explosions had been virtually eliminated. That was a quick culture change.

To me it appears, that a similar culture change is called for within BP. BP is much larger than the Canadian company mentioned above. Some would argue, that it would be impossble to implement a similar reporting system within the BP organization, and that it would take CEO time away from other important bussiness issues. However, the CEO is responsible for the survival of the company, and many people believe, that BP would not survive another event like the March 2005 explosion at Texas City. The CEO must put focus on process safety, and by demanding quick reporting on process safety related events on his desk every day he does exactly that!

As did the CEO of the Canadian company!

Sunday, August 03, 2008

Cyber Security or Process Safety?

Recently ISA's InTech magazine printed the article "Peril in the pipeline" with subtitle "Cyber security could have precluded gasoline rupture at Washington pipeline" (InTech June 2008). The backgroup for the article was the rupture of a 16-inch burried pipeline through Whatcom Falls Park in Bellingham, Washington, on June 10, 1999. The rupture resulted a fireball travelling 1½ miles downstream from the rupture location and killed 3 people.
Since pipeline are classified as a transportation activity the accident was investigated by the NTSB, who on October 8, 2002 issued the pipeline accident report "Pipeline Rupture and Subsequent Fire in Bellingham, Washington June 10, 1999". If you read just the InTech article you could easily be left by the impression, that the accident was coursed by the pipeline owner performing development work on its SCADA system without adequate protection of the running system from the development activities, and that on the day of the pipeline rupture this development work has resulted in degraded responsiveness of the pipeline monitoring system. Unless you read the InTech article very carefully you are left with the impression, that "the accident resulted form the database development work that was done on the SCADA system" on the day of the accident.
Wait a minute! This can't be. One cannot design a pipeline such, that the only protection form an overpressure event is a remote SCADA system? There must be independent protective systems, which protect a pipeline from an overpressure event! Actually, if one digs into the NTSB report, it is discovered, that the database development work is just the 5th probable cause of the accident listed. The other 4 are:
  1. damage done to the pipe during water treatment plant modification project and inadequate inspection work during the project;
  2. inaccurate evaluation of inline pipeline inspection results, which led to the company’s decision not to excavate and examine the damaged section of pipe;
  3. failure to test,under approximate operating conditions, all safety devices associated with a products facility before activating the facility, and
  4. failure to investigate and correct the conditions leading to the repeated unintended closing of an inlet block valve to the products facility.
In my view these four probable causes of the pipeline rupture all fall under the heading: Process Safety. Clealy the pipeline owner did not have adequate process safety procedures in place, and that resulted in pipeline rupture event. The cyber security issuse, i.e. the SCADA system degradation on the day of the event, was just a contributing factor, which prevented operators to intervene in a situation, were automatic systems should have prevented the event.

Nonetheless, it is not good practice to perform systems development work on an operating SCADA or BPCS, especially when this is done without the benefit of the security features built into systems such as the VAX multiuser operating system. Development of system software or even key applications should not be done on operating SCADA or BPCS systems without a prior assessment of the risk using established MOC procedures.