Sunday, December 30, 2012

Read original postings carefully before replying!

On LinkedIn there are many groups in which there are excellent discussions of issues relating to process safety, such as emergency response. One of several groups, that I follow is EHS Professionals. Even in this group it is a good idea to read the original post carefully before responding. Some times the  original poster is not looking for advice or discussion even though at first glance it would appear so. About two weeks ago the following post appeared:

Emergency Management in Petrochemical Plants

Looking for training on managing emergency cases in petrochemical plans such as fires, explosions, chemical/gas spills. Contact

Thursday, September 13, 2012

CSB add nice features to investigation reports

Lately I have read the CSB report on the three process safety events at DuPont's Belle site in the Khanawah Valley in January 2010. What a change it was from one of the first CSB reports I read - the one from the explosion and fire at Morton International, which was later acquired by Dow Chemical. The report about the toxic releases at the DuPont site was issued early in the fall of 2011 after less than 2 years investigation.
This report include two features, which I have not seen before: a logic tree covering all three releases and showing root causes for each of the events, and a timeline diagram showing the company history with details of the tree recent events. I wonder what information could be gained from a logic tree of all events investigated in a given year? Would such a logic tree give information about the type of systems, that failed? And maybe this changed from year to year based on external factors.
The Khanawha Valley became known in the process safety community when 18 chemical companies in the Valley compared worst case scenarios and made the information public not only at a town hall meeting by also by having plant manager face residents at a local mall the following Saturday. The LEPC commended at the time the companies for making the information public. Since then there has been a number of events, which got the attention of the CSB. Like explosion and fire at Bayer Crop Science which almost damaged a MIC tank. According to CSB calculations the phosgene release at DuPont in January 2010 had unhealthy off-site concentrations.
Any one CSB investigation report contain valuable information about process safety events for others to learn from. The very specific nature of the investigations however mean the general learning to be gained from the investigations is somewhat difficult. Often the failing systems, such as design, training, emergency response, incident reporting, etc.are not explicitly stated. Neither is it clearly stated what systems the recommendations are aimed at. And navigational features to easily jump from e.g.the executive summary to the recommendations of interest are to a large extend missing. Hopefully this will improve.

Tuesday, July 24, 2012

Longer than most engineering educations!

How long does it take to learn from a  major process safety event? The answer is longer than most engineering educations!
Yesterday the CSB stated in a news item published after the frist day of hearings in Houston, that at the time of the Gulf blowout companies such as BP, Transocean, Industry Associations and Government Offshore Regulators had not learned critical lessons from the BP Texas City Refinery explosion five years earlier. That support the viewpoint stated earlier here and on other platforms, that a key problem with process safety is that industry does not learn well enough from from past industry disasters - not least from the accidents of others.
At todays hearing we heard, that the US offshore industry is behind the European offshore industry in collecting and sharing key performance indicators, and learing from these shared indicators. This is taking place on a voluntary basis. However, even in Norway regulators monitor the change in risk level from year to year. However, decisions that lead to disasters are taking from day to day or even hour to hour. Annual changes in risk level indicators is of litte use here - in my view.
What would be a good indicator for learning from the bad fortune of others? Would stories help? Stories about the fathers or mothers or sons or daugthers - people like you and me - killed or injured by process safety events.

Saturday, July 21, 2012

What can data do?

On July 19th in the Houston Chronicle the chairperson of the CSB Rafeal Moure-Eraso wrote under the heading "Better safety data could help prevent oil industry disasters" about the explosion and fire at BP's Texas City Refinery in March 2005, and the explosion and fire on Deepwater Horizon while it was working for BP in the Gulf of Mexico in April 2010. Mr. Moure-Eraso argues, that these disasters show the need for developing a system of meaningful safety indicators to alert about safety problems before a disaster strikes. I think this is a very fine goal, and such safety indicators can - if correctly designed and used - very likely help site management and others in the company make the right process safety decisions.
However, Mr. Moure-Eraso also want these safety indicators - designed for use in a specific plant - to be reported to the public and regulators. But what powers and knowledge do regulators and the public have to get involved in the management of a particular process plant? In my view: not much! So what purpose does publishing the indicators outside the plant satisfy? I am pretty sure, that if the purpose is to compare different plants in the same area then the indicators will loose their value to site management.
It has to be remembered, that the management of BP's Texas City Refinery carried out all the HAZOP's and other studies, which compliance with regulators required. What they failed to do was act on the recommendations contained in the reports from the studies. For example site management failed to replace the atmospheric blowout drum with a tie-in to the site flare system when the opportunity was there at a cost of less than 100 meters piping before the event in March 2005. If site management don't have the knowledge to act at such times, then I don't think any safety indicators will help no matter how good they are.
What in my opinion really is needed is work on the safety culture at each and every process plant from the plant floor workers over site management to corporate management. Only when the safety culture is developed, then will the number of process safety events experienced decline. This is an issue, which the CSB never or rarely seem to address in their investigations and studies. It is time for a change!

Monday, July 09, 2012

Process Safety is more than tracking a number

About a week ago the CSB announced, that it will hold hearings in Houston later this month to release the preliminary finding in connection with the Macondo Blowout and Explosion in the Gulf of Mexico in 2010. At the hearings experts will discuss the importance of effective process safety indicators. The first day will focus on refining and petrochemical indicators defined by ANSI/API RP 754 Process Safety Performance Indicators for the Refining and Petrochemical Industries. RP 754 is a result of CSB's investigation into the BP Texas City Explosion and Fire in 2005. On the second day of the hearings the focus will be on safety indicators related to the well blowout and explosion.
Unfortunately both the 2005 explosion at Texas City and the 2010 blowout in the Gulf of Mexico involved the same company: BP. As pointed out previously in this blog ExxonMobil faced a very similar situation at a location not far from the Macondo a few years earlier. ExxonMobil was fortunate enough to have the management systems and decision making systems in place to make the decision to abondon the well. There was a significant economic loss. But far from the loss BP is experiencing after the blowout or the loss Exxon experience with the Exxon Valdez. During times when decisions involving hundreds of millions of dollars are to be made, there is no time for calculation of safety performance indicators, review of safety performance indicators, and boards meetings to make the right decision for the company. The culture most be in place to involve people at all levels of the company in making the decision.
Everyone has the technology for the necessary communication among the decision makers. The question to ask is whether the necessary culture to involve the decision makers are also there? At ExxonMobil it was in place some years earlier. BP's CEO was informed after the blowout.
I believe, that the CSB focus on the wrong issues. Process Safety Performance Indicators whether they are labelled lagging or leading are all based on what happened in the past. No indicators will create a safety culture. BP performed all the necessary HAZOPs at Texas City, but the company did not act on the recommendations in the HAZOPs. I wish, that the CSB in the future would focus on two things:
  1. Safety Culture - how to create it, how to sustain it, and how to measure it.
  2. Training at all levels of the organisation from owner to operator.
A common feature of many CSB investigations over the past decade have been, that either there was a lag of  training or a lag of culture. The indicators were there if anyone cared to calculate them and use them. Let us get real process safety improvements by focusing on "the art of working properly", as they said in PetroCanada.

Wednesday, May 30, 2012

OODA or Business Intelligence in Process Safety

OODA is an agronym, which I learned today at the IDC Business Intelligence Conference (sorry about the Danish) at the Scandic Sydhavnen Hotel in Copenhagen. Attending this one day event was a real learning experience. More about OODA later.
Business Intelligence (BI) is the art of extracting valuable information from normally large amounts of business data and taking action to improve the business. The basis is usually a datawarehouse on top of which is placed user friendly analysis and reporting tools. Some companies, like the Danish retail chain Imerco, perform the analysis in almost real time. Others, like financial institution Nykredit, update the electronic drill down reports overnight. But why should this be of any interest to process safety?
Process plants are properly some of the worlds largest producers of data, most of which is never turned into information. Data is collected for the operator to monitor the plant, but only in case of a process safety event are the data analyzed. I think the process industry miss out on a major opportunity for continuous improvement of their plants.
How many process plants ask questions such as: Was our process running better or worse today than yesterday? During which shifts are the process most stable? or most optimal? What influence has the engineers once daily adjustment of the secondary air flow to a cracking furnace? These are just some of the questions I would like information about if I was a plant manager or a shift superintendent or a board operator. Is there already usage of BI in the running of process plants? I would very much like to read about them.
OODA stands for Observe, Orient, Decide, Act, and it is the circle of steps which make continuous improvement possible not just at the management level but also in the running of the plant by the operator, or in calibrations of the instruments technicians. The ideas of OODA was first used in the US air force in the 70's. It is in my opinion about time we use it to improve the safety of our plant.
By just monitoring process variables it is very difficult to see if you are getting closer to process safety event or not. However, by analyzing the data we can uncover relations not visible to the naked eye, and stop safety issues before they become safety problems or safety events. What do you think?
So let us start using BI tools on the data collected in our process plants and give our employees from the manager to the operator daily reports, which allow drill down learning. Some will likely argue, that our data are not good enough. I would tend to disagree. There are outliers in all datasets whether from a chemical plant or a financial institution. You need to deal with that in order to perform a meaningful analysis.

Friday, May 25, 2012

Exposure at biochemical plants

Yesterday I attended a half day seminar at the Danish National Research Center for the Working Environment (NRCWE) on microbial air pollution. My reason for attending was, that I am currently involved in risk assessment of a facility for production of protein from waste methane streams. Naturally such a place has the potential to expose workers to  biological agents of different kinds both during normal production, i.e. dust from the drying operation, or abnormal situations, during which the microorganisms decide to produce undesired byproducts. 
I was happy to discover during the first presentation by Susanne Høyer from the Danish Working Environment Authority that the regulation of biological agents are not that different from the regulation of chemical agents. It is a question of workplace assessment and evaluation of the risk of exposure. And biological agents are assigned to classes based on the consequences of exposure. That sounded quite familiar.
However, the third presentation of the afternoon was an eyeopener. Anne Mette Madsen talked about organic dust toxicity (ODTS) among workers involved in the cleaning of grass seeds (Denmark is a major producer of grass seeds used on many football stadiums around the world, e.g. those used during WC in South Africa - but not on the home stadium of FC NorthZealand - Danish Soccer Champions 2012). Pictures of both problem samples and normal samples before and after cleaning was shown. The problem samples looked a bit more dirty before cleaning, but otherwise there was no difference. The biological analysis showed no difference among problem sample and normal samples. So unfortunately a clear cause for the ODTS could not be identified.
The story about the grass seed workers show the complexity of exposure to biological agents. As a consumer you would properly start buying seeds from another supplier to avoid an exposure problem, but as an employee you are in a more difficult position. Changing job is not that easy these days. And employers of operators for biochemical plants properly should conduct regular health checks of their operators from before they start to some time after they stop working.

Friday, May 18, 2012

Will the state be any safer?

Yesterday the CSB commended the State of Massachusetts for new tough hazardous storage and processing rules. The background for the changed rules was a recommendation in a CSB report on an explosion at an ink and paint facility in Danvers. There were no injuries at the site, but a number of local residents had to be treated at a hospital. The CSB writes in their commendation:
The CSB investigation found that CAI had increased its quantities of flammable liquids over the years. The additional quantities went undetected by the local authorities who had no inspected the facility for over four years prior to the the time of the indicident.
 Clearly the owners of the site did not have an understanding of the dangers of storing large quantities of flammable liquids at their site, and the procedures and systems, that need to be in place to ensure, that this takes place safely. Nonetheless the CSB report on the explosion contains 11 recommendations directed at the General Court of the State of Massachusetts (2), Office of Public Safety, Department of Fire Services of the State of Massachusetts (4), the Town of Danvers (1), NFPA (2), International Code Council (1) - number of recommendations in brackets, and finally the company: 1 recommendation. The latter essentially state, that CAI  should comply with relevant federal laws. As of today 20% of the recommendations have been closed.
So what changes have the State of Massachusetts made to their local laws? The CSB describe the changes as follows:
At the time of the accident mandatory notification by companies to local authorities that a facility had increased in quantities of flammable materials from the initial amount listed in the permit was not well enforced. Therefore, the Board recommended that Massachusetts require companies storing and handling flammable materials to amend their licence and re-register with state or local authorities when increasing their quantities of flammable materials; they must also verify compliance with local, state fire codes and hazardous chemical regulations.
Essentially this is prescriptive medicine - and hence more enforcement. Where will this get us if an explosion caused by lack of understanding of proper process safety standards can unleash such a change in state laws? And 10 recommendations to different organisations outside the company?
I would suggest, that it is time to switch from this very prescriptive approach to regulation of hazardous facilities to a performance oriented approach requiring the companies to demonstrate, that they operate safely. This is also recommended by DNV after the Deepwater Horizon disaster.

Tuesday, May 15, 2012

Why safety culture is important!

I am continuously looking for good stories as to why safety culture and top management backing is so important for having a good process safety record. Here is one from a discussion on LinkedIn:

Is using the Audit stick to beat the organization into submission the right tool? Let's see.... You can get a horse to walk forward by twisting its tail. You may get kicked a few times and pooped on, but the horse will ususally walk forward. When you stop twisting the tail, it will stop walking forward. You can also get a horse to walk forward by leading it with the reins. This usually does not carry the same negative effects of being on the back end of a horse. If you lead the horse by using the reins and reward it for following, the horse will eventually follow without you holding the reins. Now apply the horse story to using the Auditing tail twisting and real Management support of leading by being in front. The value of Auditing as a way of improving performance is short term, unsupported and will eventually tail! Leading and rewarding will yield continued success. In practice, both techniques are being used. Guess which ones succeed long term and have more employees "buy-in".
I totally agree with this story, but as usual with stories you need to consider the cultural envinronment in which you use them. If your audience has limited knowledge of horses, then you may need to modify to story. I can highly recommend joining LinkedIn groups and enjoy stories such as this one, and other sharing of information. However, based on yesterdays post: You need to be selective about which groups you join, and be ready to leave if the benefits of your reading decline.

Sunday, May 13, 2012

Are LinkedIn discussion groups keeping on track?

I am active in quite a number of groups on LinkedIn related to safety. I very much enjoy the discussions which I often find provide me with perspectives, that I would not have thought about myself. Therefore I get concerned when a group owner or moderator need to remind us about that rules of copyright also apply to those discussions. Since I enjoy the challenge of searching for information on the WWW, I properly have on one or two occassions provided a link, which I should not have provided due to copyright issues. However, I really get sad when I see someone poste things to a discussion group without seeking the views of others or information. The following is my comments on such a case.
The following is a series of postings on a LinkedIn group. This particular discussion was started under a heading relating to HAZOP, and in a later posting the author stated, that the purpose was to share experience about HAZOP. So let us analyse what experience have been shared up to the point, where I decuded to unfollow the discussion.
The initial posting was:
HAZOP study is part and parcel of PHA intended to examine and evaluate overall process design through continuous brainstorming sessions and to find any operational deviations and process interactions which may lead to an hazardous environment which leads to hazardous environments and operability problems including
1. Leading to safety, occupational health and environmental ailing to personnel
2.damage to asset/equipment/environmenta.
3.environmental emmissions
4.operability and manitainability problems etc.
The first sentence gives an incorrect view of what HAZOP studies are. The purpose of a HAZOP study is to identify hazards and operability events in a proposed design caused by deviations from normal operation and evaluate the possible consequences of such events. Of the four bullet points listed in the first posting only the last one have some relevance to a HAZOP study. What experience is shared here?
The second posting was:
Plant non-availability / limitation and lack of product quality / production loss;
 Environmental emissions;
 Demolition / Decommissioning / Abandonment reviews;
 Construction and commissioning hazards.
I wonder what the relationship is between the first line and the three bullet points? And again: What experience is shared here?
The third posting was:

Even though HAZOP essentially performed when the design is essentially complete and P&ID's were developed and contains all essential information. Even though HAZOP can be applied on different stages of a project as below. 
a. conceptual design phase and capital porject phase where the design content and major system components are decided still the design and documentation to conduct HAZOP not ready. Eventhough it is necessary to identify all the major hazards at this stage itself to facilitate their consideration in the design process to facilitate future HAZOP studies. 
It is very critical phase where a detailed design is developed, methods of operations and respective documentation is intact.the ideal period for HAZOP is just before the design is frozen as it elaborates and helps to prepare a meaningful questioning crieteria upon which meaningful answers can be obtained. 
c. Detail Design Engineering Phase(DEED): Here HAZOP is conducted in between finalyzed design and before issuing engineering drawings for P&ID's, C&E etc. with approval for construction. 
    d.Installation and commissioning Phase: 
    Wherever the operation sequences are critical and commissioning and operation can be hazardous or there is a substantial change in the design at the fog end of the design stage it is advisable to have HAZOP review before system start up. 
e.Operations PHASE: HAZOP Study CAN be considered during Operational Phase before implementing any changes to the existing system (for example, through MOC Procedure) that could affect the safety / operability or impact the environment. 
NOTE: HAZOP Study conducted after finalization of FEED will improve the process safety to a great extent and will help in reducing high risk recommendations at a later stage.
The first paragraph in this posting makes it sound as if HAZOP is another of those things, that need to be ticked off during a process design. Much like environmental impact evaluation seem to be in some countries.
At least in the major projects, which I have been involved with, design drawings were released in a more or less continuous stream to sub-contractors. Hence the HAZOP studies, and there were indeed several was scheduled with moving from one project phase to the next, e.g. from design to start of construction or from construction to commissioning/startup. Usually three different HAZOP studies were performed.
At any stage in the design process there is sufficient information to conduct the form of HAZOP outlined in the early papers published by the people at ICI, such as H.G. Lawley's ”Operability Studies And Hazard Analysis” (Chem. Eng. Progr. 70(4), pp. 45-56. 1974) or C.D. Swann and M.L. Prestons “Twenty-five years of HAZOPs” (J. Loss Prev. Process Ind. 8(6), pp. 349-353, 1995).
The fourth posting was:
The process of HAZOP analysis is based on a “guide word examination”, which is a deliberate search for deviations from the design intent of study nodes / process section.
 The review shall follow a structured step by step format in detail.
a) The complete process needs to be studied is divided into various study nodes.
b) For each node, various parameters, guidewords and deviations are considered.
c) For each deviation, causes are identified (if any cause is not credible, it is ignored).
d) For each credible cause, consequences are identified assuming no safeguards are present.
e) For each consequence, existing protections are identified.
f) Assess severity & likelihood and identify Risk Rank for the consequence.
g) After considering existing protection, if the risk level is considered unacceptable, recommendations for mitigation of the risks are made
This posting makes me wonder if the poster have ever participated in a HAZOP study, or acted as a HAZOP study leader. It reads as if it was based on a single lecture on HAZOP included in many academic process design courses without the course lecture having had any practical experience with HAZOP either as a participant or as a leader. How is the process divided into nodes? How are guidewords and parameters selected? Also most writers agree that the HAZOP study process has 3 phases: 1) A pre-meeting or preparation phase, 2) A HAZOP team meeting phase, and 3) A post-meeting or follow up phase.
Excellent examples of relevant combinations of guidewords and parameters have been published by e.g. American Home Products. And other may be found by googling "hazop guide word tables". However, before these can be used one need to adapt them to the type of process under consideration. For biochemical plants one have to add for example contamination.
The reason for this comment is, that I am sad to see the quality of discussions in the LinkedIn groups being reduced to just sharing of poor textbook content.

Friday, April 27, 2012

Are we preaching to the converted?

Today CSB came out with a statement in which chairman Rafael Moure-Eraso promissed the following: 
recommit CSB to this important mission: preventing accidents by investigating them thoroughly and making the results public along with critical safety recommendations aimed at saying lives and protexting the public and the environment
Unfortunately the experience shows that little does investigation reports and recommendations change the industry or even a single company. Just look at the March 2005 BP Texas City Refinery Explosion and Fire. It was investigated by OSHA, CSB, and a special panel. Only CEO's change the actions of companies!

Less than 2 years after I left Exxon Chemical Canada the Exxon Valdez ran aground in Prince Williams Sound near Alaska, and created the largest and longest lasting oil spill in arctic waters to date. The CEO and board toke action. Since alcohol was involved in the event near Alaska all employee at North American sites were subjected to random test for alcohol in the blood when they showed up at work. These test was performed by an external company. The in the early nineties the Exxon board introduces the OIMS, and had Lloyd's of London regularly certify the quality of this system.

A few years before I returned to an academic career in Denmark the CEO of Dow Chemical had challenged his company to reduce a number of safety performance parameters by 90% before 1995. Much later some of my friends at Dow told me, that a the time of the commitment they had no idea how this should be done. Just that it could be done. Much like JFK's statement about putting a man on the moon before the end of the decade. Dow followed up there initial goals (not all were reached) with new and more challenging ones in both 1995 and 2005. Unlike ExxonMobil they state their goals publicly and provide quarterly progress reports.

So how can we get more CEOs and company boards to act like those of ExxonMobil and Dow? It does appear that boards and CEOs of major companies are aware of the issues and do take action. The question is how to make the CEOs of start-ups and small shops aware of the need for thorough process safety analysis, and then acquire the necessary expertise to execute a HAZOP and take action on it.

Monday, April 23, 2012

Fixing the real problem!

This morning I read an article (in Danish) about researchers at DTU Informatics finding low security in the Zigbee protocol, which is being developed for wireless communication with devices in your home, e.g. turning on your stove or your coffee maker remotely. Apparently the protocol could allow hackers to turn on your stove without anything being on it. That is of course a potential fire hazard.
However, fixing the security of the communications protocol does not fix the root cause of the problem. For the stove the root cause is that an element can be turned on without anything being on it. Most of us have at one time or another been standing in front of our stove, and turned on another element than the one we intended to. This can happen because the stove have no pot-on-stove-sensor built into each of the elements. Similarly many  coffee makers have no protection against turning them on without water in the reservoir.
So even though there may be a security problem with the Zigbee protocol, the root cause of the potential hazards which this problem allows exploitation of are safety problems with the particular devices. These device now have to be re-designed with the remote operation in mind.
The issues are no different when you introduce remote wireless monitoring and control in your chemical plant or refinery!

Thursday, April 05, 2012

Will the rest of the world follow?

In Houston, Texas at the 8th Global Congress on Process Safety it was announced, that the AIChE have recommended and ABET have approved, that proficiency in all chemical process hazards are required by a broad range of engineering disciplines. For those who don't know: ABET is an organisation which review and approve the curricula for engineering education at US universities.
This means that most if not all US universities will make the necessary changes to their engineering curriculum to comply with the new requirements before their curriculum is up for the next ABET review. So within the next few years we can expect that chemical engineers and other engineers educated at a US university will have basic process safety proficiency.
I think this is equivalent to the change that happened with chemical engineering education when it went from pure description of processes to become an engineering science with fundamental courses in thermodynamics, transport phenomena, process control, modelling etc. It is a big fundamental change. An the US is lucky, that their engineering curricula are reviewed and approved by an organisation independent of the universities providing the education. Europeans and others around the world are not that fortunate!
In Europe and properly many other parts of the world the university alone decide what goes into an engineering curriculum. Granted, in Denmark an education have to be approved by the Department of Education, but that approval does not involved a review of the curricula for the individual courses. Therefore it is a bit difficult to see what incentive universities outside the US have to follow the good recommendations from the AIChE.
FEANI does other the title EUR ING to graduates from European universities. However, this title have no requirements for the particular courses followed, just the duration of the university studies.
So how can Europe ensure, that its engineers are as well educated in process safety as their American colleaques? We can hope, that progressive universities will adapt their B.Sc. and M.Sc. engineering curricula to include a process safety requirement. We can ask organisations such as EFCE - European Fedration of Chemical Engineering and EPSC - European Process Safety Center to promote the requirement of proficiency in process safety for European engineers. In the United Kingdom there is a system to make it happen. It is called the chartered engineer. Organisations like the IChemE and their sister organisations can make is requirement for membership to have a proficiency in process safety. Will they do that? What will happen on continental Europe?

Causes of process safety events

In an InTech Tips and Strategies for Managers Eddie Habib and Bill Hollifield under the title "Alarm management: Stop designing for failure" put forward two interesting postulates:
  1. Engineers design for failure.
  2. All accidents are caused by human errror.
To the first one, I would simply say: Of course engineers design for failure. All engineered devices, even chemical plants and there equipment are designed from a specification. This specification tells the engineer, how many times a control valve is expected to open or close before it fails, or what temperature and pressure a vessel shold be able to withstand and for how long. Similarly a light bulb is designed with a certain number of hours of use in mind. Designing according to specifications such as these are what engineers do. Similarly they design bridges for a certain load, and high rises for a certain wind and snow load. So if engineers should stop designing for failure, then they should stop designing - period!

Failure is inherent in engineering design. Therefore companies implement systems for monitoring there processes and systems for preventive maintenance to exchanges pieces of equipment before they fail. Process alarms is just one tool to alert the operator to process safety events, which may be caused by a failing piece of equipment. The trick is to discover failures and weaknesses before they cause major process safety events, such as the explosion and fire at Texaco's refinery at Milford Haven on July 24, 1994. Good alarms system does that!

They alert the maintance people to equipment in need of inspection, before the equipment fails. They alert the process operator to plant situations in need of attention before the process automatically shuts down. The challenge is to design these alerting system, such that only the necessary alerts are produced. Here alarm system guidelines, such as those form the EEMUA and the Norwegian Petroleum Directorate, and books such as "Alarm Management: Seven Effective Methods for Optimum Performance" by Habibi and Hollifield, can help. Actual engieering approaches to when and how to design an alarm is an active field of research.

Now to the second postulate. I agree, that all accidents are caused by human error. The challenges is to not stop the accident investigation at the first human error discovered. Quite often this first human error is just the top of the iceberg. For example at BP's Texas City refinery during the startup of the raffinate splitter on the evening of March 22, 2005, the first human error was, that the operator filled the bottom of the splitter to 100% of the bottom level indicator. The startup procedure called for filling to 50% of the bottom level indicator.

As we now know from the accident investigation report from BP and the reports from OSHA, CSB and the Baker Panel there was more to this process safety event, than this initial error. There were errors in supervision, errors in maintenance, errors in process improvement, errors in allocation of funds, errors in training of people, errors in management etc. However, all of these errors are also human errors!

In my country we have experienced a number of train accidents. Some with fatalities. Some without fatalities. A common factor in the investigation of these events appear to be, that the investigation stops, when an error by the train driver is discovered, and there was found no errors in the train and signal equipment. This is rather unfortunate, since the chance of discovering errors in training, errors in train design, errors in signal design etc. are missed.

So, yes! All accidents are indeed caused by human error. To learn from the accident we must however discover human errors on all levels in the organisation from the operator to the CEO.

Tuesday, March 20, 2012

Process Maintenance and Process Safety

The December issue of Hydrocarbon Processing features an editorial by Reliability Editor Heinz P. Bloch titled "Spare parts availability and the need for non-OEM options", which you can read here. In the editorial Bloch mention a somewhat dated article with facts about failures in chemical plants:
  • 25% of all failures are preventable but not prevented
  • 15% of all failures are predictable but not predicted
  • 20% of all failures are predicted but not acted upon to undertake repair
  • 25% of all failures are predicted and machines stopped to do repairs
  • 15% of all failures are neither preventable nor predictable
Since these data predates the turn of the mellinium, the question naturally is: Do they still hold? The short answer is yes! However, the definitions need to be updated, so here are the updated definitions:
  • 25% of all failures are preventable but not prevented because of an arbitrary decision that is simply not rooted in knowledge or experience
  • 15% of all failures are predictable but not predicted
  • 20% of all failures are predicted by not stopped to undertake repair e.g. because an experts request to shutdown was overruled by someone in authority
  • 25% of all failures are predicted and equipment is shut down for repair, but only with restorative maintenance efforts instead of proactive upgrades.
  • 1% of all failures are neither preventable nor predictable. Since human beings make the decision to build in earthquake zones and what building code to apply. Levies may or may not be built, maintained or not maintained. The 1% covers situations, where an event in a neighboring unit spreads to another unit, and result in a failure (the originating event is most like in one of the other categories!).
Now try to substitute either "process safety events" or "occupational safety events" for "failures" in the above descriptions. What would be the percentages in each category for those events? Do you think the percentage have changed over the last 20 years? Recall that during that period process safety managers have learned abbreviations like RMP, MOC, SIL, LOPA and many more.

Is more process safety always good?

Jalila Essaïdi discus the idea of making human skin tougher, so it e.g. can withstand the impact of a bullet. His work is part of the 2.6g 329m/s project, which concerns performance standard for bulletproof vests in collaboration with Forensic Gnenomics Consortium Netherlands.
On the surface it sounds good if soldiers could be better protected from the impact of bullets by toughening the skin, just like today they are wearing silk underwear as part of the their battle dress to protect against splinters from road side bombs. However the discussion on Jalila's blop site already shows, that the world is not that simple. One commenter remarks e.g. that the tougher skin also makes it more difficult for a doctor to get inside you to make life saving fixes.
I am just wondering if we can also get to much process safety at our chemical plants and if safety sometimes get in the way of   other things on the agenda, e.g. sustainability. Although on the surface process safety and sustainability should go hand in hand. But overall this discussion shows the importance of the MOC process now implemented in most process plants.

Thursday, March 15, 2012

Mainframes doing process control - again!

Last week an old friend and co-worker from my days in the Canadian petrochemical industry visited us here in Denmark for a couple of days. One of the days he and I drove to Kalundborg while the ladies were chatting and looking at Frederiksborg Castle.
Naturally giving our coming background in the oil industry the talk turned to process control computers. And my friend mentioned a new installation involving more than 50 servers, some properly vitualized. But certainly more than one box. My friend also told me, that one of his recent projects was life time extension of some mainframes for process control installed in the early 70's. That must have been either IBM 360 or IBM 370 systems. They properly cost a fortune at the time. But wait a minute! Think about the per year cost! These machines have been in constant use for about 30 years, and their useful life was being extended - properly by 5 or more years.
In light of this I said "I can't understand why one are not using modern editions of the mainframe like the z90 for  process control these days". The machine is powerful enough in the smallest possible edition to host hundreds of virtualized machines. And with the extender or sidecar there is even the possibility of have blades running Windows software if a particular application requires that.  My friend agreed, that this sounded like a good idea. I added that it would have the added benefit, that the hardware would have the muscle to run very complex real time simulations, which could be updated with real time process measurements. The possibilities appear limitless.
And I think the saving in cabling would be considerable. Of course their is a small problem. You will need engineers with a slightly different skill's, than those maintaining Intel based hardware. But you have the same education problem if your SCADA is change from Siemens to Honeywell to ABB.
What do you think?

Tuesday, February 14, 2012

Is process safety all about execution?

Last evening I attended a presentation at the IDA - the Danish Society of Engineers - about the Deepwater Horizon. The presentation was by Graham Bennett from DNV. Unfortunately the full report on DNV's investigation is no longer publicly available. Yesterdays presentation ended with a description of the ongoing effort by authorities both in the US and the EU to create new regulations which aim to prevent another similar event. That has been the modus of operation of authorities since Bhopal, since Piper Alpha, since Seveso. But does it work?
One could argue that the regulation have worked, since the world have not seen another Bhopal - although during the flooding in central Europe in the summer of 2002 we came very close to such an event during a major toxic gas release. In his presentation Graham Bennett also pointed to similarities in the lag of effective emergency management and communication on Piper Alpha and Deepwater Horizon. Both were supposedly designed to survive the type of events they experienced. They did not due to lag of efficient decision making during the initial phases of the emergency.
In his presentation Graham Bennett also mentioned, that in 2006 ExxonMobil was drilling a deepwater well not far from the Macondo formation. That was Blackbeard, which was abandoned at a dept of more than 32000 feet because Exxon drillers felt it was not safe to continue after the rig experienced pressure shocks. At the time the decision to abandon the well escalated to the level of the CEO within hours. The CEO had the guts to make the decision to abandon the well  with a loss of almost 200 M$ and not risk employee lives and company image.
I think this is the main difference between companies like ExxonMobil and Dow Chemicals and other major players in the industry: A lag of effective means of escalating a decision to the top level of the company. I base this on my initial chats about OIMS in the 90's. I learned about OIMS during an afternoon patio conversation with a friend from the days at University of Alberta. He explained the basic ideas to me, and through former colleaques at Imperial Oil I got in contact with people who was able to explain OIMS to me both in a research laboratory environment and in the settings of a refinery and chemical plant. At one of these meetings after talking about OIMS for several hours I was told, that the OIMS manual had been declared company proprietary, so I could not have a copy even for teaching purposes, but that BP had a very similar system, and their manual was freely available on that company's website. Therefore my conclusion: The difference is not in the operations management systems - or whatever each company calls them - as such, but in how the execution works in day to day operational decisionmaking throughout all levels of the organisation. It's about the connection between the ground floor and the top floor!

Friday, February 03, 2012

Are you paying attention to Oracle?

Do you remember that Oracle a few years ago acquired Sun Microsystems? What has happened since and should the process control community pay attention? These questions pupped into my head when I attended "The Extreme Performance Tour" hosted by Oracle at the Thyco Brahe Planetarium here in Copenhagen a few days ago.
A very short time after the take over was legally completed Oracle announced the Exadata - the first end-to-end engineered system to run the Oracle database. However, in my view Oracle was just playing catch up. For years you have been able to buy IBM mainframes custom engineered to run the DB2 database. So the news is that now there is competition on this very special market of large databases with easy access to the data from anywhere. Since then Oracle have also engineered the Exalogic; properly to compete with Websphere. So also here competition have increased.
The process control community is increasing its usage of simulators and modelling. Often these are custom made systems to a particular plant. At least one large user of process control systems, such as those from ABB or Honeywell, have several years ago decided, that they are better off adding hard disk capacity for history data than spending any money on consolidation of these data. However, usually the history data still reside on the process control network. This locations of the process history data to some extend limits the access to the data and the use of the data.
If the process history data resided on a large corporate computer, such as an IBM mainframe or an Oracle Exadata, then controlled access to the data both in-house of for collaborators in engineering companies and universities would be much easier. Even though process control computers today are standard off-the-shelf hardware, and run standard off-the-shelf software, are many user for good reason limiting external access to the process control network.
However, such large process history databases be used? They could be used for example to compare refinery performance over the last two turn-around circles. Retail companies for many years have used so-called business intelligence software to compare sales during the last two Easter periods. Such analysis of process data could reveal periods of improved or degraded performance. Another possible usage of high frequency process history data is development of process models that are fitted to the actual history data from the plant. So I think the process control community should pay attention to Oracle! or their competitors.
Some large companies may already have the necessary data processing capacity in house to explore the information hidden in the process history data. So what are stopping you?