This page shows our approach of a component based hazard analysis using an example. For the concept and references please use this link.
To outline our approach, we employ an example, which stems from the
RailCab research project at the University of Paderborn. The RailCab
project was founded at the University of Paderborn in 1998 in order to
develop a new railway system that features the advantages of public and
individual transport in terms of cost and fuel efficiency as well as
flexibility and comfort. The system is characterized by autonomous
vehicles traveling on demand instead of trains according to a fixed
schedule. For real-life validation of the complex system, a test track
in a scale of 1:2.5 was built at the University of Paderborn in 2002.
One particular problem is to reduce the energy consumption due to air
resistance by coordinating the autonomously operating RailCabs in such
a way that they build convoys whenever possible. Such convoys are built
on-demand and require a small distance between the different RailCabs.
In convoy mode, the leader sends position reference values to the
following RailCab which are used by the follower to control its speed.
Additionally, the system employs a doubly-fed linear drive using
magnetic fields in the RailCab and on the tracks for acceleration. The
RailCab sends set values
concerning the speed, direction and amplitude of the magnetic field
wave to the controller on the track side in order to accelerate and
brake.
Figure 1 shows the communication structure between the RailCabs and the
inverter modules on the tracks which contain the magnetic field wave
controller. We use a simplified version of the system architecture of
the 1:2.5 test track within this demonstration.
We start the demonstration by showing the component type definition
for the speed control component of a RailCab (upper part of Figure 2).
The speed control component has several ports for input data (e.g.
position reference values from the leader RailCab (port inPos), speed
sensor values (port
inSensor)), a port for a connection to the execution hardware (port
ex), and an output port out for sending set values to the magnetic
field controller.
We present the deployment diagram of Figure 3 in the next step. The
upper part contains the simplified system architecture for a single
RailCab. In the lower right part, the controller for the magnetic field
wave on the track is shown.
After showing the deployment diagram, we specify the hazard conditions. It is a hazardous situation if a RailCab drives in a convoy with the wrong speed. Therefore, a value failure on the out port of the speed control component instance is specified as a hazard condition. We specify as a second hazard that the magnetic field wave controller computes wrong set values for the actuator since the resulting incorrect acceleration of the RailCab may result in an accident. This hazard condition is modeled as a value failure at the in port of the actuator component instance.
After modeling the system architecture, its failure propagation, and the hazard conditions, we employ the top down analysis in order to determine which errors of the components result in the two mentioned hazards. Figure 4(a) shows the wizard page with the relevant results of the top down analysis.
The next step is to compute the probability of the hazards. Figure 5 shows the wizard page for specifying the error probabilities . On the next wizard page (s. Figure 6), the computed hazard probability is shown. The hazard concerning the wrong value on the outgoing port of the speed control component has a probability of 0.0100999899.
The top down analysis provides us with the relevant errors which
lead to the hazards. We use the bottom up analysis to determine the
propagation path of two of those errors through the system architecture
to search for possible improvement points in the system architecture.
We inject an error into the distance sensor and simulate the failure
propagation. In the simulation, the injected value error is propagated
to the distance control component and finally to the speed control
component by the specified and automatically derived failure
propagations. There, the failure is not propagated any further and no
hazard occurs. We additionally inject a value error into the data eagle
component. Then, a simulation of the failure propagation based on the
injected errors leads to a value failure on the out port of the speed
control component which fulfills the hazard condition. This simulated
failure propagation is shown in Figure 7.
The value error in the data eagle communication hardware ultimately
leads to the hazard condition. The data eagle device already employs
checksums to detect wrong values, but it is still possible that the
values are changed in such a way that the checksum is still valid.
In order to further reduce the probability that faulty values are
received and remain undetected, we employ an acceptance test which
checks whether received values are reasonable. We will, therefore,
apply the "Insert Acceptance Test" transformation to the deployment
model. Figure 8
shows a part of the resulting deployment model.
The second presented hazard condition is a wrong value sent to the
actuator by the wave control component. One source of this hazard is a
value error in the hardware which executes the wave control component.
In order to reduce the likelihood, we will make redundant copies of the
wave control component which are executed on different hardware.
This change to the model is accomplished by the application of the
"Triple Modular Redundancy" transformation. The transformation works by
creating two clones of the wave control component. All incoming
connections of the original wave control component are connected to
multipliers which
distribute the values to each of the three wave controls. The outgoing
values are then combined by the voter. Figure 9 shows a part of the
resulting deployment diagram. Note that the transformation does not
include the deployment of the cloned wave control components as an
automatic deployment of software components to execution hardware is
not part of the TMR transformation. We, therefore, add
new execution hardware and deploy the cloned component instances as the
next step in this demonstration.
After application of both transformations, we redo the top-down analysis. We assume that the acceptance test does not detect a wrong value with a probability of 0.01. The quantitative analysis computes a significantly lower probability (0.0002009898) for the hazard concerning the wrong value on the outgoing port of the speed control component for the transformed architecture. This is a direct result of using an acceptance test.
After the demonstration of the transformation application, we take a look into the specification of the "Triple Modular Redundancy" (TMR) transformation. Figure 10 shows the transformation diagram for the "Duplicate component instance" transformation which is later on used by the TMR transformation to clone a software component instance.
The signature at the start node shows that the transformation needs
two parameters: the software component instance that is to be
duplicated, and a name prefix for the new instance. The transformation
returns the duplicate software component instance. The transformation
consists of three activities. The first activity begins with the comp
object which has been passed as an argument and thus is already bound.
Next, the type of the component instance identified by comp and the
diagram which comp is deployed in are bound. After that, a new software
component instance of
the same type is created. The new component's name is composed of the
prefix and the original component's name.
The next activity is a for each activity. For every match of the
specified structure that can be found in the architectural model, its
each time transition is traversed. In this case, this happens for every
discrete port instance attached to the original component instance. The
each time transition leads to the third activity which creates a new
discrete port instance on the duplicate component instance. Since this
is done for all ports of the original component instance, a duplicate
of the original component instance with the same number of port
instances is returned in the end.
Figure 11 shows a section of the "Triple modular redundancy" transformation. The first activity creates two copies of a given software component instance by calling the "Duplicate component instance" transformation. The component instance for which the TMR technique is to be applied is used as an argument for both transformation calls (represented by the rounded rectangles). After that, multipliers for the incoming values and voters for the outgoing values of the component instances are created. The port instances are rewired appropriately.