Architecture Matters

While there many solutions for a given problem, often they create more complexity than needed. This comes from the fact that the complexity reflects the learning process. In the beginning, the information looks confused and overlapping and not well structured.

Gradually, order will come forward as insight will have been gained. This comes from analyzing how it really works and  taking a step back. In engineering we call this  formalization. What it comes down to is building abstract models that help in getting rid of the confusing details. Complexity is a sign of a problem not well understood.

Architecture is the forgotten art of engineering. Clean architectures make a difference.

StarFish © Fault Resilient computing

There was a time when mechanical solutions were dominant. Bulky and heavy, but  mostly inherently safe because operating in the continuous domain, they had the benefit of graceful degradation.

 

 

 

 

 

Now digital electronics and software dominates because it makes things smarter, more flexible, cheaper and lighter. The issue is that we are here in the digital domain and every clock pulse (often billions per second) a single glitch can make things go wrong.

The solution again is the architecture and concurrency. Concurrency provides  more performance but also redundancy when needed.

Such a safe architecture cannot be  put as a layer on top of  an unsafe one.  The underlying basis must be  correct be design. Therefore OpenComRTOS was formally developed as well as formally proven.

Less code  means also lower probability of error. Scalability by design also means that  transparent parallel processing makes distributing the  work for redundancy is built in. No clumsy middleware to deal with gives better performance.

Safe Systems = SIL4

Making systems safe by design is not easy. It requires thinking up front about behavior that should never happen.  Even when done properly, there is still the so-called common mode failure. It symbolizes that the unthinkable can happen and that it will happen is a certainty.

Therefore we disagree with safety thinking that goes for  so-called fail-safe states when a fault is detected. An impaired system that is no longer fully functional is no longer a safe system.

In safety  terms systems should be designed  for  a SIL4 level, impaired by a fault but still fully functional in SIL3 mode.