November 2018 - Edition 04, Risk and Uncertainty

Introduction

I have already slipped on my own resolution to write a newsletter a month. I was fighting the slip last month, but I am fully embracing the miss this time and completely skipping over October. In this episode I’ll cover uncertainty and risk in engineering, with a focus on the high level of understanding on how it affects engineers in both a technical and programmatic perspective.

Demetri's Corner

As mentioned in the introduction, I’ve missed last month’s newsletter. I’d like to chalk that up to how incredibly busy I am, but that would be untrue. I am fully engaged at this point and could very well justify being so busy that I don’t have time to write a newsletter, but instead I’m wasting time with minutia of administration and hardware setup. I have made little progress in terms of growth at this point, but I’m trying to convince myself that I’m laying the foundation so that I don’t have to worry about this minutia in the future.

I’ve come to realize that my analogy of an “umbrella” in my first newsletter needs adjustment. At this point, the main chute is fully open and I’m free to look at the surroundings to find a good landing spot. Seems good, right? But I’ve come to realize that as a business owner, the parachute is actively unstitching itself as you descend, so if you don’t maintain it mid-flight, you still can plummet to the ground, the ground being retirement. So all that freedom of looking around from great heights is burdened by the thought that the ground is still very far away…

Now that you’ve heard about my inability to make progress or keep resolutions, it’s on to the good stuff!

Today's Subject - Risk and Uncertainty

Risk and uncertainty are everywhere we look. An engineer is even more aware of this than most (other than perhaps an insurance adjuster) and realizes that every action in this world is laden with these aspects. As a starting point, it’s probably best to give a definition of each.

Risk – I looked this up in the Webster dictionary, but those definitions seem to be focused on either physical harm or insurance. For an engineer, it’s neither. A risk is something unforeseen which could change the anticipated response. In product design space, risk is either centered around something you didn’t foresee happening, or something you couldn’t foresee happening. Example of the former might be that you didn’t consider the chemical interaction with a structural member, weakening it to the point where it breaks when loaded in an unconventional, but allowable manner. The latter would be characterized by extreme loading of a piece of equipment due to an event which has never occurred in the past and is not within loading specifications. Both are extremely difficult to quantify. The former is especially troublesome as there is no basis for its existence.

Uncertainty – Again, the dictionary fails our purposes in this case. Uncertainty can best be characterized by the lack of control over fine details regarding the solution. That could be the amount of flexure a fixture may possess, the control over dimensions during manufacturing, the mechanical properties of a spring, etc.

Definitions aside, we want to explore how this impacts our work as engineers. The most rudimentary approach to addressing these items is to use conservatisms in design (accounts for uncertainties up to some reasonable point) and safety factors (accounting for risk). The amount of risk and uncertainty that remain after these factors are integrated into the design can be considered “residual”.

Often risk and uncertainty are “stacked” in the most conservative manner, which results in a very robust, but overdesigned piece of equipment. Stacking the risk and uncertainties together also gives an incomplete understanding of residual risk and uncertainty as interactions are not taken into account, which results in driving down residual risk even further than required. In many cases this is adequate and appropriate. It requires the least amount of design iteration to ensure that the result is “safe” and “effective” at the expense of product cost, weight, speed, etc. For a calculation, this may result in extremely restrictive operating envelopes or acceptance criteria.

A more nuanced approach would include modeling the risks and uncertainties as well as their interactions and treating the problem probabilistically. This requires detailed understanding of these items and complex modeling. Inputs for uncertainty models are generally easier to come across than for risk modeling as uncertainties tend to be based on variability that is documented. Additionally, risk items tend to be more binary, and can cause significant changes in the results based on the means in which it is modeled.

Implementation of Risk and Uncertainty

There are three types of implementation that are considered examples as illustrative. They include implementation during planning of: overall project execution, and design of a product.

Project Execution

Many don’t think that engineers plan projects, but that would be a disservice to engineers and the industry. Engineers should be well versed and included in all project planning that has technical aspects. I’ll talk about this soapbox in a separate newsletter.

The example used here is planning a project for installation of a new component into an existing system.

Let us start with some of the risks we may encounter:

  • The new component is not available when needed for installation,
  • The new component, when delivered is not up to specification, and
  • There is an interference in installation of the new component.

For uncertainties let us consider the following variabilities:

  • The amount of time required to install the new component,
  • The amount of weld material required to perform the fit-up, and
  • The time at which the new component arrives.

Let us stop and take a look at the risks. The first two are fairly straightforward and can be quantified. For the first, we can look at the past history of the supplier and their rate of defaulting. For the second, we can survey other recipients of the components access the quality of the vendor. The last items is different in that it captures any instance of interference which makes installation as planned impossible. This could be either as it fits through the door, the distance between bolt holes, etc. Since this so generic, quantification is difficult.

For the uncertainties, similarly, the first two can be fairly easily quantified. There will be some facility data on the variability in work time required to install similar components. For the second, the amount of welding is known and the variability in weld material can be fairly closely controlled and understood. The third item is also easy enough to quantify but note its connection to the last risk. The risk is that the component cannot be delivered when necessary, so its implication is that we won’t perform the installation at all. For the uncertainty, it still has to be delivered within the window in time for installation.

Here’s where it gets interesting. We could easily model all of these risks and uncertainties in schedules and cost estimates. But there would be some considerations that we need to make.

  1. Do we take into account the third risk at all? It is binary and if it doesn’t fit we’re sunk. In this case, we probably need to do everything we possibly can to lower that risk to below a level where we would account for it. Then we need to add all of that risk reduction as part of the overall plan and not try to quantify the risk at all.
  2. How do we define the cut-off between the first risk and third uncertainty? They are clearly connected as the maximum allowable uncertainty is determined by the overall schedule allowance. In this case, we would have to work the problem backward. If we had a schedule that included delivery date and installation time, we could model the uncertainty in installation time (first uncertainty) and then determine the maximum amount of variability in delivery date we can accept and still make the installation window. The first risk then become nuanced to, “The new component is not delivered within ?? days of the delivery date.”

Once you’re made these considerations and performed the modeling, you have a more complete picture on what to expect during the installation and will be more prepared going forward.

Design of a Product

Product design is what most people think of when they consider mechanical engineering, though it’s often a very small slice. All the same, it’s very visible, and therefore important to understand the inherent risks and uncertainties which can derail the design.

For this example, consider the case of a client providing a concept of a product that performs functions X and Y and must be developed under constraints A and B. All designs are a balance between function and constraints, so this could be a bridge, a trashcan, or a car.

For this example we will consider the following risks:

  • The required functions were not well defined and get “changed” by the client during the development process, or additional understanding of the functions requires that they are redefined,
  • The function required by the client cannot be met within the constraints given, and
  • Testing of the design after completion shows that the product does not work as intended due to unforeseen interactions.

For uncertainties we will consider the following variabilities:

  • Dimensions of critical features during manufacturing,
  • Measurement of functions during testing, and
  • Properties of materials used for construction

Considering the risks, the first risk is one that we always have to deal with. It is difficult to quantify the amount of pain this could entail, but generally it is based on a fuzzy understanding of the client, complexity of the problem, and past experience solving similar problems. In some cases many of these factors are relatively unknown and unknowable. The second risk is one that entails the ability to relax the constraints. The ability to do so, and the effort required is likely well understood if the problem is well understood. The last risk is much like the last risk in the previous example – a bucket where we contain things we don’t know may happen. Since we don’t know how they could happen, there is little way we can mitigate this risk.

The uncertainties are much easier to understand in this example than the previous. The first two are classical uses of uncertainty in manufacturing and measurement. The last item may be easy enough to quantify, and whether that is worthwhile should be determined.

So how would we consider these risks and uncertainties? The first two uncertainties are straightforward and can be modeled probabalistically based on inputs in uncertainty of manufacturing methods (including accuracy of measurement equipment) and the uncertainty in the equipment used for final testing. Conversely, they can be dealt with more simply with setting absolute bounds of acceptance should the design afford that flexibility. Similarly, data likely exists which can provide insight into material properties, but even more powerfully, this uncertainty could be removed entirely if the design prescribes the use of materials certified to have a certain material property.

For risks, we’ve already talked about the difficulty of quantifying all but the second item. In most cases, an attempt to quantify those is a waste of time, and the risk must be accepted as is. That is why there is so much aversion to doing leading edge engineering, because that is where these risks are greatest and most enterprises are not comfortable with not being able to perform any mitigation.

So what we’ve learned here is that the design itself can be well controlled, including the uncertainties in complex interactions. This may be complex and tedious, but well within the expected range of skills of an engineer. In this case the willingness to take on unmitigated risk (significant in some cases) is the major stumbling block.

Summary

We could have put up a hundred more examples, but I’m already over my page limit, so these will have to suffice. What we have explored is how risk and uncertainty can be quantified and be significant in very different ways for varying types of engineering activities. Beyond having to be able to quantify risks, the role of an engineer includes accepting those risks which cannot be effectively mitigated, determining how to account for uncertainty, and mitigating risks when appropriate. This all seems very complex – and it is – but engineers do it all the time. When you put together a project plan, or a design, or a safety calculation every decision is based on the inputs available, including the confidence you have on them. We will never have thought of everything, but on our good days, with a risk informed approach to our work, we will have captured those which potentially threaten life, utility, and efficiency.

Dose of Aphorisms

I made this one up just for this newsletter. If I wanted one theme to come out of all of the words above, it’s that we encounter risk and uncertainty in everything we do. Understanding that will make us better engineers and embracing the complexity of the world around us makes our work more interesting and valuable.

We live in an uncertain world, but if we didn’t, it wouldn’t be worth living.

Explanation of Fields in the SMARRT form submission

Reference Scenario Inputs:


Number of People Infected – How many potential members of the gathering are infectious. The simulation starts when they enter (time=0).

Type of Activity – Impacts the number of particles spread as aerosols per respiration. More strenuous activities result in more viroid particles being released.

Air Changes per Hour – This is the air exchange rate with fresh air for the volume of air being breathed by the gathering. If you use forced air exchange, you can calculate the number of air changes per hour for your specific situation.

Space Floor Area and Ceiling Height – These are used to calculate the total space volume.

Duration Infectious Person is Present – This is how long the infectious person stays in the space after their initial entry. For the reference scenario, this defines the end of the simulation.

Gathering Scenario Inputs:

See the reference scenario for all inputs up to Time of space entry.

Time of space entry and exit – These values represent when you enter and leave the space referenced to the infectious person. For example, if you show up fifteen minutes late, but stay an hour after the end of a one hour party, the Duration Infectious Person is Present is 60 minutes, the Time of Space Entry 15 minutes, and the Time of Space Exit 120 minutes