Reliability Engineering: Maximize System Dependability and Business Success

Back to overview page

Reliability is a crucial factor for the success of laboratory and diagnostic equipment, yet it remains widely misunderstood and often inadequately tested. Discover in the first part of our two-part series what reliability truly means in the context of these high-stakes products. This article provides a comprehensive overview of how reliability is defined.

 

By emphasizing the importance of incorporating reliability measures early in the product development process, this article aims to equip engineers, quality assurance professionals, and product managers with a solid understanding of how to ensure that laboratory and diagnostic equipment remains robust and dependable throughout the product lifecycle.

 

Statistical methods quantify how a system performs over time

Reliability is defined as the probability that a system will perform without failure for a specific period of time under certain functional and environmental conditions. This probability is usually derived from reliability data obtained in a field or test setting.

The histogram plot (Figure 1) shows the analysis of fictional field reliability data of a centrifuge.

re-figure1

Figure 1. Histogram of fictional field reliability data of a centrifuge

The data show how many centrifuges failed within a certain time range, or bin.

For example, 1,000 centrifuges failed between 27,000 and 54,000 hours (red bar), while 2,500 centrifuges failed between 54,000 and 81,000 hours (green bar) etc. Depending on the spread and sample size of the data, the bin size of the histogram may be refined to better represent the sample distribution.

 

quote icon
Reliability is defined as the probability that a system will perform without failure for a specific period of time under certain functional and environmental conditions.

re-figure2

Figure 2. Histogram (of fictional field reliability data of a centrifuge)
with decreased bin size and approximated probability density function


Ideally, “smoothing” the histogram by progressively reducing the bin size estimates the probability density function (PDF) f(t) of the data, shown in red (Figure 2).

 

System reliability ensures high performance, operational efficiency, and customer satisfaction.

re-figure3

Figure 3. Total cumulative number of failed centrifuges
over the lifetime

Another way to analyze the reliability data is to arrange it cumulatively, showing how the total number of failed centrifuges increases over the lifetime, thus yielding the cumulative distribution function (CDF) (Figure 3).

 

“Smoothing” the relative cumulative plot estimates the failure probability F(t), shown in red (Figure 4).

re-figure4

Figure 4. Failure probability at a given lifetime

 

This failure probability is the likelihood that a centrifuge will fail at a given lifetime. For example, based on the given failure probability, a centrifuge has a 50% probability of failing at a lifetime of 108,000 hours. Conversely, the reliability R(t) defines the probability that the centrifuge will not fail at a given lifetime (Figure 5). Reliability is the inverse (1 – F(t)) of the failure probability. As the failure probability increases, the reliability decreases over the lifetime.

re-figure5

Figure 5. Reliability decreases with progressing lifetime

 

The failure rate λ(t), also known as hazard function or hazard rate, is another commonly used term in reliability engineering. It is defined as the instantaneous likelihood at time t that a system will fail in the next time interval, given that it has survived up until that moment. Mathematically, it is the ratio of the probability density function f(t) to the reliability function R(t). While the failure probability F(t) accumulates the likelihood of failure over time, λ(t) reflects the immediate risk of failure at a specific time.

Figure 6 shows the relationship between these functions.

re-figure6

Figure 6. Reliability decreases with progressing lifetime

 

The probability density function f(t) represents the distribution of failure times across the centrifuge population. At a specific lifetime tx, the failure probability F(t) corresponds to the shaded area under the curve to the left of tx (gray), while the reliability R(t) is represented by the area to the right of tx (green). The failure rate λ(t) increases over time as the ratio between f(t) and R(t) increases.

Although it may seem intuitive that the failure rate increases continuously with progressing lifetime it often follows a characteristic 'bathtub' curve (Figure 7).

 

Predict and improve system reliability and quality

re-figure7

Figure 7. Bathtub curve: failure rate over time

 

The 'bathtub' curve is composed of three distinct phases and reflects the overall reliability behavior of many systems over time, see Table 1.

Table 1. The three distinct failure phases of the bathtub curve

 

Early failures

Random failures

Wear-out failures

Description

Initially, the failure rate is high due to manufacturing defects or design flaws (e.g., improper assembly or faulty components). As these ‘infant’ failures are addressed, the rate decreases.

After the early phase, the failure rate remains relatively low and constant, reflecting random failures caused by unpredictable external factors.

Over time, components such as bearings, motors, or rotors wear out due to repeated stress and friction, leading to an increase in the failure rate. Regular maintenance can delay this, but eventually, the centrifuge will need repair or replacement.

 

The Weibull distribution is a handy tool for the assessment of system reliability

The probability density functions and their corresponding histograms shown earlier, follow a normal distribution, which is commonly observed in data from natural and technological contexts. A normal distribution is symmetrical, with the same mean, median, and mode, and two identical standard deviations (sigma).

However, reliability data do not necessarily follow this symmetrical pattern and are often skewed. The Weibull distribution, shown in red (Figure 8), is particularly useful in reliability engineering because it models this skewness effectively. Although it may share the same mean and sigma values as the normal distribution (shown in blue), the Weibull distribution allows for varying failure rates, including early-life and wear-out failures. This makes it ideal for analyzing real-world reliability data where the risk of failure changes over time.

re-figure8

Figure 8. Normal distribution vs. Weibull distribution (same mean and sigma)

quote icon
The Weibull distribution allows for varying failure rates.
This makes it ideal for analyzing real-world reliability data where the risk of failure changes over time.

The shape of the Weibull distribution is determined by its shape parameter (often denoted as β or b) and scale parameter (often denoted as α or η). The diagram (Figure 9) illustrates how the probability density function f(t) of the Weibull distribution adopts different shapes as the shape parameter varies, while the scale parameter remains constant. With β =5.0, f(t) resembles a normal distribution, displaying a more symmetrical spread. However, with β =1.5, the distribution becomes significantly skewed, reflecting early-life failure behavior.

re-figure9

Figure 9. Density functions for Weibull distributions with different shape parameters

 

When plotting the failure probability F(t) of the Weibull distribution for different β values, the curves intersect at a particular point (Figure 10). This intersection (highlighted by the intersection of the red dashed lines) represents the characteristic life, the time at which 63.2% of units are expected to have failed – a critical reference for understanding system reliability.

re-figure10

Figure 10. Failure probability for Weibull distributions with different shape parameters

 

The reliability function R(t) of the Weibull distribution behaves inversely to the failure probability. All curves intersect at the characteristic life, the time at which 36.8% of units are expected to have survived (Figure 11).

re-figure11

Figure 11. Reliability functions for Weibull distributions with different shape parameters

 

When analyzing how the failure rate λ(t) of the Weibull distribution changes with the shape parameter, it becomes evident how the Weibull distribution models distinct phases of the bathtub curve (Table 2 and Figure 12).

Table 2. Weibull distribution models distinct phases of the bathtub curve

For β <1

The failure rate decreases, meaning failures occur more frequently early on (early-life failures).

When β =1

The failure rate remains constant, making the Weibull distribution equivalent to the exponential distribution, which models random failures.

For β >1

The failure rate increases as the system ages, indicating wear-out failures.

 

This adaptability makes the Weibull distribution suitable for a wide range of reliability scenarios.

re-figure12

Figure 12. Weibull distribution can model the failure rates corresponding to distinct stages of the bathtub curve

 

 

Typically, reliability data is illustrated using a Weibull probability plot (Figure 13).

In this plot, the Weibull cumulative density function or failure probability F(t) is linearized through a particular transformation. This makes it easier to interpret failure patterns and assess system reliability over time.
In this transformation, the x-axis is plotted on a logarithmic scale representing time, while the y-axis represents the log of the negative log of the cumulative failure probability F(t).

 

The black dots represent individual data points, while the red line corresponds to the fitted Weibull distribution. The blue shaded area indicates the confidence interval.

If the confidence interval is 0.9, there is a 90% probability that the true lifetime for a given failure probability (or inverse reliability) lies within the confidence limits.

Additionally, there is a 95% probability that the true lifetime is to the right of the left (95%) confidence limit, and a 5% probability that it is to the right of the right (5%) confidence limit.

re-figure13

 Figure 13. Weibull probability plot

To determine whether the Weibull distribution is a good fit for a given dataset, the correlation coefficient for a Weibull probability plot measures how well the transformed data points align with the fitted Weibull model. A high correlation indicates a good fit. This coefficient is derived from using fitting techniques such as Maximum Likelihood Estimation (MLE) or Least Squares Fitting. Additionally, goodness-of-fit tests such as Anderson-Darling, Kolmogorov-Smirnov, and Chi-squared are used to further validate the fit by comparing observed data to the expected Weibull distribution.

 

Summary

This article explains the key concepts in reliability engineering, such as:

  • Probability density functions and cumulative distribution functions
  • Reliability, failure probability and failure rate
  • Use of the Weibull distribution to model failure rates

With specific examples related to laboratory systems, the article guides you through practical approaches for analyzing failure trends, including the bathtub curve.

 

Ensure your automated systems remain robust and dependable

If you want to improve the reliability of your existing automated systems or integrate reliability engineering into the design and development process for new products, we can help you achieve high-performance products, customer satisfaction, and business success.

Contact Lukas Vaut and discover how our deep application knowledge and technical expertise can help you achieve the reliability you need.

hse logo

👉 Don't miss the second part of Lukas' article, which gives guidance on selecting testing methods for repairable and non-repairable systems.
Follow us on LinkedIn and be the first to hear about it!

 

Lukas_2194_HSE-Portrait

Reliability Engineering Expert Dr. Lukas Vaut

Lukas Vaut is a Senior Systems Engineer at HSE•AG bringing a wealth of experience in life sciences, pharmacy, and engineering to his role. With a BSc and MSc in Biosciences and a PhD in Health Technology from the DTU (Technical University of Denmark), his research focused on innovative 3D-printed microdevices for oral drug delivery.

Before joining HSE•AG in July 2023, Lukas deepened his knowledge of reliability engineering by taking up positions as a Development and Requirements Engineer at Resolve Biosciences and the Evident Technology Center Europe.
In addition to his technical expertise, he is a Green Belt® in Reliability Engineering and an IREB® Certified Professional in Requirements Engineering, ensuring robust and efficient system designs.

 

 

Lukas Vaut

Touch Base with us

Successful automation of life science and diagnostics workflows is a highly complex undertaking. With our key technology and application knowledge as well as with our high level of experience we will help you to shorten your time-to-market and grow your business.