Guided root cause analysis of machine failures - Status 2022
report
Today’s complexity of high-tech systems makes diagnosing system failures a tough task for service engineers. Increasing product variability and fast market introduction of new generation systems prohibit the expertise build-up that served service engineers in the past. Traditionally, system knowledge is transferred to the service organization through service manuals and training. This turns out to be inadequate in the complex world with customers expecting high system availability. Our goal is to transfer design knowledge to the service organization in the form of computational models such that the service engineer has an actionable tool to assist them in their diagnostic task. Major part of our research is to create these computational models in a structured, scalable, and maintainable way that fits into the system development way of working. The basic idea is to
1. Define input/output behavior every component type used in the system, both its normal behavior as well as for every failure mode. A prior probability for every failure mode needs to be established.
2. Compose a system model with the component descriptions as building blocks following the physical or functional structure of the system.
3. Define a set of tests (observations, measurements, service actions) that can help in diagnosis.
Based on these description, we automatically create a computational diagnostic model that can
1. List the most suspected components that have failed, including the uncertainty associated with these hypotheses.
2. List the best tests that will increase the accuracy of the diagnosis the most, i.e. reduce the uncertainty the most at the least cost and effort.
Iteratively feeding the test results into the model then iteratively leads to a improved diagnosis until a service action is appropriate. Next to the model, we developed a prototype service engineer oriented user interface to convey the ideas and way of working. This idea sounds easy and is certainly not new, but to apply this methodology in practice has many pitfalls. Next to outlining the methodology in detail, in this report we also describe our solutions to the stumbling blocks we came across while applying the methodology on an industrial printer use case.
1. Define input/output behavior every component type used in the system, both its normal behavior as well as for every failure mode. A prior probability for every failure mode needs to be established.
2. Compose a system model with the component descriptions as building blocks following the physical or functional structure of the system.
3. Define a set of tests (observations, measurements, service actions) that can help in diagnosis.
Based on these description, we automatically create a computational diagnostic model that can
1. List the most suspected components that have failed, including the uncertainty associated with these hypotheses.
2. List the best tests that will increase the accuracy of the diagnosis the most, i.e. reduce the uncertainty the most at the least cost and effort.
Iteratively feeding the test results into the model then iteratively leads to a improved diagnosis until a service action is appropriate. Next to the model, we developed a prototype service engineer oriented user interface to convey the ideas and way of working. This idea sounds easy and is certainly not new, but to apply this methodology in practice has many pitfalls. Next to outlining the methodology in detail, in this report we also describe our solutions to the stumbling blocks we came across while applying the methodology on an industrial printer use case.
TNO Identifier
981331
Publisher
TNO
Collation
82 P.