This paper introduces a novel method called extended phase space topology (EPST) for machinery diagnostics and pattern recognition. In particular, the research focuses on fault detection and diagnostics of rolling element bearings. The proposed method is based on mapping the vibrational response onto the density space and approximating the density using orthogonal functions. The method has been applied to vibration data of a rotating machine where the data were measured by proximity probes. The method was applied to two operating conditions: constant operating speed and variable operating speed. As will be shown, the proposed feature extraction method has an outstanding capability in characterizing the system response and diagnosing the system. The method is evidently robust to noise, does not depend on expert knowledge about the system, requires no feature ranking or selection, and can easily be applied in an automated process. Finally, a comparison with utilization of statistical features is performed for each operating condition, which demonstrates that the proposed method performs better than the traditional statistical methods.

## Introduction

Rolling element bearing defects are one of the major sources of breakdown in rotating machinery. Hence, monitoring the condition of bearings and diagnostics of their associated faults is essential for production efficiency and system safety. Techniques for fault detection and diagnostics of bearings have gained considerable attention among researchers across the globe. The key to a robust and accurate prediction is to be able to extract an efficient set of features that can characterize the system response in a unique way and provide as much information as possible about the intrinsic dynamics of the system. These features should be informative and nonredundant. That is to say, a feature set is ideal if the system response can be reconstructed from it. Various time [1–5], frequency [6–11], and time-frequency [6,12–15] features have been developed and used in the literature, each of which has its own strengths and weaknesses. Conventional time-domain statistical features are simple to calculate and are not problem exclusive; i.e., they can be applied to any dynamical system regardless of its nature. However, they do not provide detailed information about the dynamics of the system, and therefore, often show poor performance when the dynamics is complicated. Frequency and time-frequency features, on the other hand, are more capable of extracting detailed information from the response of the system; however, they rely on expert knowledge and have to be carefully designed and adapted to a specific problem. This motivates the design of features that are efficient and rich in information, are easy to compute and implement, and are generalizable in a sense that they can be applied to diverse systems with minimum reliance on expert knowledge.

This paper introduces a novel feature extraction method, that we call extended phase space topology (EPST), to address the stated problem and demonstrates the implementation of the approach in order to detect and isolate bearing defects in a rotating machine. The developed method is based on the quantitative characterization of the topology of the rotating shaft orbit plots. As will be shown, not only does the proposed method show superior performance over common statistical features, but can also be applied to diverse dynamical systems with minimum requirement for a priori knowledge about the nature of the system or adaptation to a specific problem.

The proposed method in this paper is applied to the vibration signals of a rotor measured by proximity probes in order to detect and diagnose bearing faults. Four different bearing conditions are considered including the nondefective bearing, and bearings with inner race (IR), outer race (OR), and ball (B) defects. Features were then extracted from the data using the developed method and used to train an artificial neural network (ANN) for classifying the defects.

The proposed feature extraction method in this paper is based on the shape analysis of the phase space trajectory. As is well known, the phase space is a space in which all states of a system are represented as the system evolves in time, and a phase portrait is a visual representation of the trajectory of the dynamic system. For the two-dimensional case, the phase space will simplify into a phase plane. The topology of the phase space trajectory provides valuable information regarding the dynamics of a system in a qualitative fashion [16]. Much work has been devoted to extracting information from these topological patterns in order to compare attractors [17–22]. In the context of rotating machinery, an orbit plot (at a particular location on the shaft) can be regarded as a projection of phase space into two dimensions, i.e., the vibration of the shaft in two perpendicular directions. Just as phase portraits, the shape of orbit plots represents the behavior of the rotating system in a qualitative fashion, which can be of critical importance in analysis and diagnostics of rotating machinery. The interpretation of this qualitative visualization, however, requires expert knowledge [23], and is moreover not usable in subsequent analysis. Hence, quantitative characterization of this topology would be beneficial in describing the condition of the system. If this topology can be described or ideally reconstructed by a set of numbers, it can be incorporated in an automated process for monitoring, detection, and isolation of anomalies and faults in the system. Therefore, our chief concern here is to extract a set of features that can quantify the phase space topology.

In earlier work [24,25], we developed and introduced the method of “phase space topology” and demonstrated how it can be employed to characterize the topology of the phase space trajectory with quantitative measures. This method, which is based on the probability density distribution of the time series, was used to extract features from the phase plane response of a nonlinear oscillator in periodic and multiperiodic domains and to estimate the parameters of the system. This paper presents an extension to the previous work and generalizes the applicability of the method and improves its performance and efficiency by enhancing the mathematical approach.

The rest of this paper is organized as follows: In Sec. 2, the mathematical details of the proposed feature extraction method are introduced. Section 3 describes the experimental setup and measurement of data used in this study. Section 4 is devoted to analysis of the response of the system and description of how the orbit plots are affected by various factors such as bearing defects and speed. In Sec. 5, the process of feature extraction and classifier training is described and the final prediction results are presented for two operating condition cases. In addition, a comparison with the statistical feature approach is performed. Finally, Sec. 6 summarizes and concludes this paper.

## Feature Extraction Method

The method of phase space topology [24,25] is based on the transformation of phase space into the density space and characterizing the density with quantitative measures. It was shown that, depending on the geometry and shape of the phase space, the density diagram contains peaks of various heights and sharpness at multiple locations, an example of which is shown in Fig. 1. This stems from the fact that the dynamical system occupies more time at specific regions of the space causing higher densities in those regions. The properties of the peaks in the density diagrams including the location, height, and sharpness of the peaks were used as features in the initial approach. Despite the success of this approach, the need to search for the peaks in the density diagrams makes it difficult or sometimes even impractical to implement, especially for systems with noisy or more complex phase space patterns. The upgraded version of this approach, which will be described below, is based on approximating the density distribution with Legendre polynomials.

### Kernel Density Estimation.

**X**= (

*x*

_{1},

*x*

_{2},…,

*x*) be an independent and identically distributed sample data drawn from a distribution with an unknown density function

_{n}*f*. The shape of this function can be estimated by its kernel density estimator ($\u0302$ indicates that it is an estimate, and

*h*indicates that its value can depend on

*h*)

*h*> 0 is a smoothing parameter called the bandwidth, and

*K*(.) is the kernel function, which satisfies the following requirements:

*x*be a state of the system and $yd=fh\u0302(x)$, its density computed using the kernel density estimator.

*y*is then approximated with Legendre orthogonal polynomials. Legendre polynomials can directly be obtained from Rodrigues' formula, which is given by

_{d}where **Z** = (**y*** _{d}* −

**f**) is the residual vector,

*N*is the number of points in the density function,

**a**= (

**y**

*−*

_{d}*E*{

**y**

*}), and*

_{d}**b**= (

**f**−

*E*{

**f**}).

*E*{.} is the expected value.

## Experimental Setup

The proposed method in this study was implemented on data collected from a rotating fault simulator machine shown in Fig. 2. This machine is a test rig with multiple parts that can be conveniently assembled and disassembled to study a variety of different machinery defects. It basically consists of a motor-driven shaft mounted on two bearings. Shafts and bearings with different sizes and conditions can be used. The tests in this study were conducted with a balanced shaft and a mass load of 5 kg applied to it. GE/Bently Nevada 7200 series proximity probe sensors were used to measure the vertical and horizontal vibration displacement of the shaft close to the bearing housing. The data collected from the transducers were processed through the corresponding conditioning units and were digitized using a National Instruments NI USB 6363 data acquisition system.

Data were collected for four bearing conditions including healthy (H), and bearings with IR, OR, and B defects as shown Fig. 3. For every defect condition, data were collected at 19 different speeds ranging from 300 rpm to 3000 rpm with alternating increments of 120 and 180 rpm (e.g., 300, 420, 600, 720, 900, …, 3000), at a sampling rate of 10 kHz, and for the time length of 5 s. In order to obtain sufficient data, ten sets of data were collected at each rotating speed for the total number of 760 sampled signals.

## Analysis of Orbit Plots

Figure 4 shows the obtained orbit plots for all four bearing conditions at various speeds. As can be seen, the shapes of the orbit plots change with the condition of the bearing as well as the rotational speed of the shaft. This observation suggests that diagnostics of bearings could be performed by characterizing the rotation profile of the shaft.

One quick observation from Fig. 4 is that the orbit plots tend to get more circular as the rotational speed increases. This transition demonstrates that the vibration of the shaft moves from the multiperiodic domain to the periodic domain, which can lead to loss of frequency information. This stems from the fact that at higher speeds, the flexibility of the shaft will dominate in proximity probe signals and mask frequencies coming from other sources including the bearings. In other words, the variation of the shape of the orbit plots with respect to the bearing condition at higher speeds, e.g., 3000 rpm becomes less significant, which could possibly lead to a less accurate diagnosis.

As shown in earlier contributions by the authors [24,25], the phase plot topology has a definitive relationship to the density curve of the corresponding states. Every loop in the orbit will produce two peaks in the density plot due to the higher concentration of data points at the returning points (local minimum and maximum of signals). As a result, a periodic motion, which is characterized by a circular or oval phase plot orbit, will have two peaks in the density plots of the corresponding axes. Similarly, extra harmonics in the response will produce additional loops or distortions in the orbits, which will result in the creation of additional peaks in the density plots. The properties of these peaks such as the location, height, and sharpness can characterize the response and can be used as features for diagnostics.

Figure 5 shows the density of the horizontal axis of the orbit plots for the same speed and bearing conditions presented in Fig. 4. As an example, the orbit plot of the healthy system at 300 rpm in Fig. 4 has an oval shape with several distortions in the middle. It can be seen in Fig. 5 that it has been translated into two higher peaks at both ends and several softer and shorter peaks in the middle of the density plot. As the speed increases moving to the right in Fig. 4, the response loses some harmonics and the orbit becomes more circular and smooth. Consequently, the peaks in the density curves of Fig. 5 disappear gradually when increasing the speed. The other plots can be described in a similar way. In addition, the dotted lines in Fig. 5 correspond to the density functions approximated with Legendre polynomials according to the method described in Sec. 2. The RMSE ranges from 0.0031 to 0.0245. It is also apparent from the figures that the fitting has been accurate.

## Fault Classification

Shaft rotational speed is an important parameter as the dynamic response of the system depends on it. Many traditional bearing fault detection techniques involve pattern recognition, which is effective only at one operating speed and requires retraining the classifier each time the rotational shaft speed changes. This limitation motivates the need for a new method that is effective under variable operating speeds. The current study investigates different bearing configurations under two operating conditions: (A) constant operating speed and (B) variable operating speed.

In order to achieve this goal, the classification problem was initially performed by training and testing the classifier on the same set of speeds. The classifier was trained at 19 rotational speeds, and then tested on the same set of rotational speeds. The second step involved generalizing this diagnostic approach to variable operating speeds. In this step, the classifier was trained on one set of speeds and then tested on another different set of speeds. The detailed description of the analysis of both of the above mentioned procedures is provided in the sequel.

### Case A: Constant Operating Speed.

As discussed in Sec. 4, the density function of the horizontal vibration signal for every speed and bearing condition was approximated using Legendre polynomials. The order of the polynomial was selected based on the best fit between the estimated density function and the approximated density function. Root-mean-square error and Pearson's correlation coefficient were calculated to compute the quality of the fit. Legendre polynomials of order 20 were used to approximate the estimated density functions. The coefficients of the Legendre polynomials were computed for each of the 760 sampled signals using the least-squares method as shown in Eq. (10). The computed coefficients for each case were saved in a vector of 21 arrays (using only the horizontal vibration signal), which was used as a feature input to train an ANN classifier. For better illustration, the first nine coefficients computed for four bearing conditions at the rotating speed of 300 rpm have been presented in Table 1. Since the rotation speed has a high impact on the response of the dynamic behavior, it was used as an additional feature, making the total number of features equal to 22. With 760 sampled data available, 50% of the sampled data (380 samples) were used for training the ANN, 25% of the sampled data (190 samples) were used for validation of the training algorithm and the remaining 25% (190 samples) were used to test the classifier. The total number of selected neurons was ten and the backpropagation algorithm was used to train the artificial neural network. The backpropagation algorithm, which is a popular method for optimizing the weights of the neural network in order to correctly map inputs to outputs, was used. The algorithm works by propagating an input forward through the network layers to the output layer where the calculated output is compared with the desired output. The error values of the calculated output and the desired output are computed and propagated backward. These errors are traced back to each associated neuron in order to update the weights.

Orthogonal function coefficients | |||||||||
---|---|---|---|---|---|---|---|---|---|

Class | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |

H | 1.374 | −0.286 | −0.217 | 0.218 | −0.954 | 0.119 | −0.878 | −0.064 | 0.120 |

IR | 1.677 | −0.032 | −0.949 | −0.078 | −1.450 | 0.349 | 0.343 | −0.437 | 0.728 |

OR | 1.910 | −0.163 | −1.098 | 0.496 | −0.843 | −0.369 | −0.834 | −0.115 | 0.640 |

B | 2.023 | −0.805 | −1.294 | 1.166 | −1.604 | −0.177 | 0.545 | −0.291 | 0.820 |

Orthogonal function coefficients | |||||||||
---|---|---|---|---|---|---|---|---|---|

Class | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |

H | 1.374 | −0.286 | −0.217 | 0.218 | −0.954 | 0.119 | −0.878 | −0.064 | 0.120 |

IR | 1.677 | −0.032 | −0.949 | −0.078 | −1.450 | 0.349 | 0.343 | −0.437 | 0.728 |

OR | 1.910 | −0.163 | −1.098 | 0.496 | −0.843 | −0.369 | −0.834 | −0.115 | 0.640 |

B | 2.023 | −0.805 | −1.294 | 1.166 | −1.604 | −0.177 | 0.545 | −0.291 | 0.820 |

The performance of the classification model is presented by means of confusion matrices. In general, in a confusion matrix, the predicted classes are compared with the actual classes. Each row of the matrix represents the results of prediction for the corresponding class at that row, while each column represents the actual class. Elements in the main diagonal of the matrix represent the correct classified prediction for each corresponding class. These elements are known as true positives. For a specific row, all elements excluding the element on the main diagonal are the misclassified prediction for the corresponding class, which are known as false positives. The false negative for a specific class is defined as the summation of elements on its corresponding column, excluding the element on the main diagonal.

The classifier performance can be analyzed using certain evaluation matrices derived from the confusion matrix, such as accuracy, sensitivity, and precision. The overall accuracy of the classifier is the correct prediction rate represented in the prediction table as the bottom right element. Sensitivity is the true positive rate and precision is the correct positive prediction divided by the total number of positive predictions. Sensitivity and precision for each predicted class are calculated and shown in each prediction table. Table 2 shows the predictions for training and test data using the neural network classifier. In each element of the confusion matrix, the upper number represents the number of cases and the lower number represents the absolute percentage of these cases with respect to the total number of cases. For example, in the training confusion matrix, element (1, 1) has 98 healthy cases, which represents 25.8% of the total cases. As can be seen, the classifier has been able to predict all defects with 100% accuracy, 100% precision, and 100% sensitivity with no misclassification. This result is remarkable for several reasons.

(I) | |||||||
---|---|---|---|---|---|---|---|

Training confusion matrix | Sensitivity | Precision | |||||

Output class | H | 98 | 0 | 0 | 0 | 100.0% | 100.0% |

25.8% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 82 | 0 | 0 | 100.0% | 100.0% | |

0.0% | 21.6% | 0.0% | 0.0% | ||||

OR | 0 | 0 | 95 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 25.0% | 0.0% | ||||

B | 0 | 0 | 0 | 105 | 100.0% | 100.0% | |

0.0% | 0.0% | 0.0% | 27.6% | ||||

H | IR | OR | B | Accuracy 100.0% | |||

Target class |

(I) | |||||||
---|---|---|---|---|---|---|---|

Training confusion matrix | Sensitivity | Precision | |||||

Output class | H | 98 | 0 | 0 | 0 | 100.0% | 100.0% |

25.8% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 82 | 0 | 0 | 100.0% | 100.0% | |

0.0% | 21.6% | 0.0% | 0.0% | ||||

OR | 0 | 0 | 95 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 25.0% | 0.0% | ||||

B | 0 | 0 | 0 | 105 | 100.0% | 100.0% | |

0.0% | 0.0% | 0.0% | 27.6% | ||||

H | IR | OR | B | Accuracy 100.0% | |||

Target class |

(II) | |||||||
---|---|---|---|---|---|---|---|

Test confusion matrix | Sensitivity | Precision | |||||

Output class | H | 44 | 0 | 0 | 0 | 100.0% | 100.0% |

23.2% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 51 | 0 | 0 | 100.0% | 100.0% | |

0.0% | 26.8% | 0.0% | 0.0% | ||||

OR | 0 | 0 | 48 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 25.3% | 0.0% | ||||

B | 0 | 0 | 0 | 47 | 100.0% | 100.0% | |

0.0% | 0.0% | 0.0% | 24.7% | ||||

H | IR | OR | B | Accuracy 100.0% | |||

Target class |

(II) | |||||||
---|---|---|---|---|---|---|---|

Test confusion matrix | Sensitivity | Precision | |||||

Output class | H | 44 | 0 | 0 | 0 | 100.0% | 100.0% |

23.2% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 51 | 0 | 0 | 100.0% | 100.0% | |

0.0% | 26.8% | 0.0% | 0.0% | ||||

OR | 0 | 0 | 48 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 25.3% | 0.0% | ||||

B | 0 | 0 | 0 | 47 | 100.0% | 100.0% | |

0.0% | 0.0% | 0.0% | 24.7% | ||||

H | IR | OR | B | Accuracy 100.0% | |||

Target class |

Note: (I) Training data and (II) test data.

First, it shows that combining the EPST method with the proximity sensor data can resolve the challenges in identifying faults at low rotational speeds (below 10 Hz). Second, no a priori knowledge of the system was included in the features. This implies that the EPST approach can be conveniently applied to diverse dynamical systems in an automated process, with minimal need for adaptation and reliance on expert knowledge about the system. Conventional bearing analyses search for specific characteristics of the system such as ball pass frequencies but this study did not require any additional analysis because the method functioned well without it. We note with caution that it may well be the case that other operating conditions require additional feature combinations. Finally, no feature ranking or feature selection algorithm [26] was employed to select the optimal feature set. Due to the fact that the effect of coefficients in function approximation decreases by increasing the number of orthogonal functions, the calculated coefficients are *naturally* ranked by their order of significance.

### Case B: Variable Operating Speed.

For this part of the study, horizontal and vertical vibration data were used for every speed and bearing condition to construct the density function. The estimated density functions were then approximated using Legendre polynomials of order 20. As in case A, the order of the approximated density function was selected based on root-mean-square error and Pearson's correlation coefficient. The first 15 Legendre polynomial coefficients in each direction were used as a feature set. The shaft rotational speed was added to the feature set to produce an input vector of 31 arrays for each sampled data. The feature vector was used as an input to train the artificial neural network classifier. The neural network was modeled with ten neurons and the back propagation algorithm. The classifier was trained using the extracted features from vibration data for different bearing conditions and for four rotational speeds. The rotational speeds that were selected to train the classifier are: the machine operating range boundaries (300 and 3000 rpm) and two middle speeds (1200 and 2400 rpm). The available vibration data of 160 total samples for different bearing conditions at these speeds were used for training the artificial neural network. The remaining 600 samples obtained at the other speeds (e.g., 420, 600,…, 2820 rpm) were used for testing the trained classifier. Figure 6 shows a flowchart for the algorithm of case B.

The classification results represented as a confusion matrix for the test data for four bearing conditions are shown in Table 3. As can be seen, the classifier has 96.7% overall accuracy with 20 misclassifications out of 600 predictions. These results indicate a high prediction rate of the classifier for the four bearing conditions. Most of the misclassified predictions are for bearings with ball defects. For a better understanding of the classifier performance, sensitivity and precision were calculated for each bearing condition and are shown in Table 3. These evaluation matrices represent a measure of the classification performance for each bearing condition.

Test confusion matrix | Sensitivity | Precision | |||||
---|---|---|---|---|---|---|---|

Output class | H | 150 | 0 | 0 | 0 | 100.0% | 100.0% |

25.0% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 142 | 0 | 0 | 94.7% | 100.0% | |

0.0% | 23.7% | 0.0% | 0.0% | ||||

OR | 0 | 0 | 139 | 1 | 92.7% | 99.3% | |

0.0% | 0.0% | 23.2% | 0.2% | ||||

B | 0 | 8 | 11 | 149 | 99.3% | 88.7% | |

0.0% | 1.3% | 1.8% | 24.8% | ||||

H | IR | OR | B | Accuracy 96.7% | |||

Target class |

Test confusion matrix | Sensitivity | Precision | |||||
---|---|---|---|---|---|---|---|

Output class | H | 150 | 0 | 0 | 0 | 100.0% | 100.0% |

25.0% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 142 | 0 | 0 | 94.7% | 100.0% | |

0.0% | 23.7% | 0.0% | 0.0% | ||||

OR | 0 | 0 | 139 | 1 | 92.7% | 99.3% | |

0.0% | 0.0% | 23.2% | 0.2% | ||||

B | 0 | 8 | 11 | 149 | 99.3% | 88.7% | |

0.0% | 1.3% | 1.8% | 24.8% | ||||

H | IR | OR | B | Accuracy 96.7% | |||

Target class |

For example, for the ball defect condition, the true positive value is 149, the false positive value is 19, and the false negative value is 1. Using these values, the precision can be calculated as 88.7% and the sensitivity as 99.3%. The high sensitivity indicates that for the actual ball defect, the classier was efficient in predicting the condition, but the precision was somewhat lower because other bearing classes were misclassified as a ball defect. In summary, the classifier has both high sensitivity and precision for the majority of the bearing conditions. Moreover, in the case of treating the classification problem as a binary problem, where the objective is to distinguish between healthy and defective bearings, the accuracy of the classifier is virtually perfect since there were zero misclassifications. In the end, the classifier is extraordinarily effective in predicting the bearing conditions independently of the rotational speed variations.

### Comparison With Statistical Features.

For purposes of comparison, the performance of the EPST features is compared with the performance of statistical features. In order to achieve this, the vibration data were used to extract statistical features including both simple and high order statistics. Nine statistical features were computed including: (1) mean, (2) mode, (3) root-mean-square (RMS), (4) standard deviation (SD), (5) minimum, (6) peak value (PV), (7) crest factor (CF), (8) kurtosis, and (9) skewness. Samples of the statistical features in the horizontal direction at 300 rpm for different bearing conditions are listed in Table 4. Due to the significance of the shaft rotational speed, it has been considered as an additional feature to the statistical feature set. The extracted feature set was used as an input to train a neural network classifier. The feature extraction was performed on two operating conditions: (A) constant operating speed and (B) variable operating speed.

Statistical features | |||||||||
---|---|---|---|---|---|---|---|---|---|

Class | Mean | Mode | RMS | SD | Min | PV | CF | Kurtosis | Skewness |

H | 3.0868 | 2.9445 | 3.0892 | 0.1214 | 2.7451 | 3.4214 | 1.1075 | 1.7670 | 0.1848 |

IR | 3.0897 | 2.8952 | 3.0931 | 0.1447 | 2.8140 | 3.4063 | 1.1013 | 1.8033 | 0.1622 |

OR | 3.1566 | 2.8772 | 3.1627 | 0.1956 | 2.7361 | 3.6327 | 1.1486 | 1.7345 | 0.1006 |

B | 2.023 | −0.805 | −1.294 | 1.166 | −1.604 | −0.177 | 0.545 | −0.291 | 0.820 |

Statistical features | |||||||||
---|---|---|---|---|---|---|---|---|---|

Class | Mean | Mode | RMS | SD | Min | PV | CF | Kurtosis | Skewness |

H | 3.0868 | 2.9445 | 3.0892 | 0.1214 | 2.7451 | 3.4214 | 1.1075 | 1.7670 | 0.1848 |

IR | 3.0897 | 2.8952 | 3.0931 | 0.1447 | 2.8140 | 3.4063 | 1.1013 | 1.8033 | 0.1622 |

OR | 3.1566 | 2.8772 | 3.1627 | 0.1956 | 2.7361 | 3.6327 | 1.1486 | 1.7345 | 0.1006 |

B | 2.023 | −0.805 | −1.294 | 1.166 | −1.604 | −0.177 | 0.545 | −0.291 | 0.820 |

In the first case, the artificial neural network was trained and tested on the same rotational speeds. Only the horizontal vibration data were used to extract the statistical features from the 760 sampled data. Table 5 shows the results of the bearing classification using these features. The confusion matrix results imply that the overall accuracy of the classification model is 89.5%. This is significantly lower compared to the accuracy obtained with the EPST features. The classifier is efficient for predicting a healthy bearing and a bearing with an outer race defect. On the other hand, all the misclassified predictions are for bearings with inner race and ball defects. In contrast, the EPST classifier was able to distinguish between all bearing conditions with a virtually perfect prediction rate.

(I) | |||||||
---|---|---|---|---|---|---|---|

Training confusion matrix | Sensitivity | Precision | |||||

Output class | H | 103 | 0 | 0 | 0 | 100.0% | 100.0% |

27.1% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 82 | 0 | 10 | 84.5% | 89.1% | |

0.0% | 21.6% | 0.0% | 2.6% | ||||

OR | 0 | 0 | 86 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 22.6% | 0.0% | ||||

B | 0 | 15 | 0 | 84 | 89.4% | 84.8% | |

0.0% | 3.9% | 0.0% | 22.1% | ||||

H | IR | OR | B | Accuracy 93.4% | |||

Target class |

(I) | |||||||
---|---|---|---|---|---|---|---|

Training confusion matrix | Sensitivity | Precision | |||||

Output class | H | 103 | 0 | 0 | 0 | 100.0% | 100.0% |

27.1% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 82 | 0 | 10 | 84.5% | 89.1% | |

0.0% | 21.6% | 0.0% | 2.6% | ||||

OR | 0 | 0 | 86 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 22.6% | 0.0% | ||||

B | 0 | 15 | 0 | 84 | 89.4% | 84.8% | |

0.0% | 3.9% | 0.0% | 22.1% | ||||

H | IR | OR | B | Accuracy 93.4% | |||

Target class |

(II) | |||||||
---|---|---|---|---|---|---|---|

Test confusion matrix | Sensitivity | Precision | |||||

Output class | H | 46 | 0 | 0 | 0 | 100.0% | 100.0% |

24.2% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 35 | 0 | 12 | 81.4% | 74.5% | |

0.0% | 18.4% | 0.0% | 6.3% | ||||

OR | 0 | 0 | 49 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 25.8% | 0.0% | ||||

B | 0 | 8 | 0 | 40 | 76.9% | 83.3% | |

0.0% | 4.2% | 0.0% | 21.1% | ||||

H | IR | OR | B | Accuracy 89.5% | |||

Target class |

(II) | |||||||
---|---|---|---|---|---|---|---|

Test confusion matrix | Sensitivity | Precision | |||||

Output class | H | 46 | 0 | 0 | 0 | 100.0% | 100.0% |

24.2% | 0.0% | 0.0% | 0.0% | ||||

IR | 0 | 35 | 0 | 12 | 81.4% | 74.5% | |

0.0% | 18.4% | 0.0% | 6.3% | ||||

OR | 0 | 0 | 49 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 25.8% | 0.0% | ||||

B | 0 | 8 | 0 | 40 | 76.9% | 83.3% | |

0.0% | 4.2% | 0.0% | 21.1% | ||||

H | IR | OR | B | Accuracy 89.5% | |||

Target class |

Note: (I) Training data and (II) test data.

In the second case, the artificial neural network was trained for four rotational speeds including 300, 1200, 2400, and 3000 rpm. The trained classifier then was tested on the other remaining 15 rotational speeds (e.g., 420, 600,…, 2820 rpm). The vibration data in both directions were used to extract the statistical features. The statistical features plus the rotational speed for each data sample were used as an input to the artificial neural network (19 features in total). A total of 160 samples were used for training and a 150 samples for each bearing condition were used for testing the classifier. The test results are presented as a confusion matrix, which is shown in Table 6. The overall accuracy of the classification model is 86%, which is also significantly lower than the EPST classifier accuracy. Results indicate that healthy bearing class was misclassified with other bearing classes (such as bearings with inner race or ball defects). However, other bearing classes were not misclassified as a healthy bearing class. This implies that the classifier has high precision and less sensitivity in predicting the healthy bearing class. Also, the classifier is effective for the outer race defect class where results show high sensitivity and precision similar to case A. The lowest performance of the classifier was for predicting the ball defect condition, where the sensitivity and the precision were found to be 71.3% and 78.1%, respectively. In summary, the statistical feature approach was successful in detecting and identifying bearings with the outer race defect condition, but it faced challenges with detecting and identifying bearings with inner race and ball defects.

Test confusion matrix | Sensitivity | Precision | |||||
---|---|---|---|---|---|---|---|

Output class | H | 132 | 0 | 0 | 0 | 88.0% | 100.0% |

22.1% | 0.0% | 0.0% | 0.0% | ||||

IR | 10 | 128 | 0 | 43 | 85.3% | 70.7% | |

1.7% | 21.3% | 0.0% | 7.2% | ||||

OR | 0 | 0 | 150 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 25.0% | 0.0% | ||||

B | 8 | 22 | 0 | 107 | 71.3% | 78.1% | |

1.3% | 3.7% | 0.0% | 17.8% | ||||

H | IR | OR | B | Accuracy 86.2% | |||

Target class |

Test confusion matrix | Sensitivity | Precision | |||||
---|---|---|---|---|---|---|---|

Output class | H | 132 | 0 | 0 | 0 | 88.0% | 100.0% |

22.1% | 0.0% | 0.0% | 0.0% | ||||

IR | 10 | 128 | 0 | 43 | 85.3% | 70.7% | |

1.7% | 21.3% | 0.0% | 7.2% | ||||

OR | 0 | 0 | 150 | 0 | 100.0% | 100.0% | |

0.0% | 0.0% | 25.0% | 0.0% | ||||

B | 8 | 22 | 0 | 107 | 71.3% | 78.1% | |

1.3% | 3.7% | 0.0% | 17.8% | ||||

H | IR | OR | B | Accuracy 86.2% | |||

Target class |

## Comparison With Conventional Methods

Conventionally, accelerometers are used as vibration sensors for bearing analysis and fault diagnostics [1,7,27,28]. Alternatively, in many real applications such as turbomachinery, it is a standard practice to install proximity probes for shaft data acquisition and balancing purposes [29–31]. Moreover, some studies have used proximity probes (noncontact sensors) for condition monitoring of rolling element bearings [32–36]. The effectiveness of the used sensor depends on different factors including the rotational shaft speed, the bearing defect type, and the location of the mounted sensor. The vibrations produced by the bearing defects transmit through multiple components and the sensor location is critical in capturing different information about the defect. Accelerometers usually work best in high speed ranges while proximity probes are more effective at low speeds. Additionally, accelerometers, which are usually installed on the bearing housing, are preferred for outer race bearing defect diagnostics. Alternatively, proximity probes can be configured to measure the shaft deflection or the bearing outer race by using rolling element bearing activity monitor mounting technique [37–39]. In the former configuration, proximity probes work better for inner race bearing defect diagnostics and for the latter configuration they work better for the outer race bearing defect diagnostics by measuring the vibration deflection directly. This improves the signal to noise ratio, because there is no intermediate structure in the signal path.

In our past work [40], we have applied conventional methods such as fast Fourier transform, envelope spectrum, and discrete wavelet transform on the same test rig and over the same operating conditions. Notwithstanding the above statement about accelerometers being best suited for outer race defect, we used it to diagnose bearings with an inner race defect. Features were extracted from the acceleration data, then mutual information was used as a ranking technique, and the optimal feature subset corresponding to the highest classification accuracy was determined. An overall accuracy of 97.1% was achieved using this procedure. In Ref. [41], we have also conducted the same approach using outer race defect seeded to the same fault simulator operated over the same conditions. Results indicate a 97.3% overall classification accuracy. In Ref. [26], we investigated healthy bearings and bearings with inner race and outer race defects tested over the same range of operating speeds as this work. Conventional feature extraction techniques such as fast Fourier transform, envelope transform, discrete wavelet transform were used. An overall classification accuracy of 97% was obtained for the three bearing conditions.

In contrast to the previous work, this paper investigates four bearing conditions including healthy bearing and bearings with inner race, outer race, and ball defects using proximity sensors. This work also studies two operating conditions: constant operating speed and variable operating speed. An overall classification accuracy of 100% was obtained for the constant operating speed condition and 96.7% for the variable operating speed condition.

To summarize, the proposed approach is not to replace the conventional methods but to explore the information that can be obtained from different kinds of vibration sensory data and to combine these information as a sensor fusion approach. Even though the results of the classification performance are remarkable, it is not clear if it can be generalized to field conditions. There are no generally reported success stories with high accuracy in field conditions. Our future work will in fact involve optimally combining information from accelerometers and proximity probes for such situations.

## Conclusion

The feature extraction method (EPST), which was introduced in this paper, is based on the density distribution of the vibration signal. We showed that the density function can be characterized and quantitative features can be extracted to describe the behavior of the system using the approximation of the density with orthogonal functions. Rolling element bearings with various conditions were studied in order to detect and identify their status. Results show remarkable performance regarding constant operating speeds, particularly low rotational shaft speeds. Furthermore, the performance of the EPST method is effective independent of the operating speed variations. These results are significant due to the importance of the shaft rotational speed and its strong influence on the dynamic system behavior. The EPST features were compared with the statistical features, and the results indicate an improvement on the classification performance for the two operating conditions in the study. The results showed that the proposed method has outstanding performance in describing the dynamics of the system, and it can also be applied to various dynamical systems with minimum knowledge about the signatures of the response under various defective conditions.

Like conventional statistical features, and unlike many other features, which are designed for specific systems, the EPST method can be applied to various time series regardless of the nature of the system. The method is robust to noise, does not depend on expert knowledge about the system, and requires no feature ranking or selection as the obtained features are automatically ranked during the process. Furthermore, the EPST method can easily be applied in an automated process.

## Acknowledgment

We deeply appreciate this support and are humbled by ONR's enthusiastic recognition of the importance of this research.

## Funding Data

U.S. Office of Naval Research (Grant No. ONR N0014-15-1-2311) with Capt. Lynn Petersen as the Program Manager.