Introduction
The electrical activity of the working heart is presented in a form of a 12lead electrocardiogram (ECG). The ECG signal is defined as depolarization i.e. the potential change from negative to positive which spreads across the myocardium followed by depolarization. The activity of the atria is marked by the deflection called the Pwave. The depolarization results in forming up a deflection from the isoelectric line. If a vector is consistent with the direction of the bipolar lead or is oriented towards the unipolar lead, the deflection is positive. In the opposite situation, the deflection sets negatively. When the vector spreads perpendicularly to bipolar lead direction, or parallel in the case of unipolar leads, no deflections are being formed, which results in isoelectric fragments in ECG. Considering the complexity of the heart conduction system and its myocardial components, this never fully happens in practice, however, this fact helps to understand the presence of the flattened, irregular ECG fragments. After analysing the coordinates of the signal, it was found that the duration of the Pwave differs significantly from the one taken with less precise methodology [1, 2]. Despite more accurate assessment, the results and conclusions of this work were still doubted [3, 4]. To support the present findings, the team designed an original, specially calibrated automatic algorithm that analyses every millisecond of the recording. Precisely measured Pwave duration becomes highly useful in defining a total atrial activation time, essential for advanced diagnosis.
The study aimed to compare the manually taken Pwave duration measurements with the ones taken automatically in a wide, unselected population of patients with atrioventricular nodal reentrant tachycardia (AVNRT), atrial flutter (AFL) and atrial fibrillation (AF). The additional goal of this study is to prove that the Pwave durations are equal in different leads of the ECG, no matter the type of arrhythmia.
Material and methods
Seventytwo patients (31 males, 41 females) aged 62.8 ± 14.27 were included in the study. The patients were divided into 3 numerously equal (24 pts) subgroups dependent on the type of arrhythmia: AF, AFL, AVNRT. The measurements were taken twice within those subgroups: the first time manually, which is treated as a golden standard, and the second time automatically with the originally developed algorithm. The manual measurements were taken 3 times in all leads by 2 independent investigators unaware of the mutual results and clinical data of the patients. The LabSystem Pro electrophysiological system was used for the manual measurements, which allowed the use of parameters: 200 mm/s, magnification 64–128×. By using vector graphics, the researchers were able to zoom the record without any quality loss. The electrophysiological system allowed to analyse the record at the rate of 1 px/1 ms using a 4K TV as the screen. For automatic measurements, the team used the specially designed software — automatic precise Pwave assessment (APPA). The algorithm was calibrated to imitate the skills of most experienced human researchers and to keep the repeatability of the measurements. The signal was analysed every 1 millisecond, and the algorithm was set to detect the raise of the isoelectric line and define the beginning of the Pwave. In some cases, the quality of the recording was so distorted, that it was impossible for the algorithm to objectively detect the beginning of the Pwave, due to the artifacts’ overlap. However, such cases were excluded from the study, as the intention was to keep the results as objective as possible. The measurements were compared, contrasted and analysed between the subgroups. The construction of the algorithm results in a slight deviation of ± 10 ms in measurements. Due to small fluctuations of the isoelectric line, which occur in every case, the algorithm starts to calculate the beginning of the Pwave after 10 ms, if the determined condition is met. Similarly, the measurement ends after 10 ms when the termination condition is reached. Therefore, assuming that the algorithm may distort the measurement at the beginning and end on average by 5 ms, the statistical error was 10 ms. This can be easily supported by the probability theory, which proves that after e.g. 1 million random sampling attempts [real random numbers — the probability density function with a normal distribution N (0, σ 2)] of numbers from 1 to 10, the mean is 5. Concluding from this, the error does not exceed 10 ms (5 ms at the beginning of the Pwave and 5ms at the end) [5, 6]. Taking a closer look at the errors, they can be divided into systematic, random and excessive. These types of errors are not considered, because in this case the examination concerns exactly one patient and his/her health condition during the examination. Moreover, the algorithm does not analyse the disturbed periods, marking them as damaged — it is not analysed why they occurred, they are just automatically rejected. Therefore it can be concluded that the only statistical error is 10 ms. Of course, there is a probability of achieving an error of 20 ms, but it is just as probable as the error of 0 ms (no error). From a mathematical point of view, for many respondents, the abovementioned cases neutralize each other, although they may occur in separate cases. Most importantly, however, the present study relies on a series of studies which makes the measurements more objective from the statistical point of view.
Statistical analysis
For quantitative variables, basic descriptive statistics were calculated (M — average, SD — standard deviation, Me — median, Q1 — lower quartile, Q3 — upper quartile, Min — minimum value, Max — maximum value), and the compliance of their distributions with theoretical normal distribution was checked using the ShapiroWilk’s W test. Comparisons were performed with the Students’ Ttest or MannWhitney U test for independent groups or KruskalWallis ANOVA for multiple comparisons. Each categorical variable was presented as numbers and percentages. The comparisons were performed with the Chisquare test. The correlations between studied parameters were performed using Pearson’s correlation coefficient or Spearman’s rank correlation coefficient according to the statistical properties of the data. The statistical analysis was performed using the computer program STATISTICA v.13.3 (StatSoft, Inc., Tulsa, USA). Pvalues less than 0.05 were considered significant.
Results
The clinical and demographic characteristics of the patients in the 3 study subgroups taking part in the present research are presented in Table 1. The data include age, sex and comorbidities of the patients concerning 3 types of atrial arrhythmia.
Variable 
Group 
Pvalue 

AVNRT n = 24 
AFL n = 24 
AF n = 24 

n 
% 
n 
% 
n 
% 

Sex 

< 0.05 

Women 
20 
27.78 
10 
13.89 
11 
15.28 

Men 
4 
5.56 
14 
19.4 
13 
18.06 

Comorbidities 

< 0.05 

HT 
15 
62.5 
17 
70.8 
18 
75.0 

DM 
2 
8.3 
5 
20.3 
6 
25.0 

CKD 
2 
8.3 
3 
12.5 
2 
8.3 

IHD 
3 
12.5 
5 
20.3 
5 
20.3 

HF 
2 
8.3 
3 
12.5 
3 
12.5 

Age (years) 

< 0.01 

Mean ± SD 
55.3 ± 12.03 
64.9 ± 12.38 
68.3 ± 15.12 

Table 2 presents the statistical information about Pwave durations measured manually and automatically in 3 subgroups of atrial arrhythmias.

The duration of the Pwave [ms] 
Pvalue 

Manual 
Automat 
dMA 


AVNRT 
n = 24 
n = 24 
n = 24 
0.045 
Mean ± SD 
113.8 ± 9.8 
110.6 ± 9.1 
–3.2 ± 7.4 

Median (IQR) 
114 (107–121) 
109 (103–119) 
–2 (1–6) 

Min–max 
95–134 
98–128 
–29–7 

AFL 
N = 24 
N = 24 
N = 24 
0.032 
Mean ± SD 
123.4 ± 7.5 
120.6 ± 8.1 
–2.8 ± 7.8 

Median (IQR) 
122 (119–127) 
121 (115–125) 
–5 (–7–3) 

Min–max 
113–147 
106–138 
–25 –17 

AF 
n = 24 
n = 24 
n = 24 
0.021 
Mean ± SD 
134.9 ± 13.2 
129.7 ± 8.0 
–3.2 ± 7.4 

Median (IQR) 
132 (129–147) 
129 (123–136) 
–2 (1–6) 

Min–max 
115–156 
115–147 
–29–7 

All patients 
n = 72 
n = 72 
n = 72 
< 0.001 
Mean ± SD 
124.0 ± 13.5 
120.3 ± 11.4 
–3.4 ± 9.1 

Median (IQR) 
122 (115–130) 
122 (115–128) 
–2 (1–8) 

Min–Max 
95–156 
98–147 
–30–20 
The longest results including mean, median and minimummaximum values are present in the subgroup of AF, measured both manually and automatically. The shortest ones are present in the subgroup of AVNRT. The results are supported with a significance test.
Figure 1 presents the results of the significance Wilcoxon signedrank test relating to the differences in duration of the Pwave in all patients and the groups of patients with AVNRT, AFL, AF. The median of the Pwave duration is slightly higher in the case of manual methodology, contrasting to the automatic measurement in the 3 subgroups separately. However, the significance test is p < 0.001 for all patients analysed altogether regarding manual/automatic methodology.
Figure 2 presents the differences, relations and proportions of age in the studied groups of patients. The differences are analysed with the KruskalWallis significance test, and posthoc tests (Dunn’s tests) in 3 groups. It can be easily noticed that the subgroup of patients with AF is statistically the most advanced in age. This fact corresponds with the abovepresented results of the longest Pwave durations in the very same group. Similarly, the subgroup of patients with AVNRT is the youngest one, which corresponds with the shortest Pwave durations.
Figure 3 shows the difference, and the absolute value of the difference between manually and automatically measured Pwave durations in 3 study groups analysed with the KruskalWallis ANOVA test.
When it comes to the difference in milliseconds, it is noticed that the greatest range of 25–75% happened in the subgroup of AF, and slightly lower in AFL. In the subgroup of AVNRT however, the range of the difference was distinctly lower. The absolute value of the difference between manually and automatically determined Pwave duration was visibly bigger in the subgroup of AF, in contrast to AFL, AVNRT.
Figure 4 presents the graphical interpretation of the results of multivariate analysis concerning the difference between manual and automatic measurements in 3 study groups. The first couple of charts reveals a similar profile of the difference in measurements considering the type of arrhythmia and age. The difference between manual and automatic measurements is higher in men considering the whole study group. The difference between manual and automatic measurements is close to 0 for women and men in the groups of age respectively: 25–57, 58–70 years old.
Figure 5 represents the BlandAltman Index and BlandAltmann plot showing the relation between the differences of the Pwave durations taken manually and automatically, and the average duration of the Pwave. The mean difference between manual and automatic measurement methodology is 3.72 ms, and the highest density of the results is included between 110–130 ms.
Discussion
The most important achievement of this research is the fact that APPA has proven similar effectiveness and precision in measuring the Pwave duration to an experienced researcher. This fact means that the results of this researchbased on manual measurements are not disturbed by the subjectivism of the researchers [7, 8]. The accurate duration of the Pwave is an essential substrate for calculating a total atrial activation time, which can be used as a predictive factor for AF. In the following discussion, the study findings are analysed, and the methodology is confronted with the others in use since 1997. The most fundamental principle of electrocardiography is the fact that all electrophysiological phenomena begin and end at the same time and place, regardless of the lead. Different leads should be treated as different perspectives of observing the same signal repeatedly. This is logical and based on the physical properties of electric signals spreading in space and time. Nevertheless, in 1997, Dilaveris [9] introduced the concept of Pwave dispersion as the difference between the longest and shortest Pwaves in two different leads, assuming wrongly, that this case happens at all. The duration of Pwaves was determined using a ruler, a magnifying glass, and ECG recording at the paper speed of 50 mm/s and the feature of 1 mV/cm. The maximum Pwave duration of 110 ms and dispersion of 40 ms were positive predictive values of 89%. The dispersion in the study group was 49 ± ms and in the control group 28 ± ms, which was statistically significant. In the following years, the Pwave dispersion became a popular parameter, which resulted in the creation of many scientific papers based on incorrect, still copied methodology [10, 11]. The results of research based on the opposite methodology were presented in 2015 at the Europace conference by the team of Zimmer et al. [12]. The authors took the measurements for the first time with the settings: 50 mm/s, 8×; and the other time with more accurate settings: 200 mm/s, 128–256×. For this purpose, they used the properties of vector graphics, contrary to the measurements taken by Dilaveris, who used raster graphics. The results revealed that with less precise settings the dispersion was 45.14 ms, while with more precise settings it was 1.24 ms, so it practically disappeared. The results also showed a direct correlation of Pmax/Pdisp, which meant that dispersion couldn’t be an independent parameter. The confirmation of this discovery was the work published in 2020 [12]. Using the methodology of Zimmer [12], the Pwave dispersion was: 44.1 ± 16.8 ms (50 mm/s, 8×), and 2.8 ± 3.4 ms (200 mm/s, 64–128×). The particularly interesting phenomenon occurred in the work of Yamada et al. [13], who published his results only two years after Dilaveris had introduced the theory of the Pwave dispersion. The author used automatic software for the measurements and the Pwave dispersion was on average 26.6 ± 9.5 ms in the study group and 14.8 ± 6.7 ms in the control group. These values were much lower than the results presented by Dilaveris, but the growing popularity of the Pwave dispersion theory decreased the meaning of those findings.
The methodology based on automatic software became an inspiration for the study team. The goal was to create an automatic algorithm with the accuracy of measurement comparable to the one reached in manual methodology. Considering the manual methodology, the level of accuracy reached using the properties of vector graphics was 1 pixel — 1 millisecond. The algorithm needed to be able to reflect the skills of an experienced researcher without being any more or less precise. After multiple analyses of the ECG records, it was decided on how to create the algorithm in the expected formula. The main assumption was based on using the properties of vector graphics for the recording of the signal. The ECG graph was formed of the coordinate points filtered every 1 ms. The biggest advantage of vector graphics is the ability of infinite, lossless enlargement of the graphs [14]. Contrary to raster graphics (e.g. a regular ECG paper printout), vector graphics are fully scalable, with no quality loss after changing proportions [15]. The scalable ECG record can adapt its quality to a given resolution, which is very important in making a precise measurement, especially in the case of the Pwaves. For example, the structurally damaged atria, as in the case of paroxysmal AF, results in flat and long Pwaves, which require higher precision in measurements. Respectively, in the case of less severe arrhythmias, the Pwaves are shorter and more distinct. This observation was confirmed by the study results. The issue related to excessively long Pwaves results in selfhiding of their actual duration, which was recently described in the work by Mercik et al. [2]. It means that the longer the Pwave is (indicating most probably the interatrial conduction disorders), the more difficult it is to assess its actual duration, despite the technology used, as it was reflected also in the research. To support this statement — the mean difference between automatic and manual measurements was 3.72 ms in all patients, including all Pwave durations. Concluding from this, the described dissimilarities couldn’t come from the algorithm, but from the phenomenon of selfhiding, which is a real problem in taking objective measurements. In extreme cases, the Pwaves may be long, flat and irregular to such extent, that they may seem to be short while zooming out. In such cases, the flat and regular parts of the Pwaves are averaged to suit the given resolution, and they seem like an integral part of the isoelectric line. In the authors’ opinion, this was the case in the study by Nielsen et al. [16]. The authors stated that not only the long duration of the Pwave but also the short one related to the higher risk of AF. Without detailed information on the methodology, one can suspect that the Pwaves qualified as “very short” (< 89 ms) were the result of the insufficient precision of measurement. Based on electrophysiological knowledge, a short and regular profile of the Pwave happens in quick and physiological signal conduction, which would be unlikely related to the higher risk of AF. This result is also supported by the numbers obtained in the research. The minimum and maximum Pwave durations are respectively 115–147 ms (automatic measurements), and 115–156 ms (manual measurements). Based on those results in the authors’ opinion, in particular in older adults, there is no such category as “very short” Pwave and if some get such measurements of the Pwave duration, this fact requires a more accurate methodology, and that would be an interesting direction of the future research (Folia Cardiologica).
To summarize, the APPA algorithm was proven to be practically as accurate in measurements as an experienced researcher using the means of vector graphics. The correct methodology in assessing the Pwave is essential for making the right diagnosis in clinical practice. The determination of the precise Pwave duration as well as the accurate assessment of the Pwave morphology, including interatrial conduction remains the goal for future research. The detailed analysis of these variables potentially increases the chances of determining a new parameter in the prediction of recurrent atrial fibrillation in clinical practice, as the current ones are insufficient.
Conclusions
The automatic precise Pwave assessment algorithm is comparably accurate in taking the measurements to an experienced researcher.
APPA can be used for scientific purposes to analyse the data saved in the form of coordinates of the signal filtered every 1 ms.
The use of an automatic algorithm doesn’t increase the precision of measurements per se, but increasing the number of analysed Pwaves per patient, which is the part of algorithm’s methodology, makes the final values more reliable.
After increasing the precision of measurements, the differences between minimal and maximal duration of the Pwaves in different leads decrease to negligible values.
The structural destruction of atria results in selfhiding of the actual duration of the Pwaves in ECG. In clinical practice, it can result in the wrong interpretation of the atrial damage.
Study limitations
A significant limitation of this study is its innovative nature, so it is impossible to compare the present results in the field of automatic measurement with the results of other authors. Moreover, it is difficult to talk about presenting the software to a wide range of users without the graphic interface which simplifies the use and the analysis in the unproduced commercial version of the program. The software uses data in the form of coordinates that cannot be obtained from all electrophysiological systems.
Conflict of interest
The authors have no conflicts of interest to declare. All coauthors have seen and agree with the contents of the manuscript and there is no financial interest to report.
Funding
The authors have no funding to declare.
Streszczenie Wstęp. Aktywność elektrofizjologiczna serca jest rejestrowana i przedstawiana w postaci elektrokardiogramu (EKG). Precyzyjny pomiar załamka P jest niezbędny do prawidłowej oceny przewodzenia sygnału wewnątrz przedsionków. W celu walidacji precyzyjnych pomiarów manualnych zespół badawczy stworzył automatyczne oprogramowanie dostosowane do precyzyjnych pomiarów fali P (automatyczna precyzyjna ocena załamków P, APPA). Celem niniejszej pracy jest wykazanie, że algorytm automatyczny ma porównywalną skuteczność w precyzyjnym pomiarze czasu trwania załamka P. Materiał i metody. Grupę badaną stanowiło 72 chorych (31 mężczyzn, 41 kobiet) w wieku 62,8 ± 14,27 lat, poddawanych różnym zabiegom elektrofizjologicznym. Załamek P mierzono dwukrotnie: za pierwszym razem ręcznie przy prędkości papieru 200 mm/s, 64–128× (pomiar precyzyjny), a za drugim razem automatycznie przy użyciu systemu APPA, który filtruje sygnał co 1 ms. Wyniki. Nie stwierdzono statystycznych różnic pomiędzy pomiarami ręcznymi i automatycznymi. Średnia różnica pomiędzy tymi dwiema metodami wynosi 3,72 ms. Mediana czasu trwania załamka P była nieznacznie wyższa w przypadku pomiarów manualnych we wszystkich rodzajach arytmii. Największa różnica wystąpiła u pacjentów z migotaniem przedsionków. Najmniejsza różnica występowała w przedziale 110–130 ms czasu trwania załamka P. Wnioski. Pomiary wykonane przez APPA oraz manualnie są równie dokładne, co potwierdza wcześniejsze wyniki uzyskane przez autorów. Algorytm pomiarów charakteryzuje się wysoką wiarygodnością wyników i może być wykorzystywany do celów naukowych. Strukturalne zniszczenie przedsionków prowadzi do samoukrycia rzeczywistego czasu trwania załamków P w EKG. Przy większej precyzji pomiarów różnice pomiędzy minimalnym i maksymalnym czasem trwania załamków P w różnych odprowadzeniach zmniejszają się do wartości pomijalnych. Słowa kluczowe: czas trwania załamka P, automatyczny algorytm, oprogramowanie, pomiary załamka P Folia Cardiologica 2022; 17, 2: 73–81 