Nathaniel F. Watson, MD, MSc1 • Chris R. Fernandez, MS2 • Sam Rusk, BS2 • Yoav N. Nygate, MSc2 • Nick Glattard, MS2 • Fred Turkington, BS2• Justin Mortara, PhD2
Introduction
The Photoplesthymogram (PPG) raw waveform is the basis for both the pulse rate and oximetry during polysomnography (PSG) and Home Sleep Apnea Tests (HSAT).
The PPG has also recently become ubiquitous as a basis of continuous measurement for the most widely adopted consumer sleep technologies, particularly smart watches.
In this study, we clinically validate AI performance for interoperable, PPG-based epoch-by-epoch Sleep-Wake staging (PPG-SW), Total Sleep Time (PPG-TST), and Respiratory Rate (PPG-RR),when compared to 1) PSG-based panel scoring by technologists (RPSGTs) and 2) PSG-based AI scoring (EEG-SW, EEG-TST, Effort Belt-RR).
Methods
We applied stratified random sampling with proportionate allocation to a database of N>10,000 retrospective PSGs.
We controlled for:
- Obstructive sleep apnea severity,
- Sleepiness
- Medical diagnoses including sleep, psychiatric, neurologic, neurodevelopmental, cardiac, pulmonary, and metabolic disorders
- Medications including benzodiazepines, antidepressants, stimulants, opiates, and sedative-hypnotics
- Demographics including sex, age, BMI, weight, and height, to establish representative adult (N=100) PSG studies from which PPG samples were obtained.
Double blinded scoring was prospectively collected for each PSG by 3 experienced RPSGTs randomized from a pool of 6 scorers. RR was established by mode when two scorers agreed on RR value and median otherwise.
Results
AI EEG-SW demonstrated 96%/94%/95% Sensitivity/Specificity/Accuracy compared to 2/3 majority PSG staging, and AI PPG-SW demonstrated 90%/89%/90% Sensitivity/Specificity/Accuracy compared to the same PSG panel.
AI EEG-TST achieved a Pearson Correlation Coefficient (R-value) of 0.968 and AI PPG-TST achieved 0.873 R-value compared to 2/3 majority PSG-TST.
When compared to the RR panel consensus in N=282 one-minute RR scoring epochs of PSG, AI Effort Belt-RR performance was <= 2 breaths-per-minute (brpm) in 93.6% of epochs with an average difference of 0.992 brpm, and AI PPG-RR performance was <= 2 brpm in 92.2% of epochs with an average difference of 0.996 brpm.
Conclusions
The study shows interoperable AI analysis performs robustly in evaluating PPG-based epoch-by-epoch sleep-wake stages, total sleep time, and respiratory rate, demonstrating state-of-art accuracy when compared to a prospective, double-blinded PSG scoring panel.
This work has implications for consumer sleep technology, HSAT accuracy, inpatient sleep monitoring, and may support the growth of HSATs by increasing total sleep time accuracy and reliability.