جارٍ تحميل عَبَر…
جارٍ تحميل عَبَر…
A'ber is built on validated machine learning over published dysarthria corpora and on the established motor learning literature for rehabilitation of motor speech disorders. The severity classifier is trained and evaluated on the UA-Speech and TORGO corpora with centroid speaker normalization preceding Cleanlab supervised label cleaning, which together let the classifier generalize across speakers without overfitting to any single recording environment.
The classifier consumes WavLM features over the acoustic battery and emits per dimension severity estimates rather than a single collapsed score. Per dimension output preserves the diagnostic signal that single number scoring discards. The pipeline enforces speaker normalization before Cleanlab so label cleaning operates on speaker invariant features, and the Cleanlab cap is computed as a percentage of flagged samples, not of total class size, which prevents the historical regression where over confident clean labels squeeze out genuine borderline cases.
Exercise selection runs through five sequential layers. Deterministic rules first filter the candidate set against clinical guards (audio quality gate, session one validity, device baseline). Kalman filters then track per dimension skill with confidence intervals that tighten as evidence accumulates. Thompson Sampling selects from the remaining candidates weighted by their expected information gain. Elo difficulty calibration matches selected exercises to the patient's current level. Bayesian cross patient pooling warm starts patients whose individual evidence is still sparse by borrowing from the population prior for similar severity profiles.
Every exercise is grounded in the motor learning literature on speech rehabilitation. Feedback frequency calibration follows the Schmidt and Lee framework, transitioning from knowledge of performance (rich, real time feedback while the patient is learning the movement) to knowledge of results (faded summary feedback once the patient is consolidating). Cadence differs by sub track because the literature shows respiration and articulation respond to different feedback densities, and the engine honors those differences instead of applying one cadence across the board.
Model development uses the UA-Speech corpus from the University of Illinois at Urbana-Champaign and the TORGO corpus from the University of Toronto, both publicly released for dysarthria research with appropriate consent. UA-Speech is primarily word and command level so it cannot yield connected speech metrics such as maximum phonation time or articulation rate; TORGO contributes the sustained vowel and connected sentence material that completes the eight dimension coverage. The two corpora are never pooled within a single metric. The IEEE SLT manuscript covering the full methodology is in preparation.
Related: dysarthria therapy overview, for clinicians.