OpenMic · FluentPlay
01 / 09
PAD v2 · Speech signal analysis

What the signal
tells us

Every word spoken generates a structured record of motor planning stability — at the session, word, syllable, and phoneme level.

Per-phoneme resolution Adaptive personal baseline Longitudinal tracking SLP override
OpenMic · FluentPlay
02 / 09
Resolution stack

Four levels of analysis,
every utterance

01 Session PAD score · stability floor · event cost · WPM · convergent-event count · acoustic-event rate · smoothness index · σ variance · 3D PAD profile
02 Word PAD score · gap · duration · syllable rate · confidence · event flags · challenge tag · POA · chunk position
03 Syllable Duration vs. speaker median · stress position · prolongation detection · voicing onset timing
04 Phoneme Per-phoneme duration · intra-phoneme repetition · acoustic onset · coda analysis · IPA mapping

Most speech tools report a single score. OpenMic reports a structured acoustic record at four resolution levels — any of which can be the target of measurement, practice, or research.

OpenMic · FluentPlay
03 / 09
Session-level view

The full picture of
a session's motor stability

PAD over session · per word
Disfluency feature stream · frame-level acoustic state
bettybutterboughtsomebutterbutshesaidbutter'sbitterifIputitinmybatteritwillmakemybatterbitterbutabitofbetterbutterwillmakemybatterbetter
3D PAD Profile · session result
Stability Stable-onsets Mean PAD WPM Smoothness Voiced 81 100 84 45 76 91
Session PAD84
Floor93
Convergent events5
Acoustic events7
Challenge word hits0
Top POABilabial (5)
Words35
Duration17s
OpenMic · FluentPlay
04 / 09
Word-level analysis

Every word is a
scored motor event

"bitter" PAD 68 · 2 syl · 495ms/syl
Base score100
Duration 495ms/syl (baseline 265ms)Penalty
Repetition detectedPenalty
Recognition insertion → part-wordPenalty
Final PAD68
FlaggedRepeated onset · convergent
Place of articulationBilabial
  • Inter-word gap — silence before word onset, compared to that speaker's rolling median
  • Syllable rate — duration per syllable vs. speaker baseline; prolongation fires at 1.8×
  • Event flags — delayed-onset, extended-voicing, repeated-onset, filler, articulation-deviation, omission — scored independently
  • Convergence rule — fires on multiple independent signals, not a single threshold; engineered for low false-flag rate
  • Challenge word tag — feared words scored at elevated sensitivity; any flag = convergent-event classification
  • POA mapping — onset phoneme cross-referenced against user's declared motor difficulty zones
  • Dual-source detection — recognition layer word timing + DFS acoustic stream; each catches what the other misses
OpenMic · FluentPlay
05 / 09
Phoneme-level resolution

The instability inside the word

The acoustic signal resolves to individual phonemes — duration, voicing onset, and intra-phoneme repetition. A word can score 99% recognition confidence and still contain a detectable motor planning failure at the phoneme level.

Phonemes heard — "bitter"
B
419ms
IH
T
ER
ER
repeat
⚠ Intra-phoneme repetition · Coda B · 419ms · 1.8× word mean
Duration bar · phoneme timeline

What phoneme-level data unlocks

  • Intra-phoneme repetition — two voicing onsets within a single phoneme window; invisible to word-level scoring alone
  • Coda vs. onset asymmetry — where in the syllable the instability fires; critical for POA-targeted practice
  • Duration outliers per phoneme — which specific sound is held, not just that the word was slow
  • IPA mapping — phoneme identity cross-referenced against declared POA difficulty zones
  • Research relevance — per-phoneme duration and voicing onset data matches the variables measured in auditory feedback perturbation studies — now available in naturalistic speech, continuously
Why this matters for stuttering Most stuttering occurs at word-initial consonants, particularly bilabials and velars. Phoneme-level resolution tells you exactly which sound is failing, not just which word.
OpenMic · FluentPlay
06 / 09
Personalized baseline

Scoring is relative
to the speaker, not a norm

Adaptive baseline

Rolling 30-word window

Gap, duration, and rate deviations are scored against that speaker's own rolling median. A naturally slow speaker and a fast speaker can both score 100.

Challenge words

Feared word sensitivity

User-declared feared words scored at elevated sensitivity. Any event flag on a challenge word is classified as a convergent event — surfacing anticipatory motor load directly.

birthday butter business
Motor difficulty zones

POA mapping

User declares which places of articulation cause motor difficulty. Any word whose onset phoneme uses a declared POA is scored with elevated sensitivity — independent of challenge word tags.

In practice: The PAD score for a given word reflects that speaker's motor planning reality — accounting for their typical rate, feared vocabulary, and known articulatory challenge zones. Progress is measured against the speaker's own history, longitudinally, across sessions.

OpenMic · FluentPlay
07 / 09
Waveform inspection layer

Any frame, any moment —
isolate and listen

Audio waveform · word region
start end
Isolated 159ms · drag handles to adjust · click to start new isolation
— Recognition layer — Acoustic onset
Repeated onset (recognition insertion)

Intra-phoneme acoustic detection: Phoneme F contains 2 voicing onsets in its 170ms audio window — the recognition layer assigned a single phoneme, but the acoustic signal shows 2 separate productions (F-F).

⚠ Also classified as stutter
Convergence rule(s) triggered · legacy classification

What this layer enables

  • Frame-level isolation — drag handles to select any window within a word's audio region; inspect exactly the moment of instability
  • Recognition vs. acoustic comparison — two markers on the waveform show where the recognition layer fired vs. where the acoustic onset actually occurred; gap between them is measurable
  • Intra-phoneme detection — when the acoustic signal shows two productions inside a window the recognition layer classified as one phoneme, the discrepancy is flagged and the region is isolatable
  • Playback modes — play the isolated selection, play the full canonical word, or play the word as produced; compare what was intended vs. what was delivered
  • Self-analysis — speaker can hear the exact frame where the motor plan broke down; not a score but a direct auditory confrontation with the event
  • Clinical use — SLP can isolate, annotate, and use the waveform as a teaching surface; ground truth override applies to the same word
"for" · PAD 70 · 1 syl · 410ms/syl A single-syllable function word at 410ms — 1.55× the speaker's baseline — with 2 voicing onsets inside the F phoneme window. The recognition layer heard one phoneme. The acoustic signal recorded two attempts.
OpenMic · FluentPlay
08 / 09
Acoustic event taxonomy + SLP override layer

Six event types. Auto-detected.
SLP-correctable.

Delayed onset

Building state >400ms without voicing onset. (SLP label: block.)

Extended voicing

Duration >1.8× speaker median per syllable. (SLP label: prolongation.)

Repeated onset

Word or part-word; intra-phoneme detection. (SLP label: repetition.)

Filler

Planning load signal, not motor disruption.

Articulation deviation

Phoneme substitution or distortion. Relevant for motor speech disorders.

Omission

Word skipped vs. scripted reference. Completeness signal.

Convergent-event flag fires only on two or more event types, or any flag on a challenge word. The SLP applies the clinical label via override.

Ground truth override
✓ Mark Fluent Stutter Block Prolongation Repetition Filler Articulation Omission

Override propagates immediately through all counters, filters, and session metrics. Every data point is editable — auto-detection is a starting point, not a verdict.

Roadmap: speaker model training SLP override assignments accumulate into a speaker-specific recognition model — improving accuracy on that speaker's idiolect and reducing the false-flag rate over time.
OpenMic · FluentPlay
09 / 09
What can be practiced and measured

Every dimension is a
target for improvement

Signal
  • Session PAD score
  • Stability floor
  • PAD variance (σ)
  • Stable-onset rate
  • Convergent-event frequency
  • Event rate by type
Timing
  • Words per minute
  • Syllable rate (ms/syl)
  • Inter-word gap distribution
  • Prolongation frequency
  • Block duration
  • Phoneme duration outliers
Context
  • Challenge word hit rate
  • POA-specific event rate
  • Stressed vs. unstressed instability
  • Session-over-session PAD trend
  • Scripted vs. unscripted comparison
  • Warm-up vs. fatigue window analysis

The instrument doesn't prescribe what to practice. It exposes every dimension of the motor planning signal — and tracks progress on whichever dimensions the speaker, clinician, or researcher chooses to target.