speechtype

Typography that
follows your voice.

npm ↗
GitHub
TypeScript·Zero dependencies·React + Vanilla JS

When text is read aloud via speech synthesis, there is no visual indicator of which word is being spoken — the connection between audio and typography is severed. speechType wraps words in spans and listens to Web Speech API boundary events, applying typographic emphasis to each word as it is spoken.

Live demo — press Speak or step with the Word slider

Word
Active weight700
Inactive opacity0.45
Rate0.9×
Transition80ms

Every word spoken carries its own weight — a breath of meaning that arrives, lingers, and fades. To read aloud is to make language visible in time, each syllable a footprint left in the air. The voice knows where it is. The page does not.

Press Speak to hear the paragraph read aloud — each word is emphasised as it is spoken. Use the Word slider to step through manually. Useful for accessibility tools, language learning, teleprompters, and any interface where voice and text need to stay connected.

How it works

The audio-text gap

When the browser speaks text via SpeechSynthesisUtterance, words are heard but not seen. There is no CSS property that tracks synthesis progress — the visual and the auditory are decoupled by design. speechType closes that gap.

Boundary events drive emphasis

The Web Speech API fires boundary events as each word begins. speechType maps the reported character index to the correct word span and applies typographic emphasis — wider tracking, heavier weight, larger optical size — while dimming all other words.

Imperative for performance

During active speech, emphasis is applied imperatively via applySpeechType — bypassing React state and re-renders entirely. The boundary event fires, the span style changes, and the frame is painted. No scheduling, no batching delay.

React or vanilla JS

SpeechTypeText and useSpeechType manage span preparation and React-driven emphasis. startSpeechType is the vanilla JS entry point — it wraps, speaks, and syncs, returning a stop function.

Usage

TypeScript + React · Vanilla JS

Drop-in component

import { SpeechTypeText } from '@liiift-studio/speechtype'

<SpeechTypeText activeWordIndex={activeWordIndex}>
  The voice knows where it is. The page does not.
</SpeechTypeText>

Start speech synthesis and sync emphasis

import { startSpeechType } from '@liiift-studio/speechtype'

const el = document.querySelector('p')
const stop = startSpeechType(el, { rate: 0.9, activeWeight: 700 })

// Later — cancel speech and restore styles:
// stop()

Hook — manual control with your own active index

import { useSpeechType } from '@liiift-studio/speechtype'
import { useRef } from 'react'

// Inside a React component:
const ref = useRef(null)
useSpeechType(ref, activeWordIndex, { inactiveOpacity: 0.4 })
return <p ref={ref}>Every word spoken carries its own weight.</p>

Options

OptionDefaultDescription
activeTracking0.06Letter-spacing on the active word in em.
activeWeight700wght axis value on the active word.
activeOpsz24opsz axis value on the active word.
inactiveOpacity0.45Opacity of words not currently spoken.
transitionMs80CSS transition duration in ms for style changes.
rate0.9Speech rate (0.1–10). Only applies when using startSpeechType.
pitch1Speech pitch (0–2).
volume1Speech volume (0–1).
as'p'HTML element to render. (SpeechTypeText only)