hama v1.5.0: phoneme-to-grapheme, a SIMD engine, and fp16 weights

Version 1.5.0 adds a third modality to hama. Alongside grapheme-to-phoneme (G2P) and phoneme ASR, there is now phoneme-to-grapheme (P2G): give it a sequence of phonemes and it writes text. The release also makes the WebAssembly engine markedly faster, halves the new model with float16 weights, and ships a live speech-to-text demo that runs end to end in your browser.

New P2G modality: P2GModel in Python and P2GNodeModel / P2GBrowserModel in TypeScript (hama-js/p2g and hama-js/p2g/browser). It is a decoder-only PrefixLM character transformer — effectively the inverse of G2P — reimplemented from scratch in the Zig engine with a KV-cached greedy decoder.
No ONNX for P2G: its weights are converted straight from the PyTorch checkpoint to a .hama package, and the engine forward pass is validated stage-by-stage against PyTorch, reproducing the reference token ids and text exactly on a committed golden corpus.
Faster engine: the projection and matmul kernels are now hand-vectorized with explicit SIMD, and the WASM build enables simd128. P2G decode is roughly 4x faster in the browser and Node, while the engine binary stays tiny (~39 KB). G2P and ASR get the same kernel speedup, with no change to any output.
Smaller weights: the P2G model ships as float16, halving it from 29 MB to 14.6 MB (the Python wheel drops from ~41 MB to ~23 MB). The engine upcasts to float32 at load, so golden parity is exact. G2P and ASR already shipped fp16 since v1.4.0.
Public APIs are unchanged — existing G2PModel / ASRModel and G2PNodeModel / ASRNodeModel code keeps working. Python hama and TypeScript hama-js are aligned on version 1.5.0.

You can try the whole pipeline yourself: the new speech-to-text demo records your microphone, runs phoneme ASR, then P2G — entirely on-device, no servers. The text-to-phonemes demo on the home page shows the G2P direction.