SSES Robotics Release 8 Documentation
Release 8 modifications are Edge AI improvements focusing on ASR (automatic speech recognition). Specifically, an R&D version of a small language model (SLM) has been implemented to identify and correct “sound alike” word errors that occur in ASR output. The method is independent of ASR model (e.g. Kaldi, Whisper, other) and is suitable for SWaP (size, weight and power consumption) constraints imposed by very small form-factors operating either wholly or partially self-contained (i.e. disconnected from the cloud).
Small form-factor self-contained operation is crucial for robotics use-cases requiring verbal communication, such as factory floor, first-responders, stalled or disabled autonomous vehicles, drones operating in remote areas,
Edge AI Functional Requirements
Below are Edge AI functional requirements for Release 8 modifications.
Edge AI Robotics Data Flow
In the data flow below, the SLM is inserted in a modular approach. A primary objective is to allow independent training of co-processing models, including ASR and any other necessary language model. It’s worth noting that DeepSeek uses a similar approach, and we might expect modular architectures to become prevalent in robotics and physical AI applications.
Robotics Small-Form Factor Hardware Requirements
Below are examples of small server platforms that meet small-form factor and self-contained operation requirements.
Sound-Alike Word Training
Below is a summary of sound-alike word training used in Release 8 modifications.