Speech Recognition Grammars

Does anyone have any experience using speech recognition systems that take grammars specified using W3C’s SRGS format?

More specifically, does anyone know of any openly available grammars that I can get my hands on? I don’t really care what the domain is. I’d just like to get a fairly complex grammar that I can use for testing purposes.


No, but I did do some experimentation with Nuance and VXML around 10 years ago for telephony and NGN apps, so when you do start playing around, I’d be interested in hearing how you do. Even with the tech available at the time, we did get a basic mail-reader done purely through voice, but it was pretty painful at times. In general I felt DTMF was a much better choice for a telephony interface. We had to be very careful to make all the grammar choices very clearly distinct from one another at every level, and even then there were lots of errors.

For fun (knowing it wouldn’t work) I implemented a letters-of-the-alphabet voice-keyboard mode for spelling out words, and since so many of the letters sound so similar, you can imagine how badly that turned out. “A B C” --> “H V Z”…

Obviously technology has moved on since then, and there’s lots more CPU around, but even so when you have random users muttering random things into a voice UI, I’m sure you still have to accept a certain sadly large number of failed or mistaken parses.

Ya, the tech has developed a huge amount in the past 10 years.

To give you an idea, I threw together a little system that actually works decently, in a few hours, using Microsoft’s SAPI. Their api is pretty cool, and it’s freely available. If you have Windows Vista or 7, you got it.

They actually have a thing built in which specifically handles letter by letter dictation, of the type you describe.

The tought stuff comes from when you’re dealing with really complex grammars, and you need to semantically interpret stuff (rather than just dictation). This is why I’m looking for examples… I’m pretty sure they exist for some domains, but for some reason I’m not having that much luck finding them.

If you want to play around with it for free, I recommend the .NET implementation of MS’s speech API. It’s pretty easy to use, and powerful.

Stack overflow could probably give you better advice, FYI.