Skip to content
com.iristick.smartglass.core

Interface VoiceGrammar

All Known Subinterfaces:
VoiceCommandDispatcher

public interface VoiceGrammar
This class represents a grammar for voice commands.

A grammar is a description of the speech pattern(s) that the voice recognition engine should detect. A pattern consists of a single token or a group of patterns.

A grammar is constructed with a builder. Builder objects are obtained from VoiceGrammar.Builder.create().

Tokens

Tokens form the basis of speech. A token is an indivisible unit of speech, typically consisting of one or more words separated by spaces. Special characters, such as punctuation, are ignored.

Groups

Tokens and patterns may be grouped together to form more complex patterns. There are two kinds of groups:
  • Patterns in a sequential group must be said one after the other in order of appearance for the entire group to be recognized.
  • Patterns in an alternative group provide options.
    Exactly one of the members of the group must be said for the entire group to be recognized.

Groups may be arbitrarily nested to form more complex patterns of speech.

At creation, each grammar has an implicit sequential root group.

Multiplicity

Every group has a multiplicity. The multiplicity defines how many times the group must be repeated. This multiplicity is denoted by a lower and upper inclusive bound. The lower bound is always finite. The upper bound may be infinite.

Listening for voice commands

When building the grammar, a callback must be added to the grammar with VoiceGrammar.Builder.setCallback(VoiceEvent.Callback, Handler). This callback is invoked for every recognized voice command defined by the grammar.

A grammar may be activated for listening with Headset.startVoice(VoiceGrammar) and deactivated with Headset.stopVoice(VoiceGrammar).

Locales

Tokens are assumed to be in the currently configured locale, or in English if that locale is not supported. This holds true even when the locale is changed while the grammar is activated for listening. The engine simply reinterprets the grammar according to the new locale. If this is not desired, a new grammar must be constructed.

Additional remarks

  • For best results, it is recommended to use voice commands consisting of at least two or three syllables. Commands that are prefixes of other commands from a different grammar should also be avoided.
  • To preserve resources, it is recommended to create as few grammars as possible and to group voice commands together if they do not need to be disabled separately.
  • Large grammars may be expensive to construct. It is recommended to reuse the grammar objects through the lifetime of the app.
  • This object is not thread-safe.

Construction example

A grammar is constructed by adding tokens and groups onto a stack. The following series of operations construct a grammar that describes a common greeting:

 
 VoiceGrammar.Builder builder = binding.createVoiceGrammar();
 VoiceGrammar grammar = builder
     .addToken("Good")
     .addToken("morning")
     .setCallback(myCallback, null)
     .build();
 
 
As a final step in the construction, a callback is set and the grammar is built, after which it is ready for use. The builder should now be discarded. It is not possible to create multiple grammar objects from the same builder instance.

Groups allow creating more complex speech patterns. A new group is opened by calling any of the push*Group* methods on the grammar builder. Whenever a token is added or a group is pushed, it is added to the top-most group on the stack.

Initially, the builder has a single sequential group on the stack that may not be popped off the stack. Building upon the previous example, a grammar may be constructed to describe several greetings for different times of the day. Indentation is used to visualize the grouping.

 
 VoiceGrammar.Builder builder = VoiceGrammar.Builder.create();
 VoiceGrammar grammar = builder
     .addToken("Good")
     .pushAlternativeGroup()
       .addToken("morning")
       .addToken("afternoon")
       .addToken("evening")
     .popGroup()
     .setCallback(myCallback, null)
     .build();
 
 

Based on this grammar, the voice command engine will recognize the following sentences:

  • "Good morning"
  • "Good afternoon"
  • "Good evening"

Oftentimes, the person to which the greeting is addressed is explicitly mentioned in the greeting. By adding another alternative group that is repeated from zero to one times, we can add an optional list of names to the greeting.

 
 VoiceGrammar.Builder builder = VoiceGrammar.Builder.create();
 VoiceGrammar grammar = builder
     .addToken("Good")
     .pushAlternativeGroup()
       .addToken("morning")
       .addToken("afternoon")
       .addToken("evening")
     .popGroup()
     .pushAlternativeGroup(0, 1)
       .addToken("Jack")
       .addToken("Anna")
     .popGroup()
     .setCallback(myCallback, null)
     .build();
 
 

Right now, we have to inspect the text of the recognized voice command in order to determine who was greeted. This operation is not always trivial and may become quite complex if voice commands need to be translated. Instead, each interesting token could be given a numerical tag. When a tagged token is part of the recognized voice command, its tag will be added to VoiceEvent.getTags().

 
 VoiceGrammar.Builder builder = VoiceGrammar.Builder.create();
 VoiceGrammar grammar = builder
     .addToken("Good")
     .pushAlternativeGroup()
       .addToken("morning")
       .addToken("afternoon")
       .addToken("evening")
     .popGroup()
     .pushAlternativeGroup(0, 1)
       .addToken("Jack", 0)
       .addToken("Anna", 1)
     .popGroup()
     .setCallback(myCallback, null)
     .build();
 
 

When the user says "Good morning Anna", the contents of the list of tags will be {1}. If the user says "Good evening", this list will be empty.

As a final example, consider less formal greetings. These may be added to the grammar by nesting the first part in another alternative group, e.g.,

 
 VoiceGrammar.Builder builder = VoiceGrammar.Builder.create();
 VoiceGrammar grammar = builder
     .pushAlternativeGroup()
       .addToken("Hi")
       .addToken("Hello")
       .pushSequentialGroup()
         .addToken("Good")
         .pushAlternativeGroup()
           .addToken("morning")
           .addToken("afternoon")
           .addToken("evening")
         .popGroup()
       .popGroup()
     .popGroup()
     .pushAlternativeGroup(0, 1)
       .addToken("Jack", 0)
       .addToken("Anna", 1)
     .popGroup()
     .setCallback(myCallback, null)
     .build();
 
 

Note the use of a sequential group to prevent recognizing "Good Jack" as a way to greet Jack.

Nested Class Summary

Modifier and Type Interface and Description
static class  VoiceGrammar.Builder
This class allows to construct immutable voice command grammar objects.

Field Summary

Modifier and Type Field and Description
static int REPEAT_UNBOUNDED
Groups whose maximum repetition is set to this value may be repeated an unlimited amount of times within a single voice command.

Method Summary

Modifier and Type Method and Description
void release()
Releases resources associated with this VoiceGrammar object.

Field Detail

REPEAT_UNBOUNDED

static final int REPEAT_UNBOUNDED
Groups whose maximum repetition is set to this value may be repeated an unlimited amount of times within a single voice command.
See Also:
VoiceGrammar.Builder.pushAlternativeGroup(int, int), VoiceGrammar.Builder.pushSequentialGroup(int, int)

Method Detail

release

void release()
Releases resources associated with this VoiceGrammar object.

It is good practice to call this method when you're done using the VoiceGrammar to free up memory.

Warning

If the grammar is still in use by a headset, releasing the grammar may lead to undefined behavior.