speech recognition
I am trying to use the NSSpeechRecognition class. This is my code \/\/\/. The speechRecognizer method never gets implemented. I am speaking very clearly into the mic, can anyone see any errors with the code. What else could it be if not a code error?
- (id)initWithName:(NSString *)name {
[self initWithWindowNibName:name];
synth = [[NSSpeechSynthesizer alloc] init];
[synth setDelegate:self];
NSArray *cmds = [NSArray arrayWithObjects:@"Sing", @"Jump", @"Roll", nil];
recogn = [[NSSpeechRecognizer alloc] init];
[recogn setCommands:cmds];
[recogn setDelegate:self];
[recogn setListensInForegroundOnly:NO];
[recogn setBlocksOtherRecognizers:YES];
return self;
}
- (IBAction)sayCommand:(id)sender {
if ([sender state] == NSOffState) {
[synth startSpeakingString:@"What do you want me to do?"];
[recogn startListening];
}
}
- (void)speechRecognizer:(NSSpeechRecognizer *)sender didRecognizeCommand:(id)aCmd {
[recogn stopListening];
if ([(NSString *)aCmd isEqualToString:@"Jump"]) {
[synth startSpeakingString:@"How High?"];
}
[clickButton setState:NSOnState];
}
-(void)dealloc {
[synth release];
[recogn release];
[super dealloc];
}
Hi
The code looks about right to me. I'm just going off to the movies, so I haven't time at the moment to investigate more carefully. Does the machine say this "What do you want me to do?" business? If that's working, it's kind of hard to see any difference between how you set up the Synth and the Recognizer. How odd! Have a google around and if you find the fix could you let us know? Otherwise I'll have a look tomorrow.
I did a google search and the code looks right. And the machine does ask what do you want me to do?
Is there a setting in the main plist file that has to be set.?
I don't think so. The stuff in Info.plist is used by interface builder and the bundle code to locate resources and stuff.
Why did you bring up - you've obviously got something in mind?
That's GREAT (I'm trying to speak clearly). That's awesome - thanks for the update.
Does anybody know if there is a more dynamic way of having the app understand speech. I want to be able to say computer set alarm for 9 am tomorrow. The way the current system is set up, I would have to hand code the accepted command for every time and every day, thats 24 * (a lot) which is much more code. I want the app to pick out key words.
If I've understood this correctly, you provide an array of key words and the recognizer calls your delegate to report every word he recognizes, right?
Can't you just say something along the lines of this (pure pseudo code)
cmdArray = [ "one", "two", "three" ....... "monday", "tuesday", .... , "pm" , "tomorrow" ] ;
listenFor(cmdArray, callback)
// hour and day are int members of your class
callback(word)
{
day = (get current day of the week)
index = (get the index of word in cmdArray );
switch ( index ) {
case 1 : case 2 : ....... : hour = index ; break ;
case 13: case 14: ..... : day = index -13 ; break ;
case 19 : hour += 12 ; break ;
case 20 : day += 1 ; break ;
}
I don't see a lot of alternative to typing in the words that have to be said. However you could actually say:
"two pm saturday"
"six tomorrow"
"saturday pm two"
Or, have I missed the point? Are you wanting the computer to know the english word we say for a number like 7? Or does he only recognize one word at a time?
it only passes one statement to the delegate method. Not an array, I think.
I want the computer to recognize key words, than apply those word to an action. So if I say "set the alarm for 9 in the morning" or "new alarm 9 morning", I want it to recognize(sort through the spoken words) the key words set or new, alarm, 9, morning and than have it do what it needs to to set an alarm for the appropriate time.
so if it sees that set is one of the key words in the array of word, it will run a function the sets the alarm, than when it looks for what time to set the alarm, it knows that I said a number and a part of the day.
in the current method, you have to set an array of commands. the problem is it wont PICK an array of predefined commands and the spoken command has to be exactly as defined.
Hi R
I pasted your code and got it working. Very interesting. Gosh, you are right - it's very fussy about recognizing words. I don't suppose my Scottish accent helps - however I'm amazed at how 'clearly' I have to speak.
*** Don't captialize the words and it works much better @"jump" , NOT @"Jump" ***
Anyway, I've modified your delegate to start listening after every word. So although it recognizes words 'one at a time', I think it does an OK job at parsing the stuff.
- (void)speechRecognizer:(NSSpeechRecognizer *)sender didRecognizeCommand:(id)aCmd
{
// [clickButton setEnabled:YES];
// [recogn stopListening];
NSLog(@" didRecognizeCommand %@",aCmd) ;
if ([aCmd isEqualToString:@"jump"]) {
[synth startSpeakingString:@"How High?"];
}
if ([aCmd isEqualToString:@"roll"]) {
[synth startSpeakingString:@"Over beethoven?"];
}
if ([aCmd isEqualToString:@"sing"]) {
[synth startSpeakingString:@"Something simple?"];
}
[recogn startListening];
}
I wonder how to this up so that I can listen for multiple commands(multiple single words).
Well, it can recognize little sentences as well as words.
- (IBAction)sayCommand:(id)sender
{
[recogn setDisplayedCommandsTitle:@"What do you want me to do?"];
NSArray *cmds = [NSArray arrayWithObjects:@"sing, sing a song", @"jump", @"roll", @"one", nil];
[recogn setCommands:cmds];
[synth startSpeakingString:@"What do you want me to do?"];
[recogn startListening];
}
- (void)speechRecognizer:(NSSpeechRecognizer *)sender didRecognizeCommand:(id)aCmd
{
[recogn startListening];
NSLog(@" didRecognizeCommand %@",aCmd) ;
[recogn startListening];
if ([aCmd isEqualToString:@"jump"]) {
[synth startSpeakingString:@"How High?"];
}
if ([aCmd isEqualToString:@"roll"]) {
[synth startSpeakingString:@"Over beethoven?"];
}
if ([aCmd isEqualToString:@"sing, sing a song"]) {
[synth startSpeakingString:@"sing out loud"];
}
if ([aCmd isEqualToString:@"one"]) {
[synth startSpeakingString:@"in the morning?"];
}
}
Understanding sentences from words is exactly what a compiler does when it's reading your code. A compiler normally does it's work in two steps:
1) It recognizes words (tokens in CS speak)2) It uses a state machine which responds to the token.
The pseudo code I posted about last night illustrates a simple state machine that remembers the day and hour to which you want the alarm set.
Thank you for making this contribution to the Forum. I've personally found this very interesting. I've written a tutorial and published it on my web site at:
http://clanmills.com/articles/cocoatutorials/
The documentation for the tutorial is at http://clanmills.com/files/Listen.pdf
The code is available at http://clanmills.com/files/Listen.zip
All bugs are mine. All good ideas are routers. The contribution of routers is acknowledged. Comments welcome.
In addition to discussing the speech recognition and synthesizer, this tutorial contains a discussion about Cocoa and Obj/C and the difficulties involved in learning the system. It also contains a discussion about the relationship between XCode, InterfaceBuilder and the NIB file.
The tutorial also includes a custom build step to illustrate how the 'About Box' can be updated on every build to record the date and time of the build.