Voice to Text

conor1 · September 7, 2017, 8:48am

Essentially it’s your demo with minor text display adjustment plus the offending voiceToText.startRecording(nil, nil, 5000,5000,5000)

conor1 · September 7, 2017, 10:06am

Update. It seems that the timeout is delayed if the voice recognition needs time to work out what is said. I spoke gibberish into the app, paused and spoke proper words. Because my gibberish was still being processed, the pause was ‘accepted’ and my proper words after the pause also accepted.

Probably not be of any use of course…

Scott_Harrison · September 7, 2017, 5:09pm

It looks like Google either removed this feature to set inputSilenceLength or it is a bug. Either way, it looks like it is not working for anyone.

I have also tried a couple apps that use this android api and they seem to have the same issue. I can not do anything on my end to fix this google issue.

Sources:

https://stackoverflow.com/a/28628826

https://stackoverflow.com/a/17675098

conor1 · September 8, 2017, 3:01pm

A subnote to this issue. I was testing again with some music playing in the background, and the recording continued for longer than usual. Obviously API taking more time to figure that talking has stopped. Not the required answer of course.

So we need google to fix this.

Scott_Harrison · September 8, 2017, 4:35pm

You might try stopping the background music if you can.

sirmania · November 13, 2017, 9:33am

Hi Scott!

Your plugin is awesome!!!

Is it possible to add support for Apple TV? Apple TV 4th generation and 4K has a microphone in the remote control (for Siri). If Apple TV can be supported it would be a game changer for me B)

Scott_Harrison · November 13, 2017, 4:06pm

Unfortunately Apple Speech framework is only support on iOS. I don’t even believe you can access the Apple TV remote microphone anyways.

sirmania · November 14, 2017, 9:57am

Ok!

I´m using an iPad and share the screen with the Apple TV, so I have access to the microphone on the iPad. This works but it would be much better to run the app on the Apple TV without the iPad. Well, well… maybe one day :rolleyes:

dmarques42 · February 16, 2018, 6:18am

I have my Android phone (6.0.1) working well with this plugin, but I have a timing problem.

Context: my app is for helping young children volcalize; a word is read to them, then they say it back.

Problem: No matter what I have tried, I cannot get any sound out after voiceToText.startRecording(…) – it blocks all audio, and I have tried audio.play as well as a texttospeech app. If I can fix this, that might do it for me.

That wouldn’t be fatal except for the delay. I now play the phrase (eg, “Say bubble”), and start Recording as soon as that completes. But then the recording does not actually start for 1+ seconds after I send the startRecording message, and children are not that patient. I need to get this difference down to 200-300ms or less.

I did note that doing an initial start/stop at app startup shortens the time for the first start after that, but they are still far too long.

So, 2 problems: the time after startRecording and when it accepts sound is much too long for my use, and 2) I cannot start the recording ahead because it blocks all audio output.

From the specs (and other comments) it seems that blocking all audio is probably a bug on my side? (I use “false” for the second argument, but it makes no difference what I put there, all sound is stopped.)

Phone is alcatel idol 3, but these children will not have high end phones/tablets anyway.

Scott_Harrison · February 16, 2018, 6:23am

Unfortunately there is no good way to handle this, it is really and an android problem. Sorry to hear you are having these issues you could try using visual options or use the native google sounds that plays.

Scott_Harrison · February 16, 2018, 6:25am

As for why audio stops, the os stops the audio no matter what. The Param is designed to stop google sound effect from playing.

dmarques42 · February 18, 2018, 2:02am

OK, thank you for the answer. I think I have improved it a lot by abandoning texttospeech (instructions of what word to say; I just use pre-recorded batch transactions from Amazon Polly) and just playing audio. That makes it a shorter and more reliable gap. We already use visuals, but these are very young kids (3-5) who have various attention issues, so the instructions “Harry, say bubble” is critical.

But now, today, voiceToText does not work at all. It was all working fine the past two days, but today, I get no error, I never get the ‘started’ e.response, and never get any recording. Did something happen to any servers or something? Is there a way for me to debug this? There is nothing at all in the log, just my own record that I did init and start (and stop), but nothing more.

Nothing is printed out, how do I figure out if it is my problem? I changed a lot of things in my code, but none of it in the routines that call voiceToText. Any suggestions?

Scott_Harrison · February 18, 2018, 2:11am

I am not sure what is happening but are you connected to internet

dmarques42 · February 18, 2018, 3:28am

Yes, lots of stuff wouldn’t work if that were a problem. But I borrowed back a different device that has the version I made yesterday and it works, so it clearly is something interacting with code I changed today (mostly replacing texToSpeech with audio.play). Do I have to worry about audio channels? But now that I know that the problem is somewhere in my app, I will track it down. But I am concerned that it stopped working after I changed other things.

Anyway, thank you for your help. Feel free to suggest anything else I should check. I will try to roll back to get it working again.

dmarques42 · February 18, 2018, 3:42am

Solved it. Somehow, microphone permissions were turned off, even though they were explicitly requested in install. If my grandson had been playing with the phone, I would have blamed him, but must have been something I did. Anyway, no problem.

My delay is much shorter and more reliable using audio.play instead of textToSpeech, might be enough to get by with it. We’ll see with user testing.

dmarques42 · February 18, 2018, 11:49pm

I have another question, about options for getting phonemes.

The children I am working with have language issues, so pronunciation a bit off sometimes (eg, ‘puh’ instead of ‘up’). I put these in my own hitlist for comparison, but I never get ‘puh’ back, obviously because it is not in any lexicon. Is there a way to get just the phonemes, and I do the word matching myself? Almost all my matching is single words with very specific context (at least so far). I tried setting the language to a phony one, but that didn’t work.

Thanks for any suggestions. This is working pretty well for me now, though can’t use it for these kids unless I can get more raw or submit my augmented lexicon.

alistair.crompton · February 19, 2018, 1:54pm

Hi,

thanks for this plugin that does a great job!

I was wondering if somehow, is there a way to disable the start and stop recording default system sounds?

Scott_Harrison · February 19, 2018, 7:43pm

Yes, on Android it is voiceToText.startRecording(nil, true)

dmarques42 · February 19, 2018, 9:44pm

>>Yes, on Android it is voiceToText.startRecording(nil, true)

When I do that, it turns off the startRecording sound, but not the stopRecording sound.

alistair.crompton · February 20, 2018, 10:02am

Thanks, perfectly works!