Read-it-to-me Story Books and interactive text

I’m taking an existing paper children’s book and converting it to an interactive book (iPad app). Since there are lots of books in the app store like this, I want to make sure we’re offering the best features of all of them.

One feature I really liked form Broderbund’s old Living Book titles was the way they handled the text. When the narrator reads the text, the words light up or are highlighted as they are spoken.

After the narrator finishes reading, you can tap on any word and hear that word spoken. (We will be recording the entire sentence for reading the story to the child, and also recording each word separately to implement this “tap word” feature.)

I’d also like to include multiple language support at some time as well.

Since we’ll be doing quite a few books, I’d like to come up with a general case solution that will make it easy to implement this. Has anyone already done this in Corona who would be willing to share code?

Does anyone have any suggestions on the best ways to implement this?

I’d like to keep the text as text rather than as graphics. I’m thinking of taking the word-wrap code and converting it to look for word boundaries and somehow make each of those words touchable (maybe as a separate object, maybe as an array). Highlighting could either be done behind the word (as with a marker) or by displaying the same word behind the first word but larger and in a highlight color. Or maybe just change the color of the word (or scale it up slightly). Or several of these.

I’d probably have to get exact timing on the reading of the sentence so I know at which millisecond each word is spoken so we can match the highlighting to it. Of course, having a tool that would let us mark word beginnings and spit out a file with those marker points would help a lot.

Any thoughts would be appreciated here!

Thanks!! [import]uid: 9905 topic_id: 3622 reply_id: 303622[/import]

Hi David,
I am a fan of Broderbund’s Living Books, Specially, Just Grandma and Me, Aesop’s Fables - Tortoise and the Hare and Arthur’s teacher trouble. I am not sure if the other titles were released like the new kid on the block, and Where’s ruff’s bone.

There were plenty of interactive titles that came after than but none came close to the quality of Broderbund.

Now, off the top of my head, you can set up each word as text and set an event listener on that, so when someone taps that, it will play the sound of that word.

Highlighting the same, well, there are a few ways of managing that, draw a rectangle outside (behind) the text in a lighter colour and increase this using transition.to to the width of the word, you would have to time them like any karaoke text. These can be stored in a table.

Alternatively, you can have a table with timings that have x and y co-ordintates stored, they will highlight the text and un-highlight them based on the timestamp.

Using Objective-C I spend a lot of time trying to do something similar, now you can do that with Corona even easier. To create the table of timing, you can start the mp3 and have the text displayed, and as soon as the word starts, tap on the start boundary and store this to a plist.

To get a better idea download one of the TapTap types games or RockStar ones, you will get an idea of how to implement this.

cheers,

Jayant C Varma [import]uid: 3826 topic_id: 3622 reply_id: 11062[/import]

I would love to see some code to implement an easy text-mp3 sync method.

I’ve been using this method, which is far from convenient: You tap the first words, and stars a transition.to that plays the mp3 and the text animation. Since no one could tell me how to change setTextColor, I just duplicated the text and swapped the alpha in order to simulate color change.

So this should be the how NOT to do it I guess. Sorry.

Download example. (Link should be direct, tell me if you have to wait.)
[lua] media.playSound(“1.mp3”)
media.stopSound()

local bg = display.newRect( 0, 0, display.contentWidth, display.contentHeight )
bg:setFillColor(255,255,255)

– Text
local t1 = display.newText(“WAS the night”, 297, 455, “Baskerville”, 30)
t1:setTextColor( 60, 60, 60 )
t1.alpha=1
local t2 = display.newText(“before Christmas,”, 482, 455, “Baskerville”, 30)
t2:setTextColor( 60, 60, 60 )
t2.alpha=1
local t3 = display.newText(“when all through”, 703, 455, “Baskerville”, 30)
t3:setTextColor( 60, 60, 60 )
t3.alpha=1
local t4 = display.newText(“the house”, 297, 491, “Baskerville”, 30)
t4:setTextColor( 60, 60, 60 )
t4.alpha=1
local t5 = display.newText(“Not a creature”, 297, 528, “Baskerville”, 30)
t5:setTextColor( 60, 60, 60 )
t5.alpha=1
local t6 = display.newText(“was stirring,”, 479, 528, “Baskerville”, 30)
t6:setTextColor( 60, 60, 60 )
t6.alpha=1
local t7 = display.newText(“not even”, 630, 528, “Baskerville”, 30)
t7:setTextColor( 60, 60, 60 )
t7.alpha=1
local t8 = display.newText(“a mouse;”, 740, 528, “Baskerville”, 30)
t8:setTextColor( 60, 60, 60 )
t8.alpha=1
local t9 = display.newText(“The stockings”, 297, 564, “Baskerville”, 30)
t9:setTextColor( 60, 60, 60 )
t9.alpha=1
local t10 = display.newText(“were hung”, 470, 564, “Baskerville”, 30)
t10:setTextColor( 60, 60, 60 )
t10.alpha=1
local t11 = display.newText(“by the chimney”, 604, 564, “Baskerville”, 30)
t11:setTextColor( 60, 60, 60 )
t11.alpha=1
local t12 = display.newText(“with care”, 795, 564, “Baskerville”, 30)
t12:setTextColor( 60, 60, 60 )
t12.alpha=1
local t13 = display.newText(“In hopes”, 297, 600, “Baskerville”, 30)
t13:setTextColor( 60, 60, 60 )
t13.alpha=1
local t14 = display.newText(“that St. Nicholas”, 408, 600, “Baskerville”, 30)
t14:setTextColor( 60, 60, 60 )
t14.alpha=1
local t15 = display.newText(“soon would”, 615, 600, “Baskerville”, 30)
t15:setTextColor( 60, 60, 60 )
t15.alpha=1
local t16 = display.newText(“be there;”, 760, 600, “Baskerville”, 30)
t16:setTextColor( 60, 60, 60 )
t16.alpha=1

– Second Text
local tt1 = display.newText(“WAS the night”, 297, 455, “Baskerville”, 30)
tt1:setTextColor( 255, 0, 0 )
tt1.alpha=0
local tt2 = display.newText(“before Christmas,”, 482, 455, “Baskerville”, 30)
tt2:setTextColor( 255, 0, 0 )
tt2.alpha=0
local tt3 = display.newText(“when all through”, 703, 455, “Baskerville”, 30)
tt3:setTextColor( 255, 0, 0 )
tt3.alpha=0
local tt4 = display.newText(“the house”, 297, 491, “Baskerville”, 30)
tt4:setTextColor( 255, 0, 0 )
tt4.alpha=0
local tt5 = display.newText(“Not a creature”, 297, 528, “Baskerville”, 30)
tt5:setTextColor( 255, 0, 0 )
tt5.alpha=0
local tt6 = display.newText(“was stirring,”, 479, 528, “Baskerville”, 30)
tt6:setTextColor( 255, 0, 0 )
tt6.alpha=0
local tt7 = display.newText(“not even”, 630, 528, “Baskerville”, 30)
tt7:setTextColor( 255, 0, 0 )
tt7.alpha=0
local tt8 = display.newText(“a mouse;”, 740, 528, “Baskerville”, 30)
tt8:setTextColor( 255, 0, 0 )
tt8.alpha=0
local tt9 = display.newText(“The stockings”, 297, 564, “Baskerville”, 30)
tt9:setTextColor( 255, 0, 0 )
tt9.alpha=0
local tt10 = display.newText(“were hung”, 470, 564, “Baskerville”, 30)
tt10:setTextColor( 255, 0, 0 )
tt10.alpha=0
local tt11 = display.newText(“by the chimney”, 604, 564, “Baskerville”, 30)
tt11:setTextColor( 255, 0, 0 )
tt11.alpha=0
local tt12 = display.newText(“with care”, 795, 564, “Baskerville”, 30)
tt12:setTextColor( 255, 0, 0 )
tt12.alpha=0
local tt13 = display.newText(“In hopes”, 297, 600, “Baskerville”, 30)
tt13:setTextColor( 255, 0, 0 )
tt13.alpha=0
local tt14 = display.newText(“that St. Nicholas”, 408, 600, “Baskerville”, 30)
tt14:setTextColor( 255, 0, 0 )
tt14.alpha=0
local tt15 = display.newText(“soon would”, 615, 600, “Baskerville”, 30)
tt15:setTextColor( 255, 0, 0 )
tt15.alpha=0
local tt16 = display.newText(“be there;”, 760, 600, “Baskerville”, 30)
tt16:setTextColor( 255, 0, 0 )
tt16.alpha=0

local function stopNBC()
media.stopSound(“1.mp3”)
end

function t1:tap( event )
media.playSound(“1.mp3”)
transition.to(t1, { time=900, alpha=0 } )
transition.to(tt1, { time=900, alpha=1 } )
transition.to(t1, { delay=920, time=900, alpha=1 } )
transition.to(tt1, { delay=920, time=900, alpha=0 } )
transition.to(t2, { delay=800, time=1000, alpha=0 } )
transition.to(tt2, { delay=800, time=1000, alpha=1 } )
transition.to(t2, { delay=2000, time=400, alpha=1 } )
transition.to(tt2, { delay=2000, time=400, alpha=0 } )
transition.to(t3, { delay=2100, time=700, alpha=0 } )
transition.to(tt3, { delay=2100, time=700, alpha=1 } )
transition.to(t3, { delay=2900, time=400, alpha=1 } )
transition.to(tt3, { delay=2900, time=400, alpha=0 } )
transition.to(t4, { delay=2900, time=700, alpha=0 } )
transition.to(tt4, { delay=2900, time=700, alpha=1 } )
transition.to(t4, { delay=3800, time=400, alpha=1 } )
transition.to(tt4, { delay=3800, time=400, alpha=0 } )
transition.to(t5, { delay=4100, time=700, alpha=0 } )
transition.to(tt5, { delay=4100, time=700, alpha=1 } )
transition.to(t5, { delay=4800, time=400, alpha=1 } )
transition.to(tt5, { delay=4800, time=400, alpha=0 } )
transition.to(t6, { delay=5000, time=600, alpha=0 } )
transition.to(tt6, { delay=5000, time=600, alpha=1 } )
transition.to(t6, { delay=5600, time=400, alpha=1 } )
transition.to(tt6, { delay=5600, time=400, alpha=0 } )
transition.to(t7, { delay=5600, time=700, alpha=0 } )
transition.to(tt7, { delay=5600, time=700, alpha=1 } )
transition.to(t7, { delay=6300, time=400, alpha=1 } )
transition.to(tt7, { delay=6300, time=400, alpha=0 } )
transition.to(t8, { delay=6200, time=700, alpha=0 } )
transition.to(tt8, { delay=6200, time=700, alpha=1 } )
transition.to(t8, { delay=7100, time=400, alpha=1 } )
transition.to(tt8, { delay=7100, time=400, alpha=0 } )
transition.to(t9, { delay=7400, time=600, alpha=0 } )
transition.to(tt9, { delay=7400, time=600, alpha=1 } )
transition.to(t9, { delay=8000, time=400, alpha=1 } )
transition.to(tt9, { delay=8000, time=400, alpha=0 } )
transition.to(t10, { delay=8300, time=600, alpha=0 } )
transition.to(tt10, { delay=8300, time=600, alpha=1 } )
transition.to(t10, { delay=8900, time=400, alpha=1 } )
transition.to(tt10, { delay=8900, time=400, alpha=0 } )
transition.to(t11, { delay=9200, time=500, alpha=0 } )
transition.to(tt11, { delay=9200, time=500, alpha=1 } )
transition.to(t11, { delay=9700, time=300, alpha=1 } )
transition.to(tt11, { delay=9700, time=300, alpha=0 } )
transition.to(t12, { delay=9900, time=500, alpha=0 } )
transition.to(tt12, { delay=9900, time=500, alpha=1 } )
transition.to(t12, { delay=10400, time=400, alpha=1 } )
transition.to(tt12, { delay=10400, time=400, alpha=0 } )
transition.to(t13, { delay=10600, time=700, alpha=0 } )
transition.to(tt13, { delay=10600, time=700, alpha=1 } )
transition.to(t13, { delay=11300, time=400, alpha=1 } )
transition.to(tt13, { delay=11300, time=400, alpha=0 } )
transition.to(t14, { delay=11600, time=1100, alpha=0 } )
transition.to(tt14, { delay=11600, time=1100, alpha=1 } )
transition.to(t14, { delay=12600, time=400, alpha=1 } )
transition.to(tt14, { delay=12600, time=400, alpha=0 } )
transition.to(t15, { delay=12900, time=700, alpha=0 } )
transition.to(tt15, { delay=12900, time=700, alpha=1 } )
transition.to(t15, { delay=13800, time=400, alpha=1 } )
transition.to(tt15, { delay=13800, time=400, alpha=0 } )
transition.to(t16, { delay=13500, time=700, alpha=0 } )
transition.to(tt16, { delay=13500, time=700, alpha=1 } )
transition.to(t16, { delay=14200, time=400, alpha=1 } )
transition.to(tt16, { delay=14200, time=400, alpha=0 } )

end
t1:addEventListener( “tap”, t1 )[/lua] [import]uid: 10426 topic_id: 3622 reply_id: 11065[/import]

I’ve made great progress on this test. You can download the project here:

http://www.ElectricEggplant.com/corona/SoundTextSync.zip

I started with cmontesino’s code and sound sample, used Audacity to label each word and also split up the sounds (you can export separate audio files for each labeled section), and to create the table of timings for each word.

I’ve included the Audacity files (the .aup file that’s included, along with the folder) so you can try this out and see how it works.

For this test, you can click on the black box to the left of the text to play the entire recording (with word highlighting synced to the audio), or you can click on each word to hear it by itself and see it highlighted.

For a real production, I would record each page’s contents twice… once just read straight through (as we have here), and once with clear pauses between each word so they’re enunciated clearly and don’t blend into the next word.

I used cmontesino’s idea of switching between different colored text to highlight, but you could also highlight behind the text using the width of the word. And if we ever get fancier ways to display text (like with outlines or glows), we could come up with fancier ways to display.

I’d love to hear your feedback/comments on this.
[import]uid: 9905 topic_id: 3622 reply_id: 11276[/import]

I figured that was the only way to do it right now. A straight read-through and separate word sounds. If you’re patient, the blog says they’re updating the sound features, though. [import]uid: 11024 topic_id: 3622 reply_id: 11283[/import]

It might be easier to use a single sound file for all the words (rather than one for each) if we had a way to start and stop the sound at a specific point. Also, this really needs volume control. The single word sounds are at a much lower volume than the whole sentence even though they’re all from the same source and are all at the same volume. I assume we’ll have more control over that when the new code comes out. [import]uid: 9905 topic_id: 3622 reply_id: 11292[/import]

This is amazing! Thanks for sharing.

I tried it in my iPad and it works great. I’d change the button just a little bit bigger, at times it won’t work. I had to change the snippets folder’s mp3 to caf and the main.lua file (media.playEventSound(“snippets/”…name…".caf") ). Don’t know how to code this with – if system.getInfo(“model”) == “iPad” then – kind of restriction for Android devices.

Just in case, I’ll share a quick way to convert a folder full of .mp3 to .caf files.

  1. open terminal
  2. go to the folder full of mp3s (in terminal: cd “folder-path-here”)
    3.paste in terminal and hit enter:

#!/bin/bash for f in \*.mp3; do echo "Processing $f file..." afconvert -f caff -d LEI16@44100 -c 1 "$f" "${f/mp3/caf}" done
4.Hit enter once more and enjoy. [import]uid: 10426 topic_id: 3622 reply_id: 11305[/import]

Here’s a capture from my iPad. I made the button a little bit bigger and added “Output FPS and texture memory usage”

http://img816.imageshack.us/img816/7610/ipadexample.png

I erased the fps=“60” from config.lua, it’s too slow if you add a big background image and physics. The default 30fps works great.
And if your using physics, remember to support only one orientation. Only “landscapeRight” or only “portrait” for example. You don’t want the screen going everywhere. [import]uid: 10426 topic_id: 3622 reply_id: 11307[/import]

cmontesino, glad you like it! Fun working with it. Thanks for the script for sound conversion too. [import]uid: 9905 topic_id: 3622 reply_id: 11326[/import]

Jayant and cmontesino, thanks for the starting ideas and code!

Jayant, I just took a look at my copy of Audacity (an open source sound editing program) and found that I can select a section of sound and apply a “label” to it. I can do this for each word/phrase, and then there’s a command to output this info to a text file! So taking the first part of the mp3 file that cmontesino provided, and marking each word, I get:

[text]
0.359055 0.643383 was
0.643383 0.752740 the
0.752740 1.210217 night
1.210217 1.600257 before
1.600257 2.456886 christmas
2.456886 2.646438 and
2.646438 3.018252 all
[/text]

The numbers are the start and end point in milliseconds of each word, along with my name for the label (where I used the actual word). Using this, I can go through all the audio and get the in/out points of each word, put that into a table, and use that to trigger the word highlighting.

I’ll do some work on this during the week and post what I’ve come up with.

cmontesino, I think you’re right that you can’t change the text color within a transition.

If we do have to use two versions of the text, I think the transition.dissolve command is cleaner than changing the alpha. It essentially does the same thing:

 transition.dissolve(t1,tt1,500,1000)  
 transition.dissolve(tt1,t1,500,2200)  

Again, thanks for the pointers! [import]uid: 9905 topic_id: 3622 reply_id: 11082[/import]

I just uploaded a new version, taking cmontesino feedback into consideration. This one includes both mp3 and caf files, with caf files being selected. I also simplified a few things in the table, and made the hot ‘button’ larger. Also removed the 60 fps line so it now defaults to 30fps (which is fine).

Tested on the iPad and works great.

Note that on the simulator, the sounds for the individual touched words is much quieter than when the full sentence is read. I think this must be a simulator bug. Sounds fine on the iPad.

Same URL to download:
http://www.ElectricEggplant.com/corona/SoundTextSync.zip

Also added to the code exchange library:
http://developer.anscamobile.com/code/soundtextsync-read-it-me-story-books-and-interactive-text
[import]uid: 9905 topic_id: 3622 reply_id: 11825[/import]

Hi David,

thanks for sharing your insights and code ! Seems a great solution and well engineered.

I am trying to download the zip to test it on my iPad but the download stalls after the first 45kb.
Can you wake up the server somehow ? :wink:

Thanks again.

Werner
[import]uid: 12222 topic_id: 3622 reply_id: 13324[/import]

Hi wernerramaekers… I just tried downloading it… worked fine for me. Want to try again? If it still doesn’t work, email me at David at ElectricEggplant.com and I can just email it to you (7.1MB). [import]uid: 9905 topic_id: 3622 reply_id: 13367[/import]

Hi David,

It works again and I was able to download the zip.

Works great, thanks for sharing. I hope to contribute to the Corona community soon .

Werner [import]uid: 12222 topic_id: 3622 reply_id: 13382[/import]

wernerramaekers - excellent! Glad it can help and looking forward to your contributions. [import]uid: 9905 topic_id: 3622 reply_id: 13384[/import]

what encoding should the CAF files be? signed 16bit for example or something different?

Cheers
Chris [import]uid: 9457 topic_id: 3622 reply_id: 14985[/import]

Chris – yes, I just used standard 16 bit AIFF files and renamed them as CAF. Note now, though, that with the new sound routines in the current version, you’re not stuck with CAF. You can go with any of the formats supported:
http://developer.anscamobile.com/partner/audionotes [import]uid: 9905 topic_id: 3622 reply_id: 15090[/import]