Saturday, December 4, 2010

Google acquires Phonetic Arts

Google just announced that they have acquired Cambridge startup Phonetic Arts. I am very excited and happy for my friends there. It is great to see another Cambridge-based Speech company being recognized for their great work (after Entropic and more recently Spinvox). Maybe I should have stayed in England after all.

Most of my career I have worked on speech recognition (aka. speech-to-text) but recently, since speech recognition has become mainstream, I have started to become convinced that there is a lot of untapped potential in Text-to-Speech (TTS). I look forward to seeing what the PA team will come up with now that they have the greater resources of Google at their disposal.

Speech Recognition for mobile apps

Over the last year I have been working on bringing our Dragon speech recognition to mobile apps on iPhone and Android. We built a super-simple SDK that allows developers to add speech to their app in 5 minutes (drag framework into XCode and add 3 function calls). We are working with a lot of large and small app developers and its cool to see all the interesting ideas people come up with. The three biggest apps that have launched so far are:

- Siri Assistant (recently acquired by Apple) Book tables in restaurants, find movies, look up trivia on the web, all by voice.
- Amazon Price Check A great way to quickly check prices on Amazon. You can use voice, scan a barcode or take a picture to find products.
- Ask is focusing on Questions and Answers. The app lets you speak a query and dictate answers by voice.

Check them out on the app store and stay tuned for more interesting apps coming soon.

Monday, December 7, 2009

Dragon Dictation for iPhone

After many months of development I am really glad to finally see our Dragon Dictation app available for download on the iTunes app store. The app allows you to dictate message on your iPhone and send them via email/SMS or cut and paste them into any other app. We use the same recognition engine as Dragon NaturallySpeaking. See the demo video my boss made:

Give the app a try. If you like it, just imagine how cool it would be if this was built into the iPhone OS and you could dictate anywhere you can type with the virtual keyboard...

Saturday, December 5, 2009

Why Apple acquired the Lala team

I never really paid much attention to the online music space but Apple's acquisition of Lala made me look around. Everyone assumes that Lala's deals with the record studios and Google will self-destruct due to change of control clauses in those contracts. Why would Apple buy them?

Just compare the sharing/embedding features offered by iTunes and Lala and think about which version you are more likely to use on twitter or embed on your blog. Clearly the Lala engineering team thought about how to do this, whereas the iTunes version is a typical implementation by a large company that felt it needed to check the "social media integration" checkbox. I am looking forward to seeing what these guys will do inside Apple.



Sunday, November 1, 2009

Mobile turn-by-turn navigation gets interesting

As predicted Google has announced their own free turn-by-turn navigation app. Clearly Google has been working hard to integrate various data map sources, from government data (like the US census TIGER data), user supplied data and Google's Streetview data. I am really not a big fan of companies like TeleAtlas and Navteq who controlled all the map data for decades, but I don't like Google's approach of exploiting user supplied data and destroying businesses just because they can can. Daniel Lyons sums it up very nicely.

The voice search looks pretty good but at this point plenty of companies have that technology. The TTS doesn't sound so hot.

I really hope that Apple will come up with their own solution, maybe based on OpenStreetMap and Cloudmade, especially since Google is a US only solution at this point.

Saturday, August 8, 2009

Gmail push notification on iPhone

This morning I saw that an iPhone app appeared on the app store that provides iPhone push notifications for Gmail. After looking into this a bit it looks like the GPush app sends your username and password to the GPush server and from their servers keeps an IMAP IDLE connection to Gmail open. Whenever Gmail signals a new email to the GPush server, GPush triggers an iPhone push notifications via Apple's Push Notification Service (APNS).

I really don't like the idea of some unknown company having my Gmail credentials and being logged into my gmail account all the time, so I built a proof-of-concept of the same thing that I can run on my own server. I used the following code:

Python imaplib2 from

example IMAP IDLE server from

APNS code based on examples from Apple iPhone developer forum postings

You can find the proof-of-concept code for client and server on github

To try this you need to have an iPhone developer account, create a Push Notification provisioning profile and have a server that runs python 2.6.

Wednesday, July 1, 2009

iPhone Voice Control primetime TV ad

Apple has started showing ads of iPhone Voice Control on US primetime TV. For my colleagues and me this has been a dream for a long time: A handset OEM advertising speech recognition as a standout feature!

The ad is not too different from the one we shot two years ago. :-)