Just about a year ago I questioned the opinion of many analysts and bloggers suggesting that Siri was losing the personal digital assistant battle to Google Assistant and Alexa. I reasoned that Apple didn’t care about that ‘battle,’ but instead, it was improving Siri enough so that they could sell more of their devices. If only Apple made the right investments on Siri, i.e., strengthen its apparently weak, some may say funny, ability to understand natural language, fundamentally improve its Natural Language Processing. It turns out Apple didn’t read my post ☺. Siri seems worse in some respect. For all the R&D investments that Apple did in 2018, I continue to be flummoxed by the answers of Siri while I’m driving to get estimated time of arrivals and Siri asking me to look at the screen. ☹ You know I’m driving, why are you asking me to look at the screen? In the meantime, Apple launched the HomePod in February of 2018, and two features stood out: the sound quality and Siri. I decided before the holidays to give it a try and indeed what a sound, wow! Not as good as my smallish Polk stereo speakers, but way better than most smart speakers and as good as or better than the more expensive Google’s Home Max. In regards to the virtual assistant, i.e., Siri, functionality, that’s another story. The first drawback I found is that the only way to control the speaker is through Siri. I, of course, could control it through my iPhone but it would have been nice to untangle the two so if I leave the house with my iPhone it kept playing. So, because I want HomePod to be playing all the time regardless of where my iPhone is, I have been negotiating with Siri what I want it to play.
I first created my playlist and picked a title that I thought it would be unique enough for Siri to understand. To my surprise, it could not, regardless of how slow or clear I would enunciate. It was not a speech recognition issue; I figured it was the same old NLP weakness and Siri hasn’t improved at all. With the HomePod though it seems it would be much easier to understand natural language, Apple, I guess, just decided to use Siri “as is” instead of looking at the big picture. I just want to play music, how hard can it be to understand a playlist title? I just could not get Siri to play my playlist, as it kept wanting to play something else. I then decided to change the playlist title and making sure that had at least one keyword in my title none of the other playlists or albums on Apple Music had. It still was unable to do it and kept playing one of the different playlists on Apple Music. I finally figured it out. My playlist title had three keywords, and other playlists on Apple Music that Siri kept playing had two; two of three keywords on my title were the same as those on Apple Music. The moment I changed my playlist title to two keywords and picking unique keywords I was finally able to play my playlist. Whew! It was a simple probabilistic problem that Siri had to solve: the keywords in my title match exactly the three spoken keywords (100% match), while only two keywords in other playlists match the spoken three words is a 67% match. Most likely Siri, for being a cloud service, is using the same flawed NLP model regardless of the method you’re interacting with. As in the case of the example of using Siri while driving, the device and context should give Siri the data to attempt to be smarter. At the risk of sounding anachronistic, there are known query-answer rule-based systems that work quite well with a simple statistical scoring. I may be over-simplifying the HomePod use case, but for playing music, which for its beautiful sound I believe Apple wanted it to be used mostly for music, at least now, then why doesn’t Apple forgo Siri current algorithms and uses a much more straightforward and predictable one for Music? Statistical algorithms can get quite sophisticated too by computing bi-directional similarity scores between titles and query, or accounting for keyword weights, and other variables, but even then it will produce more predictable results and a better user experience. Furthermore, Apple Music has already an excellent taxonomy to start with that is actively maintained, which a rule-based system could use to implement inference logic and 'look very smart.' All of this would be enormously easier and cheaper for Apple to implement. They could divert a tiny fraction of the Siri budget to buy a company that does just that. All Music fans, we'd be grateful for the improved user experience. Sometimes, we, technologists want to use the most leading-edge technology or system to solve a problem or generalize the method or algorithm underneath so that it addresses all possible scenarios. I have preached this too when I knew it would work but also to never lose track of the ‘big picture’ and what problem we’re trying to solve. For this particular case, the HomePod playing music, and for the sake of HomePod’s success, a more straightforward and traditional rule-based system is the better approach. NLP is still Siri's Achille's hill, and unless Apple knows how to fix it for all its intended uses, it's time to focus on the specific use cases. Comments are closed.
|
Categories
All
Archives
January 2019
|