Apple looks to geolocation for enhanced speech recognition

Posted:
in General Discussion edited January 2014
As Siri virtual assistant stands to gain hordes of new users around the world with the iPhone 5s and 5c, Apple is looking to improve accurate voice recognition through geolocation and localized language models.

Language
Source: USPTO


Apple's iOS devices already boast a number of voice-recognizing input features, including Siri and speech-to-text, but worldwide availability makes accurate representation of some local dialects and languages an issue.

Looking to improve the situation, Apple is investigating ways to integrate location data with language modeling to create a hybrid system that can better understand a variety of tongues. The method is outlined in a patent filing published by the U.S. Patent and Trademark Office on Thursday, titled "Automatic input signal recognition using location based language modeling."

Apple proposes that a number of local language models can be constructed for a desired service area. A form of such a system is already in use with Siri's language selection, which allows users to choose from various models, like English (United States) and English (United Kingdom).

However, the method can serve the opposite effect and further complicate system recognition. Apple explains:
That is, input signals that are not unique to a particular region may be improperly recognized as a local word sequence because the language model weights local word sequences more heavily. Additionally, such a solution only considers one geographic region, which can still produce inaccurate results if the location is close to the border of the geographic region and the input signal corresponds to a word sequence that is unique in the neighboring geographic region.
Apple's new invention hybridizes local language models by weighting them according to location and speech input, then merging them with other localized or global models. Global models capture general language properties and high-probability word strings commonly used by native speakers.

In some embodiments, the local language model is first identified by geography, which is governed by service location thresholds. This first model is merged with the global version of the language and compared against input words or phrases that are statistically more likely to occur in the specified region.

Information data can be used to pick out word sequences that have a low likelihood of occurrence globally, but may be higher for a certain location. The document offers the words "goat hill" as an example. The input may have a low probability of being spoken globally, in which case the system may determine the speaker is saying "good will." However, if geolocation is integrated, it may be recognized that a nearby store is called Goat Hill, leading the system to determine that input as the more likely word string.

Language


Location data can be gathered via GPS, cell tower triangulation, and other similar methods. Alternatively, a user can manually enter the location into a supported device. Language assets include databases, recognition modules, local language model selector, hybrid language model builder, and a recognition engine.

Combining the location data with local language models involves a "centroid," or a predefined focal point for a given region. Examples of the so-called centroid can be an address, building, town hall, or even the geographic center of a city. When the thresholds surrounding centroids overlap, "tiebreaker policies" can be implemented to weight one local language model higher than another, creating the hybrid language model.

Language
Illustration of overlapping regions and centroids.


While it is unknown if Apple will one day use the system in its iOS product line, current technology does allow for such a method to be implemented. Cellular data can be leveraged for database duties, while on-board sensors and processors would handle location gathering, language recognition and analysis, and hybrid model output.

Apple's location-based speech recognition patent application was first filed for in 2012 and credits Hong M. Chen as its inventor.

Comments

  • Reply 1 of 13
    gatorguygatorguy Posts: 24,213member
    In a quick read it doesn't seem to be all that different from this already issued patent.
    https://www.google.com/patents/US8219384

    IMHO It's unfortunate that the USPTO allows a patent for something like this anyway.
  • Reply 2 of 13
    mjtomlinmjtomlin Posts: 2,673member
    Quote:
    Originally Posted by Gatorguy View Post



    In a quick read it doesn't seem to be all that different from this already issued patent.

    https://www.google.com/patents/US8219384



    IMHO It's unfortunate that the USPTO allows a patent for something like this anyway.

     

    Yes they are very similar in the abstract, but that's not what gets patented. It's the implementation and method (claims) that do and in this example they may be unique enough to allow for two separate patents.

     

    And why shouldn't it patented... If you work hard to come up with a way to carry this out why can't you protect it?

  • Reply 3 of 13
    mjtomlinmjtomlin Posts: 2,673member
    Double post.

  • Reply 4 of 13
    MarvinMarvin Posts: 15,322moderator
    mjtomlin wrote: »
    And why shouldn't it patented... If you work hard to come up with a way to carry this out why can't you protect it?

    He doesn't like it when Apple patents things because it means his best bud Google can't copy it without fear of being sued. Every patent Apple applies for, it's always 'oh they don't have it yet, we'll see if they get it, they shouldn't because it's obvious or oh, look someone else has a patent that looks the same', as if anybody cares one way or the other. If Google does it first, that's ok though because they are perfect in every way and have never sued anybody, not even through a subsidiary. You'll also notice this thread has nothing to do with Google - a certain someone apparently never brings up Google, Google, Google, Google, Google, Google, Google, Google unless it's explicitly brought up by someone else.
  • Reply 5 of 13
    gatorguygatorguy Posts: 24,213member
    mjtomlin wrote: »
    Yes they are very similar in the abstract, but that's not what gets patented. It's the implementation and method (claims) that do and in this example they may be unique enough to allow for two separate patents.

    And why shouldn't it patented... If you work hard to come up with a way to carry this out why can't you protect it?

    That's an entirely different and way too involved discussion for here. There's lots of rational arguments for treating software or mathematical algorithms differently than real property which has been patentable for hundreds of years. In a nutshell tho regarding this specific patent application there's so many similar patents for the same general idea and dating back as much as 20 years or more it hardly qualifies as "inventive" IMO. Just more clutter in the patent system.
  • Reply 6 of 13
    gatorguygatorguy Posts: 24,213member
    Marvin wrote: »
    He doesn't like it when Apple patents things because it means his best bud Google can't copy it without fear of being sued.

    Thanks for speaking for me but If you really read my comment it was that not even Google should have received a patent for it IMO. Nor Micron. Nor IBM . Nor Avaya.

    EDIT: Of all those the Google one just happened to be the closest match to Apple's application and thus the most pertinent. If Avaya's had been the more identical I would have linked it instead. The name on the patent isn't what important. It's that changing a detail or two shouldn't qualify for a patent as tho it's truly new and inventive. But it doesn't change what I originally said anyway.

    "IMHO It's unfortunate that the USPTO allows a patent for something like this anyway."
  • Reply 7 of 13

    "ya awl is on mah propty... I gahn ta shu you"

     

    Tennessee:  "you are on my property... I am going shoot you"

     

    Texas:  "Your Oil is on my property... I am going sue you"

  • Reply 8 of 13

    When I read this, my first thought was, a "bubba" iphone.  Camouflage outer case, stuffed-dead animal for wallpaper, Larry the Cable guy as the voice of Siri, with Fox news replacing all other news apps and instead of an apple logo on the back, there would be the GOP elephant.  Should sell by the million in the south.  FYI, I live in Dallas.

  • Reply 9 of 13
    This can come soon enough.
    My natural language is Spanish but living in Montevideo, the Spanish of "rio de la plata" is way different than that spoken in places like Mexico or chile, not to mention Spain itself.

    It's easier for me to just use English instead of forcing my Spanish to sound Mexican.
  • Reply 10 of 13
    Quote:

    Originally Posted by Gatorguy


    IMHO It's unfortunate that the USPTO allows a patent for something like this anyway.


    I agree.

     

    I don't pretend to understand the US patent system but I'd like to think that this would fail in the UK on the grounds of being 'obvious' to a practitioner in the field. Pronunciation varies by region (usually called dialects), regions do not have sharp delineation but blend into each other (people and TV programmes move around unless you build a Berlin Wall). So we apply some weighting (fuzzy logic) to the recognition data. I don't think it's a deep insight (unlike implementations that might well be).

     

    "Just more clutter in the patent system" ...that costs everyone and possibly prevents the small guys playing at all.

     

    And in the north of England, Goat Hill becomes go t'hill - go to the hill  :)

  • Reply 11 of 13
    ajmasajmas Posts: 601member
    This may be okay for default set-up, but hopefully it would be easy to over-ride. Not everyone who lives in a given area is from that location, so does not necessarily have a local accent. This is even more true with people who travel or are expats.

    This reminds me of another thing that gets on my nerves: you are logged into a website that is in English in your home country, for example, and you have specified English as your native language in your site profile, but the site insists on being smart and using the geographical local language - so why did I specify my language in my profile? For me, this an example of some technological solutions trying to be too smart and failing, because of it. Hopefully Apple doesn't get caught in this mentality here.
  • Reply 12 of 13
    mcdavemcdave Posts: 1,927member
    As an expat I hear you. Though, sometimes, this works in my favour as Siri can't understand the Kiwis. Good for default but needs to allow override.
    ajmas wrote: »
    This may be okay for default set-up, but hopefully it would be easy to over-ride. Not everyone who lives in a given area is from that location, so does not necessarily have a local accent. This is even more true with people who travel or are expats.

    This reminds me of another thing that gets on my nerves: you are logged into a website that is in English in your home country, for example, and you have specified English as your native language in your site profile, but the site insists on being smart and using the geographical local language - so why did I specify my language in my profile? For me, this an example of some technological solutions trying to be too smart and failing, because of it. Hopefully Apple doesn't get caught in this mentality here.
  • Reply 13 of 13

    As everyone knows that it is the time of globalization and the people of the world has connected with each other and the geolocation idea will be great and perfect. 

Sign In or Register to comment.