|
|||||||
| Register | Members List | New Posts | Mark Forums Read |
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Kasper's Automated Slave
Join Date: Nov 1997
Posts: 6,159
|
Apple proposes acoustic separation for iPhone conference calls
Apple in the second of two interesting patent filings revealed this week discusses techniques for improving the iPhone's ability to serve as a multi-party communication environment, in which participants on conference calls can be assigned to virtual position in order to improve clarity.
The technique is particularly suited for communication devices with at least two speakers available for audio output, such an iPhone with a connected pair of earphones or headset. When a conference call is initiated, participants would be presented with a graphical user interface on the iPhone for use in managing the virtual locations for the plurality of participants. "The visual indication for at least one of the participants can be assigned to a different one of the visually distinct regions, thereby causing an audio sound associated with the participant to be spatially adapted to originate from a virtual location corresponding to the visually distinct region," Apple said in the filing. "To assist the user of the device in determining and distinguishing the different participants in the multi-party call, directional audio processing can be utilized so that the different sources of audio for the call can be directionally placed in a particular location with respect to the headset. As a result, the user of the device hears the other participants in the multi-party call as sound sources originating from different locations. " In one implementation, Apple said the assignment to the default positions is automatic, either based on the participants' position geographically or in the order at which the participants joined the multi-party call. "Next, a participant position screen is displayed," Apple continued with is explanation. "The participant position screen can enable a user to alter the position of one or more of the participants to the multi-party call. Here, the participant position screen is displayed such that a user of the portable communication device can manipulate or otherwise cause one or more of the positions associated with the participants to be changed. In doing so, the user, in one embodiment, can cause the physical movement of a representation of a participant on the participant position screen. Here, a decision determines whether a reposition request has been made. When the decision determines that a reposition request has been made, the associated participant is moved to the specified position." All the participants on an iPhone conference call could also share media items such as "songs, albums, audiobooks, playlists, movies, music videos, photos, computer games, podcasts, audio and/or video presentations, news reports, and sports updates." In particular, the patent filing contains considerable discussion of multi-party voice calls with concurrent audio playback. "One aspect of the invention pertains to a wireless system that supports both wireless communications and media playback," Apple said. "The wireless communications and the media playback can be concurrently supported. Consequently, a user is able to not only participate in a voice call but also hear audio playback at the same time." In such instances, another graphical user interface would be presented on the iPhone's screen to allow each user to "blend" the two audio sources to their individual liking, independent of one another. "The display screen includes a blend control. The blend control allows a user of the portable electronic device to alter the blend (or mixture) of audio from audio playback and audio from a voice call. [...] The blend control includes a slider that can be manipulated by a user towards either an audio end or a call end. As the slider is moved towards the audio end, the audio playback output gets proportionately greater than the voice call output. On the other hand, when the slider is moved towards the call end, the voice call output gets proportionally greater than the audio playback output. For example, the position of the slider can represent a mixture of the audio playback output and the voice call output with each amplified similarly so that the mixture is approximately 50% audio." "The audio for each can be altered such that the audio from the incoming call and the audio from the media playback are perceived by a listener (when output to a pair of speakers, either internal or external) as originating from different virtual locations. The different virtual locations can be default positions or user-specified (during playback or in advance). [...] The sender or recipient of the audio sounds pertaining to a media item can be permitted to separately control the volume or amplitude of the audio sounds pertaining to the media item. As a result, the mixture or blend of the audio sounds pertaining to media items as compared to audio sounds pertaining to the voice call can be individually or relatively controlled." The September 2006 filing, titled "Audio processing for improved user experience," is credited to Apple employees Michael Lee and Derek Barrentine. |
|
|
|
|
|
#2 |
|
Registered User
Join Date: May 2005
Posts: 8,456
|
Must've come out of their work with 3D stereo separation with Soundtrack/Final Cut Pro or Steve's movie involvement. This would be interesting applied to stereo audio over a phone call. Very interesting. I wonder what kind of bandwith/data rate is needed for effective spatial audio over a phone?
"The natural progress of things is for liberty to yield, and government to gain ground."
—Thomas Jefferson Proud AAPL stock owner. |
|
|
|
|
|
#3 |
|
Registered User
Join Date: Apr 2006
Location: The Ansible
Posts: 11,854
|
Looks like Apple is utilizing their strengths to take another chunk out of corporate. As much as I hate the cost and, IMO, the pointlessness of conference calls they are very popular. This method could be done on the cheap using 3G and WiFi with ease and perhaps even allow for simple keynote presentations and images to be sent to the device like with iChat A/V.
Last edited by solipsism; 04-10-2008 at 03:14 PM.. |
|
|
|
|
|
#4 |
|
Registered User
Join Date: Dec 2006
Location: dit doe
Posts: 733
|
iDucker, iDe-esser, iLimiter, iSurroundMixer
|
|
|
|
|
|
#5 |
|
Registered User
Join Date: Jun 2005
Location: Philadelphia
Posts: 472
|
Jesus, did some kid scrawl on a few napkins when they came up with this patent?
That being said, the best just keeps on (potentially) getting better. |
|
|
|
|
|
#6 | |
|
Registered User
Join Date: Jun 2005
Location: Philadelphia
Posts: 472
|
Quote:
![]() |
|
|
|
|
|
|
#7 |
|
Registered User
Join Date: Jun 2003
Location: Tinton Falls, NJ
Posts: 702
|
Yes! Yes! OMG YES FTW!
As someone who spends a couple hours a day in audio conferences, lack of positional audio is a huge, huge frustration. It makes a lot of conversations turn into an unintelligible jumble. Giving each member a position is a great first step, but I'd love to see stereo/surround microphones specially built for audioconferencing and a protocol to match. |
|
|
|
|
|
#8 | |
|
Privileges Revoked
Join Date: Aug 2006
Posts: 1,890
|
Quote:
|
|
|
|
|
|
|
#9 |
|
Registered User
Join Date: Nov 2004
Location: The kool-aid stand...
Posts: 2,188
|
OmniGraffle?
Hardcore.
|
|
|
|
|
|
#10 |
|
Registered User
Join Date: Mar 2006
Posts: 154
|
Theres nothing sexy about conference calls.
|
|
|
|
|
|
#11 |
|
Registered User
Join Date: Oct 2007
Location: Los Angeles, Kahleefornyah
Posts: 226
|
|
|
|
|
|
|
#12 |
|
Registered User
Join Date: Oct 2007
Location: Los Angeles, Kahleefornyah
Posts: 226
|
|
|
|
|
|
|
#13 |
|
Registered User
Join Date: Feb 2006
Posts: 6
|
Cool enough, but...
Not patentable. R & D from all major telco and cell phone companies has been ongoing for years.
The ARM is fully capable of positional audio, and many companies already provide 3d audio optimized for ARM (and other) processors including QSound and Beatnik. Someone asked about bandwidth requirements...there are no extra bandwidth requirements for 3d audio All you need is two speakers and a position and the filters do the rest. Gregor |
|
|
|
|
|
#14 | |
|
Registered User
Join Date: May 2002
Posts: 834
|
Quote:
|
|
|
|
|
|
|
#15 | |
|
Privileges Revoked
Join Date: Aug 2006
Posts: 1,890
|
Quote:
|
|
|
|
|
|
|
#16 |
|
Registered User
Join Date: Feb 2006
Posts: 6
|
Well, its a bit more complex than simple balance control per person. Sound can actually be place behind a person (although not perfectly) but directly above and below works quite well.
|
|
|
|
|
|
#17 | |
|
Registered User
Join Date: Feb 2006
Posts: 6
|
Quote:
Gregor |
|
|
|
|
|
|
#18 |
|
Registered User
Join Date: Jan 2002
Posts: 277
|
Really? How many ears do you have?
It's quite possible to do 3D sound with two audio sources. Clever frequency and harmonic processing will give very good 3D spatialization. Speakers will work ok, but headphones should much better as they know the sources are directly at your ears. This is a pretty cool idea. |
|
|
|
|
|
#19 |
|
Privileges Revoked
Join Date: Aug 2006
Posts: 1,890
|
Yea. I think most of these software patents are "obvious" but patents haven't been about protecting ideas for a long time
![]() |
|
|
|
|
|
#20 |
|
Registered User
Join Date: May 2002
Posts: 834
|
Hence the word "basically". The main point is that the iPhone has to be the aggregator of the conference call for this to be useful otherwise it has no idea which sound to assign to which position.
|
|
|
|
|
|
#21 |
|
Registered User
Join Date: May 2005
Posts: 8,456
|
SRS simulates 3D audio with only 2 speakers, so be advised.
"The natural progress of things is for liberty to yield, and government to gain ground."
—Thomas Jefferson Proud AAPL stock owner. |
|
|
|
|
|
#22 |
|
Registered User
Join Date: Apr 2004
Posts: 271
|
I have been on far too many conference calls to count. I see no redeeming value with acoustic separation. I am listening for what is important in a conference call—not whether Betty or Bob are pleasantly acoustically separated. Business has fundamentals; this is just bordering on the ridiculous. Now, in an entertainment situation...that's an entirely different matter.
|
|
|
|
|
|
#23 |
|
Registered User
Join Date: Oct 2005
Posts: 614
|
Bye bye RIM
![]() |
|
|
|
|
|
#24 | |
|
Registered User
Join Date: May 2002
Posts: 834
|
Quote:
|
|
|
|
|
|
|
#25 | |
|
Registered User
Join Date: Apr 2004
Posts: 271
|
Quote:
|
|
|
|
|
|
|
#26 | |
|
Registered User
Join Date: May 2002
Posts: 834
|
Quote:
I'm talking about in a 'corporate environment', and yes I'm familiar. That said, having this in 'stereo' on a mobile phone with headsets would still be VERY valuable. I am often on a conf. call in a lounge or similar where it is quiet enough to utilize what this type of approach would offer. But, again, current standard telephony would not allow this as it is a single channel of audio. (Along with other limitations such a frequency range, phase alignment, etc. I am aware of our '3D audio' works). |
|
|
|
|
|
|
#27 |
|
Registered User
Join Date: Oct 2007
Posts: 4
|
Don't know about anyone else but before I saw the image showing the UI, I was thinking of a more touchflo style interface for positioning the sources than the top-down diagram they have there?
Sort of like iChat, which I think has a sort of 3d layout when you share a presentation or document - imagine that, but with you and two other participants in a conference - one could be video, the other maybe a contact picture if it was voice only (or any combination video/contact picture obviously..) - tap, hold and drag the video/picture to swap positions -audio doesn't jump across or cut out as you drag, but is 3d positioned from start to finish, a-la Creative EAX, but maybe simpler and in software - and the overall appearance is like everyone is facing inwards in a triangle as you look into the screen? (with all the usual touchflo Appley black reflect-i-ness going on around it )Bit of a tangent, but I'm wondering whether current 3G infrastructure can even support a handset aggregating a video call with two others either voice or video..? Would the video participants have to receive half-resolution video on the downlink to fit two video channels into one call (assuming we're not talking about building two radios into the handset)? and how would you get two separate channels of audio coming down as others have mentioned? I always thought GSM (and I guess 3G) conference calling was handled at the operator end and you always got the pre-mixed audio down a mono audio channel? (assuming they can't go in and rewrite the audio codecs and GSM protocols at this point) Who knows/anyway, if not, they could always just do all this over wifi instead - wonder if the SDK allows the necessary kind of access to do this eh? ![]() |
|
|
|
|
|
#28 | |
|
Registered User
Join Date: Apr 2004
Posts: 271
|
Quote:
|
|
|
|
|
|
|
#29 | |
|
Registered User
Join Date: Jan 2007
Location: Vienna, VA
Posts: 214
|
Quote:
With decent stereo separation, it will be much easier to separate the voices - just like you can do in a face-to-face meeting. The real interesting thing here is going to be getting carriers involved. When you make a conference call over land lines, the sound from the various parties is multiplexed in the central office (or at a PBX or a conference bridging-center). Under that circumstance, then the phone won't be able to separate the streams and reposition them. If, however, you receive each party's sound as a separate data stream, then this system shouldn't be that hard to implement. I've already seen this feature in standalone video conferencing systems. (Doesn't iChat also do this to some extent when you have a multi-way video chat?) Does anyone know where the audio is mixed for GSM-based conference calls? If they're mixed at a centralized location, then I think this feature will require changes to the carrier's infrastructure in order to make it all work. |
|
|
|
|
|
|
#30 | |
|
Registered User
Join Date: Apr 2004
Posts: 271
|
Quote:
|
|
|
|
|
|
|
#31 | |
|
Registered User
Join Date: Feb 2007
Posts: 666
|
Quote:
Insteand of "mano a mano" confence calls they will be "mono a mono" ![]() So much for that "stereo" effect one enjoys out of their earplugs/headsets. Alright "mano a mano" is Spanish for hand to hand. What?! you wanted the Spanish version of "ear to ear"? "oido a oido" for you men or "oreja a oreja" for you women. It just wouldn't sound right "pun" wise. |
|
|
|
|
|
|
#32 |
|
Registered User
Join Date: Apr 2006
Location: The Ansible
Posts: 11,854
|
|
|
|
|
|
|
#33 |
|
Registered User
Join Date: Mar 2004
Location: Australia
Posts: 969
|
Sounds right to me. Only the aggregator gets to hear the separated voices... the others on the call get it all combined. Of course, if the iPhone is the only phone in the conference with stereo headset then who cares... but the long term plan would have to be to separate them out for all participants - much like an iChat video conference.
|
|
|
|
|
|
#34 | |
|
Registered User
Join Date: Jul 2007
Posts: 38
|
How does this hurt in anyways?
Quote:
Btw, if this issue prevents a person from reading a 400+ book just to have a conference call, then more power to them! Additionally, a lot of conference calls are not even done with people from your own company. Are you gonna hang up on your client who is giving you half your business because he is not courteous? Also, basic etiquette does not help identify who is speaking when you are speaking to 3 or 4 complete strangers, whose voices possibly sound similar (very common especially in international calls). |
|
|
|
|
|
|
#35 |
|
Global Moderator
Join Date: Jun 2004
Location: .US
Posts: 9,127
|
That's a good suggestion for advanced civilizations that aren't based on naked apes in suits, but this is Earth here, that's the best we have. I doubt that you'll get a whole lot of the suited naked apes to go along with it.
|
|
|
|
|
|
#36 | |
|
Registered User
Join Date: Apr 2006
Location: The Ansible
Posts: 11,854
|
Quote:
![]() BTW, how can something be in a suit and be naked at the same time? |
|
|
|
|
|
|
#37 |
|
Global Moderator
Join Date: Jun 2004
Location: .US
Posts: 9,127
|
|
|
|
|
|
|
#38 | |
|
Privileges Revoked
Join Date: Aug 2006
Posts: 1,890
|
Quote:
|
|
|
|
|
|
|
#39 |
|
Global Moderator
Join Date: Jun 2004
Location: .US
Posts: 9,127
|
That's not really true because of the psychoacoustics used. The audio spectrum can be adjusted because sounds that come from above and below sound different because of how it hits our ears, the lobes reflect and scatter sounds in different ways depending on position. The software mimics that effect to make it sound like it's "out there".
Last edited by JeffDM; 04-11-2008 at 11:12 AM.. |
|
|
|
|
|
#40 | |
|
Privileges Revoked
Join Date: Aug 2006
Posts: 1,890
|
Quote:
|
|
|
|
|
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|