or Connect
AppleInsider › Forums › Mobile › iPhone › Apple proposes acoustic separation for iPhone conference calls
New Posts  All Forums:Forum Nav:

Apple proposes acoustic separation for iPhone conference calls

post #1 of 69
Thread Starter 
Apple in the second of two interesting patent filings revealed this week discusses techniques for improving the iPhone's ability to serve as a multi-party communication environment, in which participants on conference calls can be assigned to virtual position in order to improve clarity.

The technique is particularly suited for communication devices with at least two speakers available for audio output, such an iPhone with a connected pair of earphones or headset.

When a conference call is initiated, participants would be presented with a graphical user interface on the iPhone for use in managing the virtual locations for the plurality of participants.

"The visual indication for at least one of the participants can be assigned to a different one of the visually distinct regions, thereby causing an audio sound associated with the participant to be spatially adapted to originate from a virtual location corresponding to the visually distinct region," Apple said in the filing.

"To assist the user of the device in determining and distinguishing the different participants in the multi-party call, directional audio processing can be utilized so that the different sources of audio for the call can be directionally placed in a particular location with respect to the headset. As a result, the user of the device hears the other participants in the multi-party call as sound sources originating from different locations. "

In one implementation, Apple said the assignment to the default positions is automatic, either based on the participants' position geographically or in the order at which the participants joined the multi-party call.

"Next, a participant position screen is displayed," Apple continued with is explanation. "The participant position screen can enable a user to alter the position of one or more of the participants to the multi-party call. Here, the participant position screen is displayed such that a user of the portable communication device can manipulate or otherwise cause one or more of the positions associated with the participants to be changed. In doing so, the user, in one embodiment, can cause the physical movement of a representation of a participant on the participant position screen. Here, a decision determines whether a reposition request has been made. When the decision determines that a reposition request has been made, the associated participant is moved to the specified position."

All the participants on an iPhone conference call could also share media items such as "songs, albums, audiobooks, playlists, movies, music videos, photos, computer games, podcasts, audio and/or video presentations, news reports, and sports updates."



In particular, the patent filing contains considerable discussion of multi-party voice calls with concurrent audio playback. "One aspect of the invention pertains to a wireless system that supports both wireless communications and media playback," Apple said. "The wireless communications and the media playback can be concurrently supported. Consequently, a user is able to not only participate in a voice call but also hear audio playback at the same time."

In such instances, another graphical user interface would be presented on the iPhone's screen to allow each user to "blend" the two audio sources to their individual liking, independent of one another.

"The display screen includes a blend control. The blend control allows a user of the portable electronic device to alter the blend (or mixture) of audio from audio playback and audio from a voice call. [...] The blend control includes a slider that can be manipulated by a user towards either an audio end or a call end. As the slider is moved towards the audio end, the audio playback output gets proportionately greater than the voice call output. On the other hand, when the slider is moved towards the call end, the voice call output gets proportionally greater than the audio playback output. For example, the position of the slider can represent a mixture of the audio playback output and the voice call output with each amplified similarly so that the mixture is approximately 50% audio."



"The audio for each can be altered such that the audio from the incoming call and the audio from the media playback are perceived by a listener (when output to a pair of speakers, either internal or external) as originating from different virtual locations. The different virtual locations can be default positions or user-specified (during playback or in advance). [...] The sender or recipient of the audio sounds pertaining to a media item can be permitted to separately control the volume or amplitude of the audio sounds pertaining to the media item. As a result, the mixture or blend of the audio sounds pertaining to media items as compared to audio sounds pertaining to the voice call can be individually or relatively controlled."

The September 2006 filing, titled "Audio processing for improved user experience," is credited to Apple employees Michael Lee and Derek Barrentine.
post #2 of 69
Must've come out of their work with 3D stereo separation with Soundtrack/Final Cut Pro or Steve's movie involvement. This would be interesting applied to stereo audio over a phone call. Very interesting. I wonder what kind of bandwith/data rate is needed for effective spatial audio over a phone?

Proud AAPL stock owner.

 

GOA

Reply

Proud AAPL stock owner.

 

GOA

Reply
post #3 of 69
Looks like Apple is utilizing their strengths to take another chunk out of corporate. As much as I hate the cost and, IMO, the pointlessness of conference calls they are very popular. This method could be done on the cheap using 3G and WiFi with ease and perhaps even allow for simple keynote presentations and images to be sent to the device like with iChat A/V.
Dick Applebaum on whether the iPad is a personal computer: "BTW, I am posting this from my iPad pc while sitting on the throne... personal enough for you?"
Reply
Dick Applebaum on whether the iPad is a personal computer: "BTW, I am posting this from my iPad pc while sitting on the throne... personal enough for you?"
Reply
post #4 of 69
iDucker, iDe-esser, iLimiter, iSurroundMixer
post #5 of 69
Jesus, did some kid scrawl on a few napkins when they came up with this patent?

That being said, the best just keeps on (potentially) getting better.
post #6 of 69
Quote:
Originally Posted by solipsism View Post

Looks like Apple is utilizing their strengths to take another chick out of corporate. As much as I hate the cost and, IMO, the pointlessness of conference calls they are very popular. This method could be done on the cheap using 3G and WiFi with ease and perhaps even allow for simple keynote presentations and images to be sent to the device like with iChat A/V.

What chick would that be? Eliot Spitzer's Kristen?

post #7 of 69
Yes! Yes! OMG YES FTW!

As someone who spends a couple hours a day in audio conferences, lack of positional audio is a huge, huge frustration. It makes a lot of conversations turn into an unintelligible jumble. Giving each member a position is a great first step, but I'd love to see stereo/surround microphones specially built for audioconferencing and a protocol to match.
post #8 of 69
Quote:
Originally Posted by SpamSandwich View Post

Must've come out of their work with 3D stereo separation with Soundtrack/Final Cut Pro or Steve's movie involvement. This would be interesting applied to stereo audio over a phone call. Very interesting. I wonder what kind of bandwith/data rate is needed for effective spatial audio over a phone?

Can't be all that 3D with only two speakers. At best it's 2D.
post #9 of 69
OmniGraffle?
Hard-Core.
Reply
Hard-Core.
Reply
post #10 of 69
Theres nothing sexy about conference calls.
post #11 of 69
Quote:
Originally Posted by quinney View Post

iDucker, iDe-esser, iLimiter, iSurroundMixer


iCompressor, iDelay, iGate, iVerb
post #12 of 69
Quote:
Originally Posted by echosonic View Post

iCompressor, iDelay, iGate, iVerb

iPan, iGain, iChorus, iPitch
post #13 of 69
Not patentable. R & D from all major telco and cell phone companies has been ongoing for years.

The ARM is fully capable of positional audio, and many companies already provide 3d audio optimized for ARM (and other) processors including QSound and Beatnik.

Someone asked about bandwidth requirements...there are no extra bandwidth requirements for 3d audio All you need is two speakers and a position and the filters do the rest.

Gregor
post #14 of 69
Quote:
Originally Posted by SpamSandwich View Post

Must've come out of their work with 3D stereo separation with Soundtrack/Final Cut Pro or Steve's movie involvement. This would be interesting applied to stereo audio over a phone call. Very interesting. I wonder what kind of bandwith/data rate is needed for effective spatial audio over a phone?

As I understand this no bandwith/data rate changes are required. This is basically like assigning a different balance level to each source on the conference call. I assume this can only work if the iPhone is the aggregator of the caller - i.e. you call one person, put on hold, call another, etc. Then it can assign each source to a different virtual location. If you a simply part of another conference call then I don't see how the iphone could do anything about this as it would have to 'recognize' voices or have a tag sent by the conference center each time a speaker changed.
post #15 of 69
Quote:
Originally Posted by gbrandt View Post

Not patentable. R & D from all major telco and cell phone companies has been ongoing for years.

The ARM is fully capable of positional audio, and many companies already provide 3d audio optimized for ARM (and other) processors including QSound and Beatnik.

Someone asked about bandwidth requirements...there are no extra bandwidth requirements for 3d audio All you need is two speakers and a position and the filters do the rest.

Gregor

From what I understand from the picture they are patenting the concept and user interface that allows someone to manage conference calls this way.
post #16 of 69
Well, its a bit more complex than simple balance control per person. Sound can actually be place behind a person (although not perfectly) but directly above and below works quite well.
post #17 of 69
Quote:
Originally Posted by mydo View Post

From what I understand from the picture they are patenting the concept and user interface that allows someone to manage conference calls this way.

You are probably right. But they may still have issues. I've seen demo programs that allow you to move a speaker (person) around at will.

Gregor
post #18 of 69
Quote:
Originally Posted by mydo View Post

Can't be all that 3D with only two speakers. At best it's 2D.

Really? How many ears do you have?

It's quite possible to do 3D sound with two audio sources. Clever frequency and harmonic processing will give very good 3D spatialization. Speakers will work ok, but headphones should much better as they know the sources are directly at your ears.

This is a pretty cool idea.
post #19 of 69
Yea. I think most of these software patents are "obvious" but patents haven't been about protecting ideas for a long time
post #20 of 69
Quote:
Originally Posted by gbrandt View Post

Well, its a bit more complex than simple balance control per person. Sound can actually be place behind a person (although not perfectly) but directly above and below works quite well.

Hence the word "basically". The main point is that the iPhone has to be the aggregator of the conference call for this to be useful otherwise it has no idea which sound to assign to which position.
post #21 of 69
SRS simulates 3D audio with only 2 speakers, so be advised.

Proud AAPL stock owner.

 

GOA

Reply

Proud AAPL stock owner.

 

GOA

Reply
post #22 of 69
I have been on far too many conference calls to count. I see no redeeming value with acoustic separation. I am listening for what is important in a conference callnot whether Betty or Bob are pleasantly acoustically separated. Business has fundamentals; this is just bordering on the ridiculous. Now, in an entertainment situation...that's an entirely different matter.
post #23 of 69
Bye bye RIM
post #24 of 69
Quote:
Originally Posted by CREB View Post

I have been on far too many conference calls to count. I see no redeeming value with acoustic separation. I am listening for what is important in a conference callnot whether Betty or Bob are pleasantly acoustically separated. Business has fundamentals; this is just bordering on the ridiculous. Now, in an entertainment situation...that's an entirely different matter.

If this could actually be offered by real conferencing system I would disagree strongly. On conference calls you not only need to hear 'what' but also 'by whom'. Without that information a tremendous about of context of the meaning is often lost leading to miscommunication. If everyone has significantly difference vocal characteristics that all is well but if two, more or several pairs of people of similar vocal characteristics you find yourself asking 'who was that' or, if you don't want to interrupt the flow simply letting it go. This, in principle, would be extremely valuable but, with standard telephony you don't have even the possibility of two-channel transmission to make this possible.
post #25 of 69
Quote:
Originally Posted by physguy View Post

If this could actually be offered by real conferencing system I would disagree strongly. On conference calls you not only need to hear 'what' but also 'by whom'. Without that information a tremendous about of context of the meaning is often lost leading to miscommunication. If everyone has significantly difference vocal characteristics that all is well but if two, more or several pairs of people of similar vocal characteristics you find yourself asking 'who was that' or, if you don't want to interrupt the flow simply letting it go. This, in principle, would be extremely valuable but, with standard telephony you don't have even the possibility of two-channel transmission to make this possible.

You're using a bloody mobile phone! C'mon get real. It's a bloody mobile phone used in various environments—because it's mobile—not some cozy conference room where I'd use the office phone versus a mobile phone for obvious reasons. How many people here have actually worked in major corporate environments?
post #26 of 69
Quote:
Originally Posted by CREB View Post

You're using a bloody mobile phone! C'mon get real. It's a bloody mobile phone used in various environmentsbecause it's mobilenot some cozy conference room where I'd use the office phone versus a mobile phone for obvious reasons. How many people have actually worked in major corporate environments?

Do you read before you write?

I'm talking about in a 'corporate environment', and yes I'm familiar. That said, having this in 'stereo' on a mobile phone with headsets would still be VERY valuable. I am often on a conf. call in a lounge or similar where it is quiet enough to utilize what this type of approach would offer. But, again, current standard telephony would not allow this as it is a single channel of audio. (Along with other limitations such a frequency range, phase alignment, etc. I am aware of our '3D audio' works).
post #27 of 69
Don't know about anyone else but before I saw the image showing the UI, I was thinking of a more touchflo style interface for positioning the sources than the top-down diagram they have there?

Sort of like iChat, which I think has a sort of 3d layout when you share a presentation or document - imagine that, but with you and two other participants in a conference - one could be video, the other maybe a contact picture if it was voice only (or any combination video/contact picture obviously..) - tap, hold and drag the video/picture to swap positions -audio doesn't jump across or cut out as you drag, but is 3d positioned from start to finish, a-la Creative EAX, but maybe simpler and in software - and the overall appearance is like everyone is facing inwards in a triangle as you look into the screen? (with all the usual touchflo Appley black reflect-i-ness going on around it )

Bit of a tangent, but I'm wondering whether current 3G infrastructure can even support a handset aggregating a video call with two others either voice or video..? Would the video participants have to receive half-resolution video on the downlink to fit two video channels into one call (assuming we're not talking about building two radios into the handset)? and how would you get two separate channels of audio coming down as others have mentioned? I always thought GSM (and I guess 3G) conference calling was handled at the operator end and you always got the pre-mixed audio down a mono audio channel? (assuming they can't go in and rewrite the audio codecs and GSM protocols at this point)

Who knows/anyway, if not, they could always just do all this over wifi instead - wonder if the SDK allows the necessary kind of access to do this eh?
post #28 of 69
Quote:
Originally Posted by physguy View Post

Do you read before you write?

I'm talking about in a 'corporate environment', and yes I'm familiar. That said, having this in 'stereo' on a mobile phone with headsets would still be VERY valuable. I am often on a conf. call in a lounge or similar where it is quiet enough to utilize what this type of approach would offer. But, again, current standard telephony would not allow this as it is a single channel of audio. (Along with other limitations such a frequency range, phase alignment, etc. I am aware of our '3D audio' works).

With all due respect...I simply do not buy it. Given the myriad of corporate environments, all the way from the plush office to the being in the most adverse of field conditions, I prefer something more purpose-built. In the field I carry a military spec mobile phone (because is has to work for all the right reasons); at the office I carry a different phone. It is what being said versus whom the hell said it that is important to most serious business people. I wonder what Warren Buffet would have to say about all this nonsense? For that matter I dare you to ask Steve Jobs if he gives a true rat-arse about this as he runs Apple (I seriously doubt it as have read about Jobs, and in speaking with the friends I have that have worked with him).
post #29 of 69
Quote:
Originally Posted by CREB View Post

I have been on far too many conference calls to count. I see no redeeming value with acoustic separation. I am listening for what is important in a conference callnot whether Betty or Bob are pleasantly acoustically separated. Business has fundamentals; this is just bordering on the ridiculous. Now, in an entertainment situation...that's an entirely different matter.

Think of what happens when you're on a many-way conference call and a few people all try to speak at once. The voices all get muddled and you can't make out anything that was said.

With decent stereo separation, it will be much easier to separate the voices - just like you can do in a face-to-face meeting.

The real interesting thing here is going to be getting carriers involved. When you make a conference call over land lines, the sound from the various parties is multiplexed in the central office (or at a PBX or a conference bridging-center). Under that circumstance, then the phone won't be able to separate the streams and reposition them.

If, however, you receive each party's sound as a separate data stream, then this system shouldn't be that hard to implement. I've already seen this feature in standalone video conferencing systems. (Doesn't iChat also do this to some extent when you have a multi-way video chat?)

Does anyone know where the audio is mixed for GSM-based conference calls? If they're mixed at a centralized location, then I think this feature will require changes to the carrier's infrastructure in order to make it all work.
post #30 of 69
Quote:
Originally Posted by shamino View Post

Think of what happens when you're on a many-way conference call and a few people all try to speak at once. The voices all get muddled and you can't make out anything that was said.

With decent stereo separation, it will be much easier to separate the voices - just like you can do in a face-to-face meeting.

Simply a matter of basic etiquette versus trying to assimilate garbled information. A good read and use of Robert's Rules provides for better meetings, and conference calls than this iPhone feature will ever provide.
post #31 of 69
Quote:
Originally Posted by Booga View Post

Yes! Yes! OMG YES FTW!

As someone who spends a couple hours a day in audio conferences, lack of positional audio is a huge, huge frustration. It makes a lot of conversations turn into an unintelligible jumble. Giving each member a position is a great first step, but I'd love to see stereo/surround microphones specially built for audioconferencing and a protocol to match.

You beat me to my post.

Insteand of "mano a mano" confence calls they will be "mono a mono"

So much for that "stereo" effect one enjoys out of their earplugs/headsets.

Alright "mano a mano" is Spanish for hand to hand. What?! you wanted the Spanish version of "ear to ear"? "oido a oido" for you men or "oreja a oreja" for you women. It just wouldn't sound right "pun" wise.

Ten years ago, we had Steve Jobs, Bob Hope and Johnny Cash.  Today we have no Jobs, no Hope and no Cash.

Reply

Ten years ago, we had Steve Jobs, Bob Hope and Johnny Cash.  Today we have no Jobs, no Hope and no Cash.

Reply
post #32 of 69
Quote:
Originally Posted by Rot'nApple View Post

Alright "mano a mano" is Spanish for hand to hand. What?! you wanted the Spanish version of "ear to ear"? "oido a oido" for you men or "oreja a oreja" for you women. It just wouldn't sound right "pun" wise.

Technically it's oral to aural.
Dick Applebaum on whether the iPad is a personal computer: "BTW, I am posting this from my iPad pc while sitting on the throne... personal enough for you?"
Reply
Dick Applebaum on whether the iPad is a personal computer: "BTW, I am posting this from my iPad pc while sitting on the throne... personal enough for you?"
Reply
post #33 of 69
Quote:
Originally Posted by physguy View Post

As I understand this no bandwith/data rate changes are required. This is basically like assigning a different balance level to each source on the conference call. I assume this can only work if the iPhone is the aggregator of the caller

Sounds right to me. Only the aggregator gets to hear the separated voices... the others on the call get it all combined. Of course, if the iPhone is the only phone in the conference with stereo headset then who cares... but the long term plan would have to be to separate them out for all participants - much like an iChat video conference.
post #34 of 69
Quote:
Originally Posted by CREB View Post

Simply a matter of basic etiquette versus trying to assimilate garbled information. A good read and use of Robert's Rules provides for better meetings, and conference calls than this iPhone feature will ever provide.

Directional sound is not new technology, and has been implemented by many companies. How would having directional sound "hurt" in any ways? First of all, it can easily be made completely optional. Secondly, you dont necessarily need to place someone behind you, and someone else in front of you, but instead if you are in a conference call with two people, and instead of both persons sounding like they are speaking from the same place (e.g. front), if one sounds like he/she is speaking from slightly left of front, and the other slightly right of front, how would this be any worse than what you have now? On the other hand, it will make it very easy to identify who is speaking what even if the voices sound similar.

Btw, if this issue prevents a person from reading a 400+ book just to have a conference call, then more power to them! Additionally, a lot of conference calls are not even done with people from your own company. Are you gonna hang up on your client who is giving you half your business because he is not courteous? Also, basic etiquette does not help identify who is speaking when you are speaking to 3 or 4 complete strangers, whose voices possibly sound similar (very common especially in international calls).
post #35 of 69
Quote:
Originally Posted by CREB View Post

Simply a matter of basic etiquette versus trying to assimilate garbled information. A good read and use of Robert's Rules provides for better meetings, and conference calls than this iPhone feature will ever provide.

That's a good suggestion for advanced civilizations that aren't based on naked apes in suits, but this is Earth here, that's the best we have. I doubt that you'll get a whole lot of the suited naked apes to go along with it.
post #36 of 69
Quote:
Originally Posted by JeffDM View Post

That's a good suggestion for advanced civilizations that aren't based on naked apes in suits, but this is Earth here, that's the best we have. I doubt that you'll get a whole lot of the suited naked apes to go along with it.

It was also Earth when the apes were suited.


BTW, how can something be in a suit and be naked at the same time?
Dick Applebaum on whether the iPad is a personal computer: "BTW, I am posting this from my iPad pc while sitting on the throne... personal enough for you?"
Reply
Dick Applebaum on whether the iPad is a personal computer: "BTW, I am posting this from my iPad pc while sitting on the throne... personal enough for you?"
Reply
post #37 of 69
Quote:
Originally Posted by solipsism View Post

It was also Earth when the apes were suited.

BTW, how can something be in a suit and be naked at the same time?

Make that hairless apes.
post #38 of 69
Quote:
Originally Posted by Gustav View Post

Really? How many ears do you have?

It's quite possible to do 3D sound with two audio sources. Clever frequency and harmonic processing will give very good 3D spatialization. Speakers will work ok, but headphones should much better as they know the sources are directly at your ears.

This is a pretty cool idea.

Not when they are stuck in your ear. In that situation the sound is only coming from a single point that is fixed with respect to the head. So the only parameter that can change is the blend between right and left and hence 2D only.
post #39 of 69
Quote:
Originally Posted by mydo View Post

Not when they are stuck in your ear. In that situation the sound is only coming from a single point that is fixed with respect to the head. So the only parameter that can change is the blend between right and left and hence 2D only.

That's not really true because of the psychoacoustics used. The audio spectrum can be adjusted because sounds that come from above and below sound different because of how it hits our ears, the lobes reflect and scatter sounds in different ways depending on position. The software mimics that effect to make it sound like it's "out there".
post #40 of 69
Quote:
Originally Posted by JeffDM View Post

That's not really true because of the psychoacoustics used. The audio spectrum can be adjusted because sounds that come from above and below sound different because of how it hits our ears, the lobes reflect and scatter sounds in different ways depending on position. The software mimics that effect to make it sound like it's "out there".

Yes but when you have an ear bud stuck in your ear there is no "above" or "below". Your lobes are out of the equation because the buds are stuck in the ear past the lobe.
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: iPhone
AppleInsider › Forums › Mobile › iPhone › Apple proposes acoustic separation for iPhone conference calls