Canon: No camera can truly capture video for Apple Vision Pro
Canon and other camera companies have been exploring 3D, VR, and AR for a while now, but the Apple Vision Pro represents a very different challenge.
Apple Vision Pro
Executives from camera company Canon see a new business opportunity and potential market for a camera system that can create immersive video content for Apple's Vision Pro. At present, however, none of their cameras can yet handle the resolution and refresh rate Apple's headset requires.
Speaking to camera site Petapixel at the CP+ camera show in Yokohama, Japan last week, Canon officials believe they already have part of the puzzle -- a 5.2mm f/2.8 L lens that is designed specifically for producing VR content. The challenge is that the company doesn't yet have a camera with the refresh speed needed to match the Vision Pro's high-resolution screens.
Some of the immersive environments Apple has supplied already for the Apple Vision Pro have moving elements in them, but are believed to consist of a mix of computer-generated high-resolution static images and what was likely an 8K video system from camera maker RED.
Other companies would like to produce a camera system that could create images capturing real-world environments at the Vision Pro's resolution and refresh rate without resorting to computer graphics. They foresee market demand for tools that can quickly create such environments.
Canon's Yasuhiko Shiomi believes this would require a camera capable of a "100-megapixel resolution at 60 frames per second." It's the refresh rate that is currently complex to get to in combination with a resolution that high. It would amount to 14K video, a 3.5 times improvement over the current 4K standard.
There does exist one system already out there that can produce video to these specs -- the Sphere in Las Vegas has a "Big Sky" camera which is, in fact, an 18K video system. However, it costs millions of dollars, and requires 12 people to operate it, making it impractical for the emerging VR production market.
"At the moment, Canon can't cater to that level of a requirement," Shiomi said. However, Senior Managing Executive Officer and Deputy Head of the Imaging Group at Canon, Go Takura, noted that "technically, theoretically, we can do that."
"The problem is whether we can come up with the products that can be commercially viable and a price can be affordable enough for the customers to be able to buy them," Takura added.
Canon does already have a sensor capable of 100 megapixel resolution, but at present it cannot reach the required 60 frames per second.
"We are polishing our technology so that we can provide the high resolution for the VR purposes," Shiomi said. "So we will continue trying to improve our technology so that we can improve both resolution and speed with a good balance."
Read on AppleInsider
Comments
What am I missing?
If you just put the AVP on and look around, the environments, for instance the one with the mountains and a lake, look super realistic, almost like you are actually there. To produce that same feeling with a camera means having a 360 degree view at that same resolution and refresh speed. The lowest refresh rate is 90hz. Now consider that in order to produce the sphere that is required to have not only 360 degrees around but also looking up and down, how many 4k screens would that be? My estimate, based on how much space spatial videos currently take up in your view, is something like 10 give or take 2. And since you need one for each eye that's about 20 screens or 80K. 80K at 90FPS is an incredible amount of storage/bitrate.
For comparison, the best out there right now are the 8K 360 videos, but IMHO they look low-resolution when spread across the full immersive sphere required for looking everywhere. 16k feels like it would be at the acceptable level. So, I think it will be years if not decades before we get a video format and cameras that can support what the AVP is capable of supporting, but having 16K would , I think, be pretty amazing.
Cheers,
Damon
I'm a filmmaker and have worked in VR in the past, so I can give some insight.
The reason why the resolution is apparently so high is because this is for 180VR films. The videos occupy half of a sphere (180º). Though the Apple displays are 3.6k horizontally, that's at roughly 105º FoV; so 3.6k/105*180 = 6.2k resolution per eye.
If you're recording both frames on one sensor, which is how it's done on the Canon Dual Fisheye lens (and which is the easiest way to keep the lenses to an inter-pupillary distance of 60mm (roughly the distance between our eyes), then you need a resolution of 12.4k (Horizontal) x 6.2k (Vertical) = 77MP. There is also some resolution loss given the fact that the fisheyes are not projecting onto the full sensor – they are project just two circles side by side on a rectangular sensor – so I would imagine 100MP would be roughly right to retain resolution across the scene.
As to frame rate. Cinema is 23.98fps and 180º shutter, which means that the shutter is actually closed half the time and open the other half of the time. It leads to a certain strobing which subconsciously we associate with the world of cinema. Nobody really knows why this is so powerful, but maybe it helps remove us a bit from the scene so our brains associate it more as something we're observing rather than us being part of. Tbh I'm not really sure.
But with immersive video, we want to do the opposite. Rather than emphasise detachment, we want to emphasise immersion. And so we want to shoot at a frame-rate which is roughly at the upper end of what the human eye can discern, removing as much strobing as possible. That means roughly 60fps. The fact that there are two frames being shown, one for each eye, doesn't alter this equation. It still needs to be 60fps per eye.
In theory you could presumably also create some sort of periscope system so that the two sensors can be entirely detached; but I imagine this would be very costly.
Looking at the BTS shots of the Apple cameras, they interestingly don't follow this inter-pupillary distance rule. Nor does the iPhone 15 Pro for that matter. The Vision Pro isn't available in my region, so I haven't had a chance to see what these spatial videos look like, but I wonder if there is some interesting computational work happening to correct for this. That sort of computational photography work – which essentially repositions the lenses in software by combining the image data with depth data – is definitely implemented in how the Vision Pro does it's video pass-through, where the perspective of the cameras at the front of the headset are projected back to where the user's eyes are.
If there is a computational element going on here, then that's hugely interesting because a) it effectively solves this issue with needing to use one sensor, and b) it opens up intriguing possibilities of allowing a little bit of head movement, and corresponding perspective changes (i.e. true spatial video rather than just 3D video – or what is called 6DoF in the industry).
Cheers!
Damon
4k capture should be enough? If you think about it… 4k is just one locked view of what you look at, a 4k portion of the whole 360 view. The fact that you can look around everywhere, means that it is actually closer to 100mp to catch the entire scene.
I suspect Ai will be quite good at making older movies 3d, and likely can upscale ok too from lower res capture. I see some advancement in that, but no definite solution just yet.
All in all, AR/VR video is very much in its babystage still. No professional great systems are available, but I do think that AVP entering the market has sent the likes of canon into the labs, which is great to know.
edit: clarification
180VR could prove to be THE format for movies though, seems like it might be hard to do narrative action if your audience is more interested in something that is happening the complete other direction, and there is also the problem of having the entire production staff in the picture…
It does seem like tech does have some hurdles to jump, which is cool.
Sony/Canon proably need to do a lot more than just hardware going into the future. If mere humans are under AI pressure in the future companies like Sony, Canon, Intel, AMD, Nvidia, and Qualcomm are under the lack of a in house OS program/ecosystem combined with hardware going into the future. The top of OS software/program ecosystem pyramid with chip design/engineering ability is the only place to be in the future and that includes AI.
The break throughs needed to get all the software and hardware into a pair of glasses will squeeze out several companies Intel, AMD, Nvidia are already out and Qualcomm without a in house OS/ecosystem is at the mercy of a third party Samsung, Google, and Meta.
The only reason Apple is still in the so-call AI game is because of in house OS software/program ecosystem with in house chip design/engineering ability.