Apple's first AI research paper wins prestigious machine learning award
Apple's first publicly issued academic paper, research focusing on computer vision systems published in December, recently won a Best Paper Award at the 2017 Conference on Computer Vision & Pattern Recognition, one of the most sought after prizes in the field.
Considered one of the most influential conferences in the field according to the h-index, a metric for scholarly works, CVPR in July selected Apple's paper as one of two Best Paper Awards.
According to AppleInsider reader Tom, who holds a PhD in machine learning and computer vision, the CVPR award is one of the most sought after in the field.
This year, the conference received a record 2,680 valid submissions, of which 2,620 were reviewed. Delegates whittled down that number to 783 papers, granting long oral presentations to 71 entrants. Apple's submission ultimately made its way to the top of the pile, an impressive feat considering it was the company's inaugural showing.
CVPR's second Best Paper Award went to Gao Huang, Zhuang Liu, Laurens van der Maaten and Kilian Q. Weinberger for their research on "Densely Connected Convolutional Networks." Research for the paper was conducted by Cornell University in collaboration with Tsinghua University and Facebook AI Research.
Titled "Learning from Simulated and Unsupervised Images through Adversarial Training," Apple's paper was penned by computer vision expert Ashish Shrivastava and a team of engineers including Tomas Pfister, Oncel Tuzel, Wenda Wang, Russ Webb and Apple Director of Artificial Intelligence Research Josh Susskind. Shrivastava presented the research to CVPR attendees on July 23.
As detailed when it saw publication in December, Apple's public research paper describes techniques of training computer vision algorithms to recognize objects using synthetic images.
According to Apple, training models based solely on real-world images are often less efficient than those leveraging synthetic data cause because computer generated images are usually labeled. For example, a synthetic image of an eye or hand is annotated as such, while real-world images depicting similar objects are unknown to the algorithm and thus need to be described by a human operator.
As noted by Apple, however, relying completely on simulated images might yield unsatisfactory results, as computer generated content is sometimes not realistic enough to provide an accurate learning set. To help bridge the gap, Apple proposes a system of refining a simulator's output through SimGAN, a take on "Simulated+Unsupervised learning." The technique combines unlabeled real image data with annotated synthetic images using Generative Adversarial Networks (GANs), or competing neural networks.
In its study, Apple applied SimGAN to the evaluation of gaze and hand pose estimation in static images. The company says it hopes to one day move S+U learning beyond to support video input.
Like other Silicon Valley tech companies, Apple is sinking significant capital into machine learning and computer vision technologies. Information gleaned from such endeavors will likely enhance consumer facing products like Siri and augmented reality apps built using ARKit. The company is also working on a variety of autonomous solutions, including self-driving car applications, that could make their way to market in the coming months or years.
"We're focusing on autonomous systems," Cook said in a June interview. "It's a core technology that we view as very important. We sort of see it as the mother of all AI projects."
Considered one of the most influential conferences in the field according to the h-index, a metric for scholarly works, CVPR in July selected Apple's paper as one of two Best Paper Awards.
According to AppleInsider reader Tom, who holds a PhD in machine learning and computer vision, the CVPR award is one of the most sought after in the field.
This year, the conference received a record 2,680 valid submissions, of which 2,620 were reviewed. Delegates whittled down that number to 783 papers, granting long oral presentations to 71 entrants. Apple's submission ultimately made its way to the top of the pile, an impressive feat considering it was the company's inaugural showing.
CVPR's second Best Paper Award went to Gao Huang, Zhuang Liu, Laurens van der Maaten and Kilian Q. Weinberger for their research on "Densely Connected Convolutional Networks." Research for the paper was conducted by Cornell University in collaboration with Tsinghua University and Facebook AI Research.
Titled "Learning from Simulated and Unsupervised Images through Adversarial Training," Apple's paper was penned by computer vision expert Ashish Shrivastava and a team of engineers including Tomas Pfister, Oncel Tuzel, Wenda Wang, Russ Webb and Apple Director of Artificial Intelligence Research Josh Susskind. Shrivastava presented the research to CVPR attendees on July 23.
As detailed when it saw publication in December, Apple's public research paper describes techniques of training computer vision algorithms to recognize objects using synthetic images.
According to Apple, training models based solely on real-world images are often less efficient than those leveraging synthetic data cause because computer generated images are usually labeled. For example, a synthetic image of an eye or hand is annotated as such, while real-world images depicting similar objects are unknown to the algorithm and thus need to be described by a human operator.
As noted by Apple, however, relying completely on simulated images might yield unsatisfactory results, as computer generated content is sometimes not realistic enough to provide an accurate learning set. To help bridge the gap, Apple proposes a system of refining a simulator's output through SimGAN, a take on "Simulated+Unsupervised learning." The technique combines unlabeled real image data with annotated synthetic images using Generative Adversarial Networks (GANs), or competing neural networks.
In its study, Apple applied SimGAN to the evaluation of gaze and hand pose estimation in static images. The company says it hopes to one day move S+U learning beyond to support video input.
Like other Silicon Valley tech companies, Apple is sinking significant capital into machine learning and computer vision technologies. Information gleaned from such endeavors will likely enhance consumer facing products like Siri and augmented reality apps built using ARKit. The company is also working on a variety of autonomous solutions, including self-driving car applications, that could make their way to market in the coming months or years.
"We're focusing on autonomous systems," Cook said in a June interview. "It's a core technology that we view as very important. We sort of see it as the mother of all AI projects."
Comments
/s
Its like picking kids out of school in the old soviet union with various skills and then sending them to special schools, training, splitting by skills, etc until you got a lot of very trained athletes in multiples fields.
Looking for overall training I believe is a mistake and can lead to a lot of inefficiencies in the short term. Your basically doing premature optimization by doing that.
Those pools with various skillsets could then be used to automatically train agents with more integrated skillsets so the various "pools" work more efficiently, maybe even have a pool of workers as facilitators or balancing resources internally.
As to whether Apple 'understands' AI or not, well of course it does, but so does everybody else working in the field. Whether they are 'behind' or not is something completely different. I have no idea but they say the proof is in the pudding so maybe we should just wait and see.
Just take whatever 'people online' say with a grain of salt.
In the real world of teaching and learning, good teaching means scaffolding the lessons so the foundations are built before a real problem is thrown at the students. The Siri group seems to be modeling teaching of neuro-networks on brain research and good teaching methods.
BTW, my reading suggests that Siri is no longer just the voice recognition application that Apple delivers, but has evolved into the name of their AI group. Also note that the training paper is about image recognition with the focus on eye recognition, and hand recognition. The eye recognition is asking the system where the eye focus is, and the hand recognition is asking the system to recognize hand gestures. You can guess where all this AI is leading.
As an example of that, it would be easy to look at MS and Google as "leading" in AR, up until Apple innovated with API's and a wide platform to develop and distribute to of some 500 million iOS devices even before the iPhone 8 arrives. I don't expect Google to lag more than a single generation behind Apple's ARKit, and catch up fast on the Android OS platform, but where will MS widely apply HoloLens, considered to be leading in AR technology?
On the other hand, I would give MS a considerable advantage in VR, today anyway, simply as the large Windows PC / xbox platforms are the largest base of devices with the necessary performance to allow development and use,
I use Wolfram Mathematica and WolframAlpha within Mathematica and the standalone app, I find it quite poor in handling natural language questions, first because it doesn't understand the wording, second because its repository of information is very limited.
"They all suck about the same" is likely because their knowledge repositories all "suck about the same".
Here is the disconnect. Computer science studies and AI is all the rage, but the mundane unheralded unsexy heads-down task of collecting and cleaning data is where the resources truly are needed.
Personally, that seems to be the biggest technical issue I’ve ever had with Siri, so I expect HomePod will have people claiming that Siri magically became good all of a sudden and only for HomePod.
The other major issue I’ve had is from Apple’s poor feature stepping for a server-side service and lack of periodic updates to clue users in to new and updated uses—which is something Amazon has be great at doing. There are likely many great uses for Siri that I’ve never known about or have forgotten after trying it early on and not getting the desired or expected response.