If it's encrypted end to end, how is it possible to detect CSAM?
It's scanned on-device before it's sent, or upon reception. Which is also a problem - your device suddenly has unremovable spyware installed, this is not ok. What happens when Putin adds into the signature database a picture of one of his adversaries - or someone who has wronged him? All of a sudden this is flagged up under the guise of someone having CSAM, and that person is smeared and will probably disappear. No amount of "checks" will stop a dictator in his own country from doing something like that. Who controls the signature database? A guarantee that someone hasn't added a false signature in is essentially impossible.
Slippery slope arguments can be made about anything. And totalitarian countries don't need "evidence" if they want to throw someone in prison or out of a window. That's the whole point of totalitarianism: no checks or balances.
No they don't, but they also might not know who is passing info about the dictator. Be that Putin, Assad, Kim Jong-Un, Jingping etc. With this, they simply add to the database a picture of for example Putin's face, or even the word "Putin", or "dictator", "Winnie the Pooh" or anything else a dictator may deem unsavoury, and bam, they have instant knowledge of anyone who might be sending something not wholly supportive of the "motherland". See how it's a problem? And see how rapidly a thing like this can turn into a privacy snatching mass surveillance tool?
If it's encrypted end to end, how is it possible to detect CSAM?
That question is way above my pay grade to know, but my GUESS is that, as Apple mentioned in the original announcement, even an encrypted photo can be algorithmically searched for a certain pattern, in this case human nudity.
Now lots of perfectly innocent pictures contain some level of nudity, so — again, guessing here — that is combined with where the image was sent and/or uploaded by. If any of the recipients or senders is a child account, that plus the nudity might flag it — and then law enforcement could be contacted to obtain a court order to decrypt.
The basic concept of CSAM scanning doesn't involve searching for patterns in your images, not nudity or anything else. The way it worked was to compare hashes of your images to a database of hashes of known CSAM images. The database came from the National Center for Missing and Exploited Children which maintains hashes of the images that were proven in criminal cases to be of minors.
The concerns of having CSAM on our devices as part of the detection were unwarranted and based on a misunderstanding of how the system works. A potential valid concern is the probability of hash collisions. I recall Apple's response on that was that they weren't going to alert on single matches.
That's not entirely true. Whilst it did compare hashes, Apple's CSAM scanner was fuzzy matching so false positives were possible, which is why they had a human reviewer.
"The hashing technology, called NeuralHash, analyzes an image and converts it to a unique number specific
to that image. Only another image that appears nearly identical can produce the same number; for
example, images that differ in size or transcoded quality will still have the same NeuralHash value." https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf
In a truly E2E (iCloud photos are not unless you turn it on) system, hash scanning doesn't work as Apple never sees the original content on their servers. The only way to do it is to scan on device before anything goes anywhere. Apple's original annoucment around this effort was somewhat convoluted (but actually pretty smart) and didn't trigger on single matches (for fasle-positive reasons you mention) but it did have to do with scanning actual content not just hashes, which is why there's probably confusion.
As I understood it, your device would have a csam hash database and your device would hash your images, so the detection would happen on your device using data on your device. In any case, the technique involved hashes and not trying to determine the image content.
In a truly E2E (iCloud photos are not unless you turn it on) system, hash scanning doesn't work as Apple never sees the original content on their servers. The only way to do it is to scan on device before anything goes anywhere. Apple's original annoucment around this effort was somewhat convoluted (but actually pretty smart) and didn't trigger on single matches (for fasle-positive reasons you mention) but it did have to do with scanning actual content not just hashes, which is why there's probably confusion.
As I understood it, your device would have a csam hash database and your device would hash your images, so the detection would happen on your device using data on your device. In any case, the technique involved hashes and not trying to determine the image content.
I'm trying to think back to the last time this came up and we looked into it for editorial purposes. IIRC it's either actually illegal (there's a lot of really unintuitive laws around this stuff) or at the least just a very bad idea to distribute that databse to the phones in the first place, as it can be extracted and used to circumvent detection. There were also concerns about 'posioining' the database with other content, espeically in countries with fewer (or no) freedom of speech protections.
There was a fuzzy ML implementation approach akin to Microsoft's PhotoDNA included in the initial pitch as well. It wasn't just a dumb hash based solution, as it (and Microsoft's in-use PhotoDNA) are designed to detect changes (cropping, mirroing, color shifting, etc.) in the content while still idenitfying it. I do not recall off the top of my head if they ever released a full whitepaper on the approach. Someone else can dig it up if they care (edit: elijahg did link the pdf above.)
I beleive it took about half a week after researchers got ahold of Apple's implementation for testing to make it reliabally generate false positives, which meant the whole thing was ultimately useless. Some of these projects are still up on github like https://github.com/anishathalye/neural-hash-collider . That's why Apple really decided not to do it in the end. It was a clever approach from a privacy respecting perspective, it just didn't work well enough and had some external world-facing issues to boot.
The lawsuit itself is frought with problems. Keep reading AppleInsider and we'll keep you updated on its progress 😁
In a truly E2E (iCloud photos are not unless you turn it on) system, hash scanning doesn't work as Apple never sees the original content on their servers. The only way to do it is to scan on device before anything goes anywhere. Apple's original annoucment around this effort was somewhat convoluted (but actually pretty smart) and didn't trigger on single matches (for fasle-positive reasons you mention) but it did have to do with scanning actual content not just hashes, which is why there's probably confusion.
As I understood it, your device would have a csam hash database and your device would hash your images, so the detection would happen on your device using data on your device. In any case, the technique involved hashes and not trying to determine the image content.
I'm trying to think back to the last time this came up and we looked into it for editorial purposes. IIRC it's either actually illegal (there's a lot of really unintuitive laws around this stuff) or at the least just a very bad idea to distribute that databse to the phones in the first place, as it can be extracted and used to circumvent detection. There were also concerns about 'posioining' the database with other content, espeically in countries with fewer (or no) freedom of speech protections.
There was a fuzzy ML implementation approach akin to Microsoft's PhotoDNA included in the initial pitch as well. It wasn't just a dumb hash based solution, as it (and Microsoft's in-use PhotoDNA) are designed to detect changes (cropping, mirroing, color shifting, etc.) in the content while still idenitfying it. I do not recall off the top of my head if they ever released a full whitepaper on the approach. Someone else can dig it up if they care (edit: elijahg did link the pdf above.)
I beleive it took about half a week after researchers got ahold of Apple's implementation for testing to make it reliabally generate false positives, which meant the whole thing was ultimately useless. Some of these projects are still up on github like https://github.com/anishathalye/neural-hash-collider . That's why Apple really decided not to do it in the end. It was a clever approach from a privacy respecting perspective, it just didn't work well enough and had some external world-facing issues to boot.
The lawsuit itself is frought with problems. Keep reading AppleInsider and we'll keep you updated on its progress 😁
In theory having the hash database on the phone wouldn't matter, they're just hashes. All it would really do is allow someone to know how much they needed to modify the image by to avoid the match (or flood Apple with false positives) - which they could work out from what Apple released anyway. I suppose too people could see if an image they possessed was already in the database, but I can't think of any other negatives of storing the DB on the phone.
If it's encrypted end to end, how is it possible to detect CSAM?
Photos are not E2EE unless you have Advanced Data Protection AFAIK. But that doesn't matter to CSAM anyway. As I understand it the photo data is being scanned using hashtags of verifiable child porn images.
The lawsuit is absurd on its face. It’s not whether Apple should or shouldn’t have implemented the feature. That’s a debate we can all have. It’s about whether Apple deciding not to proceed makes it *legally liable* for what happened to a class. The argument that apple products are knowingly defective because they chose not to implement a controversial is just laughable. As everything is obviously a financial calculation, Apple will fight this one all the way.
Say, I'm a pedophile and take a picture of a naked kid. Obviously that pic is not in any database of child porn yet. Then I try to upload it to a CSAM-enabled iCloud. The AI flags it as possible child porn so an Apple employee takes a look to be sure? How does the reviewer know the age of my subject? How does the reviewer know that I'm a pedophile rather an a horny teen taking pictures of my consenting 17-year-old boyfriend/girlfriend? Hell, maybe it's a picture of me.
Also, regarding this lawsuit, the plaintiff isn't even a user of Apple products (or not the one's in question). Some third party shared their pic using iCloud. They could have done the same thing by spinning up their own website. Impossible for me to see how Apple bears any responsibility here.
Say, I'm a pedophile and take a picture of a naked kid. Obviously that pic is not in any database of child porn yet. Then I try to upload it to a CSAM-enabled iCloud. The AI flags it as possible child porn so an Apple employee takes a look to be sure? How does the reviewer know the age of my subject? How does the reviewer know that I'm a pedophile rather an a horny teen taking pictures of my consenting 17-year-old boyfriend/girlfriend? Hell, maybe it's a picture of me.
Also, regarding this lawsuit, the plaintiff isn't even a user of Apple products (or not the one's in question). Some third party shared their pic using iCloud. They could have done the same thing by spinning up their own website. Impossible for me to see how Apple bears any responsibility here.
The AI doesn't use contextual awareness to work out what's in the photo. It just hashes and compares with a database. Presumably the database is updated every time a device with CSAM on it is found by law enforcement, so even if that photo *is* shared, until someone who they've shared with (or the creator) is caught, and has their photos uploaded to the database the creator won't get caught. And the creator won't get caught if they delete the photo either. Messages OTOH asks before a child receives (and maybe sends, not sure) a picture containing nudity, judged by on-device AI.
Comments
That's not entirely true. Whilst it did compare hashes, Apple's CSAM scanner was fuzzy matching so false positives were possible, which is why they had a human reviewer.
"The hashing technology, called NeuralHash, analyzes an image and converts it to a unique number specific to that image. Only another image that appears nearly identical can produce the same number; for example, images that differ in size or transcoded quality will still have the same NeuralHash value." https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf
I'm trying to think back to the last time this came up and we looked into it for editorial purposes. IIRC it's either actually illegal (there's a lot of really unintuitive laws around this stuff) or at the least just a very bad idea to distribute that databse to the phones in the first place, as it can be extracted and used to circumvent detection. There were also concerns about 'posioining' the database with other content, espeically in countries with fewer (or no) freedom of speech protections.
There was a fuzzy ML implementation approach akin to Microsoft's PhotoDNA included in the initial pitch as well. It wasn't just a dumb hash based solution, as it (and Microsoft's in-use PhotoDNA) are designed to detect changes (cropping, mirroing, color shifting, etc.) in the content while still idenitfying it. I do not recall off the top of my head if they ever released a full whitepaper on the approach. Someone else can dig it up if they care (edit: elijahg did link the pdf above.)
I beleive it took about half a week after researchers got ahold of Apple's implementation for testing to make it reliabally generate false positives, which meant the whole thing was ultimately useless. Some of these projects are still up on github like https://github.com/anishathalye/neural-hash-collider . That's why Apple really decided not to do it in the end. It was a clever approach from a privacy respecting perspective, it just didn't work well enough and had some external world-facing issues to boot.
The lawsuit itself is frought with problems. Keep reading AppleInsider and we'll keep you updated on its progress 😁
Say, I'm a pedophile and take a picture of a naked kid. Obviously that pic is not in any database of child porn yet. Then I try to upload it to a CSAM-enabled iCloud. The AI flags it as possible child porn so an Apple employee takes a look to be sure? How does the reviewer know the age of my subject? How does the reviewer know that I'm a pedophile rather an a horny teen taking pictures of my consenting 17-year-old boyfriend/girlfriend? Hell, maybe it's a picture of me.
Also, regarding this lawsuit, the plaintiff isn't even a user of Apple products (or not the one's in question). Some third party shared their pic using iCloud. They could have done the same thing by spinning up their own website. Impossible for me to see how Apple bears any responsibility here.