Apple rumored to launch new Macs in October

22july2013 · October 23, 2023 9:38PM

Marvin said:

22july2013 said:

Marvin said:

22july2013 said:
Does anyone know of any software that can show me the load on my Neural Engines? Does anyone know why Apple's Activity Monitor refrains from reporting that data?

This app shows load on the Neural Engine:

https://github.com/tlkh/asitop

The percentage isn't accurate but it will show when it's being used:

https://github.com/tlkh/asitop/issues/57

Wow, I'll check into that! That will help me. Thanks.

It needs both python and pip to be installed first. This is going to take a while. Python requires GCC, and GCC requires Xcode. This could take days.

This one is prebuilt:

https://github.com/op06072/NeoAsitop
https://github.com/op06072/NeoAsitop/releases
https://github.com/op06072/NeoAsitop/releases/download/v2.9/neoasitop.zip

Open a terminal to a large enough size, drag in the binary and hit enter.

That worked. I'm very grateful for your advice. Now I will check the load on my Neural Engine, and report it here.

In my preliminary test, my Neural Engine usage peaks at 7% when I do my OCR task, which is still higher than 0% when I'm not running my app, but you did say the percentage wasn't accurate. At least now I have something to examine, thanks.

edited October 2023

9secondkox2 · October 23, 2023 9:52PM

canukstorm said:

9secondkox2 said:

macxpress said:

Afarstar said:

Hope this is true. I’m desperate to update my 10year old iMac.

So why not just get what's available now? It's already waayyy faster than what you have and will easily last another 10yrs.

Horrid advice. Buy a two year old iMac when a new one is just around the corner? You’re kidding surely.

More like just wait with what you have until the iMac is update. An m3 will smoke the m1.

People thought the A17 Pro would smoke the A16 because of 3nm but that's not the case.

It’s… in a phone.

And the GPU does smoke the a16.

I didn’t say m3 smokes m2, btw. I said it smokes m1 as the actual context of my comments pertained to the 24vinch imac which is only available with m1. Which the m2 is already a nice upgrade from.

So if you were paying attention, you’d have compared the a17 pro to the A15. But you didn’t, because… you know. Buy hey, thanks for participating.

edited October 2023

elijahg · October 25, 2023 10:45PM

22july2013 said:

elijahg said:

22july2013 said:

What I want is better NE (Neural Engine) core speeds, because my software seems to be bound by the speed of the NEs. (It's hard to be sure, since the Apple Activity Monitor in macOS doesn't tell you about the NE load.) Here's what Apple's processors claim to have:

11 trillion ops/sec -- M1
15.8 trillion ops/sec --M2 Pro, M2 Max, M2, A15
17 trillion ops/sec -- A16
22 trillion ops/sec -- M1 Ultra (apparently 2x the M1 speed because the M1 Ultra is 2 M1 cores tied together. I'm not sure if real world performance scales perfectly.)
31.6 trillion ops/sec -- M2 Ultra (apparently 2x the M2 speed because the M2 Ultra is 2 M2 cores tied together. I'm not sure if real world performance scales perfectly.)

Looking at this data, there was a 40% speed bump from M1 to M2, which is definitely okay, but the other increases were caused only from doubling the number of Mx/NE cores. I'm hoping the M3 chip has a true speed bump, in which case I'm likely to replace my M1 Mac with an M3 Mac.

Does anyone know of any software that can show me the load on my Neural Engines? Does anyone know why Apple's Activity Monitor refrains from reporting that data?

Out of interest, what do you use that requires heavy use of the neural engine?

I do two things that (as far as I can tell) are using the Neural Engine. One of them is the macOS builtin optical character recognition which I access using the macOS Shortcuts app. Essentially, I'm taking screenshots of videos using this app and I gather all the text that occurs inside videos. Essentially I'm running OCR in an infinite loop. As far as I know, the OCR code in macOS is using the Neural Engine. One of the reasons I think it's running in the Neural Engine is that I don't see any load on the CPU. So where would that OCR be occurring? Probably in the Neural Engine. That's what I'm doing. As to which videos I'm doing that on, I won't tell you. Trade secret.

Most (not all) of this kind of thing works on Intel Macs still so maybe they're emulating the Neural Engine on Intel. Or it's not actually using the neural engine at all (as you point out it's only 7% used, apparently).

22july2013 · October 25, 2023 10:51PM

elijahg said:

22july2013 said:

elijahg said:

22july2013 said:

What I want is better NE (Neural Engine) core speeds, because my software seems to be bound by the speed of the NEs. (It's hard to be sure, since the Apple Activity Monitor in macOS doesn't tell you about the NE load.) Here's what Apple's processors claim to have:

11 trillion ops/sec -- M1
15.8 trillion ops/sec --M2 Pro, M2 Max, M2, A15
17 trillion ops/sec -- A16
22 trillion ops/sec -- M1 Ultra (apparently 2x the M1 speed because the M1 Ultra is 2 M1 cores tied together. I'm not sure if real world performance scales perfectly.)
31.6 trillion ops/sec -- M2 Ultra (apparently 2x the M2 speed because the M2 Ultra is 2 M2 cores tied together. I'm not sure if real world performance scales perfectly.)

Looking at this data, there was a 40% speed bump from M1 to M2, which is definitely okay, but the other increases were caused only from doubling the number of Mx/NE cores. I'm hoping the M3 chip has a true speed bump, in which case I'm likely to replace my M1 Mac with an M3 Mac.

Does anyone know of any software that can show me the load on my Neural Engines? Does anyone know why Apple's Activity Monitor refrains from reporting that data?

Out of interest, what do you use that requires heavy use of the neural engine?

I do two things that (as far as I can tell) are using the Neural Engine. One of them is the macOS builtin optical character recognition which I access using the macOS Shortcuts app. Essentially, I'm taking screenshots of videos using this app and I gather all the text that occurs inside videos. Essentially I'm running OCR in an infinite loop. As far as I know, the OCR code in macOS is using the Neural Engine. One of the reasons I think it's running in the Neural Engine is that I don't see any load on the CPU. So where would that OCR be occurring? Probably in the Neural Engine. That's what I'm doing. As to which videos I'm doing that on, I won't tell you. Trade secret.

Most (not all) of this kind of thing works on Intel Macs still so maybe they're emulating the Neural Engine on Intel. Or it's not actually using the neural engine at all (as you point out it's only 7% used, apparently).

Both of your points are smartly made and viable theories. I can't disprove your ideas. They might be right. It's nice to have a thread where people aren't in rude disagreement with each other. Too many people this week have called me an idiot.

edited October 2023

marvin · October 25, 2023 11:33PM

elijahg said:

22july2013 said:

elijahg said:

22july2013 said:

What I want is better NE (Neural Engine) core speeds, because my software seems to be bound by the speed of the NEs. (It's hard to be sure, since the Apple Activity Monitor in macOS doesn't tell you about the NE load.) Here's what Apple's processors claim to have:

11 trillion ops/sec -- M1
15.8 trillion ops/sec --M2 Pro, M2 Max, M2, A15
17 trillion ops/sec -- A16
22 trillion ops/sec -- M1 Ultra (apparently 2x the M1 speed because the M1 Ultra is 2 M1 cores tied together. I'm not sure if real world performance scales perfectly.)
31.6 trillion ops/sec -- M2 Ultra (apparently 2x the M2 speed because the M2 Ultra is 2 M2 cores tied together. I'm not sure if real world performance scales perfectly.)

Looking at this data, there was a 40% speed bump from M1 to M2, which is definitely okay, but the other increases were caused only from doubling the number of Mx/NE cores. I'm hoping the M3 chip has a true speed bump, in which case I'm likely to replace my M1 Mac with an M3 Mac.

Does anyone know of any software that can show me the load on my Neural Engines? Does anyone know why Apple's Activity Monitor refrains from reporting that data?

Out of interest, what do you use that requires heavy use of the neural engine?

I do two things that (as far as I can tell) are using the Neural Engine. One of them is the macOS builtin optical character recognition which I access using the macOS Shortcuts app. Essentially, I'm taking screenshots of videos using this app and I gather all the text that occurs inside videos. Essentially I'm running OCR in an infinite loop. As far as I know, the OCR code in macOS is using the Neural Engine. One of the reasons I think it's running in the Neural Engine is that I don't see any load on the CPU. So where would that OCR be occurring? Probably in the Neural Engine. That's what I'm doing. As to which videos I'm doing that on, I won't tell you. Trade secret.

Most (not all) of this kind of thing works on Intel Macs still so maybe they're emulating the Neural Engine on Intel. Or it's not actually using the neural engine at all (as you point out it's only 7% used, apparently).

CoreML can run on the CPU/GPU/NPU and picks the fastest. Apple says the peak FP16 throughput of the ANE on A15 is 15.8TFLOPs. Max or Ultra chips could potentially outperform the neural engine on some tasks:

https://machinelearning.apple.com/research/neural-engine-transformers

https://www.photoroom.com/inside-photoroom/core-ml-performance-2022

Image: https://a.storyblok.com/f/191576/1352x622/477730e504/chart.png

The NPU percentage usage is measured from the power vs 8W. For OCR across multiple framebuffers, processing/generating the uncompressed buffers will be taking some of the processing time.

The neural engine is 16-core, 1-core = 6.2%. It could be the model used isn't able to run in parallel or it just finishes so quickly it doesn't need to use more of the NPU. Processing frames in batches could potentially make it run faster.

The newer Mac systems do OCR as part of the system now. In the above graph image, dragging over the text will highlight it after a short delay. Even though it's a static image, it runs OCR and lets you copy/paste text from an image. It happens fairly quickly and doesn't register much processor usage on CPU/GPU/NPU.

edited October 2023

Apple rumored to launch new Macs in October

Comments