zimmie

About

Username
zimmie
Joined
Visits
169
Last Active
Roles
member
Points
2,737
Badges
1
Posts
651
  • Apple Silicon Mac Pro could combine two M1 Ultra chips for speed

    zimmie said:
    On the topic of RAM, there's nothing inherent in the M1's design which precludes off-package RAM. They have a RAM controller on the chips, and the RAM controller is shared between CPU and GPU cores, but there's no fundamental reason the RAM couldn't be in DIMMs. Apple just hasn't chosen to do that. They might do so with the Mac Pro, or they might not. I don't see them doing a tiered memory structure, though. They just went to significant lengths to do away with NUMA concerns on the M1 Ultra.
    With the introduction of the Mac Studio, I wouldn't be surprised to see the Mac Pro go primarily rackmount. Very few people need more computing power than the Mac Studio offers at their desks. Almost everyone who does need more computing power is in an environment where they can rack the computer in a closet. For example, recording studios, film studios, scientific labs, and so on all have 19" rack space for other equipment, so putting specialist workstations in there isn't a stretch. That said, rackmount would mostly be relevant for a box with several full-height, full-length PCIe slots (e.g., to add hardwired audio and video inputs), and I'm not yet convinced Apple is interested in that at all. They might say the future is a rackmount interface which connects to the system via Thunderbolt. I'd be curious to know what they have seen the current rackmount Mac Pro doing.

    The in-package memory has much tighter timing tolerances because the memory configuration is fixed, lower signal driver power levels, and at very short distances.  I would imagine that their memory controller takes full advantage of those facts, and cuts a lot of the complicated corners that dealing with DIMM slots creates.  So, I disagree:  I do not think their memory controller could support out-of-package memory without some serious work, and it would represent a large power increase and performance impact.

    What I suggested isn't a software-visible tiered memory structure.  It is just an alternative fast backing-store for the existing virtual memory system that all software currently works with (currently backed by flash memory).  Implementing this would require no changes to the M1 architecture and very little OS change.  The Mac Pro has a very small market (especially with the Mac Studio now taking a chunk of it), so custom work to support it doesn't make a lot of sense for Apple.  That's a big reason why I think the M1 Ultra is what we will see in the Mac Pro.  And probably just one of them as going multi-chip is a lot of specialized added hardware design work that I don't think they want to do.

    The M1 Ultra has a pretty amazing amount of compute, after all.  Bump the clock rate a little and you give it a small edge over the Mac Studio.

    A rack mountable full-sized case (but still a desktop workstation) which can hold lots of extra drives, memory, and PCIe cards would differentiate it from the Mac Studio.  

    I doubt they will do this, but one thing they could do fairly easily in such a form factor is put the M1 Ultra motherboard itself on a PCIe card so one case could hold multiple of them (the case becomes just a PCIe backplane then).  How such a machine would be used becomes more challenging though and would take them away from their preferred programming model of a single shared memory space for many CPU/GPU cores.  Without a lot of OS work such a machine would look like several Macs on a high speed network... has some uses, but gets pretty obscure and way out of the consumer space.  Then again, it is the "Mac Pro" so who knows?

    They objectively do not have tighter timings. The on-package RAM on the base M1 is LPDDR4X. Exactly the same standard has been used for off-package RAM in the Intel MacBook Pro models for the last several years. The M1 Pro and Max use LPDDR5, which has slotted variants. HBM2 is enough to require the RAM be on-package, but they're not using that. Putting the LPDDR4X or LPDDR5 on-package saves a few microwatts from the shorter traces, but that's it.

    Off-package RAM as a first-tier swap level could be done, but would give developers inconsistent memory performance. Again, Apple just went to ridiculous lengths to engineer away NUMA specifically because inconsistent memory performance isn't good enough. I don't see them adding it back in when they could instead just connect their memory controllers to DIMM slots. Sure, DDR5 DIMMs are rare right now, but that's not a limitation as far as Apple is concerned.

    I wouldn't be surprised to see an M2 Ultra as the first M2-family chip, introduced in the Mac Pro at WWDC.
    watto_cobrafastasleepFileMakerFeller
  • Apple Silicon Mac Pro could combine two M1 Ultra chips for speed

    My guess is that the Mac Pro will use the same M1 Ultra as the Mac Studio does.  The difference will be in the system around the SoC.  With a larger form factor, they have more cooling potential and could bump up the clock rates a little... but really, the M1 Ultra is a monster as it is (both in terms of size and performance).  I would just take what Turnes said at face value, this is already the last of the M1 series.  And I think we will see a Mac Pro that uses it.

    So what could differentiate the Mac Pro?  In a word:  expandability.

    1) PCIe slots.  The M1 Ultra seems to have plenty of I/O potential, and a fast PCIe bridge chip would easily enable a lot of expansion potential.

    2) Drive bays.  The Mac Pro would have the same built-in super fast SSD, but in a large case a whole lot of additional storage can be accommodated.

    3) RAM.  This is where it gets tricky.  The Apple Silicon approach is to use in-package memory, and there are real constraints on how much can be put into a single package.  Some Pros just need more than can be fit into a single package, or more than is worth building in the TSMC production run.  So conventional DIMMs are needed to supplement the super fast in-package memory.  The question is, how does OSX use it?  Apple seems to want to keep the programming model simple (i.e. CPU/GPU shared memory with a flat/uniform 64-bit virtual address space), so having some fast vs slow areas of memory doesn't seem like the direction they want to go in (although they could and just rely on the M1 Ultra's ENORMOUS caches).  They are already doing virtual memory paging to flash, however... so why not do virtual memory paging to the DIMMs instead?  Big DMA data transfers between in-package and on-DIMM memory across the very fast PCIe 5.0 lanes would ensure that the available bandwidth is used as efficiently as possible, and the latency is masked by the big (page-sized) transfers.  A 128GB working memory (the in-package RAM) is huge, so doing VMM to get to the expanded pool is not as bad as you might think.  Such a memory scheme may even just sit on PCIe cards so buyers only need to pay for the DIMM slots if they really need it.  Such "RAM disk" cards have been around for ages, but are usually hampered by lack of direct OS support... and issue Apple could fix easily in their kernel.
    On the topic of RAM, there's nothing inherent in the M1's design which precludes off-package RAM. They have a RAM controller on the chips, and the RAM controller is shared between CPU and GPU cores, but there's no fundamental reason the RAM couldn't be in DIMMs. Apple just hasn't chosen to do that. They might do so with the Mac Pro, or they might not. I don't see them doing a tiered memory structure, though. They just went to significant lengths to do away with NUMA concerns on the M1 Ultra.

    With the introduction of the Mac Studio, I wouldn't be surprised to see the Mac Pro go primarily rackmount. Very few people need more computing power than the Mac Studio offers at their desks. Almost everyone who does need more computing power is in an environment where they can rack the computer in a closet. For example, recording studios, film studios, scientific labs, and so on all have 19" rack space for other equipment, so putting specialist workstations in there isn't a stretch. That said, rackmount would mostly be relevant for a box with several full-height, full-length PCIe slots (e.g., to add hardwired audio and video inputs), and I'm not yet convinced Apple is interested in that at all. They might say the future is a rackmount interface which connects to the system via Thunderbolt. I'd be curious to know what they have seen the current rackmount Mac Pro doing.

    Called the idea of using one high-end die in the laptops, then multiple high-end dies in the high-performance desktops, but I expected it to be the M1 Max Duo or something rather than a whole new name. I did get the dGPU very wrong, though. I didn't quite understand at that time just how powerful Apple's GPU cores are.
    williamlondonrundhvidwatto_cobraFileMakerFeller
  • Apple launches the 27-inch Apple Studio Display with 5K, speakers, camera

    KBuffett said:
    Does anyone know if this can be used with a PC?
    This is the LG 5K in an aluminum shell.
    Which is the 5K iMac minus the Mac in a plastic shell.

    One interesting quirk: the stands are not officially swappable. You have to select the stand or VESA mount when you buy it, and only the VESA mount points support portrait orientation.
    shareef777forgot usernameelijahgwatto_cobra
  • Benchmarks show that Intel's Alder Lake chips aren't M1 Max killers

    Serious question:

    What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

    Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max? 

    I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?
    Power is consumed when a transistor switch from 0 to 1 or 1 to 0. Switching is controlled by clock cycles. The more switching the more power is consumed. 
    Well, that is presumably a given. And possibly at a slightly lower level than I was alluding to. More specifically, is there some set of processing or overall design feature that Intel does wrong? Or does it do more 'stuff' that the M1 doesn't do? Is it required to support legacy ways of doing stuff that the M1 is free from? 
    Short answer is that x86 and amd64 instruction sets are old, and they tried to be everything to everyone. As a result, they are extremely complex. Instructions and data are put together (for example, the instruction to move data, the source register or value, and the destination register are strung together into one bit sequence), and instructions have variable bit sizes. Determining which part is the instruction and which part is the data basically requires a whole tiny CPU by itself.

    Intel's "Core" line was built in part because they were having trouble getting older designs to go faster. They built a new internal architecture which is a lot simpler, then added a sort of translation layer which takes the more complex instructions and breaks them into "micro-operations". That approach has served them well, but there's only so much you can do in hardware without removing instructions and simplifying what the processor offers to software.

    ARM is radically simpler than x86.

    Well if one needs GPU performance - I am running into software for example that simply will not run (bricked) without a beefy GPU...

    from the Macworld article:

      59,774   Apple M1 Max 32 core GPU
    143,594  nVidia 3080 Ti

    240% faster, presumably not 'within margin of error'

    Even more pronounced seem the desktop options (AMD) with the relatively inexpensive nVidia 3060 outperforming passmark scores for many higher priced cards as well as having 12GB VRAM  www.bestbuy.com/site/evga-nvidia-geforce-rtx-3060-xc-gaming-12gb-gddr6-pci-express-4-0-graphics-card/6454329.p?skuId=6454329

    I understood Apple is working on a 'boost' option which may help, and will presumably also ramp up the power and fan requirements...?
    This gets a little complicated. With Apple's "unified memory", their GPU cores have access to everything in the whole up-to-64-GB of RAM. A lot of non-gaming uses of GPUs involve manipulating huge datasets. If the data you're working with is bigger than can fit in the card's VRAM (12 GB for the 3080, 24 GB for the 3090), you're basically going to be swapping between VRAM and normal RAM. That seriously hurts performance and is why Nvidia has been making their compute-specific cards (Tesla, until that brand was retired in 2020) with 12+ GB per GPU since 2014.

    For math across large datasets, the M1 Max can actually beat the RTX 3080 just because it doesn't have to spend so much time shuffling data around.

    This is also the idea behind AMD's Radeon Pro SSG (Solid State Graphics). They added a 2 TB NVMe SSD to use as on-card swap space for VRAM. It's meant for video editing and allows you to keep a huge chunk of the video all on the card.
    sconosciutoAlex_Vcat52watto_cobra
  • EU carriers want Apple's Private Relay blocked

    dcgoo said:

    Agree! What is the difference between already existing VPN services and Apple Private Relay?
    Difference is Private Relay only applies to traffic from the Safari browser.  A VPN would apply to the entire device.
    This is not accurate. Private Relay covers all DNS requests by default, and all HTTP requests made by URLSession and a few other APIs. Safari uses one of the covered APIs, so its HTTP requests are sent over Private Relay if it is enabled. All browsers on iOS are basically Safari skins, so they also use Private Relay if enabled.

    Part of the problem is people have been conflating "VPN" and "proxy" for over a decade. The overwhelming majority of "VPN" services are actually acting as proxies. They just happen to use VPN technologies for the client-to-proxy leg. Private Relay is also a proxy service, and it's valid to compare it to others.
    applguyFileMakerFellerwatto_cobra