Who would use 64 bit ints on the 970?

os10geek · February 23, 2003 1:54PM

The 4 gigabyte RAM maximum that the 970 promises will be godsend for Lightwave, Finalcut, and Shake. Add the 970, a 15k HD, and a wicked 900 mhz system bus, and the Xeon will run out in a state of terror.

programmer · February 23, 2003 2:17PM

[quote]Originally posted by os10geek:

The 4 gigabyte RAM maximum that the 970 promises will be godsend for Lightwave, Finalcut, and Shake. Add the 970, a 15k HD, and a wicked 900 mhz system bus, and the Xeon will run out in a state of terror.

<hr></blockquote>

4 gigabyte maximum? I'm not sure what you mean... the G4 has 4 gigabyte address spaces and its just the Apple hardware that limits them to 2 GB (although the OS may steal some of the address space away from the app, but that can usually be reduced and hopefully its not a full 2 GB!).

The 970 will support 4096 GB of physical memory (where 1 GB == 2^30 bytes). That ought to make anybody run out in a state of terror... including Apple's memory controller. At $250/GB (roughly current DDR333 prices at 1GB/DIMM) that much RAM would cost a cool million US$ and would occupy 4096 slots. Its going to be a little while before RAM density is high enough to make 42-bit addressing a problem. And when that happens they just have to tweak the FSB protocol to use larger addresses since the CPU is already using 64-bits everywhere else... the "unused" space is about 4 million times the size of the "limited" 42-bit address space.

Lots of room to grow.

hoserhead · February 23, 2003 4:38PM

[quote]Originally posted by AirSluf:



Not true, by a long shot. Your thesis only holds if the CPU is upgraded but nothing else--only data beholden to the 64-bit integer will be affected as you say. Not to mention it is more likely those applications you would already compare against are using a 64-bit long already and would see little or no real change by recoding them to 64-bit int's.

]<hr></blockquote>

I'm arguing against the average application being recompiled so that sizeof(long) == 8. It just won't happen. You'll note that I specifically mentioned that programs that already do 64-bit arithmetic by two 32-bit values will probably get a win by using 64-bit long. (Integers are, almost without exception, always 32-bit. Long integers are the 64-bit integer type.)

Also, even if the rest of the architecture is upgraded (i.e., the memory bus), you will still find 32-bit applications getting a win over the same application which uses 64-bit longs. That's because you're transferring half the data over the bus, regardless of how wide it is. Less is faster in this case.

As Programmer says, though, it's not a big deal on any machine - maybe 5-10% at worst. For applications that don't need the 64-bit address space (or 64-bit long integers, for extra precision) though, it's a loss of performance that isn't necessary. Clamouring for 64-bit versions of most of your favourite applications is short-sighted at best.

February 23, 2003 5:57PM

[quote]Originally posted by HoserHead:



As well, application startup is likely to be slower, because the application's binary itself will be larger. (Constants stored in the executable will be doubled in size.)

<hr></blockquote>

what constants exactly?

mmicist · February 23, 2003 6:36PM

[quote]Originally posted by HoserHead:



I'm arguing against the average application being recompiled so that sizeof(long) == 8. It just won't happen. You'll note that I specifically mentioned that programs that already do 64-bit arithmetic by two 32-bit values will probably get a win by using 64-bit long. (Integers are, almost without exception, always 32-bit. Long integers are the 64-bit integer type.)

Also, even if the rest of the architecture is upgraded (i.e., the memory bus), you will still find 32-bit applications getting a win over the same application which uses 64-bit longs. That's because you're transferring half the data over the bus, regardless of how wide it is. Less is faster in this case.

As Programmer says, though, it's not a big deal on any machine - maybe 5-10% at worst. For applications that don't need the 64-bit address space (or 64-bit long integers, for extra precision) though, it's a loss of performance that isn't necessary. Clamouring for 64-bit versions of most of your favourite applications is short-sighted at best.<hr></blockquote>

I think you misunderstand the problems associated with 64 bit integers.

Basically Apple can define (in C terms) either sizeof(long) = 64 bits, or sizeof(long long) = 64 bits (which is the current status in most compilers). The former means all longs change size, and you get extra memory use, changed size of structures, and you break code that implicitly assumes long = 32 bits. The latter leaves ints and longs as before and gives you long longs to play with when you really want 64 bits, however you break any code that implicitly assumes sizeof(long) == sizeof(pointer). Whichever way you do it you'll break a fair bit of code, so that it will need rewriting to work under 64 bits.

I suspect Apple will make sizeof(long) = 32 bits, sizeof(long long) = 64 bits.

Performance change in the code due to 64 bit integers will be minimal, as you don't need to use them if you don't want to even in 64 bit mode since 8, 16, and 32 bit integers will still exist, however, all pointers will be 64 bit, and this will increase bus traffic etc., and slightly slow down the code.

michael

airsluf · February 23, 2003 6:53PM

hoserhead · February 23, 2003 9:04PM

[quote]Originally posted by 123:

what constants exactly?<hr></blockquote>

Implicit constants, like memory offsets and such in assembly code, in addition to assignment constants like

[code]long i = 5;</pre><hr></blockquote>

When long is 64 bit, the constant 5 to be loaded into the register needs to be a full 64 bits wide, which means that you're using 4 extra bytes of storage, on disk and in memory, for the same constant.

hoserhead · February 23, 2003 9:30PM

[quote]Originally posted by mmicist:



Basically Apple can define (in C terms) either sizeof(long) = 64 bits, or sizeof(long long) = 64 bits (which is the current status in most compilers). The former means all longs change size, and you get extra memory use, changed size of structures, and you break code that implicitly assumes long = 32 bits. The latter leaves ints and longs as before and gives you long longs to play with when you really want 64 bits, however you break any code that implicitly assumes sizeof(long) == sizeof(pointer). Whichever way you do it you'll break a fair bit of code, so that it will need rewriting to work under 64 bits.

 <hr></blockquote>

Most unix code is already 64-bit safe. It's not just sizeof(long) that changes, it's sizeof(void *) - and assumptions that sizeof(int) == sizeof(void *) used to be rampant, too.

No matter what way it's implemented, it isn't necessarily a trivial matter to make your code utilise 64-bit integer math correctly.

[quote] 

I suspect Apple will make sizeof(long) = 32 bits, sizeof(long long) = 64 bits.

<hr></blockquote>

This is already the case in gcc.

I suspect that compiling with -m64, which will not be the default, will link against a different set of 64-bit libraries and compile such that sizeof(long) == 8. In this way you can have both 64-bit and 32-bit applications installed in parallel. This is also, incidentally, the way most Unixes do it.

hoserhead · February 23, 2003 9:39PM

[quote]Originally posted by AirSluf:



Again, if the architecture changes then the assumptions don't exactly work. If the bus is wider then there is no penalty for more data, the entire width marches across in the same number of clock cycles. One of the things we have to remember as well is how does the CPU handle the differences between 64 and 32 bit ops. It uses a single bit flag to trip a mode change causing the CPU to disreguard the high order bits, I haven't read anything about it adjusting memory timing or making any other elaborate changes which would conserve memory ops between the two modes. This leads me to believe the CPU always assumes memory timing for a full 64-bit register fill regardless of mode, that would seem to wipe out any 32-bit code speed gains. It does fully keep with IBM's stance though that there is no inherent penalty for running 32-bit code through a 64-bit processor.

<hr></blockquote>

Depending on the architecture and how memory operations are queued, 32-bit applications could conceivably have a 2x speed increase in memory operations. My previous messages have assumed that the memory controller operated on bytes (or maybe 32-bit words); in this case, 32 bit operations would have an advantage. In the case that it's intrinsically working on 64-bit words, 32-bit values would be padded out with zeroes and would have no advantage.

airsluf · February 24, 2003 9:06AM

scottib · February 24, 2003 10:56AM

[quote]Originally posted by Programmer:



4 gigabyte maximum? I'm not sure what you mean... the G4 has 4 gigabyte address spaces and its just the Apple hardware that limits them to 2 GB (although the OS may steal some of the address space away from the app, but that can usually be reduced and hopefully its not a full 2 GB!).

The 970 will support 4096 GB of physical memory (where 1 GB == 2^30 bytes). That ought to make anybody run out in a state of terror... including Apple's memory controller. At $250/GB (roughly current DDR333 prices at 1GB/DIMM) that much RAM would cost a cool million US$ and would occupy 4096 slots. Its going to be a little while before RAM density is high enough to make 42-bit addressing a problem. And when that happens they just have to tweak the FSB protocol to use larger addresses since the CPU is already using 64-bits everywhere else... the "unused" space is about 4 million times the size of the "limited" 42-bit address space.

Lots of room to grow.

<hr></blockquote>

This was mentioned in the linked article as one of the reasons Intel will not migrate 64-bit to consumer desktops quickly. Intel argues that most conumers don't need more than 4GB of memory. Don't know if I agree, but that was cited.

airsluf · February 24, 2003 12:08PM

amorph · February 24, 2003 12:09PM

[quote]Originally posted by scottiB:



This was mentioned in the linked article as one of the reasons Intel will not migrate 64-bit to consumer desktops quickly. Intel argues that most conumers don't need more than 4GB of memory. Don't know if I agree, but that was cited.<hr></blockquote>

They have a point, to some extent: MS advises that you only run one application per server (because Windows scales poorly, and MS wants you to buy lots of licenses). I hope Windows doesn't still allot a mere 1GB of logical memory per process, but if so that's one more reason.

However, if you're running Linux, BSD or Solaris x86, you can use all the RAM you can find. Even with the 2GB per process limit to logical memory, they scale up well enough that you can run multiple applications.

February 24, 2003 2:35PM

[quote]Originally posted by HoserHead:

[code]long i = 5;</pre><hr></blockquote>

When long is 64 bit, the constant 5 to be loaded into the register needs to be a full 64 bits wide, which means that you're using 4 extra bytes of storage, on disk and in memory, for the same constant.<hr></blockquote>

not true! let's see:

"When long is 64 bit, the constant 5 to be loaded into the register needs to be a full 64 bits wide"

no, the register needs to be 64 bits wide, but not the constant. In this case (5), the constant will be 16 bits wide and loaded by a load-immediate (li) mnemonic.

"which means that you're using 4 extra bytes of storage, on disk and in memory, for the same constant"

you need the 4 extra bytes only if you store the whole 64-bit register in main memory and this has nothing to do with the binary or constants that will be "doubled in size".

February 24, 2003 3:04PM

[quote]Originally posted by HoserHead:



Depending on the architecture and how memory operations are queued, 32-bit applications could conceivably have a 2x speed increase in memory operations. My previous messages have assumed that the memory controller operated on bytes (or maybe 32-bit words); in this case, 32 bit operations would have an advantage. In the case that it's intrinsically working on 64-bit words, 32-bit values would be padded out with zeroes and would have no advantage.<hr></blockquote>

It is true that at least 64-bits are read from memory at a time, usually 4x64 or 8x64-bits in so called bursts. On some DDR boards 4x128 bits or 8x128 bits are read, some graphics cards even have a 256 bit memory bus. However, this doesn't change the fact that memory is still addressed BYTE-WISE. Even if the bus is 128 bits wide, you can store 4 32-bit values in 16 bytes (and not 64). Of course, this also means that a (big, otherwise there's almost no advantage) set of 32-bit pointers will load faster than 64-bit pointers.

February 24, 2003 3:09PM

[quote]Originally posted by AirSluf:



There we have it, the seperations between applied theory and implementation details. It seems to me from what I've read the 970 will act much more like the latter vice the former.<hr></blockquote>

what???

hoserhead · February 24, 2003 3:21PM

[quote]Originally posted by 123:

[code]long i = 5;</pre><hr></blockquote>

When long is 64 bit, the constant 5 to be loaded into the register needs to be a full 64 bits wide, which means that you're using 4 extra bytes of storage, on disk and in memory, for the same constant.<hr></blockquote>

not true! let's see:

"When long is 64 bit, the constant 5 to be loaded into the register needs to be a full 64 bits wide"

no, the register needs to be 64 bits wide, but not the constant. In this case (5), the constant will be 16 bits wide and loaded by a load-immediate (li) mnemonic.

[/qb]<hr></blockquote>

You are indeed correct. However, any 64-bit constants will need to be represented using 64-bit instructions.

As well, when compiled in 64-bit mode all addresses are represented using the full 64 bits (otherwise, what's the point?); this will be the major cause of larger programs.

February 24, 2003 4:19PM

[quote]Originally posted by HoserHead:

However, any 64-bit constants will need to be represented using 64-bit instructions.

<hr></blockquote>

What do you mean by "64-bit constants"? Small constants (<=32 bit, isn't that what we're talking about?) are loaded into 64-bit registers the very same way as into 32-bit registers (lis + addi/ori/subi etc.) and they use the same amount of space (2x16 bits in 2 immediate instructions).

Big constants are a bit different because you can't access the high word of the 64-bit register directly. You have to load the low word and shift it. In some cases this may lead to bigger code because more instructions are needed to load one 64-bit register (5) than two 32-bit registers (4). Of course, this only affects code size, cache and initial load time, as soon as you start calculating something, the 64-bit implementation is much faster.

[EDIT: BTW, this illustrates another inefficiency of constant 64-bit pointers]

[quote]Originally posted by HoserHead:

As well, when compiled in 64-bit mode all addresses are represented using the full 64 bits (otherwise, what's the point?); this will be the major cause of larger programs.

[/QB]<hr></blockquote>

Some addresses can be stored relative (<=32-bit) to a base address (64 bit), so it's not "all addresses". But in general you are right (and especially for vtables as has been explained before).

[ 02-24-2003: Message edited by: 123 ]

airsluf · February 24, 2003 4:22PM

programmer · February 24, 2003 5:14PM

The issue of constants is irrelevent -- if a 32-bit program needed a 64-bit constant it would take 64-bits. If a 64-bit program needs a 32-bit constant, it will take 32-bits. The size of the constant determines how much space it takes, not the size of the register. The 32-bit program, however, will require 2 registers and more loads & math ops to keep track of a 64-bit value, compared to a 64-bit program. In this case the 64-bit program will be smaller and faster. It is only with pointers that the 64-bit program suffers. Unfortunately there are lots of pointers.

The size of the register context increases by about 30% without AltiVec, and 14% with AltiVec. The effective of this depends heavily on the number of interrupts & context switches per second, and you'd have to talk to an OS engineer to find to find out the real impact of that.

Who would use 64 bit ints on the 970?

Comments