Clockless CPUs

whisper · November 14, 2001 11:02PM

MacGamer has a link to a <a href="http://www.techreview.com/magazine/oct01/tristram.asp"; target="_blank">facinating article</a> about a movement away from having a clock govern the speed of a CPU. Anyone know anything about this or how long it might take for such CPUs to reach the Mac market?

ghost_user_name · November 14, 2001 11:34PM

That is an interesting little article with some bland off-the-shelf analogies mixed in. I would definitely like to see some more specifics on this "clockless" technology before I pass judgement. It seems to me that this article was written just to give the geek-wannabes something to "ooh" and "ahh" about.

Maybe some smart guy over at Ars will write up a 14-page report for us.

amorph · November 15, 2001 12:27AM

I posted <a href="http://www.xsorbit.com/users/flamingo/index.cgi?board=Future_Hardware&action=display&num =1000321459" target="_blank">a thread about this</a> over at BadFlamingo a while back, which links to <a href="http://www.techreview.com/magazine/oct01/tristram.asp"; target="_blank">this more informative article on techreview.com</a>. I'm not familiar enough with chip tech to offer a 14 page explanation, but I can offer an overview, which is based on the above-linked article.

In any processor, the time it takes to perform a given calculation is determined by the number of logic gates the electrical current has to go through, the physical distance across the chip that it has to travel, and also the strength of the current (this is one reason why the x86 desktop chips consume so much power). Different operations take different amounts of time to resolve. Because of the nature of the technology, there will be a "value" in the result register while the calculation is in process, but it will fluctuate randomly. When it stabilizes, the calculation is finished.

The traditional solution to determining when calculations are finished is to make the whole chip march to a clock. This means that every operation has a fixed amount of time to decide on their final answer, and when that time is up, every part of the chip whose action depends on that answer can be certain that the answer they're reading is correct. Obviously, the clock has to be set to allow the slowest operation on the chip time to complete. (There are workarounds: For instance, on the 7450 G4 the vector permute instruction requires two clock cycles). The flip side of this is that if an operation is really fast, then it finishes its calculation long before the next clock tick, and sits idle. "Clockless" chips are designed to prevent that idleness.

The idea is dead simple: You let everything in the chip run at its natural speed. The implementation is tricky, because if you have, say, an instruction in the pipeline waiting on an integer add (fast) and a vector permute (slow), there's no clock to tell it when the values are readable - and if it reads a value too soon, it gets random garbage. So there has to be some way for an instruction to raise a flag or send a message saying "I'm done," and possibly also to be able to raise a flag or send a message saying, "wait, I'm not done." Once you've decided on which scheme to implement, you have to make sure it's efficient enough that it doesn't squander the speed you've gained by freeing faster instructions from the tyranny of the clock.

If you're implementing a complex instruction that can be broken up into a lot of steps, you can also adjust the lengths of the connecting wires so that the timing works out naturally - one instruction never needs the result of another before it's ready. Obviously, when you're doing this for millions upon millions of logic gates and wires, this is delicate work. And so, even though the concept of a clockless chip is simpler than that of a clocked chip, the implementation is more difficult. This is why almost all chips are clocked.

The other advantage to clockless designs is that they only consume the power necessary to perform the calculations they're currently working on. On a clocked chip, every part of the processor consumes power as if it was running. So the clockless designs are more efficient as well as faster.

airsluf · November 15, 2001 1:14AM

powerdoc · November 15, 2001 4:33AM

[quote]Originally posted by AirSluf:



Close but not quite, this is very dependent on specific chip architecture. I agree the clock inputs will always have power applied but that does not require all other modules to flollop about under a full undirected voltage draw. Clockless chips if sloppily designed can also have this problem.

The complexity of a clockless chip is orders of magnitude higher to ensure the timing issues you raised are properly handled. During an archicetures course discussion of a clockless design vs. a clocked microcode design 8-bit chip showed even at that pathetic level the difficulties were VERY expensive to overcome with the added transistors required to play timing traffic cop.<hr></blockquote>

So we can conclude that Clockless chip in mac and in any deskstop computers is not for tomorrow.

<img src="graemlins/smokin.gif" border="0" alt="[Chilling]" />

outsider · November 15, 2001 7:54AM

How about they append and extra bit to a register that tells the CPU if that information contained in the register is ready to be used in processing or is waiting to be computed? That would require the unit involved to devote circuitry to detect and append the bit information. But it's a way.

amorph · November 15, 2001 1:45PM

AirSluf wrote:

[quote]Close but not quite, this is very dependent on specific chip architecture. I agree the clock inputs will always have power applied but that does not require all other modules to flollop about under a full undirected voltage draw. Clockless chips if sloppily designed can also have this problem.<hr></blockquote>

Yes, but clockless chips, if sloppily designed, will be lucky to work at all.

The point is that not only is the possibility there (while it isn't in clocked designs), it's being aggressively implemented in the current designs and touted as a major advantage. It's a feature that's real and available now. Since computers are only getting smaller and less tethered, this advantage will only grow more significant.

So if it isn't true by default (what is, if you assume incompetent engineering? other than failure?) it's close enough to be counted as a given advantage of the design philosophy.

airsluf · November 15, 2001 2:22PM

airsluf · November 15, 2001 2:33PM

whisper · November 15, 2001 3:03PM

[quote]Originally posted by AirSluf:

As an aside on the article, Asynchronous does not necessarily mean Clockless so they are mixing their metaphors quite liberally. Most of the applications mentioned are DSP's which do a small number of complex operations very fast. Asynchronous or clockless operation for those type chips is a very large advantage as the processes are fixed and well defined, making it much easier to use simpler tricks and fewer transistors to make sure they have deterministic results.<hr></blockquote>

Are you suggesting that instead of one CPU that does everything with the same clock, we build a build a bunch of smaller CPUs, each with it's own specific purpose and clock, that work together and happen to be on the same die? That sounds like a good idea. If nothing else, it would get people used to not depending on the clock for everything. I bet it could prove to be a valuable transitional step between fully clocked CPUs and fully unclocked CPUs.

eskimo · November 15, 2001 10:38PM

[quote]Originally posted by Whisper:



Are you suggesting that instead of one CPU that does everything with the same clock, we build a build a bunch of smaller CPUs, each with it's own specific purpose and clock, that work together and happen to be on the same die? That sounds like a good idea. If nothing else, it would get people used to not depending on the clock for everything. I bet it could prove to be a valuable transitional step between fully clocked CPUs and fully unclocked CPUs.<hr></blockquote>

No he's suggesting looking at extending designs already in use such as Intel's Pentium 4 ALU which runs at a clock speed 2x greater than the common clock. This could be applied to additional MPU components such as fetch/decode, schedulers, and non ALU execution units. Each could have its own independent clock. But again with asynchronous operation you run into the complexity of ensuring your signals are still arriving at the proper destination in time and aren't too late or waiting around somewhere.

airsluf · November 16, 2001 12:17AM

amorph · November 16, 2001 9:40AM

This just occurred to me - maybe it's the way the P4 does it - but it seems to my non-engineer self that you could have the slower parts of the chip post on the trailing side of the clock, and have the faster parts post on both the leading and trailing sides, like DDR. Perhaps you could have quad-pumped sections as well. Everything's still on the same clock, so timing issues are a little better

Hmm.

Still, what if a double-pumped part of the chip posts an answer that a single-pumped part requires on the leading edge of the clock? It still has to lie idle for one half-tick, or else it needs two result registers - "up" and "down", and some way of telling its slower brethren which one their answer's in.

airsluf · November 16, 2001 11:44AM

.

prutz11 · November 16, 2001 12:34PM

Hey James... Thanks for all the keystrokes. Good work.

-will

amorph · November 16, 2001 1:38PM

AirSluf, Eskimo, Whisper - thanks for all the clarification!

This thread rocks.

Clockless CPUs

Comments