[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [oc] Beyond Transmeta...

To: <cores@opencores.org>
Subject: Re: [oc] Beyond Transmeta...
From: "Jim Dempsey" <tapedisk@ameritech.net>
Date: Tue, 11 Feb 2003 12:19:59 -0600
References: <200302091606.h19G6oN02285@www.lampret.com> <3E47A032.2070704@comsys.se> <00d201c2d12c$cd6f8800$0202a8c0@SARAH> <200302110853.12808.markom@opencores.org>
Reply-To: cores@opencores.org
Sender: owner-cores@opencores.org

----- Original Message -----

From: "Marko Mlinar" <markom@opencores.org>

Sent: Tuesday, February 11, 2003 1:53 AM

Subject: Re: [oc] Beyond Transmeta...

<snip>

> In practice, you would find it hard to make a multiplier that would fit your
> purpose, also your logic would switch many times, consuming more power than
> standard circuits, and not to speak of multi-phase clock issues.

There is not much standard about the circuit. So you invent a new serial multiplier

abcd x efgh

Becomes: (set typeface to courier)

abcd(h)+

abcd(g)+

pppppp+

abcd(f)+

ppppppp+

abcd(e)+

pppppppp

Where (x) indicates a conditional pull of the bit stream through an adder

(else pull of 0's) The product is fully complete in 11 clocks, but available

for use after 1 clock. Note that the lsb of the product is immutable after

1 clock, the 2nd lsb is immutable after 2 clocks, ... i.e. each bit of the

product is available for additional operations as it emerges. Therefor,

if you were to incorporate the multiply above into a multiply and

acumulate operation (MAC) i.e.

result = (abcd x efgh) + ijkl

Then the addition of ijkl can begin after only 1 clock tick of the bitstream.

Re: power. Could be much less than conventional means.

The multiply requires 4 1-bit serial adders. Each performing 4 additions.

Which is 16 1-bit cell operations, no latch operations

The routing logic is not illustrated above so that would increase power

consumption.

The traditional multiply would require perhaps 4 4-bit adder operations,

4 4-bit latch operations, 4 9-bit shift register operations, (additional operations)

at least 68 1-bit cell operations. This indicates bitstream could consume

1/4 the power of conventional means (at least for this example).

Using the assumption that the bitstream can clock at word width times

the parallel implimentation the traditional method computes the MAC

((4 adds + 4 shift/latch) + add) x 4 or 36 clock times of the bitstream

method. Not as good as the 50x as shown earlier.

Also note, as you go wider in word width the parallel method must slow

down for carry propigation whereas the bitstream does not.

There are a lot of unknowns here so don't be so quick to assume anything

about power consumption. A general rule of thumb though is if you can

generate the same result with less work you will consume less power.

> But even when leaving aside the implementation issues, you have will problems
> with loops, function calls and sw model, especially with PLD idea.

Why think in terms of loops and function calls? Go out of the box.

Start with a clean sheet of paper.

> There is also problem of debugging.

Initial debugging would be done through emulation. Not unlike what you do

now (synthesys). When the routing is proven then it would be incorporated

into the larger project and tested again.

Jim Dempsey

Follow-Ups:
- Re: [oc] Beyond Transmeta...
  - From: Marko Mlinar <markom@opencores.org>

References:
- Re: [oc] Beyond Transmeta...
  - From: mr.modman@email.cz
- Re: [oc] Beyond Transmeta...
  - From: Lars Segerlund <lars.segerlund@comsys.se>
- Re: [oc] Beyond Transmeta...
  - From: "Jim Dempsey" <tapedisk@ameritech.net>
- Re: [oc] Beyond Transmeta...
  - From: Marko Mlinar <markom@opencores.org>

Prev by Date: [oc] CAN core finished
Next by Date: Re: [[oc] Help::]
Prev by thread: Re: [oc] Beyond Transmeta...
Next by thread: Re: [oc] Beyond Transmeta...
Index(es):
- Date
- Thread