The speed of high-radix digit-recurrence dividers is mainly determined by the ha
rdware complexity of the quotient-digit selection
function. In this paper we present a scheme that combines the area efficiency of
bundled data with data-dependent computation time.
In this scheme the selection function is very simple and may be implemented usin
g a fast adder. This function
speculates the result digit and, when the speculation is incorrect a correction
of the quotient and of the residual must be performed.
When the residual satisfies some constraints it is also possible to switch to a
higher radix, computing a fraction of the next digit in advance.
This results in a division scheme with a variable iteration time and a variable
number of iterations and hence with an asynchronous behaviour.
Several designs were realized and compared
both in terms of execution time and area. The fastest unit considered is a radix
-64 divider that may switch to radix 128 or 256.
Our evaluations show that $area \times delay$ savings from $25\%$ to $65\%$ comp
ared to equivalent synchronous
designs may be achieved.