Presents a new method of performing division in hardware and explores different ways of implementing it. This method involves computing a preliminary estimate of the quotient by splitting the dividend, performing division of each of the parts in parallel and merging them. The estimate is refined iteratively to get the final quotient. This method is significantly fast since it carries out parallel operations to compute the preliminary quotient and makes use of a fast multiplier to refine the result. It is possible to pipeline the execution of the unit yielding further increase in throughput. Speed estimates show that this method yields a much higher throughput than other fast methods, while area and latency are comparable.