Multiscalar processors use a new implementation paradigm for extracting large quantities of instruction level parallelism (ILP) from ordinary high-level language programs. Both control and data dependences are aggressively dealt with by executing instructions speculatively based on predictions made at run time by hardware. Unlike other known ILP processing paradigms (such as superscalar and VLIW), the ``frontend'' of a multiscalar processor speculatively distributes large chunks of conventional program code (or ``tasks'') to a number of parallel processing elements without stopping to look at individual instructions contained within a task (including any number of conditional branches and procedure calls). Then each of the parallel processing elements operates on its task using its own program counter and physical copy of the single logical register file. Data dependences are resolved by a combination of hardware and software, with hardware being given more responsibility than is the case with currently used ILP paradigms.
The multiscalar paradigm is fundamentally new and contains a large variety of promising and interesting design alternatives. A natural consequence is that it also requires new, never-before-designed hardware structures and compiler algorithms. Hardware and software designs will be evaluated to provide an accurate picture of the performance capabilities of the multiscalar paradigm and to highlight the significant design issues that must be overcome for the multiscalar paradigm to enter the mainstream. A design project for a specific multiscalar implementation forms the basis for the investigation. This implementation, named Kestrel, will have a detailed instruction set architecture (ISA), a detailed hardware logic design, and working compilers. The evaluation of the Kestrel processor will take place via several layers of simulation, ranging from high level models written in C, to logic level models written in Verilog, to circuit level spice models for key elements.