Simple Basic-block instruction scheduler


This is an example of a "list scheduler" for basic blocks.
The programs have a built-in matrix multiply example (C = A Bt).
The size of the maxtrix problem is specified by cmd-line params.
A program param specifies the max number of registers available.
No registers are "pinned" in these examples. All vars are assumed to come from and need to be returned to memory. (This is not true of the "more results" item, listed at the bottom of the page).

I've used similar simple programs to generate world-beating matrix multiply subroutines (e.g. sustained 4 GF on 2 GHz AMD cpus).

FCFS scheduler
Prioritorised scheduler
Pipeline scheduler
Super-scalar scheduler
Super-pipeline scheduler
VLIW scheduler
Example 3x3 matrix multiply from FCFS scheduler
Example 3x3 matrix multiply from prio scheduler
Example 3x3 matrix multiply from pipeline scheduler
Example 3x3 matrix multiply from SS scheduler
Example 3x3 matrix multiply from super-pipeline scheduler
Example 3x3 matrix multiply from VLIW scheduler
Schedule translated into "3dnow code"
Just drop it into your favourite C code timing loop.
Schedule translated into "sse1 code"
Just drop it into your favourite C code timing loop.
Execution results (speeds in MF for 3 architectures)
More speeds results for this code translated into 500x500 MATMUL's

Kym Horsell /
Kym@KymHorsell.COM

ADVISORY: Email to these sites is filtered. Unsolicited email may be automajically re-directed to the relevant postmaster.