Simple Basic-block instruction scheduler
This is an example of a "list scheduler" for basic blocks.
The programs have a built-in matrix multiply example (C = A Bt).
The size of the maxtrix problem is specified by cmd-line params.
A program param specifies the max number of registers available.
No registers are "pinned" in these examples. All vars
are assumed to come from and need to be returned to memory.
(This is not true of the "more results" item, listed at the bottom
of the page).
I've used similar simple programs to generate world-beating matrix
multiply subroutines (e.g. sustained 4 GF on 2 GHz AMD cpus).
- FCFS scheduler
- Prioritorised scheduler
- Pipeline scheduler
- Super-scalar scheduler
- Super-pipeline scheduler
- VLIW scheduler
- Example 3x3 matrix multiply from FCFS scheduler
- Example 3x3 matrix multiply from prio scheduler
- Example 3x3 matrix multiply from pipeline scheduler
- Example 3x3 matrix multiply from SS scheduler
- Example 3x3 matrix multiply from super-pipeline scheduler
- Example 3x3 matrix multiply from VLIW scheduler
- Schedule translated into "3dnow code"
- Just drop it into your favourite C code timing loop.
- Schedule translated into "sse1 code"
- Just drop it into your favourite C code timing loop.
- Execution results (speeds in MF for 3 architectures)
- More speeds results for this code translated into
500x500 MATMUL's
Kym Horsell /
Kym@KymHorsell.COM
ADVISORY:
Email to these sites is filtered. Unsolicited email may
be automajically re-directed to the relevant postmaster.