SLCentral - Your logical choice for computing and technology
  • Home
  • Search
  • Forums
  • Hardware
  • Games
  • Tech News
  • Deals
  • Prices
  • A Guru's World
  • CPU/Memory Watch
  • Site Info
  • Latest News
    Corsair TX750W Power Supply Unit Review
    Businesses For Sale
    Shure E530PTH Earphones Review
    Guide to HDTVs
    Cheap Web Hosting
    >> Read More
    Latest Reviews
    Corsair TX750W Power Supply Unit - 4-/-0/2008
    Shure E530PTH Earphones - 9-/-0/2007
    Suunto T6 Wrist Top Computer - 1-/-0/2007
    Suunto X9i Wristwatch - 9-/-0/2006
    Shure E3g Earphones - 5-/-0/2006
    >> Read More
    SL Newsletter
    Recieve bi-weekly updates on news, new articles, and more

    SLCentralArticlesTech Explanations Jan 21st, 2018 - 10:42 PM EST
    Fundamentals Of Multithreading
    Author: Paul Mazzucco
    Date Posted: June 15th, 2001
    Recieve bi-weekly updates on news, new articles, and more!

    Applications Of Multithreading: Redundancy Is Faster?

    To avoid the fact that IA-64 can't execute instructions out of order, one of the features Intel chose to use was Predication. The idea is to do the work twice, each for one of two possible outcomes from a branch (an if/else statement - for more information, see This is actually useful, because due to the fact that IA-64 calls for in-order processing, most functional units would otherwise be idle.

    For some branches, there is no good way to predict which path to take. An extension of SMT, called Threaded Multi-Path, does for hard-to-predict threads what predication does for instructions: instead of guessing, do both at once, and discard the unused result.[15]

    Another processing concept was originally known as "Cooperative Redundant Threads" and is now called Slipstream processing. A Slipstream processor actually does the work of the whole program twice! And yet, it ends up running faster. The name stems from NASCAR, of all places (for the reasoning, go to

    Slipstreaming works by using two threads that start out exactly the same - the A-stream (advance stream), and the R-stream (redundant stream). What happens is that the R-stream remains as an unmodified thread of the original program, the hardware works to remove instructions that don't have any apparent effect, and the A-stream is then stripped accordingly.

    As the shortened A-stream runs slightly ahead of the R-stream via a delay buffer, the R-stream is able to get information about how the program will execute, even before it executes! This, in a sense, is a real-time version of the schemes used by Intel with IA-64 and feedback-driven compiling (where the program is compiled, run, profiled, given to the compiler once more with information about the program, and then runs faster with the new executable).

    The A-stream, now shortened, tells the R-stream (unmodified) which branch it should take. The A-stream runs faster because it is a smaller executable; current techniques have shown a decrease in instruction count by up to 50% on average![16]. The R-stream runs faster due to having many branches resolved as it needs the answers, and thus even more rarely needs to use branch prediction etc.

    The original means by which one could create a slipstream processor was to have a 2-way CMP chip with two simple CPUs, each with half the execution resources of a more robust processor that one would normally design. With this approach, a speedup of ~12% was achieved for the Slipstream CMP processor of the larger, traditional super-scalar core[16] (though some programs did run substantially slower than the superscalar). Thus there are ways of using the second CPU in a CMP processor even without having additional threads to run. Additional performance increases are possible because the two smaller CPUs are able to run faster due to less complexity in each.

    Another approach is to use a base SMT architecture, which is an extension of the large superscalar, and to run the A-thread and R-thread on that. The key here is that the SMT would normally act as a regular superscalar without having additional threads, and attempts to speedup a single thread via entirely different means than DMT. An interesting comparison would be between the performance (and design issues) of DMT and Slipstreaming on an SMT processor (as both are base SMT processors), and between a DMT and a similarly equipped CMP Slipstreaming processor.

    >> Summary Of The Forms Of Multithreading And Conclusion

    Did you like this article?

    Article Navigation
    1. Introduction/Amdahl's Law
    2. Latencies And Bandwidth
    3. Latencies And Bandwidth Cont.
    4. ILP Background
    5. On-Chip Multiprocessing
    6. Course-Grained Multithreading
    7. Fine-Grained Multithreading
    8. Simultaneous Multithreading
    9. SMT Induced Changes/Concerns About SMT
    10. Jackson Technology And SMT
    11. Applications Of Multithreading: Dynamic Multithreading
    12. Applications Of Multithreading: Redundancy Is Faster?
    13. Summary Of The Forms Of Multithreading And Conclusion
    14. Bibliography
    Article Options
    1. Discuss This Article
    2. Print This Article
    3. E-Mail This Article
    Browse the various sections of the site
    Reviews, Articles, News, All Reviews...
    Reviews, Articles, News...
    Regular Sections
    A Guru's World, CPU/Memory Watch, SLDeals...
    Forums, Register(Free), Todays Discussions...
    Site Info
    Search, About Us, Advertise...
    Copyright 1998-2007 SLCentral. All Rights Reserved. Legal | Advertising | Site Info