SLCentral - Your logical choice for computing and technology
Navigation
  • Home
  • Search
  • Forums
  • Hardware
  • Games
  • Tech News
  • Deals
  • Prices
  • A Guru's World
  • CPU/Memory Watch
  • Site Info
  • Latest News
    Corsair TX750W Power Supply Unit Review
    Businesses For Sale
    Shure E530PTH Earphones Review
    Guide to HDTVs
    Cheap Web Hosting
    >> Read More
    Latest Reviews
    Corsair TX750W Power Supply Unit - 4-/-0/2008
    Shure E530PTH Earphones - 9-/-0/2007
    Suunto T6 Wrist Top Computer - 1-/-0/2007
    Suunto X9i Wristwatch - 9-/-0/2006
    Shure E3g Earphones - 5-/-0/2006
    >> Read More
    SL Newsletter
    Recieve bi-weekly updates on news, new articles, and more


    SLCentralArticlesTech Explanations Sep 25th, 2017 - 7:47 PM EST
    Fundamentals Of Multithreading
    Author: Paul Mazzucco
    Date Posted: June 15th, 2001

    Course-Grained Multithreading

    As with a fair comparison between traditional superscalar and a CMP processor, each processor must be allotted the same number of functional units, same cache and cache-line sizes, etc. Again, we use the handy graphics as a means of comparing two processing paradigms:


    'a' represents the traditional superscalar, while 'b' represents a coarse-grained multithreading architecture.

    While CMP shares the same physical die, and can share the L2 cache (either if it is on-die, or off), and executes two (or more, depending upon the number of processors on the die) threads at the same time, coarse-grained multithreading (CMT) architectures do not. CMT improves the efficiency with respect to the usage of the functional units by executing one thread for a certain number of clock cycles. The efficiency is improved due to a decrease in vertical waste. Vertical waste describes situations in which none of the functional units are working due to one thread stalling.

    When switching to another thread, the processor saves the state of that thread (i.e., it saves where instructions are in the pipeline, which units are being used) and switches to another one. It does so by using multiple register sets.[4] The advantage of this is due to the fact that often, a thread can only go for so long before it falls upon a cache miss, or runs out of independent instructions to execute. A CMT processor can only execute as many different threads in this way as it has support for. So, it can only store as many threads as there are physical locations for each of these threads to store the state of their execution. An n-way CTM processor would therefore have the ability to store the state of n threads.

    A variation on this is to simply execute one thread until it has experienced a cache miss (usually a L2 cache miss), at which point it will switch to another thread. This has the advantage of simplifying the logic needed to rotate the threads through a processor, as it will simply switch to another one as soon as the prior thread is stalled. The penalty of waiting for the requested block to be transferred back into the cache is then alleviated. This is similar to the hit under miss (or hit under multiple miss) [5] caching scheme used by many processors, but it differs because it operates on threads instead of upon instructions. The MAJC architecture made use of CMP, and it also uses a form of CTM, where it switches threads on a cache miss, with support for 4 threads in this manner. [6] The MAJC architecture also has a few more tricks up its sleeve for multithreading, which will be discussed later. The APRIL architecture, circa 1990, also was to use CMT.[4]

    The advantages of CMT over CMP are: CMT doesn't sacrifice single-thread performance, and there is less hardware duplication (less hardware that is halved to make the two processors "equal" to a comparable CMT).

    >> Fine-Grained Multithreading

    Did you like this article?

    Article Navigation
    1. Introduction/Amdahl's Law
    2. Latencies And Bandwidth
    3. Latencies And Bandwidth Cont.
    4. ILP Background
    5. On-Chip Multiprocessing
    6. Course-Grained Multithreading
    7. Fine-Grained Multithreading
    8. Simultaneous Multithreading
    9. SMT Induced Changes/Concerns About SMT
    10. Jackson Technology And SMT
    11. Applications Of Multithreading: Dynamic Multithreading
    12. Applications Of Multithreading: Redundancy Is Faster?
    13. Summary Of The Forms Of Multithreading And Conclusion
    14. Bibliography
    Article Options
    1. Discuss This Article
    2. Print This Article
    3. E-Mail This Article
    Browse the various sections of the site
    Hardware
    Reviews, Articles, News, All Reviews...
    Gaming
    Reviews, Articles, News...
    Regular Sections
    A Guru's World, CPU/Memory Watch, SLDeals...
    SLBoards
    Forums, Register(Free), Todays Discussions...
    Site Info
    Search, About Us, Advertise...
    Copyright 1998-2007 SLCentral. All Rights Reserved. Legal | Advertising | Site Info