SLCentral - Your logical choice for computing and technology
Navigation
  • Home
  • Search
  • Forums
  • Hardware
  • Games
  • Tech News
  • Deals
  • Prices
  • A Guru's World
  • CPU/Memory Watch
  • Site Info
  • Latest News
    Corsair TX750W Power Supply Unit Review
    Businesses For Sale
    Shure E530PTH Earphones Review
    Guide to HDTVs
    Cheap Web Hosting
    >> Read More
    Latest Reviews
    Corsair TX750W Power Supply Unit - 03/04/2008
    Shure E530PTH Earphones - 24/09/2007
    Suunto T6 Wrist Top Computer - 19/01/2007
    Suunto X9i Wristwatch - 21/09/2006
    Shure E3g Earphones - 24/05/2006
    >> Read More
    SL Newsletter
    Recieve bi-weekly updates on news, new articles, and more


    The Fundamentals Of Cache
    October 17th, 2000
    How Cache Helps Performance

    Why is this advantageous to use an exclusive cache? Let us consider Intel's Celeron2 and AMD's Duron. The Celeron has 32k of L1 cache, equally split between data and instruction caches. It has a unified, on-die 128k L2 cache. So we add the two together, and we have 160k of on-die cache, right? Technically, yes. However, the information in the L1 is duplicated in the L2, because it is an inclusive design. This means that it has 32k of L1, and 96k of effective L2 (because 32k of the information stored in the L2 is the same as the L1), for a total of 128k of useful cache. Contrast that with the Duron, which has 128k of L1 cache, equally split between data and instruction caches. It has a unified, on-die L2 cache that is 64k in size. What!?! Only 64kb? If the Duron were designed inclusively, this defeats the purpose of the L2 cache. In fact, it would defeat the purpose of adding an L2 cache if AMD designed it with a L2 cache size that was equal to, or less than, the size of the L1. It works this way in all exclusive designs (including the never-released Joshua version of the Cyrix III). So AMD made the design exclusive (as they did with the Thunderbird). The Duron has 128kb of L1, plus 64kb of L2, and because neither one contains the same information, one can just add the two together for the effective on-die cache, which amounts to 192kb. The Duron's is obviously larger.

    So, let us consider this again using some diagrams (not to scale of course):


    Inclusive L2


    Exclusive L2 - L1 relationship

    What must be discussed (this exclusive diagram is a change from the first version of it) is the nature of the exclusivity of the cache: in AMD's Duron, the relationship is only a L1/L2 relationship. What this means is that, it is solely exclusive between the L1 and L2, and that the information in the L1 is duplicated in the main memory (and, potentially, though not always or often, in the hard drive), yet not in the L2 cache.

    Moving from the Celeron2 and Duron as examples, let's now look at their more-powerful siblings, the AMD Thunderbird Athlon and the Intel Pentium3 Coppermine. The Thunderbird has an on-die, exclusive 256kb L2 cache, and the Coppermine P3 features an on-die, inclusive 256kb L2 cache. Caches take up enormous numbers of transistors. The P3 went from about 9.5 million transistors to about 28 million just from adding 256k of L2 cache. The Athlon had about 22 million, and adding 256k of L2 cache made the Thunderbird weigh in at a hefty 37 million. These represent rather large fractions of the transistors used in two top-of-the-line x86 processors, which are there solely to feed these processors' hungry execution units.

    Considering these caches take up so much space, they increase die sizes significantly, which is not a good thing because die size plays a crucial role in yields. In fact, the formula for die yields can be expressed as:

    Die yield = (1 + (((defects per mm^2) * (Die area in mm^2))/n))^-n where n is equal to the number of layers of metal in the CMOS process (assuming wafer yields are 100%) (Hennessy and Patterson, 12).

    As you can see, the larger the die, the lower the yields. I add this in only because the original Athlon on a .18 micron process was 102 mm^2, and the Thunderbird was about 20% larger at 120mm^2. This increases die size is bad because it reduces yields. As you can see, from the standpoint of economy v. Performance, there need to be good reasons to put the L2 cache on-die, and there are.

    Now that Intel has moved a majority, and AMD the entirety, of their production to socket chips, they do not really need to put the CPUs on expensive PCBs. This is because the cache was formerly placed on the cartridge, but off-die, and those cache chips have now been replaced with on-die cache. There are other benefits besides cost however: in this case, performance.

    Despite the common misconception that electricity flows at the speed of light, it does not. It certainly travels at speeds far greater than the speed of sound, but electrons flow at a finite speed that is much lower than that of light, and this fact has an impact upon the design and performance of processors. Why mention this? One must remember that computers only deal with information in low and high voltages of electricity. The speed of any given part of a computer is, at the very least, bound by the speed at which electricity can be transmitted across whatever medium it is on. This, in turn, shows us that the ultimate performance bottleneck is necessarily the speed at which electricity can move. This is also the case for cache.

    Article Options
    Discuss this article Open a printer-friendly version of this article
    E-mail this article
    Article Navigation
    Article Navigation
    1. Introduction/Cache & Architecture Terminology Part 1
    2. Cache & Architecture Terminology Part 2
    3. How Cache Helps Performance
    4. Cache & The Evolution Of Form Factors
    5. Real-World Architectural Designs Part 1
    6. Real-World Architectural Designs Part 2
    7. How Cache Sizes Affect Yields/An Inside For What Might Have Been...
    8. Conclusion/Bibliography
    Article Info
    Author: Paul Mazzucco
    Company: N/A
    Article Options
    Discuss This Article
    Print This Article
    Browse the various sections of the site
    Hardware
    Reviews, Articles, News, All Reviews...
    Gaming
    Reviews, Articles, News...
    Regular Sections
    A Guru's World, CPU/Memory Watch, SLDeals...
    SLBoards
    Forums, Register(Free), Todays Discussions...
    Site Info
    Search, About Us, Advertise...
    Copyright © 1998-2007 SLCentral. All Rights Reserved. Legal | Advertising | Site Info