|
SL Newsletter
|
| Recieve bi-weekly updates on news, new articles, and more |
|
|
|
|

07-28-01, 07:20 PM
|
|
Registered User
|
|
Join Date: Jun 2001
Location: California
Posts: 23
|
|
Has anyone else tried the DDR solution for the P4 yet?
I have to say that i think the whole idea of the P4 was based on the extreme memory bandwidth...
I have been able to use the P4 with a DDR solution for a while, and the performance of the 1.7Ghz is comparable to the Duron on a DDR chipset, in other words, it sucks big time...
Soooo, has anyone tried it yet, and what is your opinion...
BTW, i am still waiting for opinions regarding the MP solution of the P4...
Patrick Palm
PC Resources
__________________
No question is to dumb if you are prepared for a technical answer...
|

07-28-01, 07:32 PM
|
 |
Student-for-life
|
|
Join Date: Dec 2000
Location: College Park, Maryland
Posts: 1,294
|
|
We (SystemLogic.net staff) are trying to see if VIA will supply us with a motherboard for testing, but I'm not at all suprised about the poor performance. Though I don't have the performance characteristics of the DDR version at hand, I'm working on an article discussing the P4's performance, with relation to bandwidth in particular  I think you'll like that one.
As for MP in the P4....regardless of the type of on-chip multithreading (with think it to be SMT, only few believe in the possiblity of CMP), I think that the P4's multiprocessing is "nicer" than the Athlon4's, at least at a low level. The reason is due to the P4's use of a write-through L1 cache, so that you only have to snoop the L2 cache, and not worry about the L1, because the most up-to-date stuff is already in the L2 (this would be perfect for CMP, and was my reason for believing Jackson tech being as such).
Because the Athlon uses an exclusive design, SMP support requires the snooping of both the L1 and L2, so the requirements are more complex. I can't go much beyond that, because I haven't looked at it enough yet.
__________________
paul@pleaseohpleasedontspamme.slcentral.com
A mathematician is a device for turning coffee into theorems -- P. Erdos
|

07-28-01, 07:40 PM
|
|
Registered User
|
|
Join Date: Jun 2001
Location: California
Posts: 23
|
|
Quote:
|
Because the Athlon uses an exclusive design, SMP support requires the snooping of both the L1 and L2, so the requirements are more complex. I can't go much beyond that, because I haven't looked at it enough yet.
|
The Athlon is an exclusive design, that is correct, does that mean that command snooping through caches is a necessary task... Nope, you can do a reference call on each, store it in chipset (as it does not require much memory) and access that from L1 bridge before making the call in the first place...
You seem to be a bit used to VIA's chipsets, have you been involved?
Patrick
__________________
No question is to dumb if you are prepared for a technical answer...
|

07-28-01, 07:48 PM
|
 |
Student-for-life
|
|
Join Date: Dec 2000
Location: College Park, Maryland
Posts: 1,294
|
|
Quote:
[The Athlon is an exclusive design, that is correct, does that mean that command snooping through caches is a necessary task... Nope, you can do a reference call on each, store it in chipset (as it does not require much memory) and access that from L1 bridge before making the call in the first place...
You seem to be a bit used to VIA's chipsets, have you been involved?
|
I don't understand how you wouldn't have to snoop both caches...Care to clue me in from the beginning, if necessary? I don't understand what you mean by a "reference call."
BTW -- I'm not sure if I know what you mean about being involved with VIA chipsets -- if you're insinuating something that I believe you are, I'm quite flattered. I haven't even started my sophmore year yet....
Paul
__________________
paul@pleaseohpleasedontspamme.slcentral.com
A mathematician is a device for turning coffee into theorems -- P. Erdos
|

07-28-01, 07:58 PM
|
|
Registered User
|
|
Join Date: Jun 2001
Location: California
Posts: 23
|
|
Quote:
|
I don't understand how you wouldn't have to snoop both caches...Care to clue me in from the beginning, if necessary? I don't understand what you mean by a "reference call."
|
Simplified for everyone who does not know (obviously does not include you):
You can access the caches via a high speed bus, called the chipset memory controller, if necessary you can store the processor cached references within that controller, making it easier to access, as the controller still has to respond to the CPU call, you can put a bit of extra memory into it that stores references to earlier calls...
This will mean that the CPU that has the cached info can either share it directly (if CPU-CPU calls are allowed by chipset) or you can send the data throughput directly to the latest cached...
VIA has some problems with this idea, PCRES does not, i thought you worked for VIA because your description of the Athlon cache subsystem sounds a lot like theirs... Meaning, i thought you were involved in VIA's chipset design...
Patrick
__________________
No question is to dumb if you are prepared for a technical answer...
|

08-04-01, 04:16 PM
|
 |
Student-for-life
|
|
Join Date: Dec 2000
Location: College Park, Maryland
Posts: 1,294
|
|
Going back on topic, I had all thise to say about why, in particular, the Pentium 4 sucks up bandwidth like no tomorrow.
The basic point: the large line-size boosts FPU performance, and other streaming type applicationsn, and the excess bandwidth helps a great deal; however, in integer tasks, especially code that "jumps" a lot, the large line-sizes hurt. Thus massive bandwidth is needed to act as damage control.
Of course, read the above for a more thorough discussion.
__________________
paul@pleaseohpleasedontspamme.slcentral.com
A mathematician is a device for turning coffee into theorems -- P. Erdos
|

08-04-01, 05:56 PM
|
|
Registered User
|
|
Join Date: Jun 2001
Location: California
Posts: 23
|
|
Nice article... :-)
Patrick
__________________
No question is to dumb if you are prepared for a technical answer...
|

08-05-01, 04:54 AM
|
 |
Student-for-life
|
|
Join Date: Dec 2000
Location: College Park, Maryland
Posts: 1,294
|
|
Hehe, I got a reply from Anand Lal Shimpi (whoa, yeah, I wasn't expectinng that to happen  ), and apparently his experiences with the DDR on the P4 don't reflect anything similar to yours, Patrick.
Patrick, are you at liberty to say which chipset you've had experience with? If not, I totally understand.
__________________
paul@pleaseohpleasedontspamme.slcentral.com
A mathematician is a device for turning coffee into theorems -- P. Erdos
Last edited by Paul : 08-05-01 at 05:57 AM.
|

08-07-01, 05:40 PM
|
|
Registered User
|
|
Join Date: Jun 2001
Location: California
Posts: 23
|
|
Quote:
Originally posted by Paul
Hehe, I got a reply from Anand Lal Shimpi (whoa, yeah, I wasn't expectinng that to happen ), and apparently his experiences with the DDR on the P4 don't reflect anything similar to yours, Patrick.
Patrick, are you at liberty to say which chipset you've had experience with? If not, I totally understand.
|
Giving Anand his due respects, i have to say that his tests are hardly what i would call professional...
The lack of bandwidth clearly shines through if you try to test the P4 for anything that it does better than the Athlon...
Of course, this is because of the lack of available bandwidth....
I am not saying that Anand is wrong in this case, i am just saying that he is making the same mistakes so many others have done... not using high bandwidth programs to check it out...
BTW, my stability tests are also my benchmarks, i do not use synthetic benchmarks, i have assembled three weeks of tests using tasks recorded from live situations, so i do believe that my benchmarks are pretty correct...
Patrick
__________________
No question is to dumb if you are prepared for a technical answer...
|

08-07-01, 05:54 PM
|
 |
Student-for-life
|
|
Join Date: Dec 2000
Location: College Park, Maryland
Posts: 1,294
|
|
Understood  (I hope you didn't take it to mean I was challenging your results).
The thing that I'm most interested in is at the much higher P4 frequencies, where the gap between CPU speeds and main memory becomes even more apparent, and the latencies between RDRAM and DDR become very similar to the CPU. I really want to do a CPU scaling article, between the P4 on RDRAM and on DDR, but lack of a platform (on both accounts) means I'll have to forgo this article, for quite awhile, at least.
I'll be interested in which benchmarks Anand actually does use...and how well it scales with CPU speed (but I won't tell him that 'cause I want to write another article  )
Thanks for the reply,
Paul
__________________
paul@pleaseohpleasedontspamme.slcentral.com
A mathematician is a device for turning coffee into theorems -- P. Erdos
|

08-07-01, 06:25 PM
|
|
Registered User
|
|
Join Date: Jun 2001
Location: California
Posts: 23
|
|
If you need any tests done, i might have the results ready for you already...
The thing is, the P4 was designed to have a high bandwidth path to memory, the lowered latency that comes with DDR means close to nothing (in my tests) and using anything less than dual channel RBDRAM is a mistake...
In the future, maybe combining it with DDRII which is a serial architecture, much like RBDRAM will be a solution, but i doubt that you will see any gains in performance or any lowered price...
Patrick
__________________
No question is to dumb if you are prepared for a technical answer...
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -8. The time now is 04:23 PM.
Hardware
Reviews, Articles, News, All Reviews...
|
Gaming
Reviews, Articles, News...
|
 |
|
|
Regular Sections
A Guru's World, CPU/Memory Watch, SLDeals...
|
 |
SLBoards
Forums, Register(Free), Todays Discussions...
|
Site Info
Search, About Us, Advertise...
|
 |
|