As you can see, there are numerous design tradeoffs when choosing a cache implementation. Some have worked out well, and such is the case with the P6 and Athlon derivatives. There have been instances where a memory design, perhaps, tried to go to far, and forced a company out of certain markets with a general architecture design that had the potential to do well.
As it stands for the future, it is likely that we will not be seeing larger and larger L1 caches on CPUs (with the exception being HP PA-RISC processors), because of the ability to economically integrate L2 cache on-die. Cyrix discussed this in a PDF file about the defunct Jalapeno*, which was to have a smaller L1 cache than Cyrix's previous design. Their reasoning was that large caches wouldn't allow for low latencies while maintaining high clock speeds, which is what the processor needs from an L1 cache, and the on-die L2 cache would give the high hit rates.
Intel is following this concept with their 32kb L1 in the Itanium. While the L1 Icache (Trace Cache) for the P4 is about 96k in size, its effectively about 16kb Icache (I won't get into this, because there is an excellent discussion of it here), plus a very low-latency (2 cycle) and small 8kb L1 Dcache. Both of these processors have high bandwidth L2 caches on-die. The "bigger is better" mentality in regards to cache caught on easily for those trying to gain a better understanding of computer architecture, as since the introduction of a L1 cache in the x86 world with the 486, the size of this cache has always increased at least, until now. This turn of events will do little else but to confuse those who were led astray.
* A shout-out to PDF file collectors, I have "121507 (MDR Jalapeno).pdf" but I lost the one I'm referring to, which is a different one put out by Cyrix themselves...anyone want to send it to me?
Hennessy, Patterson. "Comptuer Architecture: A Quantitative Approach," 1996.
"AMD Athlon ™ Processor and AMD Duron ™ Processor with Full-Speed On-Die L2 Cache Enabling an Innovative Cache Architecture for Personal Computing." http://www.amd.com/products/cpg/athlon/pdf/cache_wp.pdf June 19, 2000.