Some Assembly Required: The CPU (Golden Ram III)

Commercially-assembled and sold computers are a pain. They come with all sorts of junk programs (why in the world do they package both MSN and AOL onto your desktop? That’s just dumb marketing…), lock down your BIOS and other sensitive areas, try to pawn off old parts (seriously, who needs a CD-ROM these days?), and aren’t fully customizable to your needs and only your needs. And last of all, you’re often paying twice or three times as much as you need to be.

The only thing that really separates a commercially-assembled computer from a self-assembled one is the “security” of a warranty and oftentimes, a support hotline that you can call when the BSOD hits the fan, if you know what I mean (though sometimes there’s a per-minute fee for this service). In reality, however, these services do you next to nothing, and if you’re tech-savvy enough, chances are you won’t ever use them in your lifetime. When you buy individual parts for a self-assembled computer, they always come with a manufacturer’s warranty, oftentimes longer than the 1 or 3 years that the computer dealer offers. And if you ever run into software trouble, that is what your friendly neighborhood computer geek exists for. Outside of our perpetual quests to find a girlfriend, all of us really have nothing to do besides brood and assist people through their computer-related quandaries. Which is more or less what I’m doing here, but on a larger scale. I’m single and really desperate!

Okay, that was a bit awkward… The two main benefits of assembling your own computer is the ability to choose whichever parts best suit your needs (as opposed to choosing between a company’s “Home”, “Media Center”, “Gamer”, and my personal favorite, the “EXTREEEMME!!!” version. Additionally, because you’re choosing your own parts and assembling the whole thing yourself, your final price tag will end up several hundred dollars less than what you would’ve paid for a machine of the same capabilities. And that’s exactly what I’ll be teaching you about in this article, which parts to choose, explanations of all those confusing terms (native resolution? CAS Latency?), and where to find all these parts at discount prices. This is article one of a multi-part series (We’ll see how long I can milk this [whispers from the background] What!? We’re a non-profit paper!?! Oi vey…)

Performance Need Levels

The first step of designing your new PC is deciding what your price range is, and what your needs are. (I’ll pretend I didn’t just hear someone shout “iMac”) I’ll be basing most of my configurations and recommendations on generalized “needs” (Simple “MSOffice ‘n Internet”, “Digital Media Editing,” etc.). Budgets vary wildly, but as a rule, nobody needs more than $1000, and for high school students, the price really shouldn’t be above $600 (not including speaks, monitor, and operating system, or OS). Remember this, parents, when your kid starts begging you for $400 videocards that they “have to have”.

I’ve defined the level of needs into the following categories, which include all necessary computer components except for monitor, speakers, and an OS:

Super Frugal Deluxe ($250): Don’t really use computers all that much, but just need something to keep up with e-mail, internet research, and Microsoft Office things? For people who use computers on a “need” basis rather than “want” basis. This is absolutely bare-bones, and represents the minimum you can have for a working Windows XP computer.

Modesty is a Virtue ($400): For those who use their computer casually, but don’t engage in any taxing use such as gaming or encoding. For those who maybe want a music or video collection that won’t lag when played, and won’t sputter and die on you if you try to multi-task with AIM, Outlook, WinAmp, and 6 IE screens at once. (As the above might)

Decent Enough ($600): This PC will perform admirably (enough) for most tasks, whether it be gaming, multi-tasking, image editing, media encoding, etc. For the basic, non-gaming user, you won’t need anything more than this, though this will be able to run any 2D games without a hitch, and most 3D games at a respectable quality and framerate (though this depends largely on your videocard).

Make Your Neighbors Jealous ($1000): Okay, this probably isn’t too descriptive of a term for a computer. I’ll deem this the ‘Unnecessary’ computer, because it sports a lot of performance headroom and won’t need upgrading for a long while. The only people that might have a use for such high performance are serious gamers who spend a lifetime dedicated to their computer or lesser, insecure computer geeks who need to compensate for their lack of real computer prowess by buying expensive hardware.

Kick-Arse (>$1500): Fghweghads censor. Well anyhoo, anything in this range is just silly power. There is absolutely no need for this kind of performance, and in geek circles bringing something like this to a LAN party is the nerdy cock-fight equivalent of tricking out a car to bring to one of those very masculine street races.

Section I: Oh My Barton, it’s the

Gigahertz Myth!

The first component you’ll want to decide on is the Central Processing Unit (CPU, also known as the “Processor” or if you’re really lazy, just “Proc”). What you choose for your CPU will determine the rest of your basic components. The only two names you need to know about are Intel, maker of the Pentium 4, and Advanced Micro Devices (AMD), maker of the Sempron and Athlon64/FX.

Athlon This, Pentium That

Processors are all essentially the same. Athlons can do anything a Pentium can do, and vice versa—the only difference is the speed at which tasks are accomplished. The one exception is processors that have 64-bit extensions. These processors are capable of running 64-bit operating systems and programs, which, depending on your needs and uses, could be potentially useful or virtually worthless. For an explanation of 64-bit processing, as compared to 32-bit processing (which is what all other consumer-level CPUs have), see The 64-bit Story on Page 7. In a nutshell, however, 64-bit extensions at the present are virtually useless because all current consumer software is 32-bit, making no use of the 64-bit extensions in 64-bit processors. In the future, however, all software will eventually convert to a 64-bit process, so investing in a 64-bit CPU now will get you a processor that’ll be able to handle software 2 or 3 years down the line—but even then, it’s not as if a 32-bit proc you buy now will stop clicking and drop dead.

Not All Clock Cycles are Equal

The two players in the CPU market are Intel and AMD. Intel, the more mainstream and “brand-name” company, produces the Pentium 4, which is the CPU most widely used in household computers. AMD currently makes two lines of processors, the Sempron and Athlon64. Both companies also produce lower-end models, like the Celeron, AthlonXP, and the like, but, for performance reasons to be explained, consumers should try to avoid those. Outside of price, and the memory and motherboard choices, the only difference between AMD and Intel chips in general is the “coolness” factor, the inevitable allure of brand name recognition; with a Pentium 4 most people will go “Cool, a Pentium 4!” With AMD, most people will just go “Athlo-wah?

The 3 most common performance terms that you will hear about CPUs are their clock speed or clock frequency (measured in some variant of hertz), the cache (kilobytes), and the front-side bus (or FSB, usually measured in megahertz). Clock speed is how many cycles (hertz) the CPU does per second. Each cycle executes one command, so a 3 gigahertz processor (3 billion hertz) can execute 3 billion commands every second. You’ll notice, however, that AMD does not designate their chips by clock speed, but by their “Performance Index” (PI): the 2500+, 3000+, etc. The actual clock speed for the AthlonXP 2500+ and AthlonXP 3000+, however, are only 1.83 GHz and 2.16 GHz; yet, as their PI ratings indicate, they perform roughly equivalent to a 2.5 GHz and 3.0 GHz P4 (this is not exactly correct, but read on). And this is where the “Megahertz Myth” comes in. For a very long while, the clock speed of a processor used to be the sole indicator for how powerful a processor was. A 1.20 GHz Thunderbird or Pentium III was presumed faster than a 1.19 GHz Thunderbird/PIII, or any Athlon or PIII slower than 1.2 GHz for that matter. This was largely true, because, up until the introduction of the Pentium 4 and AthlonXP, the Intel and AMD chips of yesteryear worked on very similar architecture, and the 1.2 GHz PIII did about as much computation per clock cycle as the 1.2 GHz Thunderbird. When the P4 and AthlonXP (and later, the Athlon64) emerged, however, they debuted with drastically different iterations of the standard x86 design, and as a result, clock speed became different for each brand, in terms of computation. 1 hertz on a Pentium 4 was not the same as 1 hertz on an AthlonXP, performance-wise. The two main differences were the length of each CPU’s pipeline, and the Instruction Set Architecture (ISA) of each chip. The P4 had a long, 20-stage pipeline, which enabled it to ratchet up clock speeds (The P4 began at a humble 1.5 GHz and now is a stratospheric 3.8 GHz). The AthlonXP sported a shorter, 10-stage pipeline, but packed more processing units into each stage, allowing it to do more work. So, while the clock speed of a P4 might be 1.5 times that of an AthlonXP, each one of those P4 clock cycles processes half as much as one AthlonXP clock cycle. (These numbers are just arbitrary) Therefore, despite the massive clock speed advantage, the final output of the P4 was less than that of the AthlonXP. The efficiency also has to do with each chip’s ISA. With the AthlonXP, the architecture allows it to get to a result faster—for example, using only the input data 3, the processor has to get to output value 27. A P4 would have to execute 3x3x3, requiring 3 bits of data and an instruction (multiply) to be executed twice, while the AthlonXP, in its more efficient architecture, has the ability to use the exponent function ^, allowing it to execute 3^3, using only 2 bits of data and one instruction (^) (Note that this is an extreme oversimplification, to the degree of near-analogy). There are many other nuances and complications, but simply put, an Athlon clock cycle does computations faster than a Pentium 4 clock cycle. (This is also the reason why I don’t recommend any of the companies’ lower-end models—no matter how high the clock speed on a Celeron is, it will never come close to a P4 or Athlon64 because its architecture is simply inferior.)

The second most important piece of the CPU is the amount of cache it has. The data that a CPU accesses is stored in three main places—the hard drive, the Random-Access Memory (RAM), and the CPU’s cache. Cache is basically the quick access data “warehouse” of the CPU, where the most vital data is stored so that it can be accessed quickly by the CPU. Think of the CPU as a factory that processes the materials (data). The cache is the warehouse across the street. RAM would be the bigger warehouse upstate. And your hard drive would be the huge warehouse overseas. When your CPU processes data, it would of course be much more efficient to access the data at the cache right next to it, rather than driving all the way upstate to your RAM warehouse, or having it shipped overseas all the way from your Hard Drive warehouse. Why not have an 160 GB cache and ditch the RAM and hard drive altogether, you ask? Well, the expensive price of inner-city real estate prevents you from having a huge cache, and while it may be slower, it is a lot cheaper to put that 512 mega… err acre warehouse in a remote rural area than smack-dab in the middle of the city on prime land. There are different “levels” of cache, denoted by L1, L2, and sometimes L3. These are just variances, with L1 being the warehouse across the street, L2 being the slightly larger warehouse five blocks down, and L3 being the even bigger warehouse at the outskirts of town. Unlike clock speed, cache works the same for all CPUs, and more is better. Below is a chart detailing typical access times and sizes for each type of memory. With a 1.0 GHz processor, which would have 1 billion cycles per second, the 10-60 nanosecond (ns) delay (a nanosecond is a billionth of a second) would cause you to lose 10-60 clock cycles while your CPU waited for data to be transferred. With a faster 3 GHz processor (3 billion cycles/second), that delay would rise up to 30-240 wasted clock cycles. So the amount of cache is very important, and becomes increasingly so as the clock speed of your processor rises.


Type

Access Times (ns)

Typical Sizes

Level 1 Cache (L1)

2-8

8 - 128 KB

Level 2 Cache (L2)

5-12

512 KB - 2 MB

System Memory

10-60

64 MB – 1024 GB

Hard Drive

3,000,000 – 10,000,000

20 - 400 GB

The last spec of a CPU that you need to concern yourself with is its Front-Side Bus (FSB). The front-side bus is the interface that connects your CPU with the motherboard and system memory. Its speed determines how fast data can be transferred from your motherboard and memory to your CPU. The faster the better, especially when you consider the HUGE differential between the FSB and the clock speed (While a Pentium 4 can run at 3.8 GHz, it is severely bottlenecked by its miniscule 800 MHz FSB). The more important point to consider here, however, is how it will affect your system memory. Ideally, you want a 1:1 correlation between your FSB and your system memory. If your AthlonXP has a 333 MHz bus, then your RAM should also be 333 MHz—266 MHz would mean that you’re not using your FSB to full capacity, and 400 MHz would exceed the bandwidth of the FSB, and your motherboard would automatically clock it down to 333 MHz.

Section II: Analysis of the Types

See The Chart on Page 6 for more information

The Pentium 4

If you choose to go the Intel route, the two processors you need to concern yourself with here are the Northwoods (P4C) and the Prescotts/Irwindales (P4E). The Willamette is obsolete and the Pentium 4 Extreme Editions (which include the Gallatin and Smithfield cores) cater to an extremely high-end market, and cost, at a minimum, in excess of $1000. The recommendation here, based solely on speed, is the Northwood cores with the 800 MHz FSB (P4C) or, at higher speeds (the P4C maxes out at 3.4 GHz), the Irwindale. The Prescott/Irwindale, with its lengthy 31-stage pipeline (compared to the Northwood’s 20) enables, and operates more efficiently at, higher clock speeds, which is what allows it to reach 3.8 GHz, while the Northwood has topped out at 3.4 GHz. That said, at lower speeds (lower than say, 3.5 GHz), the Northwood cores actually outperform the Prescotts, and for most people, a Prescott in excess of 3.5 GHz is out of their performance needs and price budget.

As compared to the AMD CPUs, similarly priced Pentium4s (not necessarily a 3.2 GHz P4 vs. a 3200+ Athlon64, but a $200 P4 vs. a $200 Athlon64) usually perform slower, especially at the budget (sub-$150) and high end (greater than $350) levels. The one area of exception is that of media editing, where Hyper-threading (HT) and SIMD (Single Input, Multiple Data) ISA extensions like SSE3 and SSE2 give Pentium 4s a great advantage over AMD processors (which lack HT, SSE3, and for AthlonXPs, SSE2) when editing digital video or large batches of audio and image files.

Pentium 4s are designed for two different socket types: the older Socket 478, and the new LGA775. Different types of CPUs fit into different types of sockets, much like our appliances won’t fit into the European 220-volt sockets. If possible, go with the LGA775, as it is the only interface that Intel will continue developing for (all future, faster CPUs will only be available on LGA775, and all the new motherboard technology, like SLI, SATAII, PCI-Express, etc. will likely only be available on LGA775 motherboards). However, the Northwood P4Cs, only available on Socket478, perform faster than the similarly clocked Prescott P4Es, so if you don’t plan to upgrade your CPU for the next 5 years (and simply do an entire system overhaul, instead of piecemeal upgrades), then it may be better to go with a Socket478 Northwood P4C.

The Athlons

AMD makes two main lines of processors: the Sempron and the Athlon64. AMD also has an older chip, the AthlonXP, which was replaced by the Sempron core, but is still sold today. Instead of being rated by their clock speed, AMD processors use “Performance Index” ratings. The numbers, ranging from 1500+ to 3800+, are, officially, the supposed equivalent to an AMD Thunderbird (the generation preceding the AthlonXP) at that MHz frequency. Thus, a 2500 MHz (2.5 GHz) Thunderbird performs the same as a 2500+ AthlonXP or Athlon64. However, it is pretty pointless to compare processors relative to an outdated CPU—consumers want to know how the Athlons stack up against the Pentium 4s of today. Through my research (not the official word from AMD), the Performance Index of the AthlonXPs is roughly comparable with that of a Pentium 4 at 533 MHz FSB, and slightly slower than that of the Pentium4s at 800 MHz FSB, which are in turn slightly slower than the Athlon64s. A more comprehensive study based on gaming benchmarks, done by Andrew K., provides more insight into the performance efficiency between the different processor cores (see Comparative Performance Rating Index).

In the AMD hierarchy, the AthlonXP/Semprons represent the lower-class of budget CPUs, and the Athlon64 is more or less the standard AMD for serious computer use. For budget users, either the AthlonXP Barton or the Sempron Palermo would be the best choice, depending on the price range. The AthlonXP is a very good ‘budget’ chip; they provide decent performance and can be found almost anywhere for less than $100, a price range that Pentium 4s and Athlon64s don’t even come close to. The Sempron Palermo (a completely different processor from the Sempron Thoroughbred B, which is taken from old AthlonXP processors and rebranded, and performs much worse than an AthlonXP of the same rating) is actually based on the Athlon64 cores, albeit with many features (including 64-bit extensions), disabled. Its performance is slightly better than that of the AthlonXPs, yet still lags far behind that of the Athlon64s. The Sempron Palermos are priced right between the sub-$100 level of the AthlonXPs and the mid-$100 level of the Athlon64s, so the best choice may be to just go with a slightly slower AthlonXP Barton and save yourself $20-30, or spend an extra $20-30 and upgrade to a much faster Athlon64.

The Athlon64s and AthlonFXs currently represent the golden standard of processors. They are the only widely-available processor that supports 64-bit processing, and for their prices, far exceed Pentium 4s in most applications (except media editing). The AthlonFXs are the more expensive higher-end models, but are simply faster Athlon64s.

If you decide to purchase an Athlon64, be sure to find the Socket 939 variety. Like the Intel’s Socket478, no further development will occur on the older Socket 754 or Socket 940 varieties, and unlike the LGA775, Socket 939 processors are superior to Socket 754 and Socket 940 in every aspect. (If you must know, Socket 754s lack support for dual-channel memory, and Socket 940s force you to use special registered, or ECC buffered, memory, which is slower and much more expensive.) Additionally, Socket 939 motherboards are the only AMD mobos that support newer technologies like SLI, SATAII, and PCI-Express.

Part III: The Recommendations

Where to Buy

For CPUs, a good place to start is online. Sites like www.tigerdirect.com, www.zipzoomfly.com, and especially www.newegg.com are good places, especially as price references when you really start looking around for deals. If you’re in the market for a mid-range or higher processor, you’ll probably find the best prices at on online site. Also note that, for California, online stores (at least those mentioned above) don’t charge tax, and the shipping for sites like NewEgg and ZipZoomFly is either extremely cheap or free, so buying online can save you 7-8% over a brick and mortar store.

Although they’re on a hiatus at the moment, the Robert Austin Computer Shows, usually held every three weeks at the Cow Palace, offer vendors with some really great prices on many of the lower end processors (like AthlonXPs). For some odd reason, they present an $8 admission fee, but if you sign up (don’t worry, it doesn’t spam), they’ll e-mail you the tickets for free admission for every upcoming show.

The best choice, if you’re more concerned about getting something that works cheaply, rather than trying to eke out every last drop of performance, is to look for bundles that include CPU, motherboard, and sometimes even RAM, in the same package. TigerDirect offers some bundle deals for really great prices (albeit with rebate). Also, you may want to check out Fry’s (whose stores are mostly located around the South and East Bay). They usually have a couple of really great deals (one Intel, one AMD) every week. Pick up one of their ads (every week in the Friday edition of the San Jose Mercury) or browse through their online store (www.outpost.com – click under the “Advertised Specials” section at the right).

Lastly, if you’re purchasing a CPU, make sure you differentiate between “Retail Box” and “OEM”. OEM, or “white box”, products are direct from the manufacturer, and although they often cost less, they come with a shorter warranty (usually 30 days, compared to the 3-year warranties of retail CPUs), and don’t include a heatsink/fan. Depending on your level of security, and the reputation of the store, the warranty may or may not be a big issue. For most processors, unless they are defective from the start (which you will notice within those 30 days), the chances of them going kaput are slim (unless you try to overclock), so you likely won’t need the 3-year warranty anyway. However, if you’re buying from a store you’re unsure of, or some place with a difficult return policy, it may be better to opt for the retail box. As for the heatsink/fan (which you will need), be aware that finding one for an OEM proc will cost you around $10-15, so if the cost of a retail box is only around $15 more than the OEM, it’d probably be better to go for the retail, to get the 3-year warranty and avoid the hassle of finding a heatsink/fan.

Frugal: At this stage, all you want is something that will run Windows XP. While the OS will work on pretty much any processor (at the lower end, minimum requirements are more dependent on memory and available hard drive space), the most cost efficient processors at this budget range are the AthlonXP series. Assuming that you’re going with the AthlonXP, the 1600+ through 2400+ models are the best choice for value. They should all cost in the $40-70 range (Nobody buys these lower-end models anymore, because those that would be in the low-end market don’t know what an AMD is, and the computer enthusiasts who do know have moved on to higher performance models). Pentium 4s are not recommended for this price range, because of their extremely high costs (For $70, you could get an AthlonXP 2400+, while you would be very lucky to find even a 1.7 GHz Willamette Pentium 4 with a 400 MHz FSB).

Recommendation:

AthlonXP 2400+ Thoroughbred B Socket A ($70)

Modest: At this price range the drop off in price-performance between Intel and AMD becomes steep. While AthlonXPs up to 3000+ (Barton Core) can be had for less than $100, a similarly performing P4 3.06 GHz FSB533 costs about $200, twice as much for the same performance, and in this range, the upper limit of $125, a 2.4 GHz Pentium 4 FSB533, doesn’t even comes close to the AthlonXP, much less an Athlon64 that can be had for only $20 more. For AMD, almost any Barton core AthlonXP fits well into this price range, and even the lower-end Bartons, such as a 2500+ ($75), will run virtually anything that Windows XP or modern games can throw at it

Recommendation:

AthlonXP 3000+ Barton Socket A ($100)

Decent: At this point, for AthlonXP users, there’s not much room to go. The AthlonXP 3200+ Barton provides more power than most people will ever have use for, yet costs only $125. Beyond that, one would have to make a jump to the Athlon64, at $145 and up (which necessitates additional costs in a more expensive motherboard, and, if you’re going all-out for a PCI-Express mobo, a new videocard too). However, PCI-Express is an inevitable interface that all motherboards and videocards will have to upgrade to eventually. If you plan to use the same motherboard for a long time, and especially if you plan to use your computer for gaming, it would probably be worth it to splurge for a PCI-Express motherboard (and a PCI-Express videocard, if you don’t have one) so that your computer will be able to support videocards and other expansion cards in the future.

On the Intel side of things, many serviceable processors exist in the price range, like the 2.66 GHz P4B ($145) or the 2.8 GHz P4E ($170), albeit still a bit slower or a bit more expensive than the Athlon64 3000+. However, if you intend to use the computer for the aforementioned media editing programs, a Northwood and especially a P4E Prescott (which has both Hyper-threading and SSE3 capabilities) should be equal or even surpass the performance of the Athlon64.

Recommendations:

Athlon64 3000+ Winchester Socket 939 ($145)

-or-

Pentium 4E 2.8 GHz Prescott LGA775 ($170)

Unnecessary: If you’re looking for something in this level, I’ll assume you use your computer for serious use. At this point, your choice isn’t about which processor gives you more performance for the price at the moment—you’re not going to notice the performance difference between a 3.4 GHz Pentium 4 and an Athlon64 3500+ in any program right now—you’re talking about 15 second differences in 10-minute encoding, and a + 50 frames per second delta don’t matter when you’re in the 200 fps range (the human eye can’t detect anything above 75 fps anyway, and 30 fps is adequate for almost anybody). And if you’re a simpleton reading this, there is no way in the world that you are going to boost your Microsoft Word performance. That said, you should start looking toward the future, and building a system that can accommodate any future technologies and upgrades, since your rig won’t be obsolete for a very long time. Socket 939 and LGA775 are a must, and if possible, try to go for the Hyper-threading and SSE3 capable processors, as software in the future will be optimized and designed to utilize both of these features to a greater extent. Assuming you possess some semblance of financial restraint, the best choices here are one of the lower end Venice or San Diego cores, or the fastest Prescott or Irwindale core you can afford—at the same price, the Athlon cores are slightly faster overall, but users who intend to use SIMD-intensive programs (because I’m getting tired of typing ‘media editing apps’) should find the Pentiums to be superior.

Recommendations:

Athlon64 3500+ Venice Socket 939 ($275)

-or-

Pentium 4E 550 3.4 GHz Prescott LGA775 ($280)

-or-

Pentium 4E 640 3.2 GHz Irwindale LGA775 ($280)

Kick-Arse: I really don’t expect—nor encourage—anyone to buy anything in this range. Whereas the previous “Unnecessary” category was, well… pretty unnecessary, everything in this range is downright extravagant, ostentatious, and financially irresponsible. That said, the main reason I even have this category is to explain all the awesomenation components and technology that none of us can ever afford. The “top dogs” for each company are the Athlon64 FX-55 San Diego (at $800) and the Pentium 4 3.73 GHz “Extreme Edition” FSB1066 Irwindale. (at $1000—see what I mean now?) Which is better? As always, it depends on what you use your computer for. The Hyper-threading Pentium 4 should still be able to beat the AthlonFX (even with the SSE3 advantage leveled by San Diego) in media apps. However, the AthlonFX performs better all around, and, even if money isn’t an object in this range, the $200 discrepancy between the Athlon64 FX-55 and the Pentium 4 3.73 GHz is still a major factor—you could spend the $200 to tack on 2 gigs of extra RAM, or to put toward that second GeForce 6800 Ultra in SLI mode, or to spend on a 74 GB Raptor, or on an entire phase-change cooling system that enables you to chill the proc down to a nice 5° C, and then overclock the FX-55 past the 3 GHz barrier, whereupon it would thoroughly trounce the Pentium 4 3.73 GHz Irwindale.

Assuming, however, that you’re looking for the best system performance, and not simply the fastest CPU, on a limited budget you would probably find a better deal with a slower Pentium 4. While the Athlon64s and AthlonFXs scale pretty gradually with performance, there is a huge drop-off in price between the upper-level Pentium 4 Expensive Editions and the “non-extreme” Pentium 4 Prescotts (500s) and Irwindales (600s) that isn’t justified by the slight performance gain. By opting for a $600 Irwindale instead of the $1000 “Extreme” Irwindale, you could buy a second videocard to run in SLI mode, improving your performance (in games, at least) almost twofold (theoretically, anyway. In reality, SLI will give you anywhere from a 50 to 70% performance gain), which is far superior to the 5-10% gain between the Extreme and non-Extreme Irwindales. If you go for a non-Extreme Pentium 4, both the 3.8 GHz 570J Prescott and the 3.6 GHz 660 Irwindale perform about the same, but the Prescott lacks 64-bit extensions, limiting its capabilities for the future.

For Athlons in the same range as the non-Extreme Pentium 4s, the Athlon64 4000+ San Diego is the best choice, performing at the same level as the 3.6 GHz 660 Irwindale and the 3.8 GHz 570J Prescott, but at $500, over $100 cheaper.

Recommendations:

Athlon64 FX-55 San Diego Socket 939 ($800)

-or-

Pentium 4 “Extreme Edition” 3.73 GHz Irwindale LGA775 ($1020)

-or-

Athlon64 4000+ San Diego Socket 939 ($500)

-or-

Pentium 4E 660 3.6 GHz Irwindale LGA775 ($620)

Next up, the amazing, intricate, and insanely complex world of Motherboards.

The Chart

The world of processors is a bit confusing, what with Pentium 4 cores that have the same letter designation, and Athlons with radically different FSBs, caches, and clock speeds, yet still have the same “Performance Index”. Below is a chart that will hopefully clear things up.

The Performance Rating (PR, not to be confused with AMD’s “Performance Index” rating system) is an index value to compare performance across different types of processors. It is based mostly on data from gaming benchmarks, and although it shouldn’t be taken as 100% accurate, it is useful in gauging the performance between different processors. Performance scales with clock frequency, so simply multiply the processor’s clock speed with the PR index of the processor type to get the performance rating (note that this is the actual clock speed, not the “3200+” Performance Index of Athlons). For example, a 3.2 GHz P4E Prescott has a Performance Rating of 3680 (3200 MHz * 1.15), compared to a 3200+ (2.2 GHz) Winchester that has a Performance Rating of 3982 (2200 MHz * 1.18), so the 3200+ Winchester performs faster than the 3.2 GHz Prescott.


Processor

Core

Front-side Bus

Cache (KB or MB)

Process

Additional Features††

Socket Type

PR Index†††

L1

L2

L3

P4

Willamette

400 MHz

20K

256K


180nm


Socket478

0.90

P4A

Northwood

400 MHz

20K

512K


130nm


Socket478

1.00

P4B

Northwood

533 MHz

20K

512K


130nm


Socket478

1.06

P4C

Northwood

800 MHz

20K

512K


130nm

HT

Socket478

1.16

P4A

Prescott

533 MHz

28K

1024K


90nm

SSE3

Socket478

1.05

P4E (500 Series)

Prescott

800 MHz

28K

1 MB


90nm

SSE3, HT

Socket478/LGA775

1.15

P4EE

Gallatin

800 MHz

20K

512K

2MB

130nm

HT

Socket478/LGA775

1.23

P4E (600 Series)

Irwindale

800 MHz

20K

2MB


90nm

SSE3, HT, 64-bit

LGA775

1.21

P4EE

Irwindale

1066 MHz

20K

2MB


90nm

SSE3, HT, 64-bit

LGA775

1.23

PD

Smithfield

800 MHz

56K

2MB


90nm

SSE3, 64-bit, DC

LGA775

2X 0.99

PDEE

Smithfield

800 MHz

56K

2MB


90nm

SSE3, HT, 64-bit, DC

LGA775

2X 1.01

AthlonXP

Palomino

266 MHz

128K

256K


180nm

No SSE2

Socket A

1.17

AthlonXP

Thoroughbred A/B

266 MHz

128K

256K


130nm

No SSE2

Socket A

1.18

AthlonXP/Sempron

Thoroughbred B

333 MHz

128K

256K


130nm

No SSE2

Socket A

1.22

AthlonXP

Barton

333 MHz

128K

512K


130nm

No SSE2

Socket A

1.45

AthlonXP

Barton

400 MHz

128K

512K


130nm

No SSE2

Socket A

1.47

Sempron

Palermo

400 MHz

128K

128K


90nm

No SSE2

Socket754

1.5

Sempron

Palermo

400 MHz

128K

256K


90nm

No SSE2

Socket754

1.53

Athlon64

Newcastle

400 MHz

128K

512K


130nm

64-bit

Socket754

1.72

Athlon64

Clawhammer

400 MHz

128K

1024K


130nm

64-bit

Socket754

1.75

AthlonFX

Sledgehammer

800 MHz

128K

1024K


130nm

64-bit

Socket940

1.82

Athlon64

Newcastle

800 MHz

128K

512K


130nm

64-bit

Socket939

1.78

Athlon64

Winchester

800 MHz

128K

512K


90nm

64-bit

Socket939

1.81

Athlon64/FX

Clawhammer

800 MHz

128K

1024K


130nm

64-bit

Socket939

1.82

Athlon64

Venice

800 MHz

128K

512K


90nm

SSE3, 64-bit

Socket939

1.82

Athlon64/FX

San Diego

800 MHz

128K

1024K


90nm

SSE3, 64-bit

Socket939

1.83

The Athlon64 and AthlonFX processors use a special kind of front-side bus known as HyperTransport. The actual ceiling of the HyperTransport ranges from 1600-2000 MHz, but because current Athlon64s and AthlonFXs support only DDR400 memory, the operating speed is equivalent to 400 or 800 MHz.

The fabrication process is the size of transistors. Shrinking the transistor size lowers the power consumption (you’ll see this often as the VCore of a CPU) and thus lowers the heat generated, which allows the processor to run cooler and faster. A smaller transistor size is always preferable, although it is not as great a factor in performance as clock speed, front-side bus, or cache.

†† HT stands for Hyper-threading. DC stands for Dual-Core. See Additional CPU Capabilities

††† The Performance Rating Index was devised and researched by Andrew K. For the full datasheet (which contains the Performance Rating Index, along with a comprehensive list of processors), go to www.juhsd.net/flask/goldenram/download.

Additional CPU Capabilities

The 64-bit Story

The difference between 32-bit computing and 64-bit is essentially that 64-bit can process more data at a time. As some of you may know, data in computers is represented in binary code, a series of 0’s and 1’s. In a 1-bit program, there would be two possible values, 0 and 1. If you advanced to 2-bit, you would have 4 possible combinations of 0 and 1, and therefore 4 possible values, as follows:

00 (= 0 in decimal)

01 (= 1)

10 (= 2)

11 (= 3)

If you keep going on, you would soon reach the conclusion that n-bit data gives you 2n possible values. So, 3-bit would give you 23 = 8 values, 4-bit would give you 24 = 16, and so on. (Think of x-bit as all possible binary integers with x digits.) Each possible value is 1 byte. In today’s 32-bit processor, you would have 232 ≈ 4.3 billion possible combinations/values of 0 and 1, and therefore 4.3 billion bytes, or approximately 4 gigabytes. With 32-bit processing, 4 gigabytes is the maximum amount of memory one can have in their computer, and 4 gigabytes is also the maximum possible file size under a 32-bit system. (Although I believe XP Pro has a software workaround. That, however, is not true 64-bit, and thus won’t perform as such.) With 64-bit, the theoretical possible number of values would be 264 ≈ 18 exabytes (18 billion gigabytes). Now, you may be asking, who in the world needs more than the 4 gigs currently available under 32-bit, much less 18 billion gigs. (err...abytes… “gigs” and “megs” are just abbreviated jargon for gigabytes and megabytes. GB and MB are the initialisms, respectively.) Years ago, however, people were saying the same about the jump from 16-bit to 32-bit. Back then, the ceiling was a measly 64 KB—it would be impossible to achieve anything near what computers do today with only that amount. No doubt, in 2020, when the Sims 9 (or Duke Nukem Forever, for that matter.) debuts with its 23rd expansion pack with those hefty half-terabyte textures, people will be looking the same way at our measly 4 gigs.

Currently, consumer-level computing has no real use for 64-bit. It does, however, have uses in the professional market, where servers and databases sizes, which stretch into the terabytes, would absolutely crush mere 32-bit processors. (OK, not “crush” exactly, but it would be excruciatingly slow.) Today, the only consumer level programs that stretch the limits of 32-bit computing are digital video editing, work with extremely large 2D images, and to some extent, the current generation of games (Doom 3, Half-Life 2, Everquest 2) that employ very high-resolution texture sizes. Real 64-bit programs, however, won’t begin to emerge until at least 2006, after a 64-bit OS has settled in among the majority of households, and a full transition to a 64-bit standard likely won’t be complete until two or three years after that, depending on who you ask. Shortly after that point, however, virtually every 32-bit processor will become obsolete, unable to run any of the modern 64-bit programs.

In a nutshell, 64-bit processors will let you process data in excess of 4 gigabytes. Under 4 gigabytes, 64-bit and 32-bit will run virtually the same, but over it, 32-bit won’t run at all (or at least fast enough to be anywhere near accessible). No doubt, 64-bit is the future, but that future is a ways off. 32-bit will do everything just as well as a 64-bit in the present and near-future, and for really simple MSOffice/e-mail types, or those that replace their CPUs within 2 years anyway, a 32-bit processor would be serviceable. Considering that, it would probably be wise to invest in a 64-bit processor now, to avoid being stuck with an essentially useless CPU once 64-bit becomes mainstream.

Seeing Dual-Core

The latest trend in the CPU industry, although not fully to fruition yet, is the emergence of ‘dual-core’ processors—twin processors that are physically on the same die, that share the same memory controller and FSB. For those unfamiliar with computers, having dual cores, just like having dual processors or running in Hyper-threading mode (explained next), improves performance when engaging in multiple applications or highly repetitive tasks. While this will greatly improve repetitive applications like media editing (theoretically twice as fast), it is absolutely useless in things such as games, which have very little parallelism in their code, and in general use, is actually slower than single-core processors, because of speed limitations due to extra heat and limited bandwidth on a shared memory controller. And unlike Hyper-threading, which is a free firmware capability, a dual-core processor will likely cost much much more than a regular single-core processor at the same speed. Thus, for the average user who is looking for better all-around performance, the $200 premium of a dual-core would be much better spent on a better video card, more memory, or even a faster processor, any of which would increase performance much more than an extra core.

However, if you think a dual-core processor might be useful (you do a lot of digital content creation working with video, audio, images, etc.), but you need a quick solution now, going with AMD is the better choice. Because the memory controllers for Athlon64 X2s are integrated onto the die (Intel’s are on the motherboard), the memory and motherboard you buy for an Athlon64 today will be fully compatible if you decide to upgrade to an X2 sometime in the future. The same cannot be said, however, for the Intel Smithfields, which, although they will work with the current chipsets, will only fully be compatible with the new Glenwood and Lakeport motherboard chipsets.

Double your threads with a single core!

Hyper-threading is a new technology developed by Intel, wherein a single processor operates as a virtual dual-processor system. If you had a 3.0 GHz P4, for example, under Hyper-threading mode it would function as two separate, 1.5 GHz processors. This has some benefits in multi-tasking (encoding a video file while playing a game, for instance) and parallel tasks (where you’re applying the same instruction to a large amount of data; audio encoding and Photoshop filters, for example.) In anything else, you won’t see a performance difference, and in some cases, a performance decrease when Hyper-threading mode is on (Though you can disable it when you don’t need it). Unless you do a lot of video editing or Photoshop (music encoding never takes more than a few minutes anyway, so the boost is negligible.), there’s no reason to shell out extra money to get Hyper-threading (Then again, if the price is the same, it doesn’t hurt to have it either).

Streaming Single Instruction,

Multiple Data Extensions

SSE2 and SSE3 are ISA extensions that provide additional instructions (like the exponent instruction mentioned) to improve processing efficiency. Both are SIMD (Single Instruction, Multiple Data) instruction sets, and are designed, as its name implies, to apply repetitive instructions to large sets of data in parallel. Like 64-bit and dual-core processing, the SSE2 and SSE3 extensions are only useful if applications are specifically programmed to take advantage of them. The implementation of SSE2 is already widespread, especially in media editing programs, which is evident in the heavy performance advantage that Pentium 4s possess over AthlonXPs. Although it will be employed in future programs, SSE3 is unlikely to revolutionize performance to the extent that previous ISA extensions, like MMX and SSE2, did. To have both is always nice, but presently, only SSE2 has a significant effect on CPU performance, and even then, it only contributes in the performance of media editing applications.

0 Comments:

Post a Comment

<< Home