Sunday, 2 June 2013

The Core i7-4770K Review: Haswell Is Faster

Intel's Haswell architecture is finally available in the flagship Core i7-4770K processor. Designed to drop into an LGA 1150 interface, does this new quad-core CPU warrant a complete platform replacement, or is your older Sandy Bridge-E system better?

Do you know what it’s like to be at the top of your game, the nearest competitor several strides behind? Well, maybe not. But Intel sure does. When it comes to desktop CPUs, the company’s top-end parts continue to stave off AMD's best efforts. That applies to raw performance and efficiency.
We love fast, and we love efficient. But we also like to see healthy competition driving innovation. And again, on the desktop, there’s not enough of that to push Intel. Ivy Bridge-based CPUs are generally a small step up from the generation prior. And although the Sandy Bridge architecture included a number of notable improvements, unprecedented integration gave away Intel’s growing focus on mobility. Even as we got our hands on great features like Quick Sync, Intel was chiseling away at its enthusiast equity by limiting overclocking to K-series SKUs.
Expect more of the same from Haswell. You're going to see notable per-clock performance improvements, faster graphics, and additional features able to accelerate specific workloads. But you’re also going to witness a clumsy handling of overclocking (again), some strange decisions on the graphics side (again), and incremental gains that’ll have some of us upgrading our desktops, but more folks looking for Haswell-powered mobile platforms.
That's entirely by design, by the way. An emphasis on power is front and center with Haswell. And as a result, this architecture is going to span the broadest range of devices Intel has ever touched with one design. But I’ll argue that enthusiasts on the desktop take a back seat to make it all possible.

Meet Haswell, Now Known As Intel’s Fourth-Gen Core Architecture

Intel is rolling out the details of its Haswell-based processors in a staggered launch. The company plans to ship multiple variations of the architecture across a number of different interfaces, from very low-power segments to very performance-sensitive ones. However, the only arrangement emerging today is the quad-core SoC. Technically, Intel is talking desktop and mobile, though we’re deliberately focusing on the Core i7-4770K desktop CPU. I published a preview of Core i7-4770K’s performance almost three months ago, and that story has some information about Intel’s plans as well.
Haswell-based quad-core processors will ship in two configurations to cover the mobile and desktop markets. Only one is ready today, though. That chip features the HD Graphics 4600 engine, also known as GT2. The second, with Iris Pro Graphics 5200 (or GT3e) is coming later. Intel's engineers claim that Iris Pro scales incredibly well given a lofty power ceiling and enough cooling. However, CPUs endowed with the higher-end graphics engine are BGA-only, meaning they’re soldered down. So, enthusiasts buying LGA 1150-equipped motherboards will only find Core i7 and Core i5 CPUs with four cores and HD Graphics 4600 (technically, there’s also a 35 W Core i5 with fewer cores, but it’s still under wraps).
This implementation of Haswell is composed of 1.6 billion transistors, up from a comparable Ivy Bridge configuration’s 1.4 billion. Optimized expressly for Intel’s 22 nm node, the die measures 177 square millimeters, just slightly larger than quad-core Ivy Bridge at 160 mm².
Put Ivy Bridge and Haswell right next to each other and you might have a difficult time telling them apart. After all, there’s “only” a 200 million-transistor delta separating the two. That 14% growth in transistor count largely comes from a 25% increase in graphics resources compared to last generation.
That’s not to say the processor cores go untouched. Intel says it put specific emphasis on speeding up both today’s legacy code as well asapplications we’ll see in the future. To that end, larger buffers enlarge the out-of-order window, which means instructions that would have previously waited for execution can be located and processed sooner. Haswell’s window is 192 instructions. Sandy Bridge was 168. Nehalem was 128. The Haswell branch predictor is improved, too. This is something Intel manages to do every generation—and for good reason, since it simultaneously enables better performance and prevents the wasted work of a branch getting predicted incorrectly. Previously, Intel’s architecture was able to execute six operations per clock cycle. However, Haswell gets two additional ports (one integer ALU and one store), enabling up to eight operations per cycle. And workloads with large data sets should see a benefit from a larger L2 TLB.
All of those changes add up to significant improvement in Haswell’s IPC compared to Ivy Bridge. That’s where we expect most of the speed-up in general-purpose apps to come from this generation, since the top-end Core i7-4770K runs at the same 3.5 GHz as -3770K.
Sure enough, when we set five different processors (employing four different architectures) to the same constant 4 GHz, we see, first, how much more work Intel gets done compared to AMD and, second, a steady progression forward in Intel’s performance.
In addition to the two execution ports Intel adds to Haswell, ports one and two now feature 256-bit Fused Multiply-Add units, doubling the number of peak theoretical floating-point operations per cycle. Integer math gets a big boost as well from AVX2 instruction support.
Of course, multiplying the architecture’s compute potential means little if you can’t get data into the core fast enough. So, Intel also made a number of changes to its caches. Haswell’s L1 and L2 caches are the same size as they were in Ivy Bridge (there’s a 32 KB L1 data, 32 KB L1 instruction, and 256 KB L2 cache per core). Bandwidth to the caches is up to doubled, though, and we’ll see in our synthetic testing that the L1D is indeed quite a bit faster. Intel claims that it can do one read every cycle from the L2 (versus one read every other cycle in Ivy Bridge), but we aren’t able to replicate those figures in our own testing.

Cores / ThreadsBase Freq.Max. TurboL3HD GraphicsGraphics Max Freq.TDPPrice
Fourth-Gen Core i7 Family
4770T4/82.5 GHz3.7 GHz8 MB46001,200 MHz45 W$303
4770S4/83.1 GHz3.9 GHz8 MB46001,200 MHz65 W$303
47704/83.4 GHz3.9 GHz8 MB46001,200 MHz84 W$303
4770K4/83.5 GHz3.9 GHz8 MB46001,250 MHz84 W$339
4770R4/83.2 GHz3.9 GHz6 MBIris Pro 52001,300 MHz65 WN/A
4765T4/82.0 GHz3.0 GHz8 MB46001,200 MHz35 W$303
Fourth-Gen Core i5 Family
4670T4/42.3 GHz3.3 GHz6 MB46001,200 MHz45 W$213
4670S4/43.1 GHz3.8 GHz6 MB46001,200 MHz65 W$213
4670K4/43.4 GHz3.8 GHz6 MB46001,200 MHz84 W$242
46704/43.4 GHz3.8 GHz6 MB46001,200 MHz84 W$213
45704/43.2 GHz3.6 GHz6 MB46001,150 MHz84 W$192
4570S4/42.9 GHz3.6 GHz6 MB46001,150 MHz65 W$192

The Core i7-4770K gives us an 8 MB shared L3 cache, similar to Core i7s before it. Although the Sandy and Ivy Bridge designs employed a single clock domain that kept the cores and L3 running at the same speed, Haswell decouples them. Our cache bandwidth benchmark reveals a slight hit to L3 throughput, though improvements elsewhere in the System Agent keep the results fairly even.
Haswell offers the same 16 lanes of PCI Express 3.0 connectivity as Ivy Bridge, and validated memory data rates up to 1,600 MT/s. The desktop line-up’s thermal targets are quite a bit different as a result of Intel’s fully-integrated voltage regulator, but an upper bound of 84 W isn’t extreme by any stretch and a floor of 35 W is pretty familiar.
All of Intel’s upgradable processors now drop into an LGA 1150 interface, meaning any decision to adopt Haswell is also going to require a motherboard purchase, at least. So, before you drop several hundred dollars on a brand new platform, let’s figure out if Core i7-4770K is worth the investment.
via tom

No comments:

Post a Comment