Qualcomm's Next-Gen Krait 400 & Krait 300 Announced in Snapdragon 800 & 600 SoCs
by Anand Lal Shimpi & Brian Klug on January 7, 2013 9:53 PM ESTWe've been hinting at this for a while, both on the Podcast and in our most recent power analysis piece, but today it's very official: Qualcomm is announcing the next two versions of its Krait architecture.
Krait is the codename for Qualcomm's custom ARMv7 microprocessor. The 3-wide out-of-order design dominated the smartphone landscape since its introduction last year. Unlike what we saw with the Scorpion/Krait transition, Qualcomm is going to keep Krait fresh by more frequent updates.
The first two updates come today: Krait 300 and Krait 400.
Krait 300
In usual Qualcomm fashion, we're missing good depth on exactly what these new revisions deliver. This is one area where Qualcomm really needs to emulate Intel: we know more about Haswell than we do about the original Krait.
That being said, here's what we do know. Krait 300 is still built on TSMC's 28nm LP process, just like the original Krait. The pipeline remains unchanged, but Qualcomm is able to squeeze out higher clocks out of the core. It's unclear whether we're simply talking about voltage scaling or a combination of that and improvements to timing, yields and layout. Whereas the current Krait core tops out at around 1.5GHz, Krait 300 will run at up to 1.9GHz.
Another big addition to the architecture is Krait 300 now features a hardware data prefetcher that preemptively grabs data out of main memory and brings it into L2 cache. The original Krait core had no L2 prefetchers.
Single threaded IPC improvements are the name of the game with Krait 300 and like all good evolutions to microprocessor architectures, the new Krait improves branch prediction accuracy. Since there's no increase to pipeline depth, improved branch prediction directly results in improved IPC (and better power efficiency).
Both Qualcomm and ARM have been very vague about what types of instructions can be executed out of order, but Krait 300 can execute more instructions out of their original program order. Building a robust OoOE (Out of Order Execution Engine) is very important to driving higher performance, and being able to reorder more types of instructions directly impacts single threaded performance.
Krait 300 now supports forwarding between pipelines, although it's not clear whether or not the previous architecture lacked any ability to forward data between stages.
Finally Krait 300 improves FP and JavaScript performance. Once again, it's not clear how. I've asked Qualcomm whether there have been any changes to the execution units in Krait 300 to enable these improvements. In general I believe we're looking at around a 15 percent increase in performance at the same clock frequency, for a jump of 20 to 30 pecent overall with the clock increases. This isn't necessarily enough to close the gap between Krait 300 and ARM's Cortex A15, however Krait 300's power profile should be much better. Compared to Atom, the Krait 300 improvements should be enough to at least equal performance if not surpass it, but not necessarily significantly.
Krait 400
If Krait 300 is the new performance mainstream successor to Krait, Krait 400 is the new ultra high end part. Using TSMC's new 28nm HPm process (High-K + Metal Gate, optimized for low power at peak performance), Krait 400 can run at up to 2.3GHz. The 400 series core inherits all of the improvements from Krait 300 but adds a couple more.
The move to 28nm HPm necessitates a redesign of circuits and some relayout, but Qualcomm also improved the memory interface on the core. Krait 400 enjoys lower latency to main memory and a faster L2 cache.
The New Snapdragons
The new Kraits will find their way in new Snapdragon platforms, now numbered 200, 400, 600 and 800 (the old S1 - S4 labels are gone). As always, higher numbers mean better performance but you'll still need to rely on the internal part numbers to know what's really inside.
Today Qualcomm announced the Snapdragon 800, which implements four Krait 400 cores running at up to 2.3GHz, an Adreno 330 GPU and Qualcomm's 3rd generation LTE baseband (9x25) all on a single die. Snapdragon 800 is the part formerly (or still internally) known as MSM8974 which we've seen rumblings about numerous times.
Qualcomm tells us that the Adreno 330 will offer roughly 50% more graphics performance over Adreno 320, and an almost 2x increase in compute performance.
The integrated 9x25 3rd generation LTE baseband enables support for UE Category 4 LTE with up to 150Mbps downstream, this is the same IP block as in MDM9x25, and likewise MSM8974/Snapdragon 800 will be available in all the usual variants (CDMA2000/WCDMA/LTE, WCDMA/LTE, and finally no modem).
Snapdragon 800 also integrates 802.11ac baseband, a new feature of modern Qualcomm SoCs, just like 8960 and the previous S4 family.
Snapdragon 800 also includes a 2x32 LPDDR3 memory interface.
On the video/decode side, the SoC supports encode/decode of 4K HD content at 30 fps.
Also being announced today is the Snapdragon 600. This part integrates four Krait 300 cores running at up to 1.9GHz. Adreno 320 handles GPU duties, although with an increased clock speed. Compared to the current Snapdragon S4, the 600 is expected to improve performance by up to 40% if you combine IPC and frequency increases.
The new Snapdragon 600 is also known by the part number APQ8064T, and was formerly known as the Snapdragon S4 Pro.
Final Words
Qualcomm really is the one to beat when it comes to smartphone SoCs. Its excellent baseband integration combined with a very good balance of power and performance on the CPU/GPU side make for a platform that's difficult to outperform.
With Krait 300/400, Qualcomm is really evolving its Krait architecture the right way. The update comes at the right time after the original Krait, and improves performance in the right way. A religious focus on improving single threaded performance, generation over generation, without blowing through your power budget is the only way to do this. Qualcomm gets it.
Krait was good, but Krait 300/400 are likely going to continue to carry that flag through 2013. More importantly, Qualcomm has hinted numerous times that it has a "pipeline of Kraits" lined up for the future.
In tablets and larger devices are really where Qualcomm will have its work cut out for it. Between Intel's x86 offerings and ARM's Cortex A15, Qualcomm's strengths still apply - but they're going to face more strenuous competition.
Today's announcements are a welcome update. Qualcomm is gearing up for a war and is definitely making the right moves. If it can keep up this aggressive cadence, Krait can easily become a fixture in the ultra mobile space.
22 Comments
View All Comments
pugster - Monday, January 7, 2013 - link
Not much bump of GPU power compared to their precessor, unlike Nvidia's Tegra4akmittal - Tuesday, January 8, 2013 - link
Nvidia jumped from 40nm to 28nm and cortex A9 to A15 so this results into bump. While Qualcomm was already using latest 28nm technology and Krait was very close to Cortex A15. So we can call this performance increase huge.varad - Tuesday, January 8, 2013 - link
"Krait was very close to Cortex A15":You might want to back that up. Here are some numbers measured by Anandtech:
http://www.anandtech.com/show/6425/google-nexus-4-...
And the bump in Tegra 4's GPU perf is independent of the process transition. They seem to have bumped the number of GPU cores to 6x in Tegra 3.
akmittal - Tuesday, January 8, 2013 - link
from close to A15 ,i meant much better than Cortex A9Activate: AMD - Tuesday, January 8, 2013 - link
"And the bump in Tegra 4's GPU perf is independent of the process transition. They seem to have bumped the number of GPU cores to 6x in Tegra 3."What? Its directly related. By moving to 28nm nV was able to fit A15 AND more shaders in the same die area as their 40nm process. They couldn't have used that many shaders without ballooning die size and TDP. So actually, the bump in Tegra 4's GPU perf is hugely dependent on the process transition
Assimilator87 - Monday, January 7, 2013 - link
"Qualcomm tells us that the Adreno 330 will offer roughly 50% more graphics performance over Adreno 220, and an almost 2x increase in compute performance."Did you mean "over Adreno 320"?
akmittal - Tuesday, January 8, 2013 - link
Yup they mean Adreno 320 not 220..Brian Klug - Tuesday, January 8, 2013 - link
Oops that's right, fixed!-Brian
deltatux - Monday, January 7, 2013 - link
@pugster: unlike NVIDIA, Qualcomm's Adreno is one of the leading mobile GPUs out there, they don't need to play catch-up like NVIDIA does. NVIDIA's GeForce ULPs found in Tegra 2 and 3 were rather lacking for a while. Qualcomm can keep the 800 series very competitive by just improving graphics by 50% to compete with NVIDIA.mayankleoboy1 - Tuesday, January 8, 2013 - link
With the Adreno330, can they beat PowerVr in Apple SoC's ?2013 is definitely going to be exciting for ARM SoC's . We have Exynos 5xxx, T4 and Krait 300.
And of course, Apple.