28 nm AMD APU Lineup is Complete, On Course for H1 2014 Launch

Advanced Micro Devices, Inc. (AMD) announced a pair of sporty upcoming x86-based, 28 nanometer (nm) processors that will front its mobile processing efforts.  It also announced a new HTPC-geared CPU+GPU combo chip that’s sure to please budget shoppers.  Officially these chips won’t launch until 2014 Consumer Electronics Show (CES).  But AMD’s revealed enough that it’s painting a very interesting picture for the 2014 chip market as it tries to capitalize on its sales strengths and improve upon its struggling lines.

I. Trinity and Llano Kick Things Off

Since their 2010 introduction AMD’s “accelerated processing units” (APUs) (AMD’s marketing lingo for PC-geared system-on-a-chip (SoC) designs) have sold well, and have been very competitive in some niches.  But as mobiles users’ battery life expectations have risen AMD has struggled with power consumption.  Even as Intel Corp. (INTC) pioneered industry leading node technology, AMD’s third party fab partners struggled to keep pace in terms of die shrinks.

Ultimately this struggle has had more of an affected on the lower end mobile geared APU stock, than AMD’s desktop-geared offerings.

On the high end of AMD’s APU line is the A8 and A10 brands, the former of which deployed in June 2011 with the launch of the first generation Llano APU and the latter of which deployed in Oct. 2012 as a late add-on to the Trinity series.

Since day one AMD’s APU line competed for very specific market niches — budget laptops with no discrete mobile GPU card — and leaned heavily on price as a selling point.  But AMD’s graphical leadership made this formula not only work, but flourish as AMD’s chips rivalled even low-end discrete mobile GPUs at a reduced bill of materials net cost for the GPU+CPU.

Picture top to bottom: BrazosTrinity (middle), Tahiti (whom Trinity‘s on-die GPU is partially derived from) [Image Source: Jason Mick/Global Tech News LLC]
Last year’s follow-up to Llano, the 32 nm Trinity line of “accelerated processing units” (APUs) added an improved on-die GPU (sometimes referred to as a “dGPU”), which fell somewhere between a Radeon 6000 HD and 7000 HD in architecture.  Trinity also ditched the aging K10 architecture for a leaner, enhanced Bulldozer core, code-named Piledriver.  Power fell to between 65 to 100 watts.

II. Richland Sees Gains in Budget Desktop Space, But Struggles in Notebook Market

This year AMD once more refreshed the A8/A10 branded chip line with Richland cores.  The Richland refresh kept Trinity Piledriver core design relatively unchanged, but AMD’s experience with the 32 nm node allowed it to bump clock speeds roughly 10 percent.  
  AMD previews Richland @ CES 2013 [Image Source: Brandon Hill/Global Tech News]
Graphics wise Richland featured a new Neptune on-die graphics processing unit (dGPU), but despite being branded as the Radeon 8000 Series, this GPU was based on the aging VLIW-4 core design introduced with the Radeon HD 6000 series, and replaced in the Radeon 7000 series by Graphics Core Next (GCN).  This isn’t to say that Richland‘s GPU didn’t evolve on a separate parallel path — it did– however, the lack of GCN also had some decided downsides.  Most notably GCN is a SIMD architecture; fundamentally different from the MIMD style approach used by VLIW-4.

Richland uses the older VLIW-4 MIMD GPU compute units, where as Kaveri and Volcanic Islands use the new SIMD GCN 1.1 computer units. [Image Source: AMD via WCCFTech]
Ultimately, SIMD would seem to hold some advantages in terms of GPU computing, given that both AMD and NVIDIA Corp. (NVDA) have adopted this approach in their high-end GPUs.  But that said AnandTech‘s tests showed Richland to actually perform quite well in compute, indicating that while its branch of the Radeon 8000 tree was decidedly different, it was not grossly inferior.
While it was announced in January, Richland didn’t become available until this last June, a launch window that put it head-to-head with Intel’s 22 nm node Haswell chips.
Given the larger node size, Richland chips were slightly cheaper ($105-150 USD versus $182-339 USD), however, Intel’s Haswell SoCs were much faster CPU-wise.  Haswell U series Core i5 chips also offered aggressive power efficiency, consuming as little as 11.5 watts on the mobile end and 28 watts on the desktop end.  By contrast Richland‘s top performing chips only achieved a 17-watt envelope in mobile chips and a 45-watt envelope on the desktop side, due largely to AMD’s larger transistors (32 nm versus 22 nm).  
In terms of graphics Intel and AMD took somewhat different roads — AMD’s dGPU featured an external MXM memory package (2 GB) which was larger than the “Crystalwell” embedded memory found in new Intel “Iris” dGPUs, but also slower.  Ultimately, Intel’s eDRAM+DRAM approach stacked up relatively comparable bandwidth-wise with AMD’s pure MXM GDDR5 solution, with Intel’s solution being slightly superior from a reliability and power standpoint.

Iris Pro finally gives AMD’s APUs a real challenge, graphically, but it’s only available for ultrabooks. [Image Source: Intel]
The key area where Richland has done well is finding price-sensitive niches to compete at, including the budget PC market.

Intel disappointed many when it decreed that its high-end Iris Pro 5200 GPUs (the only Haswell dGPUs to feature Crystalwell) would be only available as a ball-grid array (BGA) packaged design for select ultrabooks.  Haswell desktop chips were stuck with the lower-end HD 4600 graphics solution, which Richland handily beat in terms of graphical performance [source].  For buyers of machines with a discrete GPU, this wasn’t a major concern, but for those building or buying leaner outfitted rigs (e.g. a cheap home theater PC) this was a major letdown.

By July a handful of laptops with Iris Pro (HD 5200) graphics (found in ‘R’ and “HQ” series Core i5 and i7 processors) had popped up.  Intel’s Iris Pro GPU turned the tables, outperforming Richland by anywhere from 25-40 percent in gaming benchmarks [source; source] — albeit at a higher unit cost that ultimately translated to higher laptop prices.

AMD’s Q3 earnings report reflected this reality.  The company wrote:
Computing Solutions segment revenue decreased 6 percent sequentially and decreased 15 percent year-over-year. The sequential and year-over-year declines were due to decreased notebook and chipset unit shipments, partially offset by an increase in desktop unit shipments.

  • Operating income was $22 million, compared with operating income of $2 million in Q2 2013 and an operating loss of $114 million in Q3 2012. The Q3 2012 operating loss included an inventory write-down of approximately $100 million primarily consisting of first generation A-Series accelerated processing units (APUs).

II. Meet Kaveri, the Fourth Generation Chip in the A8/A10 Line

At this point you might be wondering — didn’t AMD promise to launch 28 nm A8/A10 chips this year? Indeed, AMD had hoped to ship Kaveri — the successor to Richland (and first AMD 28 nm APU) in H2 2013.  
A Kaveri chip (far left) aside a unpackaged Richland chip (middle); a Steamroller core is on the right. [Image Source: Overclock.net (left); AMD via ExtremeTech (right)]
Ultimately that release date slipped to January 2014.  Kaveri features AMD’s new Steamroller core design which packs 3 more ALUs per core and other improvements to improve on both parallelism and single core performance.

The GPU onboard Kaveri is at last merged with the mainline Radeon series.  It features 512 GCN 1.1 SIMD cores (stream processors) packaged into 8 compute units (CUs).  These are the same cores found in the latest and greatest Radeon Rx 2xx Series GPUs (codenamed Volcanic Islands) and in AMD’s upcoming chips for the Microsoft Corp. (MSFT) Xbox One and Sony Corp. (TYO:6758) PlayStation 4.

AMD makes an interesting comparison of its CPU-versus-GPU die-space ratio versus Intel’s:

… of course it’s somewhat unclear whether AMD is talking about the Iris/Iris Pro Haswell chips or (more likely) the HD 4600.

These major changes mean a shift in chipset as well.  The new chipset is dubbed FM2+.  Hence, Kaveri won’t be available as a drop-in replacement for Richland FM2 motherboards.

Kaveri swaps out Richland‘s FM2 socket for FM2+. [Image Source: WCCF Tech]
After a back and forth debate rumor-wise about whether it was feasible to pool GDDR5 and DDR3, the believers won out as AMD has indeed adopted this novel technology for Kaveri.  The 28 nm chip features a new “HUMA” (Heterogeneous Unified Memory Access) as outlined in slides which leaked in April 2013.  

[Image Source: AMD via Bit-Tech]
HUMA supports up to 32 GB of pooled memory total (including support for four DDR3 DIMMs) — which likely will often mean 16 GB of DDR3 and 2-4 GB of GDDR5.

[Image Source: AMD via Bit-Tech]
A footnote reveals more data on the upcoming Kaveri A10-7850K APU.  The slide points to a CPU clockspeed of 3.7GHz, and GPU clockspeed of 720MHz.  This A10 is clearly a desktop part, consuming 95 watts.  A leak from July indicates AMD will also release a 1.8 GHz CPU clock (2.3 GHz turbo clock); 500 GHz GPU clock chip on the mobile end.

[Image Source: AMD via Bit-Tech]
Also interesting is the inclusion of a Cortex-M5 processor, a tweaked licensed intellectual property (IP) core from ARM Holdings plc (LON:ARM).  While the x86 Streamroller CPU and SIMD GCN 1.1 GPU cores are still doing most of the heavy lifting, the added ARM coprocessor gives AMD a low-power tool for “console class gaming sound and movie theater surround processing”.

This becomes very interesting when you consider the sales commentary in AMD’s earnings report.  Clearly AMD has recognized that the power-hungry A8/A10 chips have struggled to find acceptance in laptops, but have been embraced by HTPC users on a budget (as evidenced by the increase in desktop sales).  This is a very smart move as AMD is adding value to one of its product’s biggest target audiences.

Overall Kaveri is a major jump in terms of core design for AMD.  Not only does in unify the until now disparate GPU/APU graphics core trees, but it goes a step further, putting for in essence a single common platform for game console chips, a broad spectrum of APUs, and GPUs.  This is the first generation that’s featured this unification so it’s important not judge the gains to hastily.  But it’s definitely the start of big things.

IV. AMD’s Mobile Successors, “Beema”,

In related news, AMD also announced two new processor designs, Beema and Mullins, which will fill in the A4/A6 product line, targeting laptops and tablets.

Mullins notably packs a razor-thin 2 watt TDP, with an onboard GPU.  That’s a major approved over the most efficient Temash chips which consumed 3.9 watts.  Given AMD’s reptutation for aggressive pricing these gains may transform it into a serious competitor for tablets and even phablet-style smartphones perhaps.  Versus Temash, which saw very weak adoption, Mullins gives AMD a fighting chance against Intel and ARM chipmakers (who, its worth noting, are stuck on the same node).  

AMD’s Temash chip seemed promising, but was largely ignored by OEMs. [Image Source: PC World]
Beema is slightly more power hungry, slotting into a 10-25 watt envelope, a modest improvement over the 17-35 watt envelope of Trinity

Both processors are anchored by the fresh Puma core design, which replaces the Jaguar cores found in AMD’s Temash  and Kabini platforms — the chips that comprise AMD’s current E1/2 branded lineup, as well as much the A4/A6 branded lineup.  Like Kaveri, both mobile-minded cores pack a licensed ARM Cortex-M5 processor.  In this case the coprocessor is used to enhance mobile security via AMD’s TrustZone technology that allows for secured financial transactions online via hardware-level encryption.

Mark Papermaster, AMD’s chief technology officer and senior vice president voices AMD’s determination to turn around its mobile APU offerings after a rocky 2013 APU slump.  He comments, “AMD is establishing excellent momentum this year in the low-power, mobile computing market and with ‘Mullins’ and ‘Beema’ coming in 2014 we are not standing still.  AMD aims to deliver a set of platforms in the first half of next year that will outperform the competition in graphics and total compute performance in fanless tablets, 2-in-1s and ultrathin notebooks.”
Fiscally AMD has performed well in 2013, and the year has been a turning point for AMD.  Now with the new Steamroller, Puma, and GCN 1.1 CPU/GPU cores anchoring AMD’s entire lineup, and with a consistent approach that offers the same kinds of computing IP (from ARM processors down to the processor layout) in a diverse range of product families, AMD’s vision is coalescing as well.
It should be a very interest 2014 for AMD.  Even if it can’t live up to its ambitious mobile hopes, it should see some small gains at least, and Kaveri should make a big splash in the budget desktop space, sustaining growth, assuming AMD continues to deliver come shipping time in H1.

Recent Posts

AMD Dual-Core Optimization Utility Available

AMD Dual-Core Optimization Utility Available

Improving dual-core compatibility for gaming

5.7″ ZTE ZMAX “Phablet” Coming to T-Mobile Sept 24 for $252

ZMAX will come with a Snapdragon 400 processor and 720p display

100 Northern California Households to Receive Plug-in Priuses

UC Davis dares to go where Toyota won't with the Prius

Apple on Microsoft Ads: PCs Are “No Bargain”, Macs Are “Cool”

An Apple spokesperson fires back over Microsoft's latest commercials

Update: 13.3″ Dell XPS m1330 Notebook Details Leaked

Engadget gets the scoop on Dell's latest "ultra-portable" notebook