Welcome to closedsrc.org, a blog containing random postings and ramblings.

A new look at cloud scale-out with ARM processors

Posted: November 2nd, 2010 | Author: | No Comments »

With the rise of the smartphones and dependencies on web applications and the cloud, there has been one processor architecture that has taken the charge: ARM. Processors based on the ARM architecture have found their way into Apple iPhone, iPad and iPod Touch devices, Android smartphones and tablets, upcoming Chrome OS netbooks and tablets, home network routers and home multimedia hubs. Some of the reasons for the proliferation of ARM-based processors include: low cost, low-to-very-low power consumption, decent processing power, and open development environment.

As the need for additional web applications and cloud resources increases and the need for hosting providers and datacenters to make better use of power and cooling, I think it is time for the ARM architecture to be a formidable competitor in the server market. Granted, I am not alone in thinking about this, as various server builders are already considering options. For the past several years, the IT marching orders have been all about consolidation and virtualization. Heck, I have been a proponent for server virtualization and invested in a virtualized infrastructure career path.

Virtualization has been a hot topic, as it allows workloads running on under-utilized servers to be pooled together on to fewer servers that provide more I/O and higher availability. Hypervisors, such as VMware ESXi, Xen and Hyper-V, have also improved dramatically over the years, mostly in terms of efficiency and manageability. The problem with virtualization in a very large cloud environment is the cost of management tools, lack of network visibility or the cost of exposing network visibility (VN-Link, VMware Virtual Distributed Switching, etc.), and the dependence of relatively expensive servers. Another consideration is that you are generally currently limited to 24 cores/48 threads for every unit of rack space (twin-node 1U server with dual six-core Xeon processors), while each twin-node server can draw more than 500W of power.

Another option would be to purchase an UltraSPARC-T3 server with 16 cores and 128 threads filled with 128GB of RAM, running Solaris 10 with numerous zones or virtual partitions. For many applications, this option would work out quite fine and would allow for a fully supported Java stack top to bottom. Unfortunately, it comes at a cost of moderately-high power consumption, the requirement of FBD modules (something that Intel moved away from with the Nehalem-based Xeon processors), potential floating point bottlenecks, and, a deal-breaker for some, it is an Oracle product.

So, how would a cloud environment built with ARM-based processors look like? One way I see it is work special rackmount chassis with small single-board computers containing a dual or multi-core ARM-based processor, 1-2GB of on-board memory, 2-8GB of flash memory and dual Ethernet ports. The chassis could either aggregate all of the Ethernet ports on to a 1/10Gbps Ethernet switch or provide a pass-through module. A 3U chassis could hold 20  modules, a couple of fans and redundant power supplies, yet draw less than 300W of power total, which would provide a great thread/power ratio.

Would a dual-core ARM-based processor be able to handle the load? If the node is tasked with being on the front-line of fire and not have to handle intensive database queries or floating point processing, then the processor would be up to the task. Even if each node is has a local cache stored in one or more SQLite databases, there would be more than enough processing cycles for such tasks.

Why go with such a small amount of memory and flash-based storage? Simple, the targeted workload would be spread across dozens of nodes and the actual operating system footprint would be pretty small. Considering that you can get a thin LAMP stack on a Debian or Ubuntu server build down to around 1GB, one can surmise that an optimized, stripped-down build could be done in less than 300MB. Heck, the nodes could PXE boot and pull the necessary data over the network and that would reduce or eliminate the need for on-board storage.

How would 20 ARM-based server nodes be able to fit into a 3U chassis? It has been done before, even with nodes that pull more than 15-20W each. Both Sun Microsystems and Compaq had a 3U blade server chassis that supported 20 blades (with one UltraSPARC IIe/IIi or one Pentium M processor, 1-2GB of RAM and one 2.5″ hard drive bay) and an Ethernet switch or pass-through module. One can use the same blade setup, use smaller and more efficient power supplies and cooling (as the need for power and cooling with be a lot less than 20W per blade), update the switch to support Gigabit downstream ports and 10Gbps Ethernet uplink ports, and reduce the chassis depth. I would even bet that one could find a way to fit 20 nodes in 2U of space without sacrificing any functionality or availability.

The other benefit of building out a cloud with a bunch of ARM-based server nodes is that the impact of losing one or two nodes would be minimal, compared to losing one or two virtualization hosts or UltraSPARC-T3 servers. If you were the Borg, would you rather lose a couple of drones or a Borg cube during an attack?

As with any solution to a problem, there is always a list of concerns and gotchas. One significant downside is that the cost to manage and maintain a growing cloud environment, in the form of increasing complexity of managing hundreds or thousands of nodes and the ever increasing need of network ports or switch layers. If each 3U chassis has a network switch and a rack can hold 12 chassis (leaving room for supporting equipment), that adds 12 switches per rack that would need to be maintained (if tasked with VLAN trunking and/or routing) or 240 cables (one per node, double if both ports are used) that will need to be strung to the end of the row. Either way, that’s a lot more switch ports or switches that now have to be managed.

Another downside to using ARM-based servers for a cloud environment is the lack of support for Windows Server. To some, that’s not even an issue as many web applications that run in the cloud already run on a form of Linux or BSD operating system and are written in an interpreted language (PHP, Python, Ruby) or Java. It does mean that you will be locked out from scaling out web applications built on the .NET Framework. At this point, virtualization is the only means of scaling out Windows Server for distributed public or private web applications.

One other point that needs to be considered is that, while many of the common web application stacks are available in source form, web applications that are distributed within a cloud environment as binaries will need to be re-compiled and tested for the ARM architecture. Applications that also depend on floating point performance may not perform at the same level as it would on a server node with an x86 processor; though, the same concern can be had with UltraSPARC T1/T2/T3 servers (due to a limited amount of floating point units available). That can be a concern in the short-term and can be remedied in future editions of ARM-based processors.

While Intel has moved away from making ARM-based processors, it doesn’t mean that Intel is out in the cold, per se. One can easily replace all instances of “ARM-based” with “Intel Atom” and almost all of the arguments above will still be valid; not to mention, the Intel Atom processor runs x86 instructions natively. The only problem right now is that an Intel Atom processor still requires more power and resources (such as: external I/O controllers) when compared to, what would be a common, ARM-based processor aimed at server workloads.

All of that being said, a cloud environment can contain one or more of the options mentioned above and are not limited to just virtualization, consolidation or massive scale-out of thin nodes. Virtualization definitely has its advantages in that it can be used for almost all workloads and massively-threaded servers have been racking up performance benchmark records in some important areas.

Filed under: Blog Post | Tags: , , , ,


Leave a Reply

  •