Tag Archives: Hardware Bytes

ARMy of Servers and Project Moonshot

In my first blog post about building out servers with ARM processors, I had mentioned that one could build a high-density scale-out server infrastructure by fitting 20 blades into a 3U chassis:

How would 20 ARM-based server nodes be able to fit into a 3U chassis? It has been done before, even with nodes that pull more than 15-20W each. Both Sun Microsystems and Compaq had a 3U blade server chassis that supported 20 blades (with one UltraSPARC IIe/IIi or one Pentium M processor, 1-2GB of RAM and one 2.5″ hard drive bay) and an Ethernet switch or pass-through module. One can use the same blade setup, use smaller and more efficient power supplies and cooling (as the need for power and cooling with be a lot less than 20W per blade), update the switch to support Gigabit downstream ports and 10Gbps Ethernet uplink ports, and reduce the chassis depth. I would even bet that one could find a way to fit 20 nodes in 2U of space without sacrificing any functionality or availability.

Well, HP has taken that idea and ratcheted it up to a very impressive scale. Project Moonshot crams 72 server nodes into a 2U half-width tray housing eighteen Calxeda EnergyCards and four external 10Gb/s XAUI ports. Four of those trays can slot into a 4U SL6500 chassis, for a total of 288 nodes (72 nodes per 1U). Each EnergyCard contains four EnergyCore processors with up to 4GB per processor and 4 SATA ports, all while drawing 25 watts. In turn, each EnergyCore processor comes in both two and four Cortex-A9 core configurations and has a high-throughput fabric switch built-in. The fabric switch provides multiplexed access to five 10Gb/s XAUI ports and six 1Gb/s SGMII ports, all of which is wrapped around by three 10GbE MAC ports. Each EnergyCore also provides five SATA ports, several PCIe controllers and an SD/eMMC controller (say, for booting an operating system).

While such an impressive setup may not immediately fit into common enterprise workloads, don’t be surprised to see these things popping up at places where companies need an enormous amount of light/moderate duty workloads that can be scaled out across thousands of threads; and, where the cost of power and cooling are at a premium. Both Ubuntu and Fedora can be used, though Windows Server 8 might be an option if Microsoft deems it to be worth the time and money.

Facebook and Open Compute Project

I just want to get this out in the open, I am not a Facebook (or Twitter) user nor have I been thrilled with the fast-and-loose nature of Facebook’s handling of public or private user information.

In an interesting twist to Facebook’s “openness”, Facebook has released details and documents of their Open Compute Project that is the basis for their recent datacenter located in Prineville, Oregon. The specifications have been released by Facebook under the Open Web Foundation Agreement, while the design and implementation files are released under the Creative Commons Attribution 3.0 license.  Being a hardware-nut and have always been interested in datacenter design and architecture, the released server, rack and power component specifications made my day.

To me, it is very nice to see what was done to make the datacenter as lean and mean as possible, mostly the power supply used by each server node. The 450W power supply has two inputs, a nominal 277VAC input as primary and a 48VDC input as a backup. For typical servers, you would have two separate AC or DC power supplies that should pull power from two different sources (such as independent UPS or circuits). There is inherent inefficiencies with this setup as you have to deal with additional losses due to having two sets of conversion components, increased cooling, and an external interposer.

Right now, the two server boards that have had their designs and specifications released are both dual socket boards. The first board is a high-memory capacity board that can take two AMD Opteron 6100-series processors and has 24 memory slots evenly distributed between the two sockets. The second board uses Intel Xeon 5500/5600-series processors and has the common 18 memory slots (9 per socket, 3 slots per channel) setup found in most 1U and 2U servers. Both boards are custom designs and only have the required components included, such as SATA, USB and Gigabit Ethernet. Efficiency, cost savings and simplicity are the reasons for the stark nature of these boards, which is the priority for large scale-out compute systems.

Another unique feature of both boards is how power is provided; rather than using the more common ATX-style connectors (which are available for testing purposes), the designs call for a edge-mounted connector that is more likely found in Cisco Catalyst 6500-series modules or various blade servers. Again, simplicity and efficiency are critical.

As intriguing as the hardware is, it may not be as practical in more common small or medium business environments. Nonetheless, some of the design elements are already found in blade servers or industrial applications. It is the datacenter design elements that will have more of an impact in the near future, as IT continues its march towards cost and energy efficiency.

Edit: BTW, how cool are LED lights powered over Ethernet?

MacBook Air 13″: Why not a MacBook Pro 13″?

While I was doing my due diligence on getting a netbook-like device that would complement my primary laptop, a Lenovo ThinkPad W510, I did consider a MacBook Pro 13″ with Thunderbolt. The price of the MacBook Pro 13″ with the base 2.3GHz Core i5, 4GB of RAM, 128GB SSD, iWork and AppleCare. The price for such a configuration would be nearly the same price as the MacBook Air 13″ that I purchased.

So, why did I choose to go with a slower and non-upgradeable device (the SSD can be upgraded with a 256GB version, but that’s about it)? Simple, the MacBook Air is roughly 1.5 pounds lighter and roughly one-third of an inch thinner at its thickest part (a good 0.84 inch difference at the thinnest point, but that not a true apples-to-apples comparison, and that pun was intended). Let’s not forget that MacBook Air 13″ display has a native resolution of 1440×900 while the MacBook Pro 13″ display has a native resolution of 1280×800. That difference is noticeable and the extra pixels are very useful when working through a stack of photos in Lightroom 3. Another thing that I haven’t missed in my MacBook Air is the lack of an optical drive. If I’m on the go and need to pop in a CD or DVD, I’ll probably pick up a portable drive (probably not the one from Apple); while at home, I can use the remote disc feature and use the optical drive in my ThinkPad or another computer.

If I had to give up my ThinkPad W510, which I wouldn’t, and needed to downsize, then I would have opted for another ThinkPad, this time a T-series with a 14″ display rather than go with a MacBook Pro. Yes, there is still some overlap between the new MacBook Air and my ThinkPad W510 for some tasks, like browsing or reading e-mail; both of those tasks are nicer on the MacBook Air due to its feather-weight and form factor. For heavy duty tasks, like virtualization, large batches of photo editing or workflows, the W510 wins hands down.

First impressions: Apple MacBook Air 13″

In my last couple of posts, I kind of hinted that I got a new device and did not want to spoil the fun until I had a chance to get settled with the new device. Well, as you can see in the title of this post, the new device in fact an Apple MacBook Air 13″.

Now, some of you might be ready to pelt me with rotten tomatoes or are wondering why in the world would I buy a new Apple product after lambasting Apple on their treatment of iPhone users. Well, my reasons and rationalizations may not be enough to cover all of the concerns and questions.

For the past year, I have been wanting to pick up a netbook, so that I could have a lighter-weight alternative to my 6-odd pound Lenovo ThinkPad W510 mobile workstation. The idea is that I could carry a netbook around with me so that I can download photos that I’ve taken with my Nikon D300, then preview them and delete the ones that I didn’t want.

I looked the various 11″ netbooks from Asus, Acer, Lenovo, Dell and HP, both products targeted at consumers and business users. I even considered a couple of netbooks in tablet/slate form, but none of them really had the same robust feel on a proper Lenovo ThinkPad T or X series or HP EliteBook series of laptops. The screens were pretty cruddy for inspecting photos, the mostly plastic shells and not-so-impressive screen hinges, the 2GB memory limit (some business-class netbooks could take 4GB, but would also come with Windows 7 Professional), meh-tastic stock SSD, and other niggles all did not exude confidence as something I would spend my money on.

Ever since the current generation of MacBook Air laptops came out, I have been interested in what it could potentially offer. I really liked the 13″ screen (the lower resolution of the 11″ model was befitting for a premium netbook, but was not good enough), lightweight yet sturdy industrial design, and actually has a good, non-clicky keyboard. The two things that had kept me from seriously considering the MacBook Air was the cost and it is an Apple product.

So back I went to look at what Dell, Lenovo or HP could offer with their ultra-portable laptops and even considered a convertible table from Lenovo or Dell. The problem that I ran into was the base cost of the ultra-portable laptop or tablet, without 4GB of RAM or an SSD. Considering that whatever I wanted to buy would have to be able to take a little bit abuse now and then, the chassis had to be made from metal and/or alloy and had a decent SSD. Even with a Dell Premier discount from work, an E4310 with the base Core i5-560M, 4GB of RAM, 128GB mobility SSD, 3-year warranty and Intel 802.11n wireless, the cost would still be a bit more than the MacBook Air 13″ that I wanted (more on that in a bit). Also, the E4310 would end up about 1 pound heavier than the MacBook Air 13″, including the AC adapter. I’m also still not sold with the overall build quality or the performance of the Dell SSD. Of course, I could order it with a based hard drive and buy and slap in a mid-range OCZ Vertex 2 SSD and would end up costing about the same (+/- $50).

Two Sundays ago, I stopped by a local Apple Store, again, to take a look at and play with a MacBook Air 13″. I was sold on the form factor, the unibody aluminum case, the (glossy) screen and the overall performance. I was still skittish about the price, but remembered that I had an Apple iPhone gift card that could be used on an iPhone or any other product at an Apple Store or Apple’s online store. That helped lessen the hit to the wallet.

By the time I went home, I decided to go with the MacBook Air 13″ base model (which has a Core 2 Duo 1.86GHz/6MB cache, 2GB of RAM and a 128GB SSD), but wanted to upgrade the RAM to 4GB and add a USB Ethernet adapter, iWork suite and requisite AppleCare. The total ended up just under $1800 with 2-3 day shipping and before applying the gift card.

The laptop arrived Friday morning, at which I immediately and carefully opened the box, plugged it in to charge up the battery and plugged in the USB Ethernet adapter and into the guest network at work. I powered it on and the Apple startup sound boomed a bit too loud (I feel dirty each time I hear it), then proceeded to walk me through the initial setup. Once that was all done with, I opted to start pulling down the latest updates, which totaled to around 230MB and took about 45 minutes to download and install. I rebooted it and was back at the desktop in under a minute. That’s faster than the time it takes for my ThinkPad to accept my fingerprint, power up and get to the Windows is starting animation. Granted, my ThinkPad has to spin up a hard drive and do a quick scan of the four 4GB memory modules :D

I have been using the MacBook Air 13″ as my primary laptop over the past couple of days, making sure that I downloaded and installed Firefox 4 RC, Google Chrome, Sequel Pro, Cyberduck and went through all of the System Preferences to get everything set up the way that I wanted. I also caved in and configured the default Mail client to access my personal and GMail mailboxes, as well as using iChat as the IM client of choice. I had to install and trust the self-signed CA certificate that I use on my personal servers so that the various browsers and applications would be happy. I have run into a couple of glitches with the iChat application and the Google Talk tunnel I have set up on my personal Jabber server. I also set up the Address Book and iCal applications to sync up with my calendar and contacts hosted by Google.

The low weight and thin form factor almost makes my geeky want of any tablet disappear. Sure, it is larger and weighs more than a Motorola Xoom, Samsung Galaxy Tab or the Apple iPad/iPad 2 and takes a bit longer to “boot up”, but it is so much more usable for personal and work purposes.

It has been several years since I used Mac OS X on a regular basis, so I had to re-learn the OS X nuances and differences compared to Windows 7 and the GNOME desktop that comes with Ubuntu 10.04 LTS. I also have not had a chance to install Nikon Transfer and Adobe Photoshop Lightroom 3, but that’ll have to wait for another day or two.

I have yet to use the new Apple App Store for OS X, as I have found all of the applications that I care for off of the project’s websites. The only application equivalent that I haven’t really found yet is something akin to Foobar 2000 on Windows. I tried Clementine, but it kept on throwing errors when trying to play FLAC files off of my file server or ones that I’ve copied to the local disk. I would prefer not have to deal with iTunes or QuickTime, so I may have to switch over to VLC as my music player.

I am also still getting used to the multi-touch gestures and the two-finger “right-click”. The trackpad itself is quite nice and is a step up from either of my ThinkPad laptops (sorry IBM/Lenovo). It almost negates the need for a trackball for most tasks, but may not be ideal for working with digital photography workflow applications.

All in all, I am impressed with the MacBook Air 13″ and I feel that it is almost worth the cost. The only cost that I really have to complain about is the AppleCare for the laptop. $249 isn’t a bad price, but I’m kind of spoiled on the Dell or HP business support contracts and the fact that AppleCare does not really provide an on-site option.

Is it a perfect device for my needs? I can’t really say right now, but ask me again in about six months. Am I feeling buyer’s remorse yet? No, not yet. Can I recommend it to everyone? Not really, as it is not the best bang for the buck for a device that is only used for e-mail, web browsing and watching the occasional movie or TV show. If you want something that feels like a professional-class laptop and weighs just about 3 pounds, the MacBook Air in either form should at last be considered.

…and yes, I am still looking forward to replace my iPhone 3G with an HTC Evo Shift 4G.

Mini-review: D-Link DAP-2553

Both my primary laptop and my new redacted support dual-band 802.11n wireless and I have not been thrilled with the performance of 802.11g in general. Another issue with being stranded in the 2.4GHz band is that there are a lot of access points in my area and trying to get a decent, clear channel is nigh impossible.

So, I started shopping for a dual-band 802.11n (draft or not) access point and really wanted to pick up a Cisco Aironet 1140 802.11a/b/g/n access point. I deal with Cisco Aironet 1231G and 1242G access points on a daily basis at work, so rooting around in IOS would not be an issue for me. The problem? It would cost a blistering $650-670 and that is without a maintenance subscription. Way too much and would not work too well, as it is meant to be hung on a wall or clipped to the support bars of a false ceiling.

I also considered Cisco’s entry SMB wireless access point, the AP 541N, but did not hear a lot of good things about it nor would I want to pay the Cisco premium on a device that I could not jump into an IOS, PIX/ASA or NX-OS shell. I also looked at a Cisco WAP4410N and, while promising in terms of features, but it does not support the 5GHz band that I so wanted.

So I finally settled on the D-Link DAP-2553 dual-band access point and it was about the right price. I got the access point last night and started setting it up so that the first SSID would run on the 5GHz band, then ran into a lovely limitation: the access point can only use one of the two bands, but not both at once. Lame. For now, I have to leave my older 802.11g access point up and running for devices that do not support the 5Ghz band, and that includes: iPhone 3G, Evo 4G, the-iPhone 3G-replacement Evo Shift 4G, Sony PSP, Squeezebox, Nintendo Wii and Nintendo DS/DSi XL.

While not optimal, it does allow me to at least somewhat segregate non-802.11n devices from destroying the bandwidth for the 802.11n devices. Also, I do not really care about squeezing out the last 1Mbps on those devices either.

With the latest firmware available for the DAP-2553, I was able to finish setting up the access point so that it would use my home NTP server (had to be entered as an IP address rather than hostname). I haven’t spent too much time to see if the device supports sending logs out to a Syslog server, something that my 802.11g access point can do.

In terms of performance, I need to test copying a large ISO from my file server and two my two laptops. I did notice a drop in overall latency when on wireless, versus wired, when working over SSH connections to my servers at home. Both devices see a full signal from the access point (whether the access point sees a full signal from the devices is true, I don’t know yet) and negotiate at the full 300Mbps speed. Even with the run from the new access point to the switch and the long haul to the main switch are all Gigabit (I only have one 100Mbps switch, and that’s the one integrated into my Cisco PIX 501 firewall), my DSL connection is still like a stupidly small straw.

At the end of the day, I am disappointed that, unlike a proper Cisco Aironet, the D-Link device cannot use both the 2.4GHz and 5GHz bands so that I can collapse all of my wireless devices on to one access point. One of these days, I’ll pick up a proper Cisco Aironet or equivalent device that can run both bands and proper roaming.

Sixteen-Core Intel Atom for Server Workloads?

Recently, Microsoft made a some news by asking for a special, multi-core (sixteen cores to be exact) version of Intel’s efficient Atom processors to be used in servers. After thinking about it for a couple of days, the idea made a lot of sense. I know, it’s rare that I agree with Microsoft :)

Why so many cores? If you consider the kinds of workloads that application servers must deal with, the server must deal with a large number connections and requests and tend to idle while waiting for data to be crunched by other application or database servers. The actual data crunching that the application servers, particularly web app and content servers, need to do before sending back the results is not all that difficult. The large number of cores would facilitate a large number of threads required to handle many thousands of requests per minute.

Why use the somewhat lackluster Atom processor? The Atom processor may be a bit anemic for desktop or laptop duties, where you have numerous workloads going on at once, including rendering graphics, playing music or videos, web browsing and photo editing. On a web server, the in-order execution of the Atom processor does not have as much impact on an individual request level. Another benefit of using an Atom processor core over a Xeon core is power consumption. A desktop or server-optimized dual-core Atom processor has a TDP of less than 15W, versus a dual or quad-core Xeon’s TDP of 60-80W (even more when you look at the X models).

By taking advantage of the low power requirements of each Atom core, some of the latest fabrication processes and the proliferation of serial interconnects (PCI Express, SATA/SAS, 10GbE), building sixteen Atom cores plus memory controllers and I/O controllers on to one processor package is not too difficult to do. In fact, I put together a basic diagram of what such a processor might look like:

The processor package would include five or seven dies, four of which would each contain four 64-bit capable Atom cores with HyperThreading and an intermediate memory and I/O crossbar. The other three dies could be combined into one, with a central component providing buffered memory interfaces (or SMI in Intel terminology), IPMI for management, and high-speed links to two I/O hubs. Each I/O hub would provide external I/O interfaces, such as PCI Express, 6Gbps SAS/SATA and four 2.5Gbps 8b/10b links. The four 2.5Gbps 8b/10b links can be joined together to provide one 10Gb Ethernet port or four 1Gb Ethernet ports. The only other components a server manufacturer would need to include could include a SoC for remote management (see: ILO, DRAC and ILOM) and possibly a USB controller to provide local media or serial console access by way of a converter.

To some, this discussion may trigger a sense of deja vu. This has in fact been discussed and done before, except with UltraSPARC processor cores rather than Atom processor cores. The product would be called the UltraSPARC T series processors. The first generation was the UltraSPARC T1, which had eight cores sharing an I/O crossbar, memory controller and floating point unit. Each in-order processing core had the facilities to handle four threads concurrently, for a total of 32 threads. Kind of a coincidence that a sixteen-core Atom processor would also be able to handle 32 threads with the help of HyperThreading.

The UltraSPARC T1 debuted to mixed reviews, in which it performed beautifully in naturally multi-threaded environments but suffered under heavy, single-threaded application workloads. The Atom processor ran into some of the same criticism, which was exacerbated by the fact that the first Atom processors only had one core and HyperThreading partially helped when an additional thread was introduced to the workload.

Sun later improved on the design with the UltraSPARC T2, which so happened to integrate not only a PCI Express controller, but also a dual-port 10Gb Ethernet controller and would use fully buffered memory (a bit less efficient than DDR3 via SMI buffers, but helped reduce pin counts). The four concurrent threads per core was lifted to eight, and the shared floating point unit was replaced with one unit per core (which is then shared across the eight threads per core). A second version of the UltraSPARC T2 would later come out to support multiple sockets, at the expense of the 10Gb Ethernet controller, which migrated from being on-package to being located on the system board.

With the re-designed processor, the UltraSPARC T2 continued to beat up other processors in thread-heavy workloads and even conquered several key Oracle benchmarks. The processor still had a slight weakness to single-threaded applications, but that was mostly hidden by an increase in clock speed. The processor was improved once more, now in the form of the UltraSPARC T3.

In short, the idea of creating a many, many in-order processing core processor that can handle two or more concurrent threads per core is not a new idea, nor is it one doomed to fail. In fact, such a processor might be a significant boom for those looking to consolidate and/or virtualize web front-end or web application servers.

Intel, please heed Microsoft’s call and built this processor. If not Intel, will you do it AMD?

P.S.: I know this is a departure from my recent advocacy of building ARM processors explicitly for server workloads, but the two are not mutually exclusive. In fact, many of the ARM processor designs are based on an in-order execution design and require very little power to run. Having both an Atom-based design (or a Bobcat-based design if AMD were to join in) and an ARM-based design would ignite much needed innovation and competition in the server market. Also, an Atom-based design would allow Microsoft Windows-based to be deployed.