Archive for the ‘Hadoop’ Category.
It’s been about two and a half years since Enkitec took delivery of our first Exadata. (I blogged about it here: Weasle Stomping Day) Getting our hands on Exadata was very cool for all of us geeks. A lot has changed since then, but we’re still a bunch of geeks at heart and so this week we indulged our geekdom once again with the delivery of our Big Data Appliance (BDA). In case you haven’t heard about it, Oracle has released an engineered system that is designed to host “Big Data” (which is not my favorite term, but I’ll have to save that for some other time). The Hadoop ecosystem has taken off in the last couple of years and this is Oracle’s initial foray into the arena. The BDA comes loaded with 18 servers, each sporting 36 Terabytes of storage for a whopping total of 648 Terabytes. It also comes with Cloudera’s distribution of Hadoop (and software from various other open source projects that are part of the Hadoop ecosystem). We’re very excited to start working with Cloudera and Oracle in this exciting new approach to managing large data sets. Anyway here’s a few pictures:
The rack’s pretty heavy with all the disk drives. One of the delivery guys said he had a full rack of EMC drives that actually fell through the floor of the office building they delivered it to (no one was hurt). Fortunately we didn’t have any mishaps. And at a couple of thousand pounds, we will not be moving it around to see how it looks next to the coffee table (like we do with slightly less heavy pieces of furniture at home).
This is me and my buddy Pete Cassidy (Oracle Instructor Extraordinaire) messing around.
Another picture of me and Pete. Not as good as the other one, but I love the shoes!
Tim Fox loves the big power cables.
The BDA also has a handy beer shelf (this is the top secret new feature).
The BDA cabinet has a lock and of course the key’s were in a well label plastic bag. I had Andy Colvin hold up the label so I could take his picture. I called the shot “DoorKeyAndy”. – seemed appropriate ;)
We started on an interesting mad scientist kind of project a couple of days ago.
One of our long time customers bought an Exadata last month. They went live with one system last week and are in the process of migrating several others. The Exadata has an interesting configuration. The sizing exercise done prior to the purchase indicated a need for 3 compute nodes, but the data volume was relatively small. In the end, a half rack was purchased and all four compute nodes were licensed, but 4 of the 7 storage servers were not licensed. So it’s basically a half rack with only 3 storage servers.
Meanwhile, we had been talking with them about Hadoopie kind of stuff. They are in the telecom space and are interested in pulling data via a packet sniffer which captures info directly from the tcp traffic. During the talks we discussed hardware requirements for building a Hadoop cluster as they didn’t really have any spare hardware available to test with. That’s when the crazy science project idea was born. Someone (who shall remain nameless) suggested that we build the pilot Hadoop cluster on the 4 unused storage nodes from the Exadata half rack. Since the storage servers use basically the same hardware as is used in the Oracle Big Data Appliance (BDA), it’s kind of like having a mini BDA. Of course the storage servers have slower CPU’s and a little less memory so it’s not apples to apples, but the servers do have InfiniBand and the same 3T drives so it’s pretty similar. And since they already had the servers sitting there …
So now we have a mini Hadoop cluster installed (CDH3) with 3 data nodes (roughy 100T raw storage). We also set up the Oracle Big Data Connectors on one of the Exadata compute nodes which allows us to create external tables on files stored in HDFS. Pretty cool. Let the games begin!
Oh and by the way. I’ll probably be talking about this project a bit at E4 (Enkitec Extreme Exadata Expo) on Aug. 13-14 in Dallas.