Hadoop-based Data Analytics on IBM SmartCloud Enterprise
Cloud computing and big data analytics are two areas of technology that currently are:
* Cloud computing provides the benefits of elasticity, on-demand access to resources and utility-like billing.
* Big data processing and analytics using Hadoop provides a framework to take advantage of these resources by distributing the workload into a cluster of computers.
This article explains how to get started using Hadoop on IBM® SmartCloud Enterprise. You will learn how to set up a three-node cluster and verify your cluster is working. With the Cloud and Hadoop, you can handle large amounts of structured or unstructured data in a timely manner. Though Hadoop was not designed for virtualized environments such as the ones provided with the cloud, the cloud still provides an environment that is easy to set up and cost effective. The results of running a Hadoop job on physical nodes are likely to be superior to running the same job on virtualized nodes on the cloud; nevertheless, the cloud has opened the doors to any kind of user to run a Hadoop job which means a user is able to manipulate big data, something not possible in the past.