Normal way of using Hadoop, Spark, Pig etc is to type the commands to install them. We have published numerous guides on each of them on this website to install via SSH. Second way is using some third party ready to use service where they are already installed, they are essentially like IBM’s free Demo Cloud. Third way is to use some virtual appliance which has those preconfigured softwares so that it is just easy to run on localhost or remote server. In this guide we are talking about the third way. Here is how to get started to use Hortonworks sandbox for ready-made Hadoop, Spark, Pig etc to avoid the repeat typing commands to install them every time we need a new installation for various reasons. Why Cloudera, Hortonworks exists that we discussed in a separate article. The process of manual installation over SSH is for selective works i.e. learning purpose, installing for a heavy duty project etc. For everyday work with a fresh installation, Hortonworks Sandbox like thing is practical.
Hortonworks Sandbox For Ready-Made Hadoop, Spark, Pig etc
Hortonworks HDP Sandbox has Apache Hadoop, Apache Spark, Apache Hive, Apache HBase and many more Apache data projects. Whereas Hortonworks HDF Sandbox is for Apache NiFi, Apache Kafka, Apache Storm, Druid and Streaming Analytics Manager. Commonly we need Hortonworks HDP. It is normal compare Hortonworks with Cloudera :
We are possibly interested about Hortonworks Sandbox on a virtual machine – it can be localhost or remote server. We can get Hortonworks Sandbox working on VirtualBox, VMWare or Docker. These links will help you :
At least 4 GB of RAM needed for basic applications. Ambari, HBase, Storm, Kafka, or Spark needs minimum at least 10 GB of RAM.
There is also PDF for the newbies to use Hortonworks on VirtualBox :
We guess this much information is enough to get started.
Follow the Author of this article :