• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here: Home » Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

By Abhishek Ghosh January 21, 2017 6:46 am Updated on January 21, 2017

Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

Advertisement

Previously, we talked about Apache Hadoop Framework. Here is How Install Apache Hadoop on Ubuntu on Single Cloud Server Instance in Stand-Alone Mode With Minimum System Requirement and Commands. Apache Hadoop is designed to run on standard dedicated hardware that provides the best balance of performance and economy for a given workload.

 

Where I Will Install Apache Hadoop?

 

For cluster, 2 quad core, hexacore upwards CPUs running at least 2GHz with 64GB of RAM is expected. We are installing as Single Node Cluster. Minimum 6-8 RAM on virtual instance is practical. You can try VPSDime 6GB OpenVZ instance at $7/month. Now, Hadoop is written in Java and OpenVZ is not exactly great for running Java applications. The host can kick out you if you whip their machine to have higher load average. If you want VMWare, then Aruba Cloud is cost effective and great. You can do testing, learning work on OpenVZ but it is not practical to run high load work with OpenVZ.

 

Steps To Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

 

We will install a single-node Hadoop cluster on Ubuntu 16.04 LTS. First prepare :

Advertisement

---

Vim
1
2
3
4
cd ~
apt update
apt upgrade
apt install default-jdk

jdk or OpenJDK is the default Java Development Kit on Ubuntu 16.04. Now check the java version :

Vim
1
java -version

Sample output :

Vim
1
2
3
openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~16.04.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

We will create a group named hadoop and add a user named hduser :

Vim
1
2
sudo addgroup hadoop
sudo adduser --ingroup hadoop hduser

Next we will install extra softwares, use SSH as hduser, generate key, setup password less SSH for hduser on localhost :

Vim
1
2
3
4
5
6
7
apt install ssh rsync
su hduser
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
su k
sudo adduser hduser sudo

Here are releases of Apache Hadoop :

Vim
1
2
http://hadoop.apache.org/releases.html
https://dist.apache.org/repos/dist/release/hadoop/common/

Apache Hadoop 2.7.3 is the latest stable at the time of publishing this guide. We will do these steps :

Vim
1
2
3
4
5
6
wget https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
tar xvzf hadoop*
rm hadoop-2.7.3.tar.gz
cd hadoop-2.7.3
sudo mv * /usr/local/hadoop
sudo chown -R hduser:hadoop /usr/local/hadoop

/usr/bin/java is a symlink to /etc/alternatives/java which is a symlink to default Java binary. We need the correct value for JAVA_HOME :

Vim
1
readlink -f /usr/bin/java | sed "s:bin/java::"

If the output is :

Vim
1
/usr/lib/jvm/java-8-openjdk-amd64/jre/

then we should open :

Vim
1
nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

and adjust :

/usr/local/hadoop/etc/hadoop/hadoop-env.sh
Vim
1
2
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/

Now if we run :

Vim
1
/usr/local/hadoop/bin/hadoop

We will get output like :

Vim
1
2
Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
  CLASSNAME            run the class named CLASSNAME

Up to This Step is Minimum, Basic Apache Hadoop on Ubuntu on Single Cloud Server Instance Setup. It means Hadoop is ready to be configured.

 

Configuring Apache Hadoop

 

We need to modify the following files to get a complete Apache Hadoop setup:

Vim
1
2
3
4
5
~/.bashrc
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
/usr/local/hadoop/etc/hadoop/core-site.xml
/usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/hdfs-site.xml

Run :

Vim
1
2
update-alternatives --config java
nano ~/.bashrc

Add these :

~/.bashrc
Vim
1
2
3
4
5
6
7
8
9
10
11
12
#HADOOP START
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP END

Save the file. Run :

Vim
1
2
3
javac -version
which javac
readlink -f /usr/bin/javac

Note the values. /usr/bin/javac is from output of which javac command. Run :

Vim
1
nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Modify :

/usr/local/hadoop/etc/hadoop/hadoop-env.sh
Vim
1
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

The above is from previous outputs. Do not blindly copy-paste. Save the file. Now do these :

Vim
1
2
mkdir -p /app/hadoop/tmp
sudo chown hduser:hadoop /app/hadoop/tmp

Open :

Vim
1
nano /usr/local/hadoop/etc/hadoop/core-site.xml

Modify :

/usr/local/hadoop/etc/hadoop/core-site.xml
Vim
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/app/hadoop/tmp</value>
  <description>A base for other temporary directories.</description>
</property>
 
<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>

Run :

Vim
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
nano /usr/local/hadoop/etc/hadoop/mapred-site.xml
<pre>
 
 
Modify :
 
<pre title="/usr/local/hadoop/etc/hadoop/mapred-site.xml">
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>
</configuration>

Run :

Vim
1
2
3
mkdir -p /usr/local/hadoop_store/hdfs/namenode
mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown -R hduser:hadoop /usr/local/hadoop_store

Open :

Vim
1
nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

Modify :

/usr/local/hadoop/etc/hadoop/hdfs-site.xml
Vim
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<configuration>
<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>
<property>
   <name>dfs.namenode.name.dir</name>
   <value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
   <name>dfs.datanode.data.dir</name>
   <value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>

Try to run :

Vim
1
2
cd ~
hadoop namenode -format

Above command must be executed before we start using Hadoop. Basically the commands are for real physical server. You can read this guide :

Vim
1
https://wiki.apache.org/hadoop/Virtual%20Hadoop

The last command can fail for a given host-virtualisation technology. For that reason, in last step we will show how to use the bundled MapReduce program. If the above fails, you can use in that way. As you are new user with limited budget, we tried to emulate physical servers for learning plus offer a universal working example.

Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

Now, we can use as from as fresh SSH :

Vim
1
2
3
sudo su hduser
cd /usr/local/hadoop/sbin && ls
start-all.sh

Actually on localhost you can browse to :

Vim
1
http://localhost:50070/

You need to adjust the localhost to fully qualified domain name to really see. We have successfully configured Hadoop to run in stand-alone mode. We will run the example MapReduce program. Run :

Vim
1
2
3
mkdir ~/input
cp /usr/local/hadoop/etc/hadoop/*.xml ~/input
/usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep ~/input ~/grep_example 'principal[.]*'

More not possible to write on this guide, you may read here :

Vim
1
https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

Tagged With download apache hadoop , hadoop ubuntu , install apache hadoop in ubuntu , install apache hadoop ubuntu , install hadoop on ubuntu standalone , mv server to server hadoop , paperuri:(e6e446524564c08f94460d43ef725edb) , ubuntu install hadoop , vim ~/ bashrc export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

This Article Has Been Shared 425 Times!

Facebook Twitter Pinterest

Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to Install Apache Hadoop on Ubuntu on Single Cloud Server Instance

  • Create Own Social Media RSS Feed Stream on Cloud API

    Create Own Social Media RSS Feed Stream on Cloud API and host on Rackspace Cloud Files or Dropbox. It is fully JSON,js based and has no php or python component.

  • Software Defined Network : Basics

    Software defined network (SDN) abstracts and decouples the network system at the crucial point of traffic management through control plane and the data plane.

  • Types of Cloud : Private Cloud, Public Cloud and Hybrid Cloud

    Types of Cloud is an easy to understand article on Private Cloud, Public Cloud and Hybrid Cloud with brief description of each types divided in this way.

  • Seagate’s Hard Drive for Cloud Storage

    Seagate has introduced a new range of hard drives named Kinetic designed for usage in cloud storage infrastructure and Software Defined Storage.

  • Driverless Car or Autonomous Car : Basics

    When We Say Driverless Car or Autonomous Car, it Means Level 4 of the Classification System. Trying to Automate the Cars Started From 1920.

Additionally, performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (24.3K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • Exploring the Benefits and Advantages of Microsoft’s Operating System March 22, 2023
  • Web Design Cookbook: Accessibility March 21, 2023
  • Online Dating: How to Find Your Match March 20, 2023
  • Web Design Cookbook: Logo March 19, 2023
  • How Starlink Internet Works March 17, 2023

About This Article

Cite this article as: Abhishek Ghosh, "Install Apache Hadoop on Ubuntu on Single Cloud Server Instance," in The Customize Windows, January 21, 2017, March 23, 2023, https://thecustomizewindows.com/2017/01/install-apache-hadoop-on-ubuntu-on-single-cloud-server-instance/.

Source:The Customize Windows, JiMA.in

PC users can consult Corrine Chorney for Security.

Want to know more about us? Read Notability and Mentions & Our Setup.

Copyright © 2023 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT