Apache Flink is a big data processing engine which can run in both streaming & batch mode. data Artisans is the company who is the original creator of Flink. It started as a project called Stratosphere, which was forked, and became Apache Flink. Flink can be deployed on local machine, on cluster (it can run on YARN), or can be deployed in the cloud. Core of Apache Flink is a distributed streaming dataflow engine. It written in Java and Scala. Flink natively supports execution of iterative algorithms. Programs to run with Flink can be written in Java, Scala, Python, SQL. Flink has no own data storage system and provides data source and sink connectors to Apache Kafka, HDFS, Apache Cassandra, ElasticSearch, Amazon Kinesis etc. Apache Beam is a shared programming model for which Flink is backend.
Is not Apache Spark does similar job? Yes. Flink is not alone. That is why we published an article named Apache Spark Alternatives To Overcome Integrity Issues, from our that previous article :
Apache Flink is considered as powerful competitor of Apache Spark. Spark is based on resilient distributed datasets (RDDs). Flink is optimized for cyclic or iterative processes by using iterative transformations on collections. Flink is also a strong tool for batch processing.
However, this article is not about comparison. Let Us Move to the Steps on How To Install Apache Flink on Ubuntu Server. We said “Ubuntu Server” to point “no GUI”, you may use a local machine or even Windows 10 Ubuntu Bash to test.
An Apache Hadoop installation is not mandatory to use Flink. Hadoop version needed if you plan to run Flink in YARN or process data stored in HDFS.
Steps To Install Apache Flink on Ubuntu Server
Let us update, upgrade as
root user :
apt update -y && apt upgrade -y
Wait (do not run the next commands till we say to start). Normally we have to install the Java runtime (JRE) with this command :
apt install default-jre
And next we will install Java development environment (JDK) :
apt install default-jdk
Next we set
JAVA_HOME in the bash file with the following command:
export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
Then check with below command :
If you already ran the above steps on machine, you need not to run the below commands. In this example, we can add
webupd8team PPA for empty new machine :
apt install python-software-properties
sudo add-apt-repository ppa:webupd8team/java
apt update -y && apt upgrade -y
and then run :
apt install oracle-java7-installer
Download the binary distribution of Apache Flink from here :
Flink has binary releases marked with a Hadoop version which come bundled with binaries for that Hadoop version. The binary release without bundled Hadoop can be used without Hadoop or with a Hadoop version that is installed in the environment. So read that webpage carefully.
This is as example, without Hadoop (notice the file name
Here as example, you’ll get with Hadoop (notice the file name
## LINK to file
As example of installation, these are the steps :
tar -xzvf flink-1.5.0-bin-hadoop28-scala_2.11.tgz
## we can run
# tar -xzvf flink*
## as command
## start session
## stop session
Here are commands :
http://localhost:8081 and make sure everything is up and running. The web frontend should report a single available TaskManager instance. The version you installed has own official tutorials with examples :
That ends this tutorial.Tagged With how to configure apache flink on ubuntu 18 , ubuntu install flink , ubutunu 18 flink install