The goal of this article, Overview of Cloud Based Big Data Platforms And Tools From IBM is to classify the Big Data tools offered by IBM. Also, it will provide the readers an idea around what the big data tools IBM has. As our experience is as user, this article possibly will help the prospective users and developers from easily testing their tools or plan a complete setup. Among so many products … [Read more...]
Create Data Science Environment on Cloud Server With Docker
This guide is not directly related to typical Big Data tools like Hadoop. In this guide we will configuring Docker and Jupyter on a cloud server. Here are the steps, commands to create data science environment on cloud server with Docker for data analysis starting with a blank server with SSH. The users following this guide must have adequate knowledge so that they do not need to read guides like … [Read more...]
Advantages & Disadvantages of Using IBM Big Data Analytics On Cloud
Discussing around the advantages & disadvantages would be just a list. In order to discuss and study advantages & disadvantages of using IBM Big data analytics on cloud in details, we need to try to understand the strategy of a company providing the service, have an overview of the major commonly used products, analyze the documentations and free resources offered. The term Big Data, which is … [Read more...]
What is Fog Computing, Fog Networking, Fogging
Fog computing is a system-wide architecture made up with near-user edge devices which is useful for deploying seamlessly resources and services for computing, data storage, network capabilities over the infrastructure that connects Cloud to the Internet of Things (IoT). Fog Computing is an extension and improvement of the Cloud paradigm in support of IoT applications that must comply with precise … [Read more...]
Install Apache Drill on Ubuntu 16.04 LTS Single Cloud Server
Apache Drill is SQL Query Engine for Hadoop, NoSQL, Cloud Storage and file systems including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. As example, we can join a collection in MongoDB with a directory of event logs in Hadoop. Apache Drill features JSON data model which enables queries on complex and nested data. Drill … [Read more...]
Installing Apache Airflow On Ubuntu, CentOS Cloud Server
Apache Airflow possibly needs a small introduction. It is basically a usable part for managing servers running Big Data tools. Airflow scheduler executes tasks on an array of workers while following the specified dependencies. There is command line utilities. Similar technology is behind Luigi, Azkaban, Oozie etc. Luigi is simpler in scope than Apache Airflow. Here are the steps for installing … [Read more...]
Real-time Big Data Analytics in Health Care Using Tools From IBM
The goal of this article is to provide the readers a basic technical understanding for big data applications in the health care system as well as provide information to develop or integrate various technologies with definite possible positive effects as return. This article, Real-time Big Data Analytics in Health Care Using Tools From IBM, does not cover the other areas of application of Big … [Read more...]
Apache Solr vs. Elasticsearch For WordPress Search
We have shown as guide how to install Apache Solr, you will likely to on some cloud server instance for testing/dev purpose. Elasticsearch as such we shown to install with Hadoop. Apache Solr vs. Elasticsearch for WordPress search is a complicated topic as the number of plugin is a matter which we need to consider. Elasticsearch basically packaged by Elastic company. Apache Solr is a direct … [Read more...]
Install Apache SystemML Machine Learning System on Ubuntu
Apache SystemML provides a system for machine learning using big data on top of Apache Spark. Previously we described how to install Apache Mahout for building machine learning platform. Here is another solution for machine learning - Apache SystemML. In this guide we will show you how to install Apache SystemML machine learning system on Ubuntu 16.04. SystemML needs a minimum guidance to get … [Read more...]
How To Install Anaconda on Ubuntu 16.04
Conda is a binary package manager used by Anaconda and other systems of Python and R. Anaconda commonly used for data processing, scientific computing, predictive analytics by the data scientists, developers, analysts, and peoples working in DevOps. Anaconda has collection of over 700+ F/OSS packages. Conda is the CLI utility of Anaconda. Here is how to install Anaconda on Ubuntu 16.04. Here is … [Read more...]
How To Install Apache Solr 6.x on Ubuntu 16.04
Solr is pronounced as Solar. It is an enterprise search platform built on Apache Lucene. Similar type of software is Elastic Search. Elasticsearch is on same Apache Lucene and more commonly used with Kibana, Beats, and Logstash as Elastic Stack for searching, analyzing and visualizing data. Here is how to install Apache Solr 6.x on Ubuntu 16.04. Apache Solr search platform can be integrated with … [Read more...]