In our one old guide, we provided steps to install Apache Ambari. Apache Tez is an extensible YARN framework for building interactive data processing applications. Tez can also be used by other Hadoop components such as Apache Hive and Apache Pig to improve query performance. We Can Also Install Tez Using Ambari UI Instead of CLI. Here Are the Details on How To Install Apache Tez on One Node … [Read more...]
How To Install Kylin on Ubuntu 16.04 Server (Cloud/VPS/Dedicated)
Apache Kylin a Distributed Analytics Engine to provide SQL interface and OLAP on Hadoop. It was originally developed by eBay Inc. In our previously published guides we discussed how to install Apache Hue on Ubuntu Server. We have separate guides on how to install Apache Hadoop, how to install Ambari, how to install Apache Pig. In This Guide, We Will Show the Steps on How To Install Kylin on Ubuntu … [Read more...]
Machine Learning in Medical Diagnosis : GitHub Projects
Previously we talked about logical structuring medical application for mobile or web. Here Are Some GitHub Projects Around Machine Learning in Medical Diagnosis. Few current applications of AI in medical diagnostics are already in use. Machine Learning and AI is relatively slower growing compared to usage in core technical matters because of mess with data, lack of free data and somehow modern … [Read more...]
What is Teradata? What is Importance of Teradata in Big Data?
So far, we discussed many basic matters around Big Data and Data Analytics, Big Data software solutions which are available under the hood of Apache Foundation and services offered by companies like IBM. In Big Data field we often hear the phrase Teradata. What is Teradata? What is Importance of Teradata in Big Data? This article will discuss the basics around Teradata. Teradata is a relational … [Read more...]
What is Data Mart? Difference of Data Mart With Data Warehouse
In earlier publications on this website, we already discussed some of the basic, must to know matters around Big Data. Related to current topic they are Theoretical Foundations of Big Data, Data Lake, Data Refining, Difference Between Data Lake and Data Warehouse, ETL (Extract, transform, load) etc to mention a few. We often hear discussion around Data Mart. What is Data Mart? In simple language, … [Read more...]
What Is Data Mining? Examples of Data Mining Software
The term data mining is a bit misleading, because it is about gaining knowledge from existing data and not to the generation of data itself. What is Data Mining? Data mining is the systematic application of statistical methods to large databases with the aim of identifying new patterns and trends. The mere capture, storage, and processing of large amounts of data is sometimes referred as buzzword … [Read more...]
Getting Started with Microservices
Over the years, more and more companies are shying away from a single, monolithic system for obvious reasons. Why a monolithic approach is efficient for smaller web services and applications, it is not the best approach to take when the user have complex needs and bigger systems to run. Companies with more complex needs are now relying on microservices as a solution. Microservices bring several … [Read more...]
How To Install PostgreSQL on Ubuntu 16.04 LTS
Old readers can recall that we had guides for PostgreSQL with WordPress. New and old users can recall our recent guide on how to install MongoDB. PostgreSQL is time tested and often compared head to head with MongoDB. Here is how to install PostgreSQL on Ubuntu 16.04 LTS Server via SSH. In this article we will avoid the comparison but only discuss the steps to install, configure and secure … [Read more...]
How To Install RabbitMQ on Ubuntu 16.04 LTS
RabbitMQ is a message broker for the messaging protocols. RabbitMQ supports Advanced Message Queuing Protocol (AMQP). Message brokers in computer networks are software applications to communicate by exchanging formally-defined messages over protocols for the message-oriented middleware. Previously we talked about Apache Kafka, that is basically similar type of software. Here are the steps on how … [Read more...]
Hortonworks Sandbox For Ready-Made Hadoop, Spark, Pig etc
Normal way of using Hadoop, Spark, Pig etc is to type the commands to install them. We have published numerous guides on each of them on this website to install via SSH. Second way is using some third party ready to use service where they are already installed, they are essentially like IBM's free Demo Cloud. Third way is to use some virtual appliance which has those preconfigured softwares so … [Read more...]
Building Cognitive Applications with Data Science Experience
All of us are aware of dividing workload to the specialised persons. However, while developing free and open source applications and time to time smaller commercial applications, such division of specialized experience often gets blurred and one person handles all parts alone. Such use cases need better free tools (like IDE) to get kind of helping hand. Just a quick recapitulation of roles - … [Read more...]