• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here: Home » Big Data Analytics Solutions: On-Premise versus in the Cloud

By Abhishek Ghosh October 13, 2017 10:38 am Updated on October 13, 2017

Big Data Analytics Solutions: On-Premise versus in the Cloud

Advertisement

Objective of this article Big Data analytics solutions on-premise versus in the cloud is not limited to comparing on-premise and in the cloud big data analytic solutions but to serve some other goals like clarification of our limitation for the existing guides on localhost or test setup.

Although in the context of Big Data analytics solutions, conventionally various websites compare on-premise versus in the cloud as if they are only need of the enterprise, in-reality that is not valid. There are many types of users who needs different Big Data analytics solutions for different need, like as example for free software development.

We regularly publish guides on how-to install and configure various Big Data tools and at the same time we also discuss about the cloud based Big Data tools from vendors like IBM Bluemix. The latest way of using this service is towards “in the cloud” whereas earlier it was “On-premise”. Our “how to install Apache Hadoop” guides are for learning the basics by the readers, they are not exactly for production purpose.

Advertisement

---

In this article, we will discuss the selection matter of “ready-to-use service” versus “how to install Apache Hadoop” for production purpose. The production setup can be for various needs, starting from development to a solution for an enterprise. Whatever the need is, three situations can arise if we divide data and computing power :

  1. All computation and data on-premise
  2. All computation and data on a specific cloud
  3. Some computation and data on-premise and some computation and data on a specific cloud
Table of Contents

  • Introduction
  • Why We Publish How to Install Guides
  • Distributions and Appliances for Data Science
  • Comparison with PaaS
  • On-Premise versus Cloud Big Data Analytics Solutions
  • Conclusion

 

Big Data Analytics Solutions: Why We Publish How to Install Guides

 

Our how-to install and configure various Big Data tools are comparable to our guides on installing LAMP or LEMP server (like for the need of running WordPress). No real ready-to-use platform is practical in that use case. So, in case of webhosting, it is normal for one to select some server with root access, which depending on the need can be cloud server or VPS or dedicated server or colocation server. But universally all will be following one guide (or own guide!) to create a basic setup and slowly follow the ways to optimize many aspects of that LAMP or LEMP server. It is true that there are ready to use “WordPress hosting” exists. But no way they are closest to offer the latest tweaks or components and intended audience is newbie who are willing to pay higher. That “webhosting” is a separate niche from Big Data tools although when we publish “how to” guides it appears as similar. In case of LAMP or LEMP server we do publish guides on how to tweak zillions of components (compare – we do have guides on how to enable OSCP stapling on Apache, how to enable TLS False Start on Nginx, how to enable caching on Percona MySQL and so on). We never publish guides on how to optimize a Hadoop or Spark installation. Our maximum guides are around Docker-based deployments.

  1. We regularly publish guides on how-to install and configure various servers for webhosting for production environment.
  2. We regularly publish guides on how-to install and configure various Big Data tools but on “commodity hardware” for developmental need of various peoples.
  3. Our guides to install the free Big Data software is not what is “Big Data”.

 

Our website’s one part is assisting the sysadmin works. There is difference between sysadmin works and works of a data scientist. Ultimately, one can easily end the list of updated blogs on Big Data niche from the providers themselves. IBM’s Data Science Courses or BigDataHub are kind of brand-free.

 

Ready to Use Distributions and Appliances for Data Science is Common

 

Now as for developmental need and testing like “on-premise”, there are minimum two type of options:

  1. Guides using the original source code
  2. Guides using Clouera or Hortonworks like ready to use distributions and appliances For Data Science

The second situation is not very popular for configuring LAMP or LEMP server. Well known scripts like Centminmod actually use the “original” source code derived solutions. All webhosts deliver guides on how to install, configure LAMP server because each of them has different types of clients, different types of hardware configuration and total setup is fully open to the public internet via port 80, port 443 at minimum. In case of Data Science, it is common to use company A’s optimized package by company B. IBM using and promoting Clouera or Hortonworks or Oracle’s tools may sound odd but that is practical instead of forking an already tweaked software. In case of Big Data Analytics, unlike LAMP/LEMP server the optimization part is super complicated and time taking. The only segment where the similar situation arises for LAMP or LEMP server is the database software part, where there are three options for MySQL:

  1. Original MySQL
  2. Percona MySQL
  3. MariaDB MySQL

Well, that MySQL is starting of “Big Data”, if you look at the service catalogue of IBM Bluemix, MySQL will go towards Cloud Big Data Services. Obviously like Big Data there are Database as a Service for websites and mobile platforms.
So as like for Big Data softwares, there are resemblance of tweaking MySQL:

  1. MySQL has various engines
  2. MySQL has various tools like offered by Percona to optimize the settings
  3. MySQL delivers the highest possible headache to the WordPress users
  4. MySQL performance compared while running on optimized hardware

Before the advent of Docker, hand installing total system was gold standard for LAMP or LEMP server and still it is standard. LAMP/LEMP is kind of in-between segment as why the webhosting platform is made that is not obvious – normally we configure for WordPress as that is most common use of LAMP/LEMP and we deliberately make the matter easy to one who is learning works from SSH.

 

Comparison of Cloud Based Big Data Analytics Solutions with PaaS

 

The architectural requirements for building a Big Data Analytics solution is not mere software packages or installing them. The tweaking part involves minor or major in-house coding. That need of minor or major in-house coding resulting in huge number of Open Source software contribution in the field of Big Data. Platform as a Service (PaaS) is quite comparable with Cloud Based Big Data Analytics Solutions. But, Platform as a Service (PaaS) mostly not for running a high load production website but developing, but probably better for testing or using as backend hidden from the public. As Platform as a Service (PaaS) had/has limitations for typical LAMP/LEMP, various newer ways like FaaS are evolving.

The reason to introduce the topic PaaS is for a big reason – IBM Bluemix. At present IBM Bluemix is not only a PaaS but a PaaS looking platform delivering various cloud services. IBM Bluemix part uses Open Source software for delivering their own Platform as a Service (PaaS) and a separate service. In case of PaaS like IBM Bluemix, usually the platforms often have all the solutions for their users which automatically forces their cloud based Big Data analytics solutions to compare with PaaS where they are towards SaaS. Big Data software solutions on IBM Bluemix is not exactly comparable with old IBM Bluemix PaaS which used to mainly provide an application hosting platform. We talked about this part as often old guides on same website confuses the new users.

 

Comparison of On-Premise versus Cloud Big Data Analytics Solutions

 

From the above points, it is obvious to favor the usage of cloud big data analytics solutions by the developers as the need of on-premise resources such as own servers, own IT team dedicated for the job is too much higher. The same will be true for the small to medium size companies.

The public cloud always offers benefits for Big Data deployments – self-service, agility, elasticity, and a pay-as-you-go model. The CapEx model for on-premises deployments takes weeks or months to get in to production out of legal issues, procuring servers and racks, configure storage and networking, allocation power backup system to name a few. By comparison, the on-demand model of cloud big data analytics solutions can be very attractive, if not really subjected to test.

A cloud Data analytics solution with instant integration systems can be set up rapidly for all data sources. However, all like any cloud-based services, the cloud based big data analytics solutions has dependency on the provider for uptime and other software related matters. IBM Bluemix specially has lot of documentations, dedicated blog on data science (like we mentioned above), community support on StackExchange like sites, free Demo Cloud like services which basically makes then fit for many of the users. The names of cloud based big data analytics solution providers become limited to:

  1. Who has enough free resources to use
  2. Who has pay-as-you-go model of service
  3. Who has lot of documentations
  4. Who has lot of examples or demo stuffs on Github

And the number of such vendors is not exactly many on this earth.

Apart from purely cloud solution, like we listed three ways at the beginning of this article, there is hybrid cloud strategy for big data & analytics. The connection of on-premise environments and/or dedicated cloud with public cloud can control the matters with data security and convenience. Data movement on hybrid cloud is one of the biggest challenges for a hybrid cloud strategy and special considerations must be taken to reduce latency and maintain performance. Location, dedicated connections streaming, traffic optimization, workload optimization are few points to consider.

 

Conclusion

 

No single cloud environment optimizes all criterion. Here is where IBM has an optimized provisioning worksheet which balances the trade-offs between public, private, and hybrid cloud architectures :

Big Data Analytics Solutions- On-Premise versus in the Cloud

Any cloud based has minor inherited risks even provided by the best provider on this earth. Frankly, regular backup does the trick for a service or company who are not of a size which is exactly large. The case of larger enterprise with own datacenter hugely varies with an independent developer. Their CIO needs to calculate whether insourcing a third-party service will return better ROI than cost of purchasing the hardware, time to setup, recruiting skilled and semi-skilled manpower. We really cannot run advocacy for a specialized segment regardless of their size, like healthcare or governmental agency where question of adherence to various standards and geolocation of data may be important in a country. Such segment demands paid consultancy to decide and arrange their own solution.

Tagged With paperuri:(43214b311de13830a11de138898ed17f) , https://thecustomizewindows com/2017/10/big-data-analytics-solutions-premise-versus-cloud/ , data science platforms working on both on-premise and cloud , data lake analytics business case on premise outsourcing build in house options , data analytics business case on premise cloud options , Data , compare data analytics cloud solutions , cloud big data software on prem , big data setup on premises , big data cloud vs on premises

This Article Has Been Shared 526 Times!

Facebook Twitter Pinterest

Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to Big Data Analytics Solutions: On-Premise versus in the Cloud

  • Installing Local Data Lake on Ubuntu Server : Part 1

    Here is Part 1 of Installing Local Data Lake on Ubuntu Server With Hadoop, Spark, Thriftserver, Jupyter etc To Build a Prediction System.

  • Big Data as a Service (BDaaS) Basics

    Big Data as a Service or BDaaS, is as if combination of SaaS, PaaS and DaaS. Self Hosting Big Data platform is time consuming and costly.

  • IBM Analytics Demo Cloud : Free Hadoop, Ambari With SSH

    IBM Analytics Demo Cloud is intended to learn Hadoop, Ambari, BigSQL free of cost with SSH access & web console. Here is how to get started.

  • Real-time Big Data Analytics in Health Care Using Tools From IBM

    Goal of the article Real-time Big Data Analytics in Health Care Using Tools From IBM is to provide understanding of big data in the health.

  • Create Data Science Environment on Cloud Server With Docker

    Here Are the Steps, Commands to Create Data Science Environment on Cloud Server For Data Analysis Starting With a Blank Server With SSH.

Additionally, performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (24.3K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • Four Foolproof Tips To Never Run Out Of Blog Ideas For Your Website March 28, 2023
  • The Interactive Entertainment Serving as a Tech Proving Ground March 28, 2023
  • Is it Good to Run Apache Web server and MySQL Database on Separate Cloud Servers? March 27, 2023
  • Advantages of Cloud Server Over Dedicated Server for Hosting WordPress March 26, 2023
  • Get Audiophile-Grade Music on Your Smartphone March 25, 2023

About This Article

Cite this article as: Abhishek Ghosh, "Big Data Analytics Solutions: On-Premise versus in the Cloud," in The Customize Windows, October 13, 2017, March 29, 2023, https://thecustomizewindows.com/2017/10/big-data-analytics-solutions-premise-versus-cloud/.

Source:The Customize Windows, JiMA.in

PC users can consult Corrine Chorney for Security.

Want to know more about us? Read Notability and Mentions & Our Setup.

Copyright © 2023 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT