• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here: Home » Overview of Cloud Based Big Data Platforms And Tools From IBM

By Abhishek Ghosh September 12, 2017 9:08 am Updated on September 12, 2017

Overview of Cloud Based Big Data Platforms And Tools From IBM

Advertisement

The goal of this article, Overview of Cloud Based Big Data Platforms And Tools From IBM is to classify the Big Data tools offered by IBM. Also, it will provide the readers an idea around what the big data tools IBM has. As our experience is as user, this article possibly will help the prospective users and developers from easily testing their tools or plan a complete setup. Among so many products offered by IBM with various brand names, essentially it may feel difficult to a new user to identify which is Hadoop or understand IBM’s logic of arrangement of their product ranges. A comprehensive research has been carried out, from the publications by IBM as well as from the external sources that contain reputable and objective information.

Big Data is a new approach to data processing. Characteristically the analysis of large amounts of data has made possible only by the technological progress in recent years. Through Big Data we can gain insights and understand the connections which can go far beyond the possibilities of the existing technologies. According to current calculations, the volume of data available worldwide doubled every two years. The enormous data growth results from the digitization of content, from the areas of Internet and mobile communications, industries, traffic, sources such as social media, credit and customer cards, surveillance cameras, flight and vehicles, intelligent home control systems, sensor systems for controlling production facilities and so on. These result in new applications and business models in social networks, in the financial industry (financial transactions, exchange data), as well as in the energy sector (consumption data) and in healthcare (gene analysis, telemonitoring). Also, in many areas of science (e.g. geology, genetics, proteomics, climate research and nuclear physics) large data volumes are being worked on to create model calculations and evaluations. The linked examples towards our old articles mostly are around analysis is server log, which is readily available to majority of the developers. Furthermore, other websites also have plenty amount of scripts (like Apache Pig scripts) for more testing.

 

Table of Contents

  • 1 Introduction
  • 2 Previous Articles
  • 3 Classification of Cloud Based Big Data Platforms
  • 4 Free To Test Services
  • 5 Conclusion

 

Relationship with Our Previously Published Articles

 

Advertisement

---

In our previously published articles, we have discussed around the advantages and disadvantages of Big Data tools offered by IBM, Real-time Big Data Analytics in Health Care and how to build Big Data solution on the cloud. For quick recapitulation, in our previous articles we discussed about our observed strategies adapted by IBM, how IBM seems to be focused on the solutions which enable growth, make the things easy, cheaper and the workforce of employees dynamically adapt and evolve to the ever-changing needs. To achieve these goals, the following principles IBM possibly considers:

 

  • Wider usage of Open Source software
  • Innovation through cloud computing, business analytics, acquisitions and other strategic initiatives
  • Enterprise-wide automation and integration of business processes
  • IBM hardware specific optimizations for better performance

 

Our Classification of Cloud Based Big Data Platforms and Tools From IBM

 

We can classify the tools offered by IBM in to:

  1. IBM Big Data Platform
  2. Analytic Applications

From the point of usage, IBM Big Data Platform can be further divided in to six sub-types which is our main area of interest:

  1. Hadoop System
  2. Stream Computing
  3. Data Warehousing
  4. Information Integration and Governance
  5. User Interfaces
  6. Accelerators
Overview of Cloud Based Big Data Platforms And Tools From IBM

Big Data Platform’s big part is the Hadoop System. IBM’s InfoSphere BigInsights as product is open source Hadoop with storage, security, performance optimization, development tooling, visualization. There are reasons to mention the points. As example, the storage is IBM Shared-Nothing Cluster parallel file system which can replace HDFS with full POSIX compliance. IBM Big SQL is a great and ready to use platform with almost all commonly used Open Source Big Data tools installed which includes Hadoop, Spark, Pig to MySQL.

From the point of usage, the Analytic Applications can be further divided in to:

  1. BI/Reporting
  2. Exploration/Visualization
  3. Functional App
  4. Industry App
  5. Predictive Analytics
  6. Content Analytics

 

Free To Test BigSQL/Demo Cloud From IBM

 

For practical reasons, we need to test any service before committing to pay as regular customer. For testing the IBM Big Data tools we have discovered two free ways:

  • Free tier of IBM Bluemix with time limit
  • Forever free Demo Cloud, which is BigSQL

For the advanced users, if the need is only Big Data Analytics tools, in such case we felt Demo Cloud to be superior, quick option over Bluemix for testing purpose as it readily offers easy SSH access and also offers web UI. Readers can check our practical articles on that free Demo Cloud (that is how we tested) like our initial guide on Demo Cloud, our example tutorial illustrating using Apache Pig on Demo Cloud for server log analysis and so on.

The disadvantage of Demo Cloud is easy – data will be flashed on regular interval, which in case of BlueMix is predictable. BlueMix does have everything but it can take time to understand. Furthermore, Demo Cloud has some Question Answer support on IBM’s site as well as on StackOverflow.

It took good time for us to realize what is included inside Bluemix and why the things like BigSQL are also separate products from IBM. Separate products are enterprise grade services.

Regarding the Analytic Applications, we actually have some easy example of implementation of Analytic Applications, like with WordPress as WordPress Plugin which can analyze emotion. That plugin was well appreciated in the WordPress community. As time will proceed, more community developed free plugins will be common to find.

 

Conclusion

 

Obviously, there is some ongoing cost for using the tools from IBM and it is practical to compare the cost benefit ratio with self-hosted Open Source big data software. As cost of virtual servers, dedicated servers, bandwidth are steadily decreasing, the parameters to compare becoming lesser.

Moreover, there are definite reasons behind not publishing guides on Rackspace or HP cloud! Many things can not be written but as an intelligent user or developer, you need to understand the reasons of our preference at one point of time. Rackspace now charges extra $50 for having an account. How can we talk about Rackspace Cloud Files anymore? We had to write Python scripts to upload images on HP Cloud’s object storage. Those stories are only around webhosting, not complicated matters like having a Big Data analysis platform.

IBM is selling their hardware and networking resources with pre-installed software as a service. The software is mainly well known Open Source big data software. It is not difficult for a system administrator to measure the performance and benchmark to compare self-hosted Hadoop installation with IBM hosted Hadoop on cloud.

A big extra gain of using Hadoop like Big Data tools as a service from IBM over self-hosted is the availability of IBM’s experienced skilled employees to answer questions. Obviously, other matters like least burden of server sysadmin works to maintain own servers running Hadoop, possibility of least downtime for own server maintenance becomes other decisive factors to choose the tools offered by IBM.

Obviously, IBM optimized tools have extra advantage of faster to deploy a project without the need of setup, distributed-ness reliability through redundancy, hardware optimization, specialized product like IBM Watson, ready to use visualizations etc.
Will you use IBM’s Big SQL or use a cheap dedicated server with huge RAM and install the needed Open Source software following standard guides like we have? Answer really depends on your need. We never seen any cheap dedicated server to provide instant help for troubleshooting hardware issues. Depending on need, we accept that inconvenience for cost reduction. Not always that cheap way works.

If you have enough number of sysadmins who are used with data analysis, possibly self-hosting is not a bad choice. But basically, these kind of “Software as a Service” always have the advantage of being a managed, ready to use service with near zero chance of downtime and network security management. IBM’s Big Data tools are good just like these days we prefer DNS or Email as “Software as a Service”. One big reason to prefer these services is to save ourselves from the hackers.

Tagged With clud based big data platforms , overview of cloud based platforms , Web Server Log Processing using Hadoop

This Article Has Been Shared 649 Times!

Facebook Twitter Pinterest

Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to Overview of Cloud Based Big Data Platforms And Tools From IBM

  • Installing Local Data Lake on Ubuntu Server : Part 1

    Here is Part 1 of Installing Local Data Lake on Ubuntu Server With Hadoop, Spark, Thriftserver, Jupyter etc To Build a Prediction System.

  • Install Apache Mahout : Ubuntu 16.04 For Machine Learning Dev

    Here Is How To Install Apache Mahout On Ubuntu 16.04 For Machine Learning Development. We Can Install & Integrate Mahout With Spark, Hadoop.

  • IBM Analytics Demo Cloud : Free Hadoop, Ambari With SSH

    IBM Analytics Demo Cloud is intended to learn Hadoop, Ambari, BigSQL free of cost with SSH access & web console. Here is how to get started.

  • Create Data Science Environment on Cloud Server With Docker

    Here Are the Steps, Commands to Create Data Science Environment on Cloud Server For Data Analysis Starting With a Blank Server With SSH.

  • How To Learn Big Data : Free Courses and Resources

    How To Learn Big Data With Free Courses and Resources? First You Need to Know the Proper Websites, Second You Need to Know What Not to Learn.

Additionally, performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (24.3K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • What is an Automatic Ethanol Fireplace February 8, 2023
  • Disadvantages of Cloud-Native Computing February 7, 2023
  • Projector Screen Basics February 6, 2023
  • What is Configuration Management February 5, 2023
  • What is ChatGPT? February 3, 2023

About This Article

Cite this article as: Abhishek Ghosh, "Overview of Cloud Based Big Data Platforms And Tools From IBM," in The Customize Windows, September 12, 2017, February 9, 2023, https://thecustomizewindows.com/2017/09/overview-cloud-based-big-data-platforms-tools-ibm/.

Source:The Customize Windows, JiMA.in

PC users can consult Corrine Chorney for Security.

Want to know more about us? Read Notability and Mentions & Our Setup.

Copyright © 2023 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT