• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here: Home » Knowledge Discovery in Databases : Part III

By Abhishek Ghosh August 28, 2018 4:26 pm Updated on August 28, 2018

Knowledge Discovery in Databases : Part III

Advertisement

In Part I of Knowledge Discovery in Databases, we discussed about the database systems, fundamentals of statistics and Big Data and fundamentals of knowledge discovery in databases. In second part of Knowledge Discovery in Databases, we discussed the process of the Knowledge Discovery in Databases and Methods of the Knowledge Discovery in Databases. In This Third Part of Knowledge Discovery in Databases We Have Discussed the Details of Methods of Knowledge Discovery in Databases, Application Examples.

Again, knowledge discovery in databases is a good way to extract knowledge from data. Increasing the data collected from all areas is likely to extract more and more knowledge. The new knowledge can then be used for further evaluations. However, with the new knowledge, it must be checked whether the statistical basis of this knowledge is correct in order to avoid serious mistakes for the future.

 

Methods of Knowledge Discovery in Databases

 

The main goal, as described, is the analysis of data and the recognition of patterns. The generalization in this case serves to find a compact amount of data from all information. The generalization can be divided into two parts. On the one hand, manual generalization, in which the data in the database are increasingly restricted. On the other hand, there is the automatic generalization, in which an algorithm specific parameters are given. The algorithm then analyzes and sorts the data using the parameters and automatically performs the necessary generalization steps.

Advertisement

---

The FIGR algorithm operates according to the methodology that the number of values ​​is counted and then sorted on the basis of simple numbers. This minimizes the total number of values ​​and thus produces a correspondingly more compact amount of data.

Clustering

Clustering describes a method in which relationships between different data are produced. These relationships may lie on the one hand in the match of properties between data. On the other hand, however, it is also sorted here according to the largest possible differences. The goal is therefore to combine the same or as different as possible elements into a group. Clustering can still be subdivided into subareas. This includes cluster analysis based on multivariate statistical methods and competitive strategies through clustering with artificial neural networks.

Cluster analysis with multivariate statistical methods

A further subdivision of the multivariate statistical methods is carried out in hierarchical and partitioning methods. Hierarchical elements continue to be incorporated into agglomerative and divisive. The agglomerative processes summarize the objects and data step by step. Theoretically, this procedure can be performed until only two classes are left. The divisive methods work in the opposite direction. Here, starting from a class, further subclasses are formed. A problem with the hierarchical procedures is that wrong assignments can be made and these are no longer correctable. As a result, these methods can only be used to a limited extent. Above all, they are suitable for finding hierarchies or outliers. The partitioning methods look for optimal partitions in the data. This means that data that are as coherent as possible are searched for and summarized. The data can then be moved between partitions by means of certain specified target criteria.

Clustering with Artificial Neural Networks

For clustering with artificial neural networks, a support vector clustering algorithm in combination with a support vector machine used. The general function is to search for a hyperplane using mathematical functions that separate two classes. Another property of this hyperplane should be that the distance from the classes to the nearest points is maximal. The points that are closest to the plane serve as a support vector, while the other data points have no influence on the algorithm.

Classification

Within the classification, the data is assigned to predefined classes. This is in contrast to the clustering methods, where matching classes are found and assigned. A simple example of a classification is the granting of loans. Based on existing records related to the lending is assigned whether a loan is awarded or not.

The classification can be subdivided into two tasks. First, the pure assignment of objects to classes takes place. This is done based on the attribute values ​​of the individual objects. Only in the second task is the actual step to speak of KDD. Here the explicit knowledge about the classes is generated.

Association analysis

Association analysis is based on finding rules between occurring data. An example of rules is an “if A and B then C” join. As a basis, the Apriori algorithm can be used to find frequent data links. This is based on a monotony property for frequently occurring data links: Each subset of a frequently occurring item set must also be frequent itself.

The frequently occurring links are then determined by size. First, single-element, then two-element, and so on are determined. This algorithm is executed so often until no more links are found. The association analysis then derives the most common rules.

A classic example of this is a shopping basket analysis. It analyzes which items are purchased together with other items. These are then displayed to other customers, like “Customers who bought this item, also bought” products. This leads to targeted advertising as there is a likelihood that the similar products might also be relevant to the customers. This example will be discussed in more detail in the course of the application examples. Within the association analysis the connection to the classification can be established.

Regression analysis

Regression analysis is used to determine related information about existing data. Various regression methods are used to make forecasts with regard to the existing data. This method is mainly used when certain information is available, but missing related values. The higher the dependency between the objects, the more accurate a prognosis can be.

Knowledge Discovery in Databases Part III

 

Application examples

 

There are already many successfully used everyday examples for the use of KDD. This covers both commercial and non-commercial topics. Some of the examples are explained in the following sections. Furthermore, the aspect of data protection is taken up here.

Privacy

In the course of the ever-increasing amount of data, data protection is becoming more important. These collected, partly personal, data are analyzed and evaluated without the knowledge of the users. By connecting different databases, it is possible to obtain precise patterns of behavior and information related to individuals.

Currently, however, generally only group behavior is analyzed, which is why no conclusions are made about individuals. Nevertheless, a closer attention must be paid to privacy in this context, so that no personal analysis will take place in the future.

Geology

Today, more and more analyzes are being carried out on current climate and land use changes. More and more data is collected and stored when creating simulation models. These data can then be evaluated, for example, to predict climate change or predict future events.

The systems among other things, analyze geological data. Based on the evaluated data, typical routes of cyclones are predicted.

Marketing

The best-known examples of KDD are marketing applications. In this case, for example, customer data are analyzed and, based on this, the advertising is adapted. This can be done based on user behavior as well as the location or other available data.

The knowledge gain is here, among other things, in web shops with the well-known statements: What other articles buy customers after they have viewed this article? or “Other recommendations for you:” used. This is a significant benefit for the seller as advertising can be tailored to different customers. Such an association is also possible by associating purchases within certain locations. For example, it could be assumed that many people are looking for a bicycle within a website. On the basis of this data, a bicycle or bicycle accessory provider could specifically favor this area for its advertising.

A similar adaptation takes place today in areas of television. Analyzes are used to analyze which audience is following the program at what point in time. These analyzes then enable the placement of specific advertising on the relevant audience. An example of the application in marketing is the Spotlight system. This system analyzes sales quantities and reveals correlations between changes in these quantities and, for example, simultaneous price changes.

 

Conclusion

 

Overall, knowledge discovery in databases is a good way to extract knowledge from data. Increasing the data collected from all areas is likely to extract more and more knowledge. The new knowledge can then be used for further evaluations. However, with the new knowledge, it must be checked whether the statistical basis of this knowledge is correct in order to avoid serious mistakes for the future.

Data can be stored almost limitlessly in today’s world. Through this possibility, modern technologies or similar means are generating more and more data containing potential knowledge. Even newer and more powerful computers make more complex evaluations possible, which can extract previously intangible knowledge from the data.
The process of KDD will probably be able to change even more areas of our lives in the future due to the higher data volumes and will thus become more and more important. Also, new knowledge is constantly leading to newer knowledge. At the same time there will be a change in other topics. In terms of data protection in particular, the ever-increasing volumes of data and the increasing relationships between them will cause a great deal of change.

This Article Has Been Shared 744 Times!

Facebook Twitter Pinterest

Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to Knowledge Discovery in Databases : Part III

  • Install Apache Mahout : Ubuntu 16.04 For Machine Learning Dev

    Here Is How To Install Apache Mahout On Ubuntu 16.04 For Machine Learning Development. We Can Install & Integrate Mahout With Spark, Hadoop.

  • Chart, Data Visualization in WordPress Posts From SQL & SQL Queries

    Displaying SQL result data may be a need. Here is How to Get Chart, Data Visualization in WordPress Posts From SQL Queries in Easy Way.

  • How PaaS Can Help Developers in Big Data & Software Development

    How PaaS Can Help Developers Software Development in the Cloud? Platform as a Service (PaaS) when combined with OpenWhisk is highly powerful.

  • How To Install Apache Maven on Ubuntu Server

    Apache Maven is a Build Automation Tool. Here Are the Steps on How To Install Apache Maven on Ubuntu Server. Maven Needed For Many Big Data Software.

  • Knowledge Discovery in Databases : Part I

    Data which can be stored almost limitless in today’s world. Knowledge discovery in databases is a good way to extract knowledge from data.

Additionally, performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (24.3K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • The Importance of Voice and Style in Essay Writing April 1, 2023
  • What Online Casinos Have No Deposit Bonus in Australia March 30, 2023
  • Four Foolproof Tips To Never Run Out Of Blog Ideas For Your Website March 28, 2023
  • The Interactive Entertainment Serving as a Tech Proving Ground March 28, 2023
  • Is it Good to Run Apache Web server and MySQL Database on Separate Cloud Servers? March 27, 2023

About This Article

Cite this article as: Abhishek Ghosh, "Knowledge Discovery in Databases : Part III," in The Customize Windows, August 28, 2018, April 1, 2023, https://thecustomizewindows.com/2018/08/knowledge-discovery-in-databases-part-iii/.

Source:The Customize Windows, JiMA.in

PC users can consult Corrine Chorney for Security.

Want to know more about us? Read Notability and Mentions & Our Setup.

Copyright © 2023 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT