• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here:Home » Data Mining: An Overview

By Abhishek Ghosh November 30, 2023 9:48 am Updated on November 30, 2023

Data Mining: An Overview

Advertisement

Data mining is the systematic application of statistical methods to large data sets with the aim of creating new Identify cross-connections and trends. Due to their size, such databases are processed using computer-aided methods. In practice, the sub-term data mining has been applied to the entire process of so-called “knowledge discovery in databases“. KDD), which also includes steps such as pre-processing and evaluation, while data mining in the narrower sense only refers to the actual processing step of the process.

The term data mining is somewhat misleading, because it is about extracting knowledge from already existing data and not about generating data itself. Nevertheless, the concise designation has prevailed. The mere collection, storage and processing of large amounts of data is also sometimes referred to as buzzword data mining. In a scientific context, it primarily refers to the extraction of knowledge that is “valid (in the statistical sense), hitherto unknown and potentially useful” “for the determination of certain regularities, regularities and hidden relationships”. It is defined as “a step in the KDD process that consists of applying data analysis and discovery algorithms that provide a special collection of patterns (or models) of the data under acceptable efficiency limitations.” Inferring data from (hypothetical) models is called statistical inference.

Many of the methods used in data mining actually originate from statistics, especially multivariate statistics, and are often only adapted in their complexity for the application in data mining, often approximated to the detriment of accuracy. The loss of accuracy is often accompanied by a loss of statistical validity, so that from a purely statistical point of view, the procedures can sometimes even be “wrong”. For data mining applications, however, experimentally verified benefits and acceptable runtime are often more important than statistically proven correctness.

Advertisement

---

The topic of machine learning is also closely related, but in data mining the focus is on finding new patterns, while in machine learning the primary aim is to automatically recognize known patterns by the computer in new data. However, a simple separation is not always possible here: If, for example, association rules are extracted from the data, this is a process that corresponds to typical data mining tasks; however, the extracted rules also meet the goals of machine learning. Conversely, the subfield of unsupervised learning from machine learning is very closely related to data mining. Machine learning methods are often used in data mining and vice versa.

Research in the field of database systems, especially index structures, plays a major role in data mining when it comes to reducing complexity. Typical tasks such as searching for nearest neighbors can be significantly accelerated with the help of a suitable database index and the runtime of a data mining algorithm can be improved as a result.

Data Mining An Overview

Information retrieval (IR) is another field that benefits from the insights of data mining. To put it simply, this is about the computer-aided search for complex content, but also about the presentation for the user. Data mining methods such as cluster analysis are used here to improve the search results and their presentation to the user, for example by grouping similar search results. Text mining and web mining are two specializations of data mining that are closely related to information retrieval.

Data collection, i.e. the collection of information in a systematic manner, is an important prerequisite for obtaining valid results with the help of data mining. If the data was collected in a statistically improper manner, there may be a systematic error in the data, which is then found in the data mining step. The result may not be a consequence of the observed objects, but caused by the way in which the data was collected.

 

Process of Data Mining

 

Data mining is the actual analysis step of the Knowledge Discovery in Databases process. The steps of the iterative process are roughly outlined:

  • Focus: data collection and selection, but also the determination of existing knowledge
  • Preprocessing: Data cleansing, which integrates sources and eliminates inconsistencies, for example by removing or adding incomplete data sets.
  • Transformation into the appropriate format for the analysis step, for example by selecting attributes or discretizing the values
  • Data mining, the actual analysis step
  • Evaluation of the patterns found by the expert and control of the achieved goals

In further iterations, knowledge that has already been found can now be used (“integrated into the process”) to obtain additional or more accurate results in a new run.

Facebook Twitter Pinterest

Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to Data Mining: An Overview

  • Uses of Text Mining in Web Content Mining : Part I

    This series will examine one of the discipline of knowledge discovery, that is Text Mining, and present the application possibilities of Web Content Mining.

  • What Is Data Mining? Examples of Data Mining Software

    Data mining is the systematic application of statistical methods to large databases with the aim of identifying new patterns and trends.

  • Knowledge Discovery in Databases : Part II

    In Part I of Knowledge Discovery in Databases, we discussed about the database systems, fundamentals of statistics and Big Data and fundamentals of knowledge discovery in databases. In this second part of Knowledge Discovery in Databases, we will discuss the process of the Knowledge Discovery in Databases and Methods of the Knowledge Discovery in Databases. […]

  • Approaches of Deep Learning : Part 1

    From This Series on Approaches of Deep Learning We Will Learn Minimum Theories Around AI, Machine Learning, Natural Language Processing and Of Course, Deep Learning Itself.

performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

vpsdime

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (24.3K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • Cloud-Powered Play: How Streaming Tech is Reshaping Online GamesSeptember 3, 2025
  • How to Use Transcribed Texts for MarketingAugust 14, 2025
  • nRF7002 DK vs ESP32 – A Technical Comparison for Wireless IoT DesignJune 18, 2025
  • Principles of Non-Invasive Blood Glucose Measurement By Near Infrared (NIR)June 11, 2025
  • Continuous Non-Invasive Blood Glucose Measurements: Present Situation (May 2025)May 23, 2025
PC users can consult Corrine Chorney for Security.

Want to know more about us?

Read Notability and Mentions & Our Setup.

Copyright © 2026 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy