• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here: Home » What Does Data Cleansing Mean?

By Abhishek Ghosh February 21, 2021 6:44 pm Updated on February 21, 2021

What Does Data Cleansing Mean?

Advertisement

Data cleansing includes various methods for removing and correcting data errors in databases or other information systems. For example, the errors may consist of incorrect (originally incorrect or outdated), redundant, inconsistent, or incorrectly formatted data. Key steps for data cleansing are duplicate detection (detecting and merging the same data sets) and data fusion (merging and completing patchy data). Data cleansing is a contribution to improving the quality of information. However, information quality also affects many other characteristics of data sources (credibility, relevance, availability, costs, etc) that cannot be improved by means of data cleansing.

 

Data Cleansing Process

 

The process of cleaning up the data is divided into five successive steps):

  • Make a backup copy of the file/table
  • Data Quality – Setting Data Requirements
  • Analysis of the data
  • Standardization
  • Cleanup of the data

Data Quality Requirements

Advertisement

---

High-quality and reliable data must meet certain requirements :

  • valid data: same data type, certain maximum values, etc.
  • complete data
  • uniform data: same unit (i.e. currency, weight, length)
  • integral data: Data must be protected from intentional and/or unintentional manipulation.

Analysis of data

Once the requirements have been clarified, the data must be checked, i.e. with the help of the checklists, whether the data is of the required quality.

Standardization of data before cleanup

For a successful cleanup, the data must first be standardized. For this purpose, these are first structured and then standardized. The structuring brings the data into a uniform format, for example, a date is brought into a uniform data format (01.09.2009) or composite data is broken down into its components, i.e. a customer’s name into the name components Salutation, Title, First Name and Last Name. In most cases, such structuring is not trivial and is carried out with the help of complex parsers.

During standardization, the existing values are mapped to a standardized value list. This standardization may be carried out, for example, academic titles or company additions.

Cleaning up data

There are six methods to clean up the data that can be applied individually or in combination:

  • Derive from other data: The correct values are derived from other data (i.e. salutation from the gender).
  • Replace with other data: The corrupted data is replaced by other data (i.e. from other systems).
  • Use Default values: Default values are used instead of the incorrect data.
  • Remove incorrect data: The data is filtered out and not further processed.
  • Remove duplicates: Duplicates are identified through duplicate detection, the non-redundant data is
  • Consolidation from the duplicates, and a single data set is formed from them.
  • Split summary: In contrast to the removal of duplicates, incorrectly summarized data is separated again.

What Does Data Cleansing Mean
Storage of the faulty data

Before cleaning up the data, you should save the original, erroneous data as a copy, and not simply delete it after the cleanup. Otherwise, the adjustments would not be comprehensible, and such a process would not be audit-proof.

An alternative is to store the corrected value in an additional column. Because additional disk space is required, this approach is recommended for only a few columns in a record to correct. Another option is to store it in an additional line, which increases the memory requirement even more. Therefore, it is only possible to correct a small number of records. The last option for a large number of columns and rows to correct is to create a separate table.

This Article Has Been Shared 209 Times!

Facebook Twitter Pinterest
Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to What Does Data Cleansing Mean?

  • Difference Between Data Warehouse And Data Lake

    What Is The Difference Between Data Warehouse And Data Lake? Data warehouses is four decade old established concept. Data lake is a new idea.

  • What is Predictive Analytics?

    What is Predictive Analytics? Predictive analysis encompasses a variety of data-based knowledge to make predictive assumptions about the future events.

  • Big Data and Privacy : Data Leakage

    Here is Another Practical Discussion Around Big Data and Privacy. In our earlier article we mainly discussed maintaining compliance for the app developers.

  • Why Can Industries Benefit From Blockchain?

    A blockchain makes it possible to transmit information in a tamper-proof manner using a decentralized database shared by many participants so that copies are excluded. The database is also known as a distributed ledger. It is stored on many computers in a peer-to-peer network, with each new node taking over a full copy of the […]

  • What is Data Fusion?

    Data fusion is the process of merging and completing incomplete data sets. It is an important part of information integration. Data in a recipient record is supplemented with the help of a donor record. The donor record consists of variables and the recipient record from variables. The variables are therefore present in both data sets. […]

Additionally, performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (22.1K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • How to Make the Most of Your S Pen (S22 Ultra) June 29, 2022
  • Safe Chargers for Samsung Galaxy S22 Ultra June 27, 2022
  • How Telecoms Can Use The Cloud To Power Their 5G Network June 24, 2022
  • A Beginner Guide to Cloud Computing for Development June 22, 2022
  • 5 Benefits of Using a Virtual Data Room Today June 19, 2022

About This Article

Cite this article as: Abhishek Ghosh, "What Does Data Cleansing Mean?," in The Customize Windows, February 21, 2021, June 30, 2022, https://thecustomizewindows.com/2021/02/what-does-data-cleansing-mean/.

Source:The Customize Windows, JiMA.in

This website uses cookies. If you do not want to allow us to use cookies and/or non-personalized Ads, kindly clear browser cookies after closing this webpage.

Read Privacy Policy.

PC users can consult Corrine Chorney for Security.

Want to know more about us? Read Notability and Mentions & Our Setup.

Copyright © 2022 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy