• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here: Home » What Does Data Cleansing Mean?

By Abhishek Ghosh February 21, 2021 6:44 pm Updated on February 21, 2021

What Does Data Cleansing Mean?

Advertisement

Data cleansing includes various methods for removing and correcting data errors in databases or other information systems. For example, the errors may consist of incorrect (originally incorrect or outdated), redundant, inconsistent, or incorrectly formatted data. Key steps for data cleansing are duplicate detection (detecting and merging the same data sets) and data fusion (merging and completing patchy data). Data cleansing is a contribution to improving the quality of information. However, information quality also affects many other characteristics of data sources (credibility, relevance, availability, costs, etc) that cannot be improved by means of data cleansing.

 

Data Cleansing Process

 

The process of cleaning up the data is divided into five successive steps):

  • Make a backup copy of the file/table
  • Data Quality – Setting Data Requirements
  • Analysis of the data
  • Standardization
  • Cleanup of the data

Data Quality Requirements

Advertisement

---

High-quality and reliable data must meet certain requirements :

  • valid data: same data type, certain maximum values, etc.
  • complete data
  • uniform data: same unit (i.e. currency, weight, length)
  • integral data: Data must be protected from intentional and/or unintentional manipulation.

Analysis of data

Once the requirements have been clarified, the data must be checked, i.e. with the help of the checklists, whether the data is of the required quality.

Standardization of data before cleanup

For a successful cleanup, the data must first be standardized. For this purpose, these are first structured and then standardized. The structuring brings the data into a uniform format, for example, a date is brought into a uniform data format (01.09.2009) or composite data is broken down into its components, i.e. a customer’s name into the name components Salutation, Title, First Name and Last Name. In most cases, such structuring is not trivial and is carried out with the help of complex parsers.

During standardization, the existing values are mapped to a standardized value list. This standardization may be carried out, for example, academic titles or company additions.

Cleaning up data

There are six methods to clean up the data that can be applied individually or in combination:

  • Derive from other data: The correct values are derived from other data (i.e. salutation from the gender).
  • Replace with other data: The corrupted data is replaced by other data (i.e. from other systems).
  • Use Default values: Default values are used instead of the incorrect data.
  • Remove incorrect data: The data is filtered out and not further processed.
  • Remove duplicates: Duplicates are identified through duplicate detection, the non-redundant data is
  • Consolidation from the duplicates, and a single data set is formed from them.
  • Split summary: In contrast to the removal of duplicates, incorrectly summarized data is separated again.

What Does Data Cleansing Mean
Storage of the faulty data

Before cleaning up the data, you should save the original, erroneous data as a copy, and not simply delete it after the cleanup. Otherwise, the adjustments would not be comprehensible, and such a process would not be audit-proof.

An alternative is to store the corrected value in an additional column. Because additional disk space is required, this approach is recommended for only a few columns in a record to correct. Another option is to store it in an additional line, which increases the memory requirement even more. Therefore, it is only possible to correct a small number of records. The last option for a large number of columns and rows to correct is to create a separate table.

This Article Has Been Shared 875 Times!

Facebook Twitter Pinterest

Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to What Does Data Cleansing Mean?

  • Difference Between Data Warehouse And Data Lake

    What Is The Difference Between Data Warehouse And Data Lake? Data warehouses is four decade old established concept. Data lake is a new idea.

  • What is Predictive Analytics?

    What is Predictive Analytics? Predictive analysis encompasses a variety of data-based knowledge to make predictive assumptions about the future events.

  • Big Data and Privacy : Data Leakage

    Here is Another Practical Discussion Around Big Data and Privacy. In our earlier article we mainly discussed maintaining compliance for the app developers.

  • Why Can Industries Benefit From Blockchain?

    A blockchain makes it possible to transmit information in a tamper-proof manner using a decentralized database shared by many participants so that copies are excluded. The database is also known as a distributed ledger. It is stored on many computers in a peer-to-peer network, with each new node taking over a full copy of the […]

  • What is Data Fusion?

    Data fusion is the process of merging and completing incomplete data sets. It is an important part of information integration. Data in a recipient record is supplemented with the help of a donor record. The donor record consists of variables and the recipient record from variables. The variables are therefore present in both data sets. […]

Additionally, performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

Get new posts by email:

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (24.3K Followers)
  • Twitter (5.8k Followers)
  • Facebook (5.7k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.3k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • Exploring the Benefits and Advantages of Microsoft’s Operating System March 22, 2023
  • Web Design Cookbook: Accessibility March 21, 2023
  • Online Dating: How to Find Your Match March 20, 2023
  • Web Design Cookbook: Logo March 19, 2023
  • How Starlink Internet Works March 17, 2023

About This Article

Cite this article as: Abhishek Ghosh, "What Does Data Cleansing Mean?," in The Customize Windows, February 21, 2021, March 22, 2023, https://thecustomizewindows.com/2021/02/what-does-data-cleansing-mean/.

Source:The Customize Windows, JiMA.in

PC users can consult Corrine Chorney for Security.

Want to know more about us? Read Notability and Mentions & Our Setup.

Copyright © 2023 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT