Information extraction (IE) is the engineering application of methods from practical computer science, artificial intelligence and computational linguistics to the problem of automatic machine processing of unstructured information with the aim of acquiring knowledge. against a pre-defined domain. A typical example is the extraction of information about merger events, where, for example, instances of the relation merge (company1, company2, date) are extracted from online messages. Information extraction is of great importance because a lot of information is available in an unstructured (non-relationally modeled) form, for example on the Internet, and this knowledge becomes more accessible through information extraction.
Information extraction could also be referred to as targeted text extraction. Information extraction systems are always geared to at least a specific field of expertise, usually even to certain areas of interest (scenarios) within a more general field (domain). For example, in the domain ‘Business News’, a possible scenario would be ‘Personnel change in a management position’.

Possible Applications of Information Extraction
The independent field of research of information extraction must be distinguished from related fields: Text extraction aims at a comprehensive summary of the content of a text (comprehensive automatic text summarization is problematic in that even human readers will never achieve complete agreement in the task of summarizing the most important aspects of a text if it has not been specified to what extent the information should be important). Text clustering means the independent grouping of texts, text classification means the classification of texts into predefined groups. Information retrieval can refer to the search for documents in a set of documents (full-text search) or – depending on the literal meaning – to the more generally formulated task of retrieving information. Data mining generally refers to the “process of recognizing patterns in data”.
---
In general, two types of information extraction can be distinguished: On the one hand, the extracted data can be immediately intended for a human observer. This scope of application includes, for test purposes, which forwards information extracted from e-mails as SMS, or a system that displays information extracted from the hits in a search engine, such as the positions offered in job advertisements.
On the other hand, the data can be intended for further machine processing, be it for storage in databases, for text categorization or classification, or as a starting point for comprehensive text extraction. If the information being searched consists of several individual pieces of information, the area of application determines certain requirements for the information extraction system. For example, the information must be available in a structured form for machine processing, while an unstructured result can also be sufficient for further processing directly by humans.
If the information sought does not consist of other individual pieces of information, as in the case of proper name recognition, such a distinction is superfluous.
Conclusion
Information extraction systems can be used for various tasks, from the automatic analysis of job advertisements to the preparation of a general text extraction. According to these requirements, the systems can deliver structured or unstructured results. Furthermore, the systems can have completely different linguistic depths, from extraction by targeted summarization with pure sentence filtering, where only semantic orientation in the form of the word list is given, to systems with analysis modules for all levels of language (phonology, morphology, syntax, semantics, possibly also pragmatics).
In some areas, our lack of understanding of how natural language works leads to stagnation in development, but since information extraction is a more limited task than complete text comprehension, solutions that meet the requirements are often possible in the sense of “appropriate language engineering” (perhaps especially in connection with neighboring areas).