The possibilities with regard to the networking on the Internet means that a lot of data exchange takes place between the devices. This exchange includes both structured and unstructured data. More efficient process design and profit maximization should now be the focus of most business-driven companies. In order to be able to position the company position and the specially developed products better on the market, the experience and evaluation of data is used in order to better respond to customer wishes and requirements. Real-time and time-to-market are also key terms here. To meet these requirements efficiently and completely, the evaluation of structured and unstructured data is required. The analysis of unstructured data plays a much more important role than the analysis of structured data that already makes sense in terms of topic. Through interrelationships and categorization of unstructured data new connections and insights can be gained, which are not directly apparent.
As a result of these insights companies hope for added value and competitive advantages in the marketplace, big data methods are represented in almost every area of life and market. This representation of big data in individual market areas can result in a large number of issues. One question may be: is big data the same in every market segment and area of life, or are special paths in different segments required?
Not every market segment has the same requirements for big data. Depending on the branch in which Big Data is used, different requirements or weightings of the evaluation scenarios may occur. The accurate provision and collection of a large number of data volumes is therefore a significant requirement that the service user places on a service provider or on a big data method. In order to be able to meet these requirements significantly, the application and market area should be clearly defined and the, in part special, claim known.
The term Big Data is widespread nowadays. Often this is associated with the evaluation of huge amounts of data. While this connection is correct, it does not do justice to its versatility. Due to the technical possibilities and the endeavor to network all electronic means of communication with one another, an increased data traffic arises with many partly unstructured data, which does not make any sense at first glance. With big data mechanisms measurable parameters can be objectively evaluated, from which connections and changes can be made visible.
The focus of this series of articles is the structuring of the different topics of big data and is intended to illustrate the variety of application areas. To begin with, basic topics that will serve to understand the subject will be described in order to subdivide the topic of big data into three logically combined application areas and to illustrate them with the help of examples. We do have an in-depth article on basics on Big Data. However, that is too big for the present series.
When presenting each outline point, it should become clear how the selected area of application is to be assigned to the respective application field. It should be made clear that this topic can be categorized into many logical fields of application. The explanatory notes will indicate which data are useful in which application areas. The representations of the areas of application deal with a cross section of these topics, there should be no critical evaluation of the individual subtopics, but in the execution of the examples, some opportunities and risks may arise, which are necessary for the overall overview.
After the introduction, the definitions and basics of big data will follow. Furthermore, the different strategies of big data will be discussed. The end will conclude with the market suitability of Big Data. It will explain whether this topic is relevant in every area and to what extent Big Data evaluations are pronounced. Due to the rapid pace of technological advances, it is expected that there will be differences in media usage across generations, and thus the use of big data will depend on the age structure of the target group. Therefore, the structuring has been done in three age groups. The occurring Big Data methods should be classified meaningful and argumentative into the respective age divisions.
It should be clear that the needs of individuals have an impact on public life. Collecting and analyzing data that is most likely associated with big data these days was much earlier in human history.
In World War II data volumes were evaluated. The military messages were encrypted, but it was possible to circumvent this encryption by finding the initial position of the encryption rollers. After the invention of the Colossus machine, 5,000 characters per second could be processed by automatically reading in data.
Over the course of the twentieth century, the processed data grew larger and larger; inventions were needed to meet the new requirements. At the end of the 20th century, in 1997, NASA scientists Michael Cox and David Ellsworth were working on a new system, as conventional computer systems could no longer handle the huge amounts of data. In this context, the expression of the problem of big data can also be read in a report by the two scientists. Thus, this article brought the current problem of huge amounts of data and its processing to a unanimous notion. Today, big data is also associated with information technology, as large companies such as Cisco and SAP address these issues. Cisco predicted that Internet traffic will double every two years. The fields of application of big data projects have become ever wider and more diverse since the 21st century. Managers from a wide range of companies also recognize the importance of such data analysis for their company and the competitive advantages that can be derived from it. Since the data can be used in a variety of ways, the fields of application have become so diverse. This can redefine job descriptions and make analysts more important to companies.
In the historical context, it can therefore be said that evaluations of data against the background of new insights already existed much earlier than the term itself. Today, big data is repeatedly associated with other fields of application that also make it more difficult to differentiate. Since 2008, the term big data has been repeatedly associated with the IT environment. The many computer systems needed for such analysis show a causal relationship, but big data offers far more applications than just IT analytics.
Big data means “large amounts of data”. From this it can be supposedly concluded that even the storage of large amounts of data can give the technology its name. Much more important, however, is the added value of big data analyzes resulting from the evaluation and the newly gained findings, which can also be used in different ways in different industries. Big data itself is therefore not defined by the size or quantity of the stored data, but by the amount of data used, which leads to a result. The results could only be achieved with these and not with less data.
The corporate strategies of such big data analytics are not cross-industry. Companies in the banking and insurance industry are concerned with risk assessment and fraud detection. Any frauds of the past should be detected earlier, so that ideally this can be completely prevented. For companies in the automotive industry, analyzes of data streams are in the foreground, while production plants are monitored in order to be able to act faster if necessary. Even within the company, you can discover various big data strategies. For example, a marketing department is more interested in the evaluation of social media and forum contributions in order to be able to derive any trends and requirements at an early stage.
Despite differences in the industries, there are generally common components that make up big data analytics. When it comes to big data, there are always 5 words beginning with the letter “V”.
- Variety : The term “Variety” stands for the type of data used. Not every piece of information that is saved is of high quality at the same level. Nothing can be generalized because different analyzes also have different requirements. For the data types, a distinction is made between structured, semi-structured and unstructured data.
Structured data is characterized by a similar structure and can be easily assigned. Unstructured data complicates the automatic analysis, because they are not categorizable and can not be brought into a semantic context at first glance. “The vast majority of data available in companies is unstructured. The content development unstructured data are automatically transformed into a structured form. Semi-structured data includes both structured and unstructured information. A conventional e-mail is a good example: While the header has a structure with, among other things, the sender and recipient line, the actual e-mail message (called body) is unstructured.
- Velocity : Velocity describes the speed at which evaluated data is available. Parallelism will be required in the future to ensure real-time analysis. Thus, the data is not time-delayed, but immediately available in real time. However, there are data in which real-time can not be guaranteed, because the amount of data is very large and not all data fit into a given grid during initial analysis.
- Veracity : Veracity stands for the quality of the requested or evaluated data. The more qualitative the data, the more meaningful they are and can thus be better incorporated in the implementation of findings. When evaluating user contributions in the social media area, subjective moods can change the orientation of the contributions and thus lead to a changed analysis result.
- Volume : This is the objective amount of data, expressed in one unit of data, that is stored and used for big data analysis. This volume can quickly reach the petabyte or higher range. IBM is currently working on the projects, which will later be able to store and evaluate the amount of data that is twice the amount of the current daily traffic of the entire Internet.
- Value : This is the real added value that should be created through analysis and evaluation. Value describes the generation of new findings as a result through the combination of data sources. The data can not only be used internally, but also offered to third parties for sale. This creates new business models for companies. It not only generates profit by selling products, but also by selling the generated data to third parties.
This ends the first part. In the second part, we have discussed marketability of Big Data and applications in some sectors.
Illustration adapted from https://www.semanticscholar.org/paper/Real-time-processing-technologies-in-big-data%3A-Ibtissame-Yassine/19545841f6810cde52b0ab5bbcbe1a8a474ba7fe in unmodified form for academic reason.