Up to the third part on fields of application of Big Data, we have discussed about all topics which can be put under the header “Business Life Application”. Namely – (i) Research and product development (ii) Financial and Risk Controlling (iii) Production (iv) Marketing and Sales and (v) Distribution and logistics. In the this part, we will discuss the use cases related to personal life including employment.
Children and adolescents
Even though children and adolescents are not yet fully able to work, they represent an important part of purchasing power and are a big and promising target group for big data methods in particular. Big Data and the related field of application Social Media Analytics are preferably located in the age segment of children and adolescents due to their high online density. Social media is to differentiate from the conventional mass media, such as radio and TV, since it takes place exclusively digitally. As an online study from 2014 reports, 100% of 14-19 year olds are online regularly, 17 years ago, the figure in this age segment was still around 6%. This growth is due to increasing digitization and a lower barrier to entry compared to traditional communication media.
It can be deduced that children and adolescents are best reached online today. The resulting digital presence has the potential to exploit usage data so that administration interfaces and marketing strategies can be more effectively aligned using the results of the evaluation. The different possibilities that users have on different social platforms can also be evaluated in different ways. Like buttons on Facebook can be rated differently than self-written comments that represent unstructured content. A numerical overview of such different uses are the data from Facebook, which are shown in the graph.
Social Media Analytics analyzes collected data and digital usage information to subsequently provide better online presence and communication on digital platforms for potential customers. In the area of social media, the same online platforms are repeatedly appearing that most users register or innovate in the viral media. Brian Solis, an analyst in the digital industry, shows with a chart that digital platforms are more quantitatively more extensive and can also be categorized more finely. This chart provides a map-like overview of the communication channels in the social media area, it divides the typical user behavior into three activities that are characteristic of all activities in which people make use of the Internet: Listening, Learning and Adapting. Accordingly, it assigns the different industries to these activities and also shows in which areas of life of a person they can be used. This creates fields to which the individual online services and providers can be assigned.
The reason for the wide range of digital platforms is that nearly half of the children and adolescents are registered on six or more platforms and are regularly active there. However, this quantity does not represent any subjective neglect of privacy. On the contrary, users on the Internet are calling for equal treatment of privacy policies with regard to providers and users outside the Internet. In addition, the majority of users are aware of the potential risks of the Internet and are accordingly careful about entering personal identification data.
Each individual area in social media can be analyzed separately. The usage data that each platform has in common, however, differs in the nature and extent of the data. Due to different usage behavior on different platforms, divergent insights can be obtained with cross-platform analysis. These are usually more promising and meaningful than evaluations of just one digital platform. The average active user base of 936 million users published by Facebook in March 2015 as part of the quarterly report, gives an approximate overview of how much user data is generated even for a single status message per day per user. In addition, according to Figure 5 every 20 minutes about 10 million comments, 8 million clicks on the “like me” button and almost 3 million uploaded photos. This underlines the enormous potential of social media analytics and appropriate responses to the results.
With such a high volume of data, there is a high probability that even unstructured data can be used to gain new insights for specific requirements. If the user is regarded as an individual, he partly shares identification data in his profile. Such data may be private details and preferences in private life, such as preferred sports or musical genres. The user may publish news, personal well-being or otherwise in the context of freedom of expression. In addition, a click of the user on the “Like” button allows a personalized presentation. For children and adolescents, advertising can be targeted because a connection between the user’s profile and the things he’s clicked on the “Like” button, can be found. You have the opportunity to get attention and to put your opinions in certain topics. Since children and adolescents in particular have an increased need for communication, certain products can be positioned through targeted marketing.
Employment & Security
Many employable people today are communicatively networked. People of working age tend to have a greater need to conduct digital transactions than people of working age or dependents. The fact that these transactions are based on manipulable digital connections, the user is not aware of in certain situations. This fact can be easily exploited by fraudsters, for example to obtain data from third parties or to deliberately deceive someone else. Fraud Detection deals with these fraudulent attempts in everyday life. Fraud prevention should be undertaken by taking precautionary measures. The thematic range of Fraud Detection is very far-reaching: “insurance companies, online dating agencies, social networks and in principle all platforms on which people can either disguise their identity or skillfully manipulate through various steps”, are basically at risk of fraud.
By analyzing the collected data, anomalies are to be discovered, with the help of which a possible fraud with a higher probability is to be uncovered. By systematically analyzing the data, known patterns are to be recognized and stored. In order for a known procedure to be identified as fraud, there is a categorization of the procedures, so that afterwards certain patterns can be assigned to the category “fraud”. By deceiving private individuals or end users, companies’ economic goals and, if necessary, the continued existence of a company are at risk. Should a user become aware of a fraud in relation to a transacted transaction, this could turn away from the offers or products of the company.
The rigidly defined indicators of fraud, which offer little room for dynamic adjustments, are to be replaced by near-real-time analysis, which guarantees a much faster reaction time to change. The added value lies in the early detection of the fraudulent situation by known patterns, so that countermeasures can be taken before the fraud is carried out and possibly leads to economic damage. Fraud Detection tries to detect these fraud attempts as far as possible. It is not to be expected, however, that fraud can be fully recognized because there is no criterion in the world that can uniquely identify a human being except his or her biometric data. Thus, there are always possibilities identify yourself as being digitally wrong.
The sentence “Content is King” by Bill Gates from the year 1996 means that in the digital age the mere content for the user stands in the foreground. However, 19 years later, this guiding principle may have lost that importance. Today it is not only the content that is crucial, it is the content that has to be put into a meaningful everyday context to satisfy the users. Especially the working people who are traveling a lot and have to resort to geographical data in certain situations, will demand such a context of the data.
The technical possibility of geographical location is now possible via any smartphone through the built-in transmitter and receiver modules via GPS. GPS also contributes to the fact that the requirements of the usage-based content can be implemented: “via GPS, the customer is navigated in the selected supermarket on request, until he stands in front of the correct shelf”, what the user or the buyer in the supermarket just needs can be determined by analyzing data with an increased hit rate. Location-based services help to structure the information and bring it into a meaningful context. An example of such a service is the smartphone app from Foursquare. Conclusions on the usage behavior can provide the number of check-ins. It can be seen that it has increased sixfold, from one to six million.
A check-in is a virtual counterpart to a real geographic place in the world where users can check in. If this check-in location does not yet exist, there is the possibility to create this yourself. By using such apps, the customer can on the one hand show in social networks where he is at the moment and thus exchange his digital contacts. On the other hand, the providers offer partial discount and coupon promotions for such check-ins, as this can mean both advertising for the app operator and for the owner of the location. Both the quantity of check-ins in localities and the time factor allow statistics on visitor flows, in order to be able to make predictions about the number of visitors, depending on the time of day. The user himself in turn has something in the form of special daily or daily time offers of the use of such apps. The number of check-ins at Foursquare over the past three years indicates that more and more users are using such providers. This use affects both the business and the private life, because even at a business lunch can be summarily checked in a restaurant.
In this context, Big Data does not just mean the evaluation from a single smartphone app. Rather, it is the context of the data, for example, by linking with social media contributions of the user. In terms of the previous check-ins from the Social Media Friends list, leisure activities can be created for the surrounding area. If the mood data of the user contributions are also evaluated, it can be deduced whether the user is looking for something calm or something sociable in the current mood of the day. The benefits of these evaluation options go in two directions: “at the end of this process, not only is there a satisfied customer, but also a unique set of new usage data, on the basis of which prediction analyzes are possible”, thus, by analyzing GPS data, forecasts can be made about the geographic accumulation of potential customers in relation to specific locations.
By 2019, data throughput expectedly grown exponentially to as high as 135 exabytes per month. This fact indicates, that the number of people who will have access to the Internet will increase immensely in the future.
Many non-analysts could ask themselves at this point who has an interest in analyzing whether an Internet user comes from Internet site A via website B to Internet site C or takes the direct route from Internet site A to Internet site C. However, there are numerous groups that are interested in such results and can generate benefits from them. In short, all companies are willing to make the processes more efficient, especially those with their own website are keen to limit the above 166-minute residence time on the Internet to your online presence platform and not lose users to other websites. The trend of website design goes from standardized websites to personalized websites, where the customer is addressed directly with advertising. In order to gain such insights and to be able to personalize the websites, the internet behavior of the users has to be examined. There are mainly two terms that should be explained in more detail: cookies and clickstream.
Cookies are small log files in the Internet browser, with which it is possible to identify a consistent user on another website. This allows the identity to be traced back beyond various websites.
In addition to this cookie tracking, there is also clickstream analysis, which can be described as a “very precise indicator of the behavior of a specific user segment”. Clickstream tries to group the behavior on web pages and the user interaction in order to be able to create specific user groups that are selectable according to certain characteristics. With the help of clickstream, “beaten tracks can be found on which the Internet users move through the web page structure more often than usual”.
A well-known example of clickstream’s analysis is the freely accessible Apache Hadoop programming framework. Data analyzed and correlated by Apache Hadoop are semi-structured log files that a user generates when they visit a web page on the Internet. These log files, like the cookies mentioned above, make the visitor of the website clearly identifiable.
As a result, cookies and clickstream are big data analytics tools that can be used to create an evaluation of user behavior from unstructured content that users inevitably leave on a web page. In an online shop, the exact way to buy the product or any detours can be depicted. Should the purchase of the product go too far, the website can be reactively redesigned to improve the way to product acquisition.
Conclusion of this Part
This part mostly discussed about the usages which are commonly we know. However, yet some use-cases remaining to discuss. In the next part, we will discuss that part and remaining small part on private life. The sixth part will end this series with a conclusion.