Data-Driven Journalism is the process of obtaining, building, filtering, analyzing and presenting databases, with the aim of generating news. It is a practice derived from Precision Journalism, proposed by Philip Meyer in the 1970s, and Computer-Assisted Reporting (CAR). The first reference to Data Journalism, a term derived from Data-Driven Journalism, was made by programmer Adrian Holovaty (2006), in an article published on his personal website and entitled A fundamental way newspapers sites need to change. In the proposal, Holovaty recommends the incorporation of database management techniques into the daily routine of the newsrooms, as a way to facilitate the reuse of the information collected in the daily work of reporting.
Through data collection, using social science techniques, and database analysis, this specialty of journalism seeks to introduce elements of the scientific method into the productive routine of news, which, it is argued, would result in greater objectivity and accuracy in the news. It is mainly a productive routine, defined by the following steps: data collection, filtering, visualization and narration. Unlike the traditional journalist who protects his sources, the data journalist gives access to the data to as many people as possible.
A second is The inverted pyramid of data journalism, published by journalist Paul Bradshaw on his weblog in 2011. The author proposes the description initially in a four-step process:
- Compilation (compile)
- Combination (combine)
- Finding: Text mining in web content
- Cleaning: Process to filter and transform data, preparation for visualization
- Visualization: Displaying the pattern
- Publication: Integrating the visuals, attaching data to stories
- Distribution: Enabling access on a variety of devices
- Measure: Tracking usage of data stories over time
In addition to these steps related to jgd-specific production routines, there is the final step of communication (communicate), unfolded in turn in six steps or characteristics.
In order to achieve this, the process should be split up into several steps. While the steps leading to results can differ, a basic distinction can be made by looking at six phases:
From the year 2014, several software companies market software which automatically write articles based on large volumes of data. In the coming years, this could limit the work of the data journalist to a reformulation activity.