What Data Mining News Coverage Can Teach Us About Global Events
What can massive data mining of global news coverage tell us about the economic sanctions against Russia or global reaction to China’s economic slowdown? The Cross Country Emerging Markets Unit at BBVA Research recently produced a reportsummarizing their work on tracking geopolitical and social events. In the report, they used big data analysis that draws in part on myGDELT Project, which monitors, live-translates, and thematically, emotionally, and event codes local news media in almost every country of the world in 65 languages in realtime. Two of those analyses focus on Russian sanctions and the Chinese economy, which news mining offers a powerful new lens onto.
Looking first at Russia, the team at BBVA produced a networkdiagram of how countries are mentioned in context with one another in coverage of the economic sanctions against Russia. The visualization below takes their diagram and re-visualizes it using modularity, PageRank, and Force Atlas 2 layout algorithms to tease out additional detail on its structure. Each country is displayed as a node and the thickness of the line drawn between any pair of countries indicates the frequency at which those two countries are mentioned together in Russian sanction news coverage. Countries mentioned more frequently together are displayed closer to each other in the image. In essence, this offers a proxy of how countries are contextualized with respect to the sanctions. To make the visualization more readable, only the top 1000 strongest connections are shown.
Each country node is sized by its “PageRank” score, which is the same algorithm used by Google to rank web pages. In the context of the graph above, it captures the overall “importance” of each country in all coverage of Russian sanctions. Predictably the United States and European partners like Germany and France play pivotal roles and Ukraine is mentioned as the reason for the sanctions, but also clear are the central roles of countries like China and Iraq.
A technique called “modularity finding” was used to find the natural clusters within the network – those countries that are mentioned more often with each other than with the others. This yielded two clusters, one colored in red and the other in blue. The blue cluster is especially significant, as it contains many of the countries that oppose the Russian sanctions, such as India, China, Iraq, and Turkey. It also includes nations like Jordan, which Russia has signed major economic deals with in spite of the sanctions, and those like Israel which declined to recognize the sanctions.