We have two panels presenting on ‘Socialising Big Data: The in/ vulnerabilities of digital data-objects’ at the CRESC Annual Conference to be held at SOAS, London, next week (4th-6th September).
On Thursday 5th September, papers will include: Evelyn Ruppert, From Probability to Plausibility: In/vulnerable numbers; Ana Gross and Celia Lury, Drawn Numbers: Secrets, Public Statistics and Representational Crises; and Stephanie Alice Baker, Predicting a Crisis on Twitter: From causality to emergence.
On Friday 6th of September, Ruth McNally and Adrian Mackenzie will be presenting on: Everything is in the Data: Reading the contemporary DNA archives.
From Probability to Plausibility: In/vulnerable numbers
Department of Sociology
Goldsmiths, University of London
Definitions of ‘big data’ in circulation focus on the volume, variety, velocity and granularity of data and the novel computational analytics that these demand. But rather than definitions these capture qualities of data that are brought into being by specific practices: those that generate new kinds of data (online activities, mobile phone use, transactions with corporations and governments, crowdsourcing, digitisation, etc.) and those that configure and analyse data in innovative ways (formatting, cleaning, linking, mining, correlating, inferring, curating, storing, searching, tracing, sharing, visualising, etc.). In this way the understanding of the ‘bigness’ of data is extended to a multitude of practices that generate and analyse data in innovative ways. For one, it encompasses governing practices that link data-objects such as numbers from myriad sources and analyses them in ways that are inventive of particular styles of thinking and governing. I examine one such invention—the plausibility estimate—in relation to the circulation, movement and assembling of population numbers. I argue that plausibility brings to the fore an historical tension in the notion of probability highlighted by Ian Hacking: that probability is both about belief and how often things happen in world. While still appealing to statistical procedures, I suggest that it calls for more belief and faith in numbers. It also legitimises reasonable guesses and precautionary principles and in so doing enlarges a space of discretionary control. So, while plausibility seemingly makes the certainty of assembled numbers vulnerable to challenge, invulnerability is secured by introducing and legitimising criteria such as reasonableness, trustworthiness, believability and credibility.
Drawn Numbers: Secrets, Public Statistics and Representational Crises
Ana Gross and Celia Lury
Centre for Interdisciplinary Methodologies, University of Warwick
Since 2007 the Consumer Price Index (CPI) produced by the Institute of National Statistics and Census (INDEC) in Argentina has been at the centre of a national and international legal and technical controversy. The figures produced are deemed false (they are described as ‘drawn’ numbers) by a range of actors, and their ability to represent the real fluctuation of prices as found in the wild, that is, prices observed and objectively annotated in their supposedly natural, situated states, is called into doubt. This paper explores this controversy as a crisis of representativeness that serves to expose the significance of the secrets deemed legally necessary to establish trustworthy relations between statistics and the real. To explore this vulnerability, we focus on the operation of the Statistical Secret Act in securing the (non-)trustworthiness of the CPI. As Emanuel Didier (2004) observes, the validity of public statistics is inseparable from the creation and maintenance of a barrier between the public and the operations of calculation. In the case of the CPI produced by INDEC, this barrier was breached, with the result that the stabilisation of consumer goods as ‘generics’, a process necessary for the measurement of pure prices, was not made possible, as the secret identity of retail outlets required to preserve the ‘wild’ state in which the fluctuation of prices may be naturally encountered and annotated was not protected.
Predicting a Crisis on Twitter: From causality to emergence
Stephanie Alice Baker
Department of Sociology
Goldsmiths, University of London
Big data analytics are increasingly employed for risk management and crisis communication purposes to predict and respond to civil unrest. Computer-generated technologies that measure public sentiment towards, and engagement with, crises on social media platforms are thought to provide valuable insights regarding the potential for future disorder. This paper explores the limits of predictive analyses with regard to issues of trust, power, memory and emotion. It conducts qualitative analysis of tweets posted during the 2011 English riots to examine how digital modalities reconfigure power relations between vulnerable and invulnerable populations as collectives seek to enact social change. Theories of power tend to be conceived in terms of govermentality and surveillance; on the contrary, this paper focuses on the performative qualities of the digital, exploring how these technologies altered the ways in which the riots were perceived, experienced and enacted.
Much of the power of these digital modes of communication is derived from the visual. Visual imagery disseminated on digital devices played a fundamental role in how spectators experienced the events, challenging the legitimacy of more traditional forms of media and articulating that which was unable to be articulated (both politically and symbolically). I propose that investigating the impact of social media on the recent riots, and the capacity for digital communication to materialise social relations more generally, requires recognising the new media ecology in which social movements emerge. There is an emphasis here not only on what is spoken, but what is unspoken, as well as the unpredictable nature of these events as the result of complex, interactive relationships, both online and off. This more reflexive approach to social media’s role in riots shifts the question from one of causality to emergence, a concern with the role digital devices and social media platforms play in the organisation of contemporary sociality.
Everything is in the Data: Reading the contemporary DNA archives
Department of Sociology,
Lord Ashcroft International Business School,
How do we read or even look at large-scale scientific data? Genomics is the name of the loose coalescence of scientific techniques that attempts to leverage biological understandings from large amounts of DNA sequence data. The enormous scientific investment in genomic DNA sequence data highlights both the hopes for a pattern-based knowledge of living things, and the costs of committing to large-scale sequence data archives as the foundation of scientific knowledge. This paper discusses both the forensic good of sequence data archives, and the many complications that such archives generate. It suggests that in genomic data archives, in contrast to many industry, business and government settings, we can read traces of shifts in scientific practice that re-make and re-shape what counts as biological knowledge. We suggest, therefore, that reading sequence archives with an eye for patterns of change in file formats, in metadata provision and indeed in the copying of data between repositories can help highlight the social life of data, databases, instruments and software.