This paper reflects on the potential for stimulating mutually desirable interactions between two groups of participants in the digital information commons. In the digital era the scale and speed of information production and circulation are increasing and the potential applications of information for solving societal problems are vast (Lindblom and Cohen 1979; Castells 2009). In the scholarly literature much research attention focuses on the emerging open collaborative culture involving the participation of numerous individuals and loosely connected online groups (Cross et al. 2002; Jenkins 2006; Tapscott and Williams 2007; Albors et al. 2008; Baym 2010). Separate from this is another prolific literature focusing on the development of an open inquiry model by formal science professionals as they respond to the potential of digital online platforms and to public demand for greater transparency and accountability.
For example, the UK Royal Society (2012, 7) says that ‘open inquiry is at the heart of the scientific enterprise’, but openness is not seen as an unqualified good. The Society also says that ‘there are legitimate boundaries of openness which must be maintained in order to protect commercial value, privacy, safety and security’ (Royal Society 2012, 9). Information activities should be organised to meet ‘the requirements of accessibility, intelligibility, assessability and usability’ (Royal Society 2012, 39). Thus, ‘intelligent openness’ implies that information resources ‘must be intelligible to those who wish to scrutinize them; data must be assessable so that judgements can be made about their reliability and the competence of those who created them; and they must be usable by others’ (Royal Society 2012, 7). ‘Others’, however, are admitted as ‘qualified observers’, thereby maintaining the privileged position of formal science. Indeed, within open science projects, access may be granted only to those explicitly deemed to be able to advance the aims of a particular project (David et al. 2010).
The ever-expanding information commons has long been associated with hopes that it will eventually yield universally distributed ‘collective intelligence’ (Teilhard de Chardin 1955/1999; Lévy 1997, 13; von Ahn 2005), or, at least greater interaction between formal science and more informal online groups. Mutually beneficial collaboration between them could result in novel capacities for managing the digital information commons.1 This paper focuses on two areas where the norms and routines of these groups differ, often creating barriers to collaboration. The first of these is the norms that influence authority, that is, control over the structuring and use of digital information. The second is norms with respect to the maintenance or curation of digital information.
The context of digital information is being shaped by a world of Web 3.0 technologies with their growing potential for creatively linking data. In contrast to the now familiar Web 2.0 which drew attention to user generated content and online interactivity, the Web 3.0, or ‘metaverses’ of social media and online interactivity, is beginning to inter-penetrate and to draw upon virtual (physical) spaces, data tagging, real time modelling, and autonomous machine computation. The result seems to be capabilities for augmenting existing data, simulating real world events, and supporting new forms of human action (Smart et al. 2007). As these developments unfold, it will be neither possible nor desirable for organisations and loosely connected online groups to endeavour to meet, or even identify, all of the information needs of their stakeholders (Powell et al. 2012, 11). Opportunities for collaboration are numerous and it is by no means obvious where authority should be lodged for the curating and linking of data. Large-scale scientific projects have always had to contend with these issues, but as the Internet plays host to increasingly vast data resources, they are becoming more salient for everyone.
The norms of authority and information maintenance or curation are key design features for the effective management of the digital information commons. At present, much digital information is inaccessible, is not retained, or is not structured in a useable way; it does not accumulate as ‘useful knowledge’ (Mokyr 2002). Hess and Ostrom (2007) note that there is typically a variety of routines for managing a common. These often become accepted as standard patterns of interaction, that is, they become ‘social technologies’ (Nelson and Sampat 2001). In formal science, the predominant social technology can be designated as constituted authority involving relatively formal norms and procedures for the accumulation of such knowledge. For loosely connected online groups, the accepted social technology of authority is more likely to give precedence to routines that are less formal. The social technology is more fluid and is better designated as adaptive authority in contexts where the aim is to generate information for immediate application. Such information is often relatively ephemeral or transient; it is either not integrated with information generated by formal science or, when it is, it is integrated according to the norms of formal science. The focus in this paper is on the latent potential for collaboration between those with quite different initial aims. With few incentives or resources, much digital information accumulated by loosely connected online groups remains fragmented and unavailable for future use either by those contributing the data or by other groups because it is held in ways that do not coincide with their requirements for data re-use and processing.
To examine opportunities for and barriers to greater collaboration between formal science and loosely connected online groups, illustrations of differences in their respective social technologies are taken from crowdsourcing initiatives in the fields of astronomy, the environment, and crisis and emergency response; areas in which formal science and informal online groups are devising novel ways to manage the digital information commons.2 The discussion highlights who initiates them, their affinity to different social technologies of authority, and whether they curate information. It is based on desk research, the author’s participation in a long term research programme on the management of open digital information, and the experience of an activist who participated in several of the examples.3 The selection is intended to draw attention to a range of ways that issues of authority and data curation are managed. The discussion is not intended to present a fully elaborated set of case studies. It serves to highlight where tensions are present and whether there are reasons to expect changes in social technologies that will accommodate the interests of both groups. Commons-based information initiatives such as Wikipedia and open source software projects are not discussed because the dynamic and transactional nature of these project differs from the data curation and re-use issues central to the analysis here and because these projects already are collaborative and have received considerable attention in the literature (Dalle et al. 2005; O’Reilly 2005; Pentzold 2011).
The following section highlights the way formal science and the activities of loosely connected online groups privilege different social technologies and approaches to curation in the digital information commons. The next section highlights features of crowdsourcing initiatives that illustrate some of the key differences, emphasising the tensions that arise when different social technologies come into contact as well as identifying the contexts in which there is reason to consider how collaboration might be fostered. This is followed by a discussion of challenges and opportunities for collaboration and suggestions for incentives that could foster the evolution of novel social technologies that would serve as design features for managing the increasing wealth of data in the digital information commons. The conclusion emphasises the need for a research agenda that bridges between largely separate discussions in the domains of research on formal science and on the ways loosely connected online groups are responding to the opportunities created by rapid changes in the extent and features of the digital information commons.
2.Social technologies and the information commons
Social values are embedded within hardware, software and applications, leading scholars to characterize them as ‘social technologies’ (Katz and Rice 2002). This usage of the term highlights values in the digital world, but it does not draw attention to norms and rules that become accepted within different groups and then influence their practices. Nelson and Sampat (2001, 40) suggest that patterns of human interaction employing standard institutions, that is norms and rules, can be designated as ‘social technologies’, that is, as institutions that have ‘come to be regarded by the relevant social group as standard in a context’. They influence how people act and interact especially where ‘the effective coordination of interaction is key to accomplishment’ (p. 40). This definition is employed here because it highlights issues of coordination and emphasises that standard social technologies influence the ways in which ‘actors get things done … by making certain kinds of transactions, or interactions more generally attractive or easy, and others difficult or costly’ (Nelson and Sampat 2001, 39–40). The idea that standardized social technologies are held in place by prevailing norms and rules is helpful for thinking about the difficulties that come to the fore when disparate social technologies come into contact with each other.
The routines of formal science and of loosely connected online groups involve different norms and procedures for managing digital information in the commons. It is important to examine these when they come into proximity with each other if collaboration is to be fostered. Technological change is a factor that spurs change in social technologies. Nelsen and Sampat’s examples come from mass production and synthetic dyestuffs, but the transition to a rich information environment with rapid technological innovation is also likely to yield novel social technologies for managing coordinated action in the digital information commons. Because social technologies ‘limit choices regarding how to do things’ (Nelson and Sampat 2001, 44), an examination of these developments can provide insight into measures that might encourage the evolution of new design features for collective action in the information commons (Ostrom 1990). Standard patterns of interaction for managing information in formal science are developing largely separately from those favoured by loosely connected online groups partly because of their initially different purposes. Greater collaboration between them could enhance the usefulness of such information, but this is hampered by their distinctive social technologies.
In formal science the social technology is formal and takes the form of constituted authority. There is explicit reference to historically constituted norms for processing, maintaining and accessing information. The priority is to enable digital information to be purposefully shared among those who meet particular criteria so that ‘propositions are tested by consensuality’ (Mokyr 2002, 5). Formal science is becoming more open due to funder requirements for publicly accessible publication of research and online tools are expanding the technical practices of disclosure, creating many new data repositories that might, in principle, support collaboration with those who are not accredited as professional scientists. Nevertheless, information disclosure or ‘sharing’ is motivated by contests for priority (acknowledgement of first discovery). Although claims of priority require full and accurate disclosure (Dasgupta and David 1994), the nature and extent of information disclosure is governed by professional norms rather than a presumption of completeness or comprehensivity. In formal science, there is an effort to minimise the tacit component of knowledge by curating information so that errors can be identified, theories can be tested, and information can be reused. Researchers are necessarily reticent to disclose all that they know while being obligated to disclose enough for others to interrogate and replicate their results in principle (David and Steinmueller 2013). Openness is therefore a qualified good and there is an effort to assess the reliability and competence of those who create information to retain some degree of control over information and its application, even when it is located in the digital information commons.
The conception of the information commons that is familiar to many loosely connected online groups is rather different. They are more likely to favour the social technology of adaptive authority. Groups are likely to reach decisions on the basis of informal discussion, the procedures for decision making are more fluid, and choices about how to organize and classify data are relatively impromptu or set by the affordances of open digital platforms, rather than by standards established by science professionals. There may be informal or semi-formal hierarchies of authority including moderators, administrators, and initiators of groups, implicit obligations for reciprocity, and norms that facilitate trust, but there is less attention to professional accreditation and priority. In these groups, information is commons-based ‘when no one uses exclusive rights to organize effort or capture its value’ (Benkler 2004, 1110). The hallmark is ‘collaboration among large groups of individuals, sometimes in the order of tens or even hundreds of thousands, who cooperate effectively’ (Benkler and Nissenbaum 2006, 394). The online participatory culture is seen as giving rise to a ‘cognitive surplus, newly forged from previously disconnected islands of time and talent’ (Shirky 2010, 29). While there may be power struggles over values and the roles of participants (Mateos Garcia and Steinmueller 2008; Berdou 2011), loosely connected online groups are evolving procedures and routines such as licences and governance structures, and they have a different understanding of ‘openness’ than is typical of formal science.
The ideas of constituted and adaptive authority refer to ideals and the social technologies employed by formal science and informal groups may be less differentiated in practice. Science, engineering, medical research, the social sciences and the humanities are becoming increasingly information-intensive and distinctions between qualified and ‘unqualified’ producers and users of information are blurring. However, access to information resources is discussed mainly with respect to technical issues regarding the effective use of digital platforms, the cost structures of research, and the technical problems of expanding access to data to professionals and ‘amateurs’ (Dutton and Jeffreys 2010), and less so with respect to issues of authority. With the increasing granularity, modularity and fragmentation of online information (Benkler and Nissenbaum 2006) and the progression of technological innovation towards Web 3.0, it is timely to consider how social technologies are changing alongside changes in ‘physical’ technologies.
Formal science is embracing some of the features of adaptive authority, but qualms persist about losing control of information and who is qualified to offer a view on the meaning of data. Change is evident, for example, in initiatives to foster ‘citizen science’ or ‘science by the people’ (Silvertown 2009, 4). The fact that there are barriers to collaboration between formal science and loosely connected online groups is suggested by the observation of Haklay (2011, np) that citizen science ‘can only exist in a world in which science is socially constructed as the preserve of professional scientists in academic institutions and industry’. This contrasts with ‘research in wild’ where concerned (online) groups might be treated ‘as (potentially) genuine researchers, capable of working cooperatively with professional scientists’ (Callon 2003; Callon and Rabeharisoa 2003, 195). Stodden (2010) notes that citizen science is regarded as complementary or even subordinate to formal science. Thus, formal science sometimes adopts norms that resemble those typical of loosely connected online groups, and even adopts a similar discourse about ‘research in the wild’ (EPSRC 2012), but its social technology remains that of constituted authority.
Formal science and loosely connected online groups differ in the way that they approach the accumulation of information (and data). In formal science, digital curation has a special meaning referring to ‘maintaining, preserving, and adding value to digital research data throughout its lifecycle … in trusted digital repositories [which] may be shared’ (Digital Curation Centre 2012). Its aim is the accumulation of stocks of useful knowledge. In contrast, loosely connected online groups are more likely to be interested in content creation, usually with fewer resources. In these groups, content curation refers to practices of aggregating, distilling, sifting and selecting information for relatively short term application and gives less emphasis to preservation, storage and reuse (Bruns 2010). Less effort is made to validate and organize information for purposeful sharing. These groups are generating relatively ephemeral information although some of the digital platforms they use do have a capacity for data preservation and reuse. Nevertheless, the emphasis is on timely sharing and application by others, and not only with those accredited as members of their groups. Yet these data could be linked with other information to yield novel data sets and enhanced interaction between these groups and formal science professionals.
Formal science and loosely connected online groups are increasingly facing what has been called a ‘data deluge’. The so-called ‘big data’ era is one in which ‘vast volumes of scientific data are captured and generated by large scientific facilities, new sensors and instruments, interconnected networks, e-commerce, and computer models’ (Codata 2012, np). Initially referring mainly to situations where large-scale data processing tests the limits of current technology, the term is being expanded to include a wider range of data related activities involving the mixing of data generated from disparate sources by or for both formal science and loosely connected online groups (Mayer-Schönberger and Cukier 2013). The move of formal science into the digital information commons is partly motivated by an effort to resist the further enclosure of information (Boyle 2008), but it adheres to the technology of constituted authority from which it derives privilege.
For some, there would appear to be relatively few possibilities for productive interaction between formal science and loosely connected online groups because the social technology of formal science is seen as ‘top down’ suggesting exploitative power relations, and the social technology of loosely connected online groups is seen as being ‘bottom up’, open and the result of consensual power (Shirky 2010). However, perceived incompatibilities may be diminishing as socio-technical controversies become more prominent. ‘Faced with the exceptional’ (Callon 2003, 40), explanations are more likely to be sought by people who do not know each other and have no pre-existing consensus about the ‘standard’ norms for validating, interpreting or curating information. The ‘overflowings’ of those historically excluded from the production, circulation and application of knowledge may be creating a stronger impetus to foster collaboration with formal science, even if many groups resist designation as those to whom knowledge is offered by science (Callon 2003).
The use of new digital technologies is generating vast amounts of data e.g. Twitter, Facebook, and Google. Some data are public and are generated by open source online tools including Ushahidi and OpenStreetMap (Gill and Bunker 2010; Goldfine 2011), while others are proprietary tools such as Google’s Map Maker, TeleAtlas’s Map Insight and Navtek’s Map Reporter. Although they are unevenly distributed globally (Graham 2011), tools for geodata collection, data aggregation, analysis and publication are increasingly widely available (Okolloh 2009; Chilton 2010; Haklay 2010; Berdou et al. 2012). As David (2005, 20) points out ‘it is important to notice that there is a region in which the two can overlap’, in his case in the context of freely shared and proprietary information. Similarly, it is important to understand the potential for overlap between formal science and loosely connected online groups when their respective activities are located within the digital information commons. In the wake of rapid technological change and contestations over the framing of social problems, a new paradigm (Kuhn 2000) or social technology for managing the commons is needed to reap the potential benefits of the information activities of both. Hess and Ostrom (2007, 13) note that there is a variety of approaches to managing the information commons and that the ‘outcomes of the interactions of people and resources can be positive or negative or somewhere in between’. To achieve positive outcomes, ‘strong collective-action and self-governing mechanisms’ are needed (Hess and Ostrom 2007, 5), but it not always clear how best to stimulate collaboration.
Relatively little is known about the overlaps between the social technologies of formal science and loosely connected online groups, with the exception of developments in ‘citizen science’ and open software development communities. Activity in the digital information commons involving crowdsourcing offers an opportunity to consider differences in their social technologies, the opportunities and barriers to collaboration, and incentives that might guide the emergence of new design principles for collective action in the digital information commons (Hess 2012).
3. Crowdsourcing in scientific and social practice
The availability of digital tools for addressing socio-technical controversies, natural and human disasters, and emergencies is making it feasible for both formal science and loosely connected online groups to generate ‘useful knowledge’. As the culture of free time changes with spread of the Internet and mobile phone, there is a vast resource of people who can engage in information production, processing and application. Crowdsourcing was defined initially as ‘the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call’ (Howe 2006, np), for example, the paid for activity supported by Amazon Turk. The meaning of crowdsourcing has since been extended to apply to activities aligned with open source software principles (Howe 2008; Malone et al. 2009). It now refers to any activity where a task is issued to an open or undefined online community (Brabham 2012, 395).
If differences in the social technologies of formal science and loosely connected online groups are downplayed, it might be assumed that the ubiquity of digital tools for crowdsourcing would favour collaboration without the need for interventions to encourage novel approaches. The experience of the United Nations’ Global Pulse initiative, however, confirms that social technologies do present barriers to collaboration. Global Pulse is a ‘real-time big data’ initiative launched in 2009 to employ innovations in digital technologies to ‘help decision-makers gain a real-time understanding of how crises impact vulnerable populations’ (UN Global Pulse 2012a, i). It aims to integrate data from formal science and loosely connected online groups using mobile call logs, mobile banking transactions, user-generated content (blog posts and Tweets), and online searches, with data sets collected by governments, the science community, and United Nations agencies. The crowdsourcing elements of this initiative are intended to complement official statistics, survey data, and information from early warning systems. The aim was to combine data to create verified information resources, providing feedback to policy makers and practitioners, reducing the time between information collection and action.
Both formal science and loosely connected online groups were expected to participate, but collaboration has proven to be difficult. Early on there were tensions over the control of information. For example, when data were held by companies, formal science institutions and governments, this led to legal challenges, inter-organisational competitiveness and secrecy. There were concerns about reputation on the part of some organisations when their data was to be integrated with data sourced from those they regarded as departing from trusted data management practices. The sourcing of data by and from loosely connected online groups proved to be difficult because of privacy concerns with respect to data contributors (UN Global Pulse 2012b). Tensions between the social technologies informing the practices of different participants are not surprising. The issue here is that there were few signs that these barriers to collaboration were being acknowledged from the outset or that measures were being put in place to create incentives to overcome them.
Global Pulse brought together organisations that privilege adaptive authority such as OpenStreetMap which is geared to responding to local events and crises using crowdsourced information that can lead to rapid action (Meier 2009). In contrast, many United Nations organisations handle information that is controlled and managed in line with the information standards of constituted authority (UN CITO 2012). Choi Soon-Hong, UN Assistant Secretary-General and Chief Information Technology Officer, observes that ‘organisations involved in crises often develop what we call “point solutions”, instead of “integrated solutions”, to manage crisis information’ (Stauffacher et al. 2011, 5); that is, solutions that are based on relatively ephemeral information as compared to scientifically verified information. As an experienced crowdsourcing practitioner put it, ‘balancing top-down and bottom-up requires more serious reflection than it’s previously been given’ (Currion 2011, 40).
In crisis and emergency situations, the emphasis is typically on responses to people’s immediate needs with information that can assist them in acting appropriately, regardless of how ephemeral the information is (Fung et al. 2007, 53; Hargreaves and Hattotuawa 2010). Those generating information in line with the priorities of formal science invest time and resources in validating information and arranging for control over its release. This initiative could have fostered a novel design for managing the digital information commons, but its ambition was scaled back in terms of achieving integrated data. It highlights the tensions between social technologies especially those associated with negotiating information access, rights to release data held by traditional science (and governmental) institutions, and ensuring that the rights of local populations are respected. In cases such as Global Pulse where adaptive authority is employed, actors will need incentives for making their data available so that they and others can make use of mixed data, for example, for health, the environment, emergency or crisis responses. If fragmentation as a result of the modularity of digital platforms and incompatibility of standards for data curation proliferates, there will be relatively fewer opportunities for applying the world’s rapidly accumulating digital information resources than otherwise could be the case.
Another project further illustrates the challenges of collaboration involving formal science and loosely connected online groups, in this case, highlighting the difficulty of achieving data integration. The Young Lives Linked Data Demonstrator project (Powell et al. 2012) aimed to process data from the Young Lives project hosted by the University of Oxford. This project follows 12,000 children over 12 years in Peru, India, Vietnam and Ethiopia. It uses household and child surveys, household and community data related to health, education, employment and income, family status, and welfare as well as some crowdsourced data. The demonstrator project sought to convert some of the data into open linked data sets to increase its accessibility. Linking data requires identifying the concepts represented in the data and choosing vocabularies, terms and identifiers. It involves technical knowledge, domain expertise, and knowledge of a linked data eco-system. Some progress was made but organizations such as the World Health Organization did not publish data that could be linked and, even when they did, they used different conventions from those employed by the Young Lives project. In order to link data the demonstrator project researchers had to follow the conventions and data modelling decisions of science-based organizations. They had neither the resources nor the expertise to evolve mutually agreed conventions suitable to external agencies.
This project highlights tensions between formal science and loosely connected online groups around data managed according different social technologies. Other examples of initiatives involving crowdsourcing for a variety of purposes demonstrate differences in approach and highlight the need for incentives to spark collaboration in those areas where it could be expected to enhance social welfare.
In formal science, the crowdsourcing initiative LHC@home is managed by the social technology of constituted authority with the aim of achieving curated stocks of digital information. This initiative involves volunteers who offer the unused capacity of their personal computers to enable the European Organization for Nuclear Research (CERN) to run simulations using data from the Large Hadron Collider. There is little if any incentive to provide access to these data so that they might be available for reuse by loosely connected online groups, even those that might be interested in pursuing the meaning of these data. GalaxyZoo engages volunteers in the analysis of imagery from NASA’s Hubble Space Telescope archive. Started by the Oxford University astrophysics group, it is part of the Citizen Science Alliance (CSA), a collaboration among scientists, software developers and educators who develop and manage projects using the time and abilities of a distributed online community to generate scientific results. Oldweather, also supported by CSA, is sponsored by a coalition of organisations including the UK Met Office, National Maritime Museum, the Atmospheric Circulation Reconstructions over the Earth project at Oxford University, and NOAA (National Oceanic and Atmospheric Administration) in the United States. Lessons about fostering novel ways of managing the digital information commons in a way that bridges between loosely connected online groups and formal science are unlikely to be found through further study of these kinds of crowdsourcing initiatives, yet these citizen science initiatives claim the attention of scholars in the research literature (Cooper et al. 2007; Wilderman 2007; Haklay 2011; Wiggins and Crowston 2011).
A blended or hybrid approach to managing the digital information commons, mixing the technologies of constituted and adaptive authority, or formal science and ‘research in the wild’, is visible in initiatives such as WideNoise which aims to tackle noise pollution and to alert citizens to the urban soundscape. Mobilised by Everywhereaware (Enhance Environmental Awareness through Social Information Technologies), it is supported by the ISI Foundation, a private research institute in Italy, a consortium of European universities, and by European Commission funding. Information is managed with reference to formal science (constituted authority). This initiative does generate relatively ephemeral information, but it has the ambition and resources to invest in data curation (within a scientific knowledge model). Smart Citizen aims to enable residents to respond to environmental air quality and noise pollution issues. This project is supported by the Institute for Advanced Architecture of Catalonia, Fab Lab Barcelona, and Hanger, an art research and production centre, a mix of organisations that adheres to formal science for data curation, aiming to manage information generated by a loosely connected online group in this way. The CSA sponsored Zooniverse is an online hub for ‘citizen science’ projects (mainly about space) and PyBossa, hosted by the University of Geneva and supported by the Open Knowledge Foundation and CERN’s Citizen Cyberscience Centre, provides a digital platform where anyone can launch projects involving human cognition and it adheres to the conventions of formal science.
Again, these crowdsourcing initiatives offer few hints about the potential for collaboration that respects the non-science social technology of many loosely connected online groups. The mobilizers in these examples are mainly formal science or science-related institutions that support ‘open’ science. They target pools of dispersed participants with varying skills who may be interested in games, prizes, recognition, or in their perceived contribution to a scientific enterprise. Information is managed according to the norms of formal science. Of course, digital information generated by and for formal science is not always curated when data are collected and it may not be published, amended or corrected. There are always struggles to obtain funding for ongoing data maintenance (Merali and Giles 2005). Loosely connected online groups such as Wikipedia, OpenStreetMap and some open source software groups are engaged in crowdsourcing and spend substantially to maintain their digital information resources for future uses, but they generally are not perceived to be doing so according to the norms of formal science. They are relatively well resourced compared to the less well established crowdsourcing initiatives of the groups that we turn to next.
In crowdsourcing initiatives responding to environmental problems and natural disasters and mobilised by loosely connected online groups the alignment is typically with the social technology of adaptive authority and thus with a contrasting view of the role of content curation as compared to digital (science) curation. For example, Radiation Map is a monitoring and mapping initiative of volunteer participants in the Russian Far East that was mobilised after the tsunami in Japan and the Fukushima radiation leaks (Plantin 2011). Citizens took radiation readings, analysed the data to assess the risk of radiation, challenged media reports and recorded areas of contamination in Russia. The information was relatively ephemeral in that there was no link to Russian science or government institutions, but there were limited efforts to curate some of the content.
Let’s Do It World links mapping and monitoring of waste and illegal dumping with local citizen clean-ups. As an illustration of an open source map, World Waste Map supports the coordination of clean-up teams. In this case, the information is not time critical, it is curated content, but it is not validated by the norms of formal science. It is supported by business system software companies, a passenger and cargo shipping company, and non-governmental organisations, the former three of which might be expected to align with the technology of constituted authority and to have interests in controlling the release of information about illegal dumping if it creates risks for their reputations. Nevertheless, they support open data and the adaptive authority of loosely connected online groups. The aim of these initiatives is to facilitate offline action by collecting actionable data, it is not to collect scientifically validated data. The structure (time, location and other categories) and process of data management differ from those of digital curation. Although the data could theoretically be linked with data generated by formal science, in practice, such data linking is costly and these groups have little incentive to do so.
A well-documented case of crowdsourcing is the deployment of the Ushahidi platform following the Haiti earthquake to collect information and visualize data (Gao et al. 2011). Participants’ messages were used primarily by international relief organizations. Mobilized by the Ushahidi and International Network of Crisis Mappers, organisations preferring the technology of adaptive authority, it was also sponsored by Tufts University and the UN Office for the Coordination of Humanitarian Affairs. Although it privileged adaptive authority, it faced the challenge of managing large quantities of data and included organisations typically aligned with constituted authority. The use of Ushahidi’s open platform provided a way of breaking the monopoly on crisis data previously held by large organisations such as the Red Cross and the United Nations, yielding data which enhanced situation awareness for small non-governmental organisations without the resources to collect or manage data independently.
Tensions were present in this case, however, partly because of the mingling of different social technologies of authority which influenced how things were done and made some interactions easier than others (Nelson and Sampat 2001, 39–40). For example, crowdsourced data were collected from local populations and served as information for evidence-based policy making, but the results were not always accessible to the local communities that provide them so that they could participate in responses to the crisis situation that was affecting them. Local people’s contributions are usually translated so that they can be incorporated into larger data sets. Once translated, information loses its context and the meaning is frequently lost. In the Haitian case, numerous text messages were sourced that contained information that did not fit into the conventions of an online form, e.g. name, age, gender, location. Translators discarded this information because it contained too little information to send to rescue teams (Sutherlin 2013). The social technology of constituted authority certainly helped ‘actors get things done’ (Nelson and Sampat 2001, 39), but it also influenced which interactions were made easier and more difficult in a very consequential way.
Another initiative using the Ushahidi platform is Russian Fire, aimed at facilitating emergency response aid during life threatening wildfires in western Russia (Asmolov 2010). Mobilised by volunteers, the principal purpose was not mapping or curating information, but facilitating crowd-to-crowd responses. Activists mobilised volunteer fire fighters and produced instructions facilitated by an Ushahidi based Help Map. Information was aggregated and organized by category, geolocation and time. The participants were not motivated to digitally curate (in the sense of formal science) information since their purpose, as in the Waste Map case, was to facilitate action. And although the data remain online and can be used by others, there was no incentive to create an open data resource that could be linked with data sets generated by formal science. It is difficult to encourage loosely connected online groups to curate their data on platforms such as Ushahidi. The platforms can become overloaded and in this specific example, there was no archive function (2010). The issue is not that those who collect data should be responsible for scientific data curation when that is not their purpose. Instead, the issue is the incommensurability of the paradigms or social technologies.
The initiators of these crowdsourcing projects are providing actionable data for the short term. Without incentives and resources, they are unlikely to see their data as a resource for later research that could benefit from the linking of data sets sourced by those adhering to different social technologies. Nevertheless, open digital platforms such as Ushahidi or Wikis and crisis maps have been described as creating ‘a new draft of history’ (Giridharadas 2010) and they are opening the ‘floodgates of information’ (Meier 2010). In contrast to formal science crowdsourcing initiatives that are oriented to producing scientifically validated information, the initiatives mobilised by loosely connected online groups in response to conflict or crisis situations aim to create an evidence base and action plans for applications that are temporary or ephemeral (Callon 2003; Haklay 2011); ‘hastily formed networks’ (Denning 2006; Yap 2011) may dissipate when the issue becomes less salient. Nevertheless, the data could be of value to formal science and public authorities with a remit for social action if incentives for collaboration were to be put in place.
4. Challenges and opportunities
The preceding discussion confirms that crowdsourcing initiatives give greater or lesser emphasis to the technologies of constituted or adaptive authority as their means of managing the digital information commons. Given their different purposes and motivations, they emphasize information curation according to the norms of formal science or the collection of more immediately applicable and ephemeral information. Their data may be curated on digital platforms and open to all, but there is frequently little feasibility of linking these data. As suggested by Global Pulse, when there is an explicit ambition to combine different approaches, tensions start to become evident.
The technology of constituted authority involves an effort to control and validate information in accordance with the values and routines of formal science. The technology of adaptive authority favours dispersed initiative, fluidity and rapid action. In some cases, individuals may have multiple identities, participating in both formal science and loosely connected online groups, but the distinguishing features of their respective engagements in the digital information commons are prominent. To encourage the curation of relatively ephemeral information as a record of historical events and as a resource for future analysis, participants in loosely connected online groups might acquire the skills for managing data in line with formal science. But rapid technological innovation and the spread of crowdsourcing are changing this singular approach. The digital platforms in use today and emerging Web 3.0 developments ‘entail a constellation of methods, materials, interpretations, conventions, understandings, skills, theories and social relations that collectively constitute a socio-technical system or ensemble’ (Hackett 2011, 28). The availability of these tools tells us little about how power relations will be negotiated or whether users will favour new approaches to collaboration in the digital commons (Quinn and Bederson 2011; Yap 2011).
Some suggest that crowdsourcing, even when it occurs in an open information environment, is fostering a ‘new elite’ that is ‘wary of overtly signalling the power dimensions of crowdsourcing to those drawn to the call’ (Wexler 2011, 15). Thus, the challenges of managing the digital information commons go beyond those related to the rise of an invisible college of science which encourages global digital interactivity and increasing use of digital tools (Wagner 2008). Much research using data of all kinds is being conducted in the private sector with very limited access for scholars working in the digital information commons. Companies are using their own transaction-generated data and they are accessing data from loosely connected online groups. The future is likely to see a radical mix of methods of data collection and analysis, some aligned with the social technology of constituted authority and others much less so (Savage and Burrows 2007). Research councils are starting to invest in infrastructures to manage the information commons in response to the ‘big data’ era and questions about these issues are increasingly on the agenda for debate especially with regard to whose data is to be included or excluded and on what terms (Boyd and Crawford 2012, 664).
In addition, if social welfare benefits from synergies between the activities of formal science and loosely connected online groups are to be increased, it will be necessary to do more than liberate digital information from the prevailing copyright regime. Within formal science there is intense debate about the enlargement of the public domain and provisions for ‘fair use’ of copyright protected information (DuLong de Rosnay and Carlos De Martin 2012, xvi). However, as discussed above, the information commons is not as ‘open’ as is sometimes suggested. Formal science and loosely connected online groups have different views of what is open, on what terms and for what purposes. Adherents to the technology of constituted authority may limit access to their information resources or they may bypass or exclude data generated by those groups that do not conform to their norms. If data generated by loosely connected online groups starts to be mixed in the commons alongside scientific data, such groups may be charged with degrading the information commons if they do not operate in line with formal science standards of information verification.
Loosely connected online groups also may represent a threat to professional science or to other established institutions if they start to compete for resources required to curate their information. This is particularly so in the development area where both governments and larger non-governmental organisations have underinvested in their capacities for managing digital information within and external to their organisation (Powell et al. 2012). If they start to allocate financial resources to support these activities, other groups are likely to claim that their own resources are threatened. Research councils are building open data infrastructures in a period of declining funding, and resourcing the capacity to host existing formal science data is already a challenge. Extending capacity to handle potentially interlinked data resources generated by other groups is likely to heighten competition for scarce funds.
Nevertheless, there is much potential for fostering collaboration. There is an increasing subscription by formal science professionals and loosely connected online groups to the principles of open access to digital information. This is creating a favourable environment for discussion about how to foster design principles to encourage collaboration. In the case of some of the examples discussed in this paper, the potential benefits of access to curated data sets on locations and responses to wildfires, the actions taken to mitigate damage at sites of toxic pollutants, or information about relocations of populations and their welfare in times of crisis, could be of future use if they are linked with formal science data about the quality of the environment or population migration, for instance.
Among the possible means of creating incentives for collaboration is the use of research contracts to finance the curation of data sourced by different means. However, if these enforce the technology of constituted authority they are unlikely to succeed. If organisations (public and private) that help to finance crowdsourcing by loosely connected online groups to address immediate social problems were to encourage these groups to deposit their data in open collective repositories, a contractual approach would help to create incentives for collaboration. If public agencies do not move into this space, private sector intermediaries undoubtedly will as they realise the potential for revenue accruing from linking and analysing data.
As different groups respond to the challenges of a wealth of data, the norms for accrediting those deemed qualified to access data are likely to change. Mayer-Schönberger and Cukier (2013) suggest that ‘algorithmists’, trained in computer science, mathematics and statistics will become the future data analysts. The technology of constituted authority, albeit with a somewhat changed skills base, could become the principle means of building trust in their work. However, if new routines, skills sets and standards are developed with the participation of representatives of loosely connected online groups, there is scope for the emergence of novel approaches in the commons. Experience shows that trust can be built up in a variety of ways (Benkler and Nissenbaum 2006). It does not need to rely mainly on constituted authority. What might replace or complement existing social technologies is not something that can be predicted because it will evolve through the negotiated practices of those involved in data analytics and in devising new applications of linked data sets. It is clear, however, that these developments will have implications for what information management practices prevail in the open digital information commons in the future.
The future design principles for managing the digital information commons will reflect accommodations between the social technologies of authority in use today. Formal science and loosely connected online groups are generating and applying data in response to many societal challenges. Both could benefit from the accumulation of useful knowledge derived from their respective activities, despite differences in their initial purposes.
Loosely connected online groups that produce relatively ephemeral information find their information useful for their purposes. When it is maintained, it can be used by others for social action because it is good enough for a specific purpose or better than available alternatives. In some cases, such as the Russian Fire example in this paper, that information would not be available if it had to be processed according to the social technology of formal science. However, new forms of collaboration in the digital information commons would help to ensure that the future development of innovative technologies and data analytics methods do not exclude such data sets, enhancing the potential for linking data in creative ways and stimulating measures for ensuring that future data sets are not the preserve of formal science or locked up by the private sector.
The digital information commons is a space for many disparate interest groups. Novel social technologies will evolve and change will be contested, given the different interests and approaches. If useful knowledge historically has been associated with the social technologies of formal science, the future might bring hybridisation of existing approaches. Alternatively, it could see the rise of a new elite that distances itself increasingly from the activities of loosely connected online groups to protect its authority or to further enclose new data compilations.
The standard ways of interacting online privileged by formal science and loosely connected online groups differ for many good reasons. However, the likelihood that data will be sourced and recombined in different ways seems certain to raise considerable barriers to collaboration in the digital information commons. These barriers could be reduced if attention is given to the differences among prevailing social technologies of authority in the digital world. Research investigating co-evolving ‘physical’ and social technologies leading to practical suggestions for change of benefit to both formal science and loosely connected online groups is needed.
Such a research agenda will need to bridge the gap between the currently largely separate scholarly and policy debates around the accommodation of formal science to the ‘open’ data agenda and the similar debates about the contributions by loosely connected online communities to the resolution of societal problems. The pressing question is how evolving social technologies will influence interactions among human beings and their digital technologies, which data analyses and applications will become difficult and costly and not pursued, and which will be encouraged and sustained in the interests of maximizing the social welfare contributions by those participating in the digital information commons. A research agenda in response to this question would take up the agenda suggested by Ostrom (1990) that is premised on the principle that collective action in a commons requires a framework of rules that encourages respect for the norms of disparate commons-based groups.