Show Summary Details

Page of

PRINTED FROM OXFORD HANDBOOKS ONLINE ( © Oxford University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

date: 05 August 2020

Navigating the Ethics of Big Data in Public Health

Abstract and Keywords

“Big data,” which encompasses massive amounts of information from both within the health sector (such as electronic health records) and outside the health sector (social media, search queries, cell phone metadata, credit card expenditures), is increasingly envisioned as a rich source to inform public health research and practice. This chapter examines the enormous range of sources, the highly varied nature of these data, and the differing motivations for their collection, which together challenge the public health community in ethically mining and exploiting big data. Ethical challenges revolve around the blurring of three previously clearer boundaries: between personal health data and nonhealth data; between the private and the public sphere in the online world; and, finally, between the powers and responsibilities of state and nonstate actors in relation to big data. Considerations include the implications for privacy, control and sharing of data, fair distribution of benefits and burdens, civic empowerment, accountability, and digital disease detection.

Keywords: big data, electronic health records, digital disease detection, data sharing, social media, public health ethics, privacy

(p. 354) Introduction

Digitalization and advanced methods of storing, curating, mining, and analyzing data have ushered in in the era of big data. “Big data” is a broadly defined phenomenon that seems to touch upon nearly all aspects of human activity, and to have the potential of revolutionizing modern life (Mayer-Schönberger and Cukier, 2013). There are two fundamental aspects to big data: The first is the plurality of sources from which data are generated. They span a broad spectrum from Internet clicks, to social media activity, geolocation, and the use of apps, to genomic data and health and credit records. The second aspect is the data analytics and data mining techniques deployed to distill meaning from the data themselves. Such tools enable important inferences and identification of non-obvious patterns in human behavior as well as other structures in organizations and networks. This understanding allows the development of predictive models, and it is precisely this predictive power that makes big data valuable (Vayena and Gasser, 2016b; Pentland et al., 2013).

Big data is widely thought to offer an unprecedented opportunity to improve both public health research and practice (Shaw, 2014). The digital transformation we are witnessing is unprecedented in several ways, including the amount and diversity of data that are generated and stored; the possibilities of linking diverse data for analysis, even if some of these sets do not appear to have any bearing on health (e.g., cell phone metadata, Internet clicks); the speed with which data can be analyzed; and the predictive power that such analyses can yield. Furthermore, big data makes available information that may previously have been public in a technical sense but is now made available widely (p. 355) and, in many cases, at no cost to the user (Google Street View is an example, as is access to tax and real estate records).

Despite the excitement about big data in public health, attention has been drawn to various challenges such applications pose (Vayena, Mastroianni, and Kahn, 2012; Goodman and Meslin, 2014; Vayena et al., 2015; Mittelstadt and Floridi, 2015). Some of these challenges are technical in nature, such as interoperability and analysis of unstructured data, filtering “noise” from data, among others (Khoury and Ioannidis, 2014; Palfrey and Gasser, 2012). As a result, much of the promise of big data in public health still remains in the domain of imagined possibility. Nonetheless, this chapter will describe and anticipate ethical challenges for big data in the public health domain, with a special focus on the following non-exhaustive list of concerns: privacy, control and sharing of data, roles of state and nonstate actors, fair distribution of benefits and burdens, civic empowerment, and accountability.

It is worth noting the complex interplay between the technical and ethical concerns. For example, it could be argued that, given the current limited uptake of big data in public health, addressing trade-offs between privacy and public benefit are not yet called for. On the other hand, it might be argued that there is a moral responsibility to take measures that will enable big data approaches to fulfill their as yet unrealized potential. We can readily see that a dilemma may arise—big data cannot fulfill its public health potential because of ethical constraints (e.g., privacy concerns), which are in turn justified by a claim that technical approaches to address those ethical constraints (e.g., privacy protections) are not yet sufficiently reliable.

The Big Data Phenomenon and its Potential for Public Health

Big Data from the Health Care System

Early hopes about big data were initially fueled by the advent and increasing adoption of electronic health records (EHRs) (Ross, Wei, and Ohno-Machado, 2014; Birkhead, Klompas, and Shah, 2015; Hoffman and Podgurski, 2013). EHRs are rich sources of data, including not just the patients’ health data, but basic demographic data as well. EHR systems may have additional functionalities and even allow for some practice-level statistics. Therefore, analysis of EHR data can be used to carry out quality improvement, “real-world” research, and public health surveillance for communicable and noncommunicable diseases (Ross et al., 2014). In some countries, institutions that have established EHR systems are obliged to submit data to the health authorities, including lab results, syndromic surveillance, and immunization data. Several studies have also demonstrated the potential of EHR data mining for pharmacovigilance. More recent studies (p. 356) have shown that analysis of open notes on health records can pick up adverse events earlier than standard pharmacovigilance, leading to faster examination of suspected adverse events (LePendu et al., 2013).

Routine electronic reporting from health system EHRs results in calculation of population-level immunization rates. A system dubbed ESP (Electronic Support for Public Health) that sits alongside and communicates with EHRs has been used to monitor events of potential public health significance such as vaccine adverse events (Baker et al., 2015). However, it is fair to say that the full potential of EHRs in public health has not yet been adequately realized due to a variety of factors, including lack of interoperability across EHR systems, lack of data analytics specific to public health, lack of standard implementation practices, and unanticipated challenges in the disruption of work flow in the clinical context (Cresswell and Sheikh, 2012).

Big Data beyond the Health Care System

Another promising use of big data in public health is the use of data sources from outside the health care system. Internet searches, social media, cell phone metadata, and other sources can be mined for a variety of public health purposes. For example, predictive analytics (data analytics tools that make predictions about future events) can be used to identify communities, neighborhoods, and eventually individuals that are in need of health services. Early examples of this have been reported by the Chicago Department of Public Health, which used predictive analytics to develop microtargeting campaigns for mammography screening. (Bhatt and Mansour, 2015). Furthermore, computational analyses of opinions expressed in social media (sentiment analysis) can point to the specific types of health information that would be most relevant for these communities. On the basis of such information, health promotion programs can be better targeted and result in better outcomes (Ayers et al., 2016). Examples of applications in the public health sphere are increasing and include complex issues such as vaccination coverage and utilization of mental health services, including interventions for suicide prevention (Costa-Roberts, 2015).

Perhaps most notably, big data sources outside the health care system can be used as a source of early and reliable information for public health surveillance. This activity is better known as “digital disease detection” (DDD) (Brownstein et al., 2009). (Other terms that have been used for this category of surveillance include “informal” or “unstructured” surveillance, “epidemic intelligence,” and “event-based” surveillance, or EBS.) Many examples of DDD systems have developed over approximately twenty-five years. ProMED-mail (the International Society for Infectious Diseases’s Program for Monitoring Emerging Diseases, []), for example, is an expert-moderated system that is widely used in the global public health community (Yu and Madoff, 2004), and HealthMap is an automated system that retrieves data on disease outbreaks via the Internet (Harvard Medical School and Boston Children’s Hospital, 2018).

Other innovations have used alternative data sources. A notable and well-documented example was Google Flu Trends (and later Google Dengue Trends), which used (p. 357) geolocalized search query data to detect or predict disease outbreaks. The ever-expanding sources of data that have been studied for disease-detection purposes now include cell phone usage patterns, Wikipedia access, online access to medical textbooks, and queries (Wesolowski et al., 2014; Santillana et al., 2014; McIver and Brownstein, 2014).

Syndromic surveillance, based on reports of symptoms of illness that are collected (e.g., fever, headache, vomiting), is often categorized as part of traditional or official public health systems. One innovative syndromic approach, outside of the formal health system, aggregates self-reports of symptoms to monitor potential disease events. Sometimes called “participatory surveillance,” and including systems such as Flu Near You in the United States and Salud Boricua in Puerto Rico, their results closely parallel more established public health surveillance.

Finally, another concept that can potentially impact public health is that of the digital phenotype. It is based on the idea of the extended phenotype, first introduced by Richard Dawkins, who argued that our phenotype cannot be limited strictly to our biological processes (Dawkins, 1982). Instead, our interactions with the environment and the ways in which we modify it are part of our wider phenotype. Capturing and understanding these interactions allows for greater understanding of how we function, and therefore of our health status. Jain et al. (2015) take this idea further, opening the notion of the phenotype to our daily digital interactions. We interact with personal digital technologies, we modify them and they affect us, and they constitute a major part of our environment; as such, they are natural extensions of our phenotypes. In this understanding, the digital phenotype serves as a normative guide that aims to offer (a) an alternative, but not exclusive, approach to the standard biomedical paradigm that typically starts with a fixed hypothesis about biological processes and aims to collect evidence to refute or confirm it; and (b) a more unified take on big data, whereby all data that can be captured about, or from, a person can contribute to understanding biology, health, and disease. The “digital phenotype” gives an answer to several calls over recent years to exploit all kinds of data in relation to the individual (Murdoch and Detsky, 2013; Ayers et al., 2014). Several epistemic arguments have been offered in support of such efforts, including the identification of new hypotheses, real time insight, and assumption-free insight (Weber et al., 2014).

Key Ethical Challenges

Ethical challenges to adopting the methods described above revolve around the blurring of three previously clearer boundaries: between personal health data and nonhealth data; between the private and the public sphere in the online world; and, finally, between the powers and responsibilities of state and nonstate actors in relation to big data. In this section, we explore some key ethical questions that arise in the new context in which these familiar divisions can no longer be assumed to be fixed. Towards that end, (p. 358) Figure 31.1 articulates features of big data and specifies their potential public health applications, identifying key ethical issues that may arise from big data’s use in the public health sector.

Navigating the Ethics of Big Data in Public Health

Figure 31.1 Key Ethical Issues in Big Data and Public Health. The inner circle contains different features of big data, the middle sections refer to possible applications of big data in the field of public health, and the broadest circle lists key ethical issues arising from the use of big data in the public health sector.


Privacy is often cited as the most difficult ethical challenge in uses of big data (Gasser, 2015). Several reasons account for this difficulty. First, to increase utilization of data, access should be granted to several users (researchers or public health practitioners) for uses that were not necessarily intended or anticipated when the data were collected. Widening access to data in this way can aggravate the risk of breaches of confidential (p. 359) information. Second, routine measures adopted in the past to ensure privacy protection, such as de-identification or even anonymization, may no longer be adequate to protect the confidentiality of data (Rothstein, 2010; de Montjoye et al., 2015). As a result of advanced analytical and computational methods, linking different data sets increases the risk of re-identification (Sweeney, 2000). Third, a primary mechanism for assuring the protection of privacy has been respect for individual consent (Vayena and Gasser, 2016a). However, the consent model is stretched to the breaking point when what is in prospect is consenting to unknown uses and linkages of data by various unknown users (Vayena, Mastroianni, and Kahn, 2013; Koenig, 2014; Cate and Mayer-Schönberger, 2013). A powerful feature of big data is that data from diverse data sources are linked and analyzed for an infinite variety of purposes that cannot be foreseen at the time that consent is obtained from an individual. It is therefore impossible for an individual to make an informed decision about the risks and benefits of data use when both the purpose of use and future users are unknown. Finally, underlying all these challenges is the deeper challenge of articulating a method to protect the confidentiality of private information that does not unduly inhibit the use of information for the common good.

Complicating all these challenges is the way in which the distinction between the public and private spheres has been called into question in recent years (Gleibs, 2014; Zimmer, 2010). Studies of social media users have shown that being online and sharing information about oneself is not necessarily the same as having decided to “go public” with the same information (O’Brien et al., 2015). Users may have expectations of privacy despite being active online, and this includes the expectation that they are not being tracked or having personal data (beyond actual postings on a social medium) used and shared by the owners of the interface they are using (e.g., Facebook sharing with Google).

An example that illustrates these tensions is the use of Twitter data for public health surveillance. It is unlikely that those who tweet are aware of all the ways their tweets may be put to use (e.g., as public health surveillance or pharmacovigilance). Although Twitter data are in the public domain, questions have been raised about the ethics of their use (Rivers and Lewis, 2014). Two different arguments have been constructed: one relates to the earlier point about people who tweet being unaware of the uses of their data, including potential risks that such uses may carry; the second, which represents a more European perspective on the value of privacy, relates to the point that even if people made their personal data publicly available, that does not automatically translate into unconditional permission for use by anyone for any purpose. The latter point is of particular interest in the following sense: What an individual makes publicly available, such as a symptom she experiences, like a “headache,” will be combined with similar data from others; algorithms will then be applied to the data; and novel information will be derived that reveals more about her—for example, that she may be suffering from a specific health condition. It is not the data point “headache” that this person considers private (she made this publicly accessible), but rather the end result of the analysis of this and other data points—in this example, that she may be suffering from a specific health condition. Given the stigma associated with many diseases and conditions, real harm can occur to individuals and groups, if either can be identified in the reporting on results.

(p. 360) Despite such concerns, the ease with which such data sets can be analyzed and the increasing evidence about their potential value make it tempting for various organizations to utilize them. Consider the example of suicide prevention (O’Dea et al., 2015). Big social media platforms (e.g., Facebook, Reddit’s SuicideWatch) have developed suicide prevention programs. In 2014 Twitter partnered with Samaritans, a suicide prevention charity, on an app (Samaritans Radar) that screened tweets for suicide-relevant content and alerted followers of the person considered to be at risk. The project was enthusiastically launched, only to be suspended a week after the launch due to concerns about privacy and stigmatization. This example highlights the tension between achieving some public health goals while respecting the right to privacy (Lee, 2014; Samaritans, 2014).

Control of Data and Data Sharing

A number of ethical issues arise from the question of who controls data sets, including the mechanisms that determine how data may be shared. Sharing of data for public health purposes, even within the same nation, is challenging, but it is even more so across international borders. Public health emergencies often bring the difficulties in data sharing to the forefront. The barriers to sharing data include logistical, political, economic, legal, and ethical factors. Notably, there is no global framework for data sharing in public health, or even specifically for surveillance.

Moreover, many data sources that constitute big data, and that have potentially valuable public health uses, typically belong to private companies, such as those that operate social media platforms or search engines. One question that requires deeper exploration is what obligations, if any, such private companies have to make their data (or metadata) accessible to public health researchers or authorities for purposes such as public health surveillance. Many such companies treat personal data that they have in their possession as a key business asset. This in turn raises further questions about the conditions that need to be met in order for access to be granted, and about the means by which possible uses of data may be communicated to the public. In the post-Snowden era of concern about state surveillance, public concerns about the legitimacy of state surveillance for security purposes may negatively affect public acceptance of public health surveillance. Therefore, it is crucial to distinguish the role (and/or intent) of public health surveillance from other forms of surveillance, whether conducted by state or nonstate actors.

Nonstate Actors in Public Health

No adequate account of the ethics of big data in public health is possible without an extended analysis of the nature and capabilities of the agents involved. The state, of course, is a key player in public health. Others include powerful nonstate actors (e.g., corporations), some with a global reach in their activities and with the capacity to draw on levels of resources that surpass those possessed by many states. In a space with a plurality (p. 361) of highly diverse actors, it is all the more important to explore the question of the public health role that can be properly assigned to each. Consider, for example, that a private company running analytics on large data is able to discover—perhaps without even seeking to do so—a pattern of disease related to a behavior, a region threatened by an outbreak, or even a particular individual as a source of an infectious disease outbreak. Is this company obliged to report its findings to health authorities? Or is it obliged to communicate its findings to the broader public, or to individuals who are specially affected? What are the duties of nonstate actors to anticipate such concerns through incorporating a public health dimension into their due diligence procedures (Kahn et al., 2014)? As argued in the United Nations’ Guiding Principles on Business and Human Rights, nonstate actors, such as corporations, have responsibilities toward global justice and human rights quite independent of their legal duties (UN OHCHR, 2011). Therefore, some nonstate actors may bear specific responsibilities to contribute to public health activities (e.g., corporations collecting data that can be used for surveillance in a country that is lacking a state surveillance system), while they simultaneously have the duty to respect individual rights.

Finally, beyond the role of nonstate actors as subjects of norms, it is also important to consider the role they play in the process of norm creation in the sphere of big data. This role partly arises out of the fact that powerful nonstate actors are often in possession of the data sets, have special control and expertise regarding their handling (e.g., encryption), and operate globally and hence are able to influence standard-setting across a variety of jurisdictions. Naturally, the assumption of unaccountable nonstate actors, motivated by profit maximization, having a quasi-legislative role clearly gives rise to a serious concern about the legitimacy of the ensuing norms. Serious questions arise as to whether nonstate actors should have any such role and, if they do, under what conditions, relating to such factors as accountability and transparency.

Harm Mitigation

Various kinds of harms potentially arise when individuals or communities are identified within the context of some public health activity. Such harms may include the infringement of individual rights, discrimination, and stigmatization, which may result in economic and social sanctions, social exclusion, damaged family and social relationships, among other harms. The risk of these harms is in various ways exacerbated because big data approaches to public health are in an early stage of technical development (e.g., misidentification, false predictions). Some of the most commonly discussed methodological challenges for big data in public health involve difficulties in filtering out “noise” from data and validating and adjusting algorithms, including the entire spectrum of relevant data versus data that may give a skewed result. An example often cited with respect to the methodological issues is Google Flu Trends, which initially appeared to be very successful in identifying influenza trends in the United States, doing so as accurately, but much faster, than the US Centers for Disease Control and Prevention (CDC). Eventually, (p. 362) however, the Google flu algorithms failed to deliver similarly accurate results, becoming a landmark case of the potential hype and risks associated with big data analytics in the public health sphere (Cook et al., 2011; Butler, 2013; Lazer et al., 2014).

The question of redressing harms in public health has also been raised in the opposite way: If people are harmed by inaction or, in the case of big data, failure to use available technologies and means to improve public health, what means are available to individuals of groups for redress? Allen et al. (2013) have argued that there is no redress mechanism for such harms. This argument can be extended to the big data context, particularly as novel methods are developed that enhance the value of big data analytics in advancing the public’s health.

Fair Distribution of Benefits

Big data uses for public health can generate both benefits (e.g., improvements in health) and burdens (e.g., risk of privacy breaches) for individuals and communities. This leads to a series of questions about whether the burdens and benefits of big data approaches to public health are fairly distributed. In 2013 the National Health Service (NHS) in England introduced the initiative, aiming to collect patients’ records from general practitioners to a centralized database. The data would be used for research as well as for quality improvement in service provision. However, many critics felt that the burden that may accrue to individual users of the NHS if their confidential data was disclosed was not balanced with the benefit that would likely accrue to profit-making companies that were provided access to their health data. These concerns were consolidated into a strong public opposition to the initiative, which led to its discontinuation in 2016 (UK Department of Health, 2016). Such imbalances will likely arise in the future when big data collected in resource-poor settings is used to benefit those in high-resource settings.

Civic Empowerment

Big data offers the prospect of an actively engaged citizenry, one that voluntarily takes up opportunities to make valuable contributions to public health that serve the common good. Participatory models may counteract the negative image that, in the perspective of big data analysis, individuals are reduced to merely the source of valuable data to be used for the benefit of someone other than themselves. We should be wary of a potential dark side, however, to the rhetoric of empowerment. This is the risk that under the banner of giving people enhanced options one is in fact burdening them with responsibilities that they are not equipped to discharge. For example, in a future scenario in which the delivery of health care depends on a widespread use of participatory methods, those ill-equipped to make use of those methods (e.g., elderly people who are not adept at using smart phones or other technologies) may enjoy only an illusory form of empowerment (Robinson et al., 2015).

(p. 363) Accountability

The foregoing is a non-exhaustive list of the kinds of considerations that are central to the regulation of the use of big data in public health. In addition to these “first-order” concerns, it is necessary also to bring into play “second-order” institutional mechanisms of accountability (Vayena et al., 2018). These mechanisms monitor the extent to which the first-order concerns are being addressed and provide methods for improving performance and avenues for seeking appropriate remedies in the case of failure. These accountability mechanisms should apply to both state and nonstate actors, although their nature may differ in line with differences in the kinds of actors, their goals in using big data, the degree of power they enjoy, and so on.


New streams of data, from both inside and outside the health care system, along with techniques to analyze them, will only increase. Our interconnected devices that automatically communicate and exchange data within networks are already offering a glimpse of the future of ubiquitous and continuous data generation and capture. These technologies can offer benefits to public health. However, for them to serve the purpose of public health successfully, they need to be carefully balanced against potential harms, including the loss of privacy and the diminishment of autonomy and equity (Vayena and Gasser, 2016b). Public health must respect individual human rights (Childress et al., 2002), but it must also draw its moral legitimacy from considering the common good (Tasioulas and Vayena, 2016). Both respect of individual rights and promotion of the common good of health are not just a matter of state responsibility. Nonstate actors are increasingly playing important roles in data control and data uses, which can impact the conduct of public health research and activity. Therefore, their responsibilities in relation to their interaction with state actors need to be clearly spelled out. Ultimately, public health success will rely on the public’s trust and cooperation, and for that, transparency and accountability are crucial. Big data in public health cannot afford the shadow of opacity about potential harms relating to privacy, discrimination, and fair distribution of benefits. It needs to identify, balance, and mitigate them in order to legitimate its application and harness its immense potential.


Allen, J., Hulman, D., Meslin, E. M., and Stanley, F. 2013. “Privacy Protectionism and Harms to Health: Is There Any Redress for Harms to Health?” Journal of Law and Medicine 21: 473–485.Find this resource:

Ayers, J. W., Althouse, B. M., and Dredze, M. 2014. “Could Behavioral Medicine Lead the Web Data Revolution?” JAMA 311(14): 1399–1400. doi:10.1001/jama.2014.1505.Find this resource:

(p. 364) Ayers, J. W., Westmaas, J. L., Leas, E. C., Benton, A., Chen, Y., Dredze, M. et al. 2016. “Leveraging Big Data to Improve Health Awareness Campaigns: A Novel Evaluation of the Great American Smokeout.” JMIR Public Health and Surveillance 2(1): e16. doi:10.2196/publichealth.5304.Find this resource:

Baker, M. A., Kaelber, D. C., Bar-Shain, D. S., Moro, P. L., Zambarano, B., Mazza, M., et al. 2015. “Advanced Clinical Decision Support for Vaccine Adverse Event Detection and Reporting.” Clinical Infectious Diseases 61(6): 864–870. doi:10.1093/cid/civ430.Find this resource:

Bhatt, J., and Mansour, R. 2015. “How Chicago Is Using Predictive Analytics to Improve Public Health.” Paper presented at the Annual Meeting of the American Public Health Association, November 3. this resource:

Birkhead, G. S., Klompas, M., and Shah, N. R. 2015. “Uses of Electronic Health Records for Public Health Surveillance to Advance Public Health.” Annual Review of Public Health 36: 345–359.Find this resource:

Brownstein, J. S., Freifeld, C. C., and Madoff, L. C. 2009. “Digital Disease Detection—Harnessing the Web for Public Health Surveillance.” New England Journal of Medicine 360(21): 2153–2157. doi:10.1056/NEJMp0900702.Find this resource:

Butler D. 2013. “When Google Got Flu Wrong.” Nature 494(7436): 155–156.Find this resource:

Cate, F. H., and Mayer-Schönberger, V. 2013. “Notice and Consent in a World of Big Data.” International Data Privacy Law 3(2): 67–73. doi:10.1093/idpl/ipt005.Find this resource:

Childress, J. F., Faden, R. R., Gaare, R. D., Gostin, L. O., Kahn, J., Bonnie, R. J., et al. 2002. “Public Health Ethics: Mapping the Terrain.” Journal of Law, Medicine & Ethics 30(2): 170–178.Find this resource:

Cook, S., Conrad, C., Fowlkes, A. L., and Mohebbi, M. H. 2011. “Assessing Google Flu Trends Performance in the United States during the 2009 Influenza Virus A (H1N1) Pandemic.” PLoS ONE 6(8): e23610. doi:10.1371/journal.pone.0023610.Find this resource:

Costa-Roberts, D. 2015. “South Korea Announces App to Combat Student Suicide.” PBS NewsHour, March 15. this resource:

Cresswell, K., and Sheikh, A. 2012. “Effective Integration of Technology into Health Care Needs Adequate Attention to Sociotechnical Processes, Time and a Dose of Reality.” Letter to the Editor. JAMA 307(21): 2255. doi:10.1001/jama.2012.3520.Find this resource:

Dawkins, R. 1982. The Extended Phenotype: The Gene as the Unit of Selection (San Francisco: W.H. Freeman).Find this resource:

de Montjoye, Y. A., Radaelli, L. Singh, V. K., and A. Pentland. 2015. “Unique in the Shopping Mall: On the Reidentifiability of Credit Card Metadata.” Science 347 (6221): 536–539. doi:10.1126/science.1256297.Find this resource:

Gasser, U. 2015. “Perspectives on the Future of Digital Privacy.” ZSR II(134): 426–427.Find this resource:

Gleibs, I. H. 2014. “Turning Virtual Public Spaces into Laboratories: Thoughts on Conducting Online Field Studies Using Social Network Sites.” Analyses of Social Issues and Public Policy 14: 352–370.Find this resource:

Goodman, K. W., and Meslin, E. M. 2014. “Ethics, Information Technology, and Public Health: Duties and Challenges in Computational Epidemiology.” In Public Health Informatics and Information Systems, 2nd ed., edited by J. A. Magnuson and P. Fu, 191–209 (New York: Springer).Find this resource:

Harvard Medical School and Boston Children’s Hospital. 2018.

Hoffman, S., and Podgurski, A. 2013. “Big Bad Data: Law, Public Health, and Biomedical Databases.” Journal of Law, Medicine & Ethics 41 (Suppl. 1): 56–60.Find this resource:

Jain, S. H., Powers, B. W., Hawkins, J. B., and Brownstein, J. S. 2015. “The Digital Phenotype.” Nature Biotechnology 33(5): 462–463.Find this resource:

(p. 365) Kahn, J. P., Vayena, E., and Mastroianni, A. C. 2014. “Opinion: Learning as We Go: Lessons from the Publication of Facebook’s Social-Computing Research.” Proceedings of the National Academy of Sciences 111(38): 13677–13679. doi:10.1073/pnas.1416405111.Find this resource:

Khoury, M. J., and Ioannidis, J. P. A. 2014. “Big Data Meets Public Health: Human Well-Being Could Benefit from Large-Scale Data if Large-Scale Noise Is Minimized.” Science 346(6213): 1054–1055.Find this resource:

Koenig, B. A. 2014. “Have We Asked Too Much of Consent?” Hastings Center Report 44(4): 33–34.Find this resource:

Lazer, D., Kennedy, R., King, G., and Vespignani, A. 2014. “The Parable of Google Flu: Traps in Big Data Analysis.” Science 343(6176): 1203–1205.Find this resource:

Lee, N. 2014. “Trouble on the Radar.” Lancet 384(9958): 1917.Find this resource:

LePendu, P., Iyer, S. V., Bauer-Mehren, A., Harpaz, R., Mortensen, J. M., Podchiyska, T., et al. 2013. “Pharmacovigilance Using Clinical Notes.” Clinical Pharmacology and Therapeutics 93(6): 547–555. doi:10.1038/clpt.2013.47.Find this resource:

Mayer-Schönberger, V., and Cukier, K. 2013. Big Data: A Revolution That Will Transform How We Live, Work, and Think (London: John Murray).Find this resource:

McIver, D. J., and Brownstein, J. S. 2014. “Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time.” PLoS Computational Biology 10(4): e1003581. doi:10.1371/journal.pcbi.1003581.Find this resource:

Mittelstadt, B. D., and Floridi, L. 2015. “The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts.” Science and Engineering Ethics 22(2): 303–341. doi:10.1007/s11948-015-9652-2.Find this resource:

Murdoch, T. B., and Detsky, A. S. 2013. “The Inevitable Application of Big Data in Health Care.” JAMA 309(13): 1351–1352. doi:10.1001/jama.2013.393.Find this resource:

O’Brien, D., Ullman, J., Altman, M., Gasser, U., Bar-Sinai, M., Nissim, K., et al. 2015. Integrating Approaches to Privacy Across the Research Lifecycle: When Is Information Purely Public? Berkman Center Research Publication No. 2015–2017 (Cambridge, Mass.: Berkman Center for Internet & Society). doi:10.2139/ssrn.2586158.Find this resource:

O’Dea, B., Wan, S., Batterham, P. J., Calear, A. L., Paris, C., and Christensen, H. 2015. “Detecting Suicidality on Twitter.” Internet Interventions 2(2): 183–188.Find this resource:

Palfrey, J., and Gasser, U. 2012. Interop: The Promise and Perils of Highly Interconnected Systems (New York: Basic Books).Find this resource:

Pentland, A., Reid, T. G., and Heibeck, T. 2013. Big Data and Health: Revolutionizing Medicine and Public Health. Report of the WISH Big Data and Health Working Group. this resource:

Rivers, C. M., and Lewis, B. L. 2014. “Ethical Research Standards in a World of Big Data.” F1000Research 3. this resource:

Robinson, L., Cotten, S. R., Ono, H., Quan-Haase, A., Mesch, G., Chen, W., et al. 2015. “Digital Inequalities and Why They Matter.” Information, Communication and Society 18(5): 569–582. doi:10.1080/1369118X.2015.1012532.Find this resource:

Ross, M. K., Wei, W., and Ohno-Machado, L. 2014. “ ‘Big Data’ and the Electronic Health Record.” Yearbook of Medical Informatics 9(1): 97. doi:10.15265/IY-2014-0003.Find this resource:

Rothstein, M. A. 2010. Is Deidentification Sufficient to Protect Health Privacy in Research?” American Journal of Bioethics 10(9): 3–11. doi:10.1080/15265161.2010.494215.Find this resource:

Samaritans. 2014. “Samaritans Launches Twitter App to Help Identify Vulnerable People.” Press Release, October 29.

(p. 366) Santillana, M., Nsoesie, E. O., Mekaru, S. R., Scales, D., and Brownstein, J. S. 2014. “Using Clinicians’ Search Query Data to Monitor Influenza Epidemics.” Clinical Infectious Diseases 59 (10): 1446. doi:10.1093/cid/ciu647.Find this resource:

Shaw, J. 2014. “Why ‘Big Data’ Is a Big Deal.” Harvard Magazine, March-April. this resource:

Sweeney, L. 2000. “Simple Demographics Often Identify People Uniquely.” Data Privacy Working Paper 3 (Pittsburgh: Carnegie Mellon University). this resource:

Tasioulas J, and Vayena E. 2016. “Public Health and Human Rights.” JAMA 316(1): 103–104. doi:10.1001/jama.2016.5244.Find this resource:

UK Department of Health. 2016. “Review of Health and Case Data Security and Consent.” Written statement to Parliament, July 6.

UN OHCHR (United Nations Office of the High Commissioner for Human Rights). 2011. Guiding Principles on Business and Human Rights (Geneva: United Nations). this resource:

Vayena, E., and Gasser, U. 2016a. “Between Openness and Privacy in Genomics.” PLoS Medicine 13(1): e1001937.Find this resource:

Vayena, E., and Gasser, U. 2016b. “Strictly Biomedical? Sketching the Ethics of the Big Data Ecosystem in Biomedicine.” In Ethics of Biomedical Big Data, edited by B. Mittelstadt and L. Floridi, 17–40 (Cham, Switzerland: Springer).Find this resource:

Vayena, E., Mastroianni, A., and Kahn, J. 2012. “Ethical Issues in Health Research with Novel Online Sources.” American Journal of Public Health 102(12): 2225–2230. doi:10.2105/AJPH.2012.300813.Find this resource:

Vayena, E., Mastroianni, A., and Kahn, J. 2013. “Caught in the Web: Informed Consent for Online Health Research.” Science Translational Medicine 5(173): 173fs6. doi:10.1126/scitranslmed.3004798.Find this resource:

Vayena, E., Salathé, M., Madoff, L. C., and Brownstein, J. S. 2015. “Ethical Challenges of Big Data in Public Health.” PLoS Computational Biology 11(2): e1003904. doi:10.1371/journal.pcbi.1003904.Find this resource:

Vayena, E., Dzenowagis, J., Brownstein, J. S., and Sheikh, A., 2018 “Big Data Implications for Public Health.” Bulletin of the World Health Organization 96: 66–68.Find this resource:

Weber, G. M., Mandl, K. D., and Kohane, I. S. 2014. “Finding the Missing Link for Big Biomedical Data.” JAMA 311(24): 2479–2480. doi:10.1001/jama.2014.4228.Find this resource:

Wesolowski, A., Buckee, C. O., Bengtsson, L., Wetter, E., Lu, X., and Tatem, A. J. 2014. “Commentary: Containing the Ebola Outbreak—the Potential and Challenge of Mobile Network Data.” PLoS Currents Outbreaks, September 29. this resource:

Yu, V. L., and Madoff, L. C. 2004. “ProMED-mail: An Early Warning System for Emerging Diseases.” Clinical Infectious Diseases 39(2): 227–232.Find this resource:

Zimmer, M. 2010. “ ‘But the Data Is Already Public’: On the Ethics of Research in Facebook.” Ethics Information Technology 12: 313–325. doi:10.1007/s10676-010-9227-5.Find this resource:

Further Reading

Crawford, K., and Schultz, J. 2014. “Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms.” BCL Review 55: 93. this resource:

(p. 367) The Economist. 2014. “Ebola and Big Data: Waiting on Hold.” The Economist, October 25. this resource:

European Commission. 2014. The Use of Big Data in Public Health Policy and Research. Background information document. A report from the Directorate-General for Health and Consumers Unit D3 eHealth and Health Technology Assessment (Brussels: European Commission). this resource:

Freifeld, C. C., Brownstein, J. S., Menone, C. M., Bao, W., Filice, R., Kass-Hout, T., et al. 2014. “Digital Drug Safety Surveillance: Monitoring Pharmaceutical Products in Twitter.” Drug Safety 37(5): 343–350. doi:10.1007/s40264-014-0155-x.Find this resource:

Nahass, T. A., and Nahass, R. G. 2012. “Electronic Health Record Technology.” JAMA 307(21): 2255. doi:10.1001/jama.2012.3520.Find this resource:

O’Neill, O. 2013. “Can Data Protection Secure Personal Privacy?” In Genetic Privacy: An Evaluation of the Ethical and Legal Landscape, edited by T. S. H. Kaan and C. W. L. Ho, 25–40 (London: Imperial College Press).Find this resource:

Vayena, E., Dzenowagis, J., Brownstein, J. S., and Sheikh, A. 2018. “Policy Implications of Big Data in the Health Sector.” Bulletin of the World Health Organization 96(1): 66. doi:10.2471/BLT.17.197426.Find this resource:

White, R. W., Tatonetti, N. P., Shah, N. H., Altman, R. B., and Horvitz, E. 2013. “Web-Scale Pharmacovigilance: Listening to Signals from the Crowd.” Journal of the American Medical Informatics Association 20: 404–408. doi:10.1136/amiajnl-2012-001482.Find this resource: