<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="discussion" dtd-version="1.2" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">2694-1473</journal-id>
<journal-title-group>
<journal-title>Journal of the Society for Clinical Data Management</journal-title>
</journal-title-group>
<issn pub-type="epub">2694-1473</issn>
<publisher>
<publisher-name>Society for Clinical Data Management</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.47912/jscdm.234</article-id>
<article-categories>
<subj-group>
<subject>Opinion paper</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>A Privacy Nihilist Perspective on Clinical Data Sharing: Open Clinical Data Sharing is Dead, Long Live the Walled Garden</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Starren</surname>
<given-names>Justin</given-names>
</name>
<degrees>MD, PhD, FACMI</degrees>
<email>Justin.starren@northwestern.edu</email>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Rasmussen</surname>
<given-names>Luke V.</given-names>
</name>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Schneider</surname>
<given-names>Daniel H.</given-names>
</name>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Nannapaneni</surname>
<given-names>Prasanth</given-names>
</name>
<xref ref-type="aff" rid="aff-1">1</xref>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Michelson</surname>
<given-names>Kelly</given-names>
</name>
<xref ref-type="aff" rid="aff-1">1</xref>
<xref ref-type="aff" rid="aff-3">3</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>Northwestern University Feinberg School of Medicine, US</aff>
<aff id="aff-2"><label>2</label>Northwestern Memorial HealthCare, US</aff>
<aff id="aff-3"><label>3</label>Ann &amp; Robert H. Lurie Children&#8217;s Hospital of Chicago, US</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2023-09-26">
<day>26</day>
<month>09</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>3</volume>
<issue>3</issue>
<elocation-id>4</elocation-id>
<history>
<date date-type="received" iso-8601-date="2022-12-15">
<day>15</day>
<month>12</month>
<year>2022</year>
</date>
<date date-type="accepted" iso-8601-date="2023-05-25">
<day>25</day>
<month>05</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2023 The Author(s)</copyright-statement>
<copyright-year>2023</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>SCDM publishes JSCDM content in an open access manner under a Attribution-Non-Commercial-ShareAlike (CC BY-NC-SA) license. This license lets others remix, adapt, and build upon the work non-commercially, as long as they credit SCDM and the author and license their new creations under the identical terms. See <uri xlink:href="https://creativecommons.org/licenses/by-nc-sa/4.0/">https://creativecommons.org/licenses/by-nc-sa/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="https://www.jscdm.org/articles/10.47912/jscdm.234/"/>
<abstract>
<p>Clinical data sharing, combined with deep learning, and soon quantum computing, has the potential to radically accelerate research, improve health care, and lower costs. Unfortunately, those tools also make it much easier to use the data in ways that can harm patients. This article will argue that the vast amounts of data collected by data brokers, combined with advances in computing, have made reidentification a serious risk for any clinical data that is shared openly. The new National Institute of Health data sharing policy acknowledges this new reality by directing researchers to consider controlled access for any individual-level data. The clinical data sharing community would be well-advised to follow the lead of the physics and astronomy communities and create a &#8220;walled garden&#8221; approach to data sharing. While the investment will be significant, this approach provides a more optimal combination of both access and privacy. Some design considerations for walled gardens are discussed. The article concludes with a list of recommended actions that can be taken by individuals and institutions today.<sup><xref ref-type="bibr" rid="B1">1</xref></sup></p>
</abstract>
<kwd-group>
<kwd>Manage Clinical Research Data</kwd>
<kwd>Define/document data handling process</kwd>
<kwd>Secure Data</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec>
<title>Introduction</title>
<p>Clinical data sharing combined with deep learning, and soon quantum computing, has the potential to radically accelerate research, improve health care, and lower costs. Unfortunately, those tools also make it much easier to use the data in ways that can harm patients. These competing forces are creating a perilous time for clinical data sharing. Future generations will judge informatics and data science by how we balance those forces.<sup><xref ref-type="bibr" rid="B1">1</xref></sup></p>
<p>The term &#8220;clinical data sharing&#8221; has many potential interpretations. This article focuses on the sharing of clinical data for research, rather than to support clinical care or for public health. However, we will draw from incidents that include clinical care to illuminate the challenges and potential pitfalls involved in data sharing in this sector. This article also focuses on sharing data during &#8220;normal times&#8221; rather than during a public health emergency.<sup><xref ref-type="bibr" rid="B2">2</xref></sup> Further, we exclude mandatory data reporting to accrediting or governmental programs,<sup><xref ref-type="bibr" rid="B3">3</xref></sup> and considerations about knowledge sharing (such as pay-to-publish and open access journals).</p>
<p>The Human Rights Watch has declared data privacy a human right,<sup><xref ref-type="bibr" rid="B4">4</xref></sup> as has the United Nations High Commissioner for Human Rights.<sup><xref ref-type="bibr" rid="B5">5</xref></sup> The ethical mandate for health information privacy traces back at least to Hippocrates.<sup><xref ref-type="bibr" rid="B6">6</xref></sup> Researchers are ethically bound to respect and protect the privacy rights of research participants.<sup><xref ref-type="bibr" rid="B7">7</xref></sup> Unfortunately, as we will discuss later, data, including clinical data, are frequently used for reasons beyond their original purpose, potentially violating privacy rights. We propose that open clinical data sharing &#8212; data that is freely available to any and all users without authentication or repercussions<sup><xref ref-type="bibr" rid="B8">8</xref></sup> &#8212; cannot adequately protect the privacy rights of research participants. We further propose that clinical data sharing for research should limit sharing to known and trusted entities, with severe penalties for data misuse. Doing less risks losing the trust of patients and research subjects.</p>
</sec>
<sec>
<title>There is no privacy</title>
<p>In 1999, the chief executive of Sun Microsystems, Scott McNealy, famously declared, &#8220;You have zero privacy anyway. Get over it.&#8221;<sup><xref ref-type="bibr" rid="B9">9</xref></sup> While possibly hyperbolic two decades ago, his statement becomes truer with each passing year. Unlike many other countries, including Canada, Japan, and those in the European Union, the United States (US) has no overarching law that protects personal information.<sup><xref ref-type="bibr" rid="B10">10</xref></sup> Instead, there are specific laws that cover specific data types, including driver&#8217;s licenses,<sup><xref ref-type="bibr" rid="B11">11</xref></sup> educational records,<sup><xref ref-type="bibr" rid="B12">12</xref></sup> credit reports,<sup><xref ref-type="bibr" rid="B13">13</xref></sup> video rental records,<sup><xref ref-type="bibr" rid="B14">14</xref></sup> and data produced by covered health care entities.<sup><xref ref-type="bibr" rid="B15">15</xref></sup> The US regulates privacy based on the entity creating the data rather than the content of the data. For comparison, web search data on the term &#8220;diabetes&#8221; would be protected in Canada as health information, but not in the US because it was not generated by a covered entity.<sup><xref ref-type="bibr" rid="B15">15</xref></sup></p>
<p>Few patients or research subjects understand this nuanced distinction between source and content. Many incorrectly assume that sensitive data about their health is automatically protected. An analogy to how the US regulates data would be if chemicals were regulated based on the company that produced them. Under this approach, a chemical produced under the auspices of a pharmaceutical company would be regulated as a drug, but the same chemical produced under the auspices of a food company would not (eg, the cocaine produced as a byproduct of Coca-Cola production<sup><xref ref-type="bibr" rid="B16">16</xref></sup> would be unregulated). This is not how we regulate chemicals. It should not be the way we regulate data.</p>
<p>Also underappreciated is the vast amount of unregulated data, including medical data, that is collected on every individual. Data brokers &#8212; companies that collect, aggregate, and resell vast amounts of personal and sensitive data &#8212; are virtually unregulated. Justin Sherman, co-founder of Ethical Tech,<sup><xref ref-type="bibr" rid="B17">17</xref></sup> in testimony to the US Senate noted:</p>
<disp-quote>
<p>Data brokers gather your race, ethnicity, religion, gender, sexual orientation, and income level; major life events like pregnancy and divorce; medical information like drug prescriptions and mental illness; your real-time smartphone location; details on your family members and friends; where you like to travel, what you search online, what doctor&#8217;s office you visit, and which political figures and organizations you support.<sup><xref ref-type="bibr" rid="B18">18</xref></sup></p>
</disp-quote>
<p>In <italic>Our Bodies, Our Selves</italic>,<sup><xref ref-type="bibr" rid="B19">19</xref></sup> Adam Tanner profiles the multibillion dollar business of selling medical records. He observes, &#8220;medical data miners cross-reference anonymized patient dossiers with named consumer profiles from data brokers,&#8221; noting that one can easily purchase a fully identified list (ie, name, address, phone number, etc.) of people with a given disease, such as, &#8220;clinical depression, irritable bowel syndrome, erectile disfunction, even HIV.&#8221;<sup><xref ref-type="bibr" rid="B19">19</xref></sup></p>
<p>What data brokers do is legal. Even when behavior is clearly illegal, penalties are slight. Cambridge Analytica, a British consulting company, collected personal data on millions of Facebook users without their consent and used the data for political advertising to support the 2016 Trump presidential campaign.<sup><xref ref-type="bibr" rid="B20">20</xref></sup> While the company was prosecuted and went bankrupt, the punishments for individuals involved were minimal. The CEO was banned from serving as a corporate director for seven years; no one was incarcerated.<sup><xref ref-type="bibr" rid="B21">21</xref></sup> Although Facebook (Meta) was given a $5 billion fine by the Federal Trade Comission,<sup><xref ref-type="bibr" rid="B20">20</xref></sup> this was less than 6% of its revenue and Meta&#8217;s stock price did not fall, which suggests that the stock market viewed this as &#8220;business as usual.&#8221;</p>
<p>Many assume incorrectly that the Health Insurance Portability and Accountability Act (HIPAA) protects all clinical data. Data brokers have found ways around HIPAA.<sup><xref ref-type="bibr" rid="B19">19</xref></sup> Moreover, most clinical data sharing for research does not involve HIPAA-regulated data. Data transferred to researchers are no longer regulated under HIPAA. Even so, most research-related sharing involves removal of the 18 HIPAA-designated identifiers.<sup><xref ref-type="bibr" rid="B22">22</xref></sup> This is thought by some to render the data &#8220;safe&#8221; and freely sharable without restriction. Some institutions do not even consider HIPAA de-identified data to be human subject data.</p>
</sec>
<sec>
<title>The re-identification problem</title>
<p>In our experience, not only do many researchers not fully understand the &#8220;HIPAA 18&#8221; identifiers and fail to correctly remove them, but also &#8220;HIPAA 18&#8221; censoring does not make clinical data unidentifiable. Under HIPAA, &#8220;The covered entity also must have no actual knowledge that the remaining information could be used alone or in combination with other information to identify the individual.&#8221;<sup><xref ref-type="bibr" rid="B22">22</xref></sup> Sherman noted that the sheer volume of data now available makes re-identification quite easy.<sup><xref ref-type="bibr" rid="B18">18</xref></sup> For example, the spacing of days between individual data entries (no actual dates) in a clinical data warehouse of over six million patients contained patterns of spacings that were unique for many patients (including author JS). Since care for JS occurred at specific locations, a cross-reference to cellphone location data could easily re-identify JS&#8217;s data: the same way January 6 rioters have been identified.<sup><xref ref-type="bibr" rid="B23">23</xref></sup></p>
<p>Since Latanya Sweeney re-identified the medical data of Governor Weld of Massachusetts,<sup><xref ref-type="bibr" rid="B24">24</xref></sup> a string of papers has illustrated the ability to re-identify individuals from supposedly anonymized data sets.<sup><xref ref-type="bibr" rid="B25">25</xref>,<xref ref-type="bibr" rid="B26">26</xref></sup> There is a veritable arms race between developers of anonymization algorithms and developers of re-identification algorithms. It is reasonable to assume that any individual-level data can be re-identified, if not today, then soon.</p>
</sec>
<sec>
<title>Threats to health care data privacy are increasing</title>
<p>On top of increased cyberattacks that have targeted health care data,<sup><xref ref-type="bibr" rid="B27">27</xref></sup> recent political events have accelerated the need to protect patient and research subject privacy.<sup><xref ref-type="bibr" rid="B28">28</xref></sup> Open data are not only available to researchers, they are also available to corporations and to the government. Police use public genetic databases to search for suspected criminals,<sup><xref ref-type="bibr" rid="B29">29</xref></sup> and can subpoena newborn genetic screening results.<sup><xref ref-type="bibr" rid="B30">30</xref></sup> Concerns about governmental access to clinical records are increasingly acute, following the Supreme Court&#8217;s Dobbs v. Jackson Women&#8217;s Health decision.<sup><xref ref-type="bibr" rid="B31">31</xref></sup> Some state Attorneys General are already attempting to subpoena privileged medical records, as has occurred in Indiana.<sup><xref ref-type="bibr" rid="B32">32</xref></sup> This behavior is not new. In 2004, after Congress passed the Partial-Birth Abortion Ban Act, Attorney General John Ashcroft subpoenaed medical records from multiple hospitals, including New York Presbyterian (NYP), and Northwestern Memorial Hospital (NMH). In the case of NYP, the hospital refused; the judge ruled against the hospital and later found the hospital in contempt. In the case of NMH, the hospital successfully blocked the subpoena, but the government appealed. The government later dropped the subpoenas when it became clear that NYP and the other hospitals were ready and willing to fight this all the way to the Supreme Court.<sup><xref ref-type="bibr" rid="B33">33</xref></sup></p>
<p>These examples are relevant to clinical data sharing for research for several reasons. First, the government is legally allowed to obtain data from third parties that it cannot obtain directly. For example, the US government can bulk purchase data that it is legally forbidden to collect directly without a warrant, such as cellphone location data.<sup><xref ref-type="bibr" rid="B34">34</xref></sup> A government attorney, like Ashcroft, looking for evidence of a crime, could analyze openly shared clinical data and would only need to re-identify a fraction of the individuals to argue that a crime had been committed. Second, the Indiana case reinforces that this risk is not merely hypothetical. The International Classification of Diseases 10th Revision (ICD10-CM) contains many reproductive health codes (<xref ref-type="table" rid="T1">Table 1</xref>) that indicate activities considered &#8220;crimes&#8221; in certain states. Similarly, the Systematized Nomenclature of Human Medicine (SNOMED)<sup><xref ref-type="bibr" rid="B35">35</xref></sup> contains roughly 100 codes related to elective or attempted abortions. Normally, researchers do not purge reproductive health codes from shared data sets.</p>
<table-wrap id="T1">
<label>Table 1</label>
<caption>
<p>ICD10-CM Codes Related to Reproductive Health Activities that are Illegal in some Jurisdictions.</p>
</caption>
<table>
<thead>
<tr>
<td align="left" valign="top"><bold>Code</bold></td>
<td align="left" valign="top"><bold>Definition</bold></td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">004.X</td>
<td align="left" valign="top">Complications following (induced) termination of pregnancy</td>
</tr>
<tr>
<td align="left" valign="top">007.X</td>
<td align="left" valign="top">Failed attempted termination of pregnancy</td>
</tr>
<tr>
<td align="left" valign="top">004.82</td>
<td align="left" valign="top">Renal failure following (induced) termination of pregnancy</td>
</tr>
<tr>
<td align="left" valign="top">099.32</td>
<td align="left" valign="top">Drug use complicating pregnancy, childbirth, and the puerperium</td>
</tr>
<tr>
<td align="left" valign="top">Z33.2</td>
<td align="left" valign="top">Encounter for elective termination of pregnancy</td>
</tr>
<tr>
<td align="left" valign="top">10A0</td>
<td align="left" valign="top">Abortion, Products of Conception</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>The new NIH Data Sharing Policy is a step in the right direction</title>
<p>The new NIH Data Sharing Policy<sup><xref ref-type="bibr" rid="B36">36</xref></sup> incorporates many of the concerns described above. It instructs, &#8220;Researchers should consider whether access to scientific data derived from humans, even if de-identified and lacking explicit limitations on subsequent use, should be controlled.&#8221; The U.S. Department of Health and Human Services Secretary&#8217;s Advisory Committee on Human Research Protections (SACHRP) states these concerns even more forcefully, declaring, &#8220;Increasingly, the protections afforded by removing the eighteen identifying data elements cited in HIPAA have become out of date, as technological advances and the combining of data sets increase the risk of re-identification.&#8221;<sup><xref ref-type="bibr" rid="B37">37</xref></sup> The SACHRP further states, &#8220;Genomic data are also particularly susceptible to re-identification.&#8221; The SACHRP makes several recommendations including controlled access for data from human participants, and stronger measures to deter misuse.</p>
</sec>
<sec>
<title>The walled garden approach to data sharing</title>
<p>The approaches to protecting privacy in shared research data can be divided into three broad categories: distributed computing, data-centric, and process-centric. With distributed computing, the data are not actually shared. Instead, each site typically creates identical data structures. Queries and algorithms are then run locally against those structures, and the results are aggregated. The Observational Health Data Sciences and Informatics (OHDSI) network is one of the largest examples of distributed computing.<sup><xref ref-type="bibr" rid="B38">38</xref></sup> While very useful for simple queries, this approach can be limiting for machine learning because of the necessity for every site to support the correct computer environment for the algorithm.</p>
<p>The data-centric approach is to create data that &#8220;can defend itself&#8221;.<sup><xref ref-type="bibr" rid="B39">39</xref></sup> The goal is to modify or &#8220;harden&#8221; the data to the point that they can be shared without restrictions because there is a sufficiently low risk of reidentification. Many approaches have been used, which include: removing certain data types, such as the 18 HIPAA identifiers; censoring cells when the number of subjects is below a certain threshold; and censoring extreme values. As we now know, these approaches reduce, but do not eliminate, reidentification risk.</p>
<p>Recently, there has been considerable interest in synthetic data as a way to truly anonymize data (that is, to make it never re-identifiable) while preserving its inferential scientific value. It is worth noting that synthetic data comes in two forms: fully synthetic and partially synthetic. Fully synthetic data are generated completely <italic>de novo &#8212;</italic> without relying on preexisting data &#8212; and are intended to provide data that looks like real data, but that may have little inferential value, and consequently little research value. The Medicare Claims Synthetic Public Use Files (SynPUFs) are examples of fully synthetic data.<sup><xref ref-type="bibr" rid="B40">40</xref></sup> Partially synthetic data are derived from real data with the intent of preserving the inferential value while reducing the reidentification risk.<sup><xref ref-type="bibr" rid="B41">41</xref>,<xref ref-type="bibr" rid="B42">42</xref></sup> Although some claim that synthetic data is immune from reidentification risk,<sup><xref ref-type="bibr" rid="B43">43</xref></sup> partially synthetic data is vulnerable to several risks, including membership inference,<sup><xref ref-type="bibr" rid="B44">44</xref></sup> which is when an adversary can infer that a target individual is in the data set, and can thereby infer other facts about the individual. Stadler, et al., evaluated five different algorithms for generating synthetic data from the <italic>All of Us</italic> data set and found that synthetic data did not provide a better tradeoff between privacy and utility than traditional deidentification techniques.<sup><xref ref-type="bibr" rid="B45">45</xref></sup> Stadler&#8217;s work suggests that synthetic data will not be a &#8220;magic bullet&#8221; that slays the reidentification monster.</p>
<p>Given this, should researchers simply stop sharing data, even though data sharing can accelerate research and save lives? No; which brings us to process-centric approaches. These approaches assume that the data can be reidentified and, instead, focus on process controls and contractual obligations to ensure that the data recipient will not attempt to reidentify individuals or use the data for other than intended uses, and impose penalties as a deterrent. Research institutions are very familiar with process-centric approaches in the form of Data Use Agreements (DUAs) and Material Transfer Agreements (MTAs). Typically, these are bilateral agreements between two organizations. There are 142 medical schools that receive NIH grants, bilateral agreements between each pair of these institutions would require slightly over ten thousand separate agreements and would likely involve massive duplication of research data. In addition, DUAs and MTAs are typically limited to a single project, potentially resulting in hundreds of thousands of separate agreements. To be efficient and to reduce infrastructure duplication, large-scale, multi-institution clinical data sharing for research typically involves consortia of multiple institutions and a single, shared technology infrastructure. We call this the &#8220;walled garden&#8221; approach.</p>
<p>Physics and astronomy provide examples of the walled garden approach. The Large Hadron Collider (LHC) project, held up as the archetypal big data sharing example, rapidly sends petabytes of data to collaborators globally.<sup><xref ref-type="bibr" rid="B46">46</xref></sup> However, this is not <italic>open</italic> data sharing. Data sharing occurs only within the consortium. The first release of open data was in 2021, roughly a decade after the data was originally collected.<sup><xref ref-type="bibr" rid="B47">47</xref></sup> The Laser Interferometer Gravitational-Wave Observatory (LIGO)<sup><xref ref-type="bibr" rid="B48">48</xref></sup> has a similar approach: there is extensive sharing within the consortium, with clear rules for data use and consequences for misuse, but little data release outside the walls. With a walled garden approach, data is shared only with known and trusted individuals and institutions that are held accountable if trust is misplaced. In the biomedical domain, the <italic>All of Us</italic> research project<sup><xref ref-type="bibr" rid="B49">49</xref></sup> and the National Covid Cohort Consortium (N3C)<sup><xref ref-type="bibr" rid="B50">50</xref></sup> are among the best known examples of walled gardens.</p>
</sec>
<sec>
<title>Walled garden design considerations</title>
<p>For walled gardens to succeed, several considerations must be addressed. First, developing and maintaining the garden will involve significant infrastructure investment. For the LHC, LIGO, <italic>All of Us</italic>, and N3C, the data management infrastructures were massive multi-year, multimillion dollar endeavors that required large development teams and ongoing multimillion dollar operations expenditures. Entities, such as the European Council for Nuclear Research (CERN), the NIH or the National Science Foundation have resources of that scope. Second, someone must build the walls. Access to the garden should be though national centralized identity management. Only a large, national entity, such as the NIH, has identity management systems of sufficient scale. Third, the governance of the garden must be trusted and trustworthy. There should be clearly articulated principles and allowed uses for the data. For example, will for-profit researchers be granted access? What are researchers&#8217; obligations for disseminating results derived from the shared data? Whether the governance is appointed or democratic, public or private, the users of the garden must have confidence that data within the garden will only be used for appropriate purposes. Fourth, rules without punishment are not deterrents. The penalties for data misuse must, as SACHRP recommends, be severe enough to be an effective deterrent. Rule violations should be considered scientific misconduct and addressed accordingly. Scientific misconduct is much more common that we often like to admit.<sup><xref ref-type="bibr" rid="B51">51</xref></sup> In physics or astronomy, getting thrown out of the LHC or LIGO consortia as a result of bad behavior could be career ending. Penalties should also anticipate that not all garden users will be traditional academics, and that purely academic penalties may not be sufficient. Each email that violates the CAN-SPAM Act<sup><xref ref-type="bibr" rid="B52">52</xref></sup> can result in a penalty of $46,517 with no limit to the total fine. Clinical data privacy violations are generally considered more serious violations than spam email, and the penalties should reflect that. Finally, to maximize the scientific and societal benefit, entry to and use of the garden must be affordable. Some current gardens, such as <italic>All of Us</italic>, charge for cloud compute time above a basic allocation. User fees should never make access to the garden so onerous or expensive that only major corporations and elite universities can pass through the metaphorical gate.</p>
</sec>
<sec>
<title>What can data managers do today?</title>
<p>Adequate general purpose walled gardens for potentially identifiable biomedical data are not widely available today. So, what can individuals and institutions do now?</p>
<list list-type="bullet">
<list-item><p>Convene institutional leadership to establish acceptable thresholds for reidentification risk and criteria for evaluating that risk. For example, some institutions are comfortable with the level of protection provided by current synthetic data approaches; others are not.</p></list-item>
<list-item><p>Establish processes for evaluating data sets prior to any sharing outside the institution. For example, who evaluates a data set prior to release? At Northwestern University, central review is now required for any clinical data.</p></list-item>
<list-item><p>Reify these criteria and processes in publicly available policies.</p></list-item>
<list-item><p>Identify data sharing repositories that meet institutional criteria for various risk levels. For example, PhysioNet is a repository for biomedical data that has the ability to enforce DUAs and that requires users to receive human subjects research training prior to data access.<sup><xref ref-type="bibr" rid="B53">53</xref></sup> We have used this at Northwestern for sharing data that, though HIPAA safe-harbor deidentified, presented a reidentification risk that was adjudged too high for open sharing.<sup><xref ref-type="bibr" rid="B54">54</xref></sup></p></list-item>
<list-item><p>Ensure that study consent documents honestly communicate that absolute anonymity cannot be guaranteed. Telling our research participants otherwise would be disingenuous and unethical.</p></list-item>
<list-item><p>Increase awareness of reidentification risk among researchers, data managers, and data analysts, making it everyone&#8217;s responsibility to consider the implications of reidentification risk.</p></list-item>
<list-item><p>Encourage the developers of institutional data sharing software to support DUAs and the validation of external users.</p></list-item>
<list-item><p>Encourage relevant governmental bodies to support the development and operation of appropriate walled gardens to accelerate research through the sharing of clinical data.</p></list-item>
<list-item><p>Follow the reidentification literature and periodically reevaluate institutional criteria and policies.</p></list-item>
</list>
</sec>
<sec>
<title>Conclusion</title>
<p>In conclusion, we would suggest that Scott McNealy&#8217;s privacy nihilist view was half right.<sup><xref ref-type="bibr" rid="B9">9</xref></sup> With everything happening today, we have close to zero privacy. But we do not need to &#8220;get over it&#8221; and give up. We can and should do better. The first step is to stop believing that we can truly anonymize data.</p>
</sec>
</body>
<back>
<sec>
<title>Competing Interests</title>
<p>The authors have no competing interests to declare.</p>
</sec>
<ref-list>
<ref id="B1"><mixed-citation publication-type="journal"><label>1.&#160;</label><string-name><surname>Horvitz</surname> <given-names>E</given-names></string-name>, <string-name><surname>Mulligan</surname> <given-names>D</given-names></string-name>. <article-title>Data, privacy, and the greater good</article-title>. <source>Science</source>. <year>2015</year>; <volume>349</volume>(<issue>6245</issue>): <fpage>253</fpage>&#8211;<lpage>255</lpage>. DOI: <pub-id pub-id-type="doi">10.1126/science.aac4520</pub-id></mixed-citation></ref>
<ref id="B2"><mixed-citation publication-type="journal"><label>2.&#160;</label><string-name><surname>Subbian</surname> <given-names>V</given-names></string-name>, <string-name><surname>Solomonides</surname> <given-names>A</given-names></string-name>, <string-name><surname>Clarkson</surname> <given-names>M</given-names></string-name>, et al. <article-title>Ethics and informatics in the age of COVID-19: challenges and recommendations for public health organization and public policy</article-title>. <source>J Am Med Inform Assoc</source>. <year>2021</year>; <volume>28</volume>(<issue>1</issue>): <fpage>184</fpage>&#8211;<lpage>189</lpage>. DOI: <pub-id pub-id-type="doi">10.1093/jamia/ocaa188</pub-id></mixed-citation></ref>
<ref id="B3"><mixed-citation publication-type="webpage"><label>3.&#160;</label><collab>Centers for Medicare and Medicaid Services</collab>. <article-title>Hospital Inpatient Quality Reporting Program</article-title>. CMS.gov. Published December 1, 2021. Accessed April 4, 2023. <uri>https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/HospitalRHQDAPU</uri></mixed-citation></ref>
<ref id="B4"><mixed-citation publication-type="webpage"><label>4.&#160;</label><string-name><surname>St. Vincent</surname> <given-names>S</given-names></string-name>. <article-title>Data Privacy is a Human Right. Europe is moving toward Recognizing that</article-title>. <source>Foreign Policy in Focus</source>. Published April 19, 2018. Accessed December 1, 2022. <uri>https://fpif.org/data-privacy-is-a-human-right-europe-is-moving-toward-recognizing-that/</uri>.</mixed-citation></ref>
<ref id="B5"><mixed-citation publication-type="webpage"><label>5.&#160;</label><collab>United Nations</collab>. <source>The Right to Privacy in the Digital Age. Report of the United Nations High Commissioner for Human Rights</source>.; <year>2021</year>. Accessed December 1, 2022. <uri>https://documents-dds-ny.un.org/doc/UNDOC/GEN/G21/249/21/PDF/G2124921.pdf?OpenElement</uri></mixed-citation></ref>
<ref id="B6"><mixed-citation publication-type="webpage"><label>6.&#160;</label><collab>Greek Medicine</collab>. <article-title>History of Medicine Division, National Library of Medicine</article-title>. Published February 7, 2012. Accessed December 2, 2022. <uri>https://www.nlm.nih.gov/hmd/greek/greek_oath.html</uri></mixed-citation></ref>
<ref id="B7"><mixed-citation publication-type="journal"><label>7.&#160;</label><string-name><surname>Arellano</surname> <given-names>AM</given-names></string-name>, <string-name><surname>Dai</surname> <given-names>W</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ohno-Machado</surname> <given-names>L</given-names></string-name>. <article-title>Privacy Policy and Technology in Biomedical Data Science</article-title>. <source>Annu Rev Biomed Data Sci</source>. <year>2018</year>; <volume>1</volume>(<issue>1</issue>): <fpage>115</fpage>&#8211;<lpage>129</lpage>. DOI: <pub-id pub-id-type="doi">10.1146/annurev-biodatasci-080917-013416</pub-id></mixed-citation></ref>
<ref id="B8"><mixed-citation publication-type="webpage"><label>8.&#160;</label><article-title>What is Open Data? Open Data Handbook</article-title>. Accessed December 1, 2022. <uri>https://opendatahandbook.org/guide/en/what-is-open-data/</uri></mixed-citation></ref>
<ref id="B9"><mixed-citation publication-type="webpage"><label>9.&#160;</label><string-name><surname>Sprenger</surname> <given-names>P</given-names></string-name>. <article-title>Sun on Privacy: &#8220;Get Over It.&#8221;</article-title> <source>Wired</source>. Published online January 26, 1999. Accessed December 1, 2022. <uri>https://www.wired.com/1999/01/sun-on-privacy-get-over-it/</uri></mixed-citation></ref>
<ref id="B10"><mixed-citation publication-type="webpage"><label>10.&#160;</label><collab>Privacy Laws Around the World</collab>. <article-title>pdpEcho</article-title>. Published December 1, 2022. Accessed December 1, 2022. <uri>https://pdpecho.com/privacy-laws-around-the-world/</uri></mixed-citation></ref>
<ref id="B11"><mixed-citation publication-type="webpage"><label>11.&#160;</label><article-title>The Drivers Privacy Protection Act (DPPA) and the Privacy of Your State Motor Vehicle Record</article-title>. Epic.org. Accessed December 1, 2022. <uri>https://epic.org/dppa/</uri></mixed-citation></ref>
<ref id="B12"><mixed-citation publication-type="webpage"><label>12.&#160;</label><collab>U.S. Department of Education</collab>. <chapter-title>Family Educational Rights and Privacy Act (FERPA)</chapter-title>. <publisher-name>U.S. Department of Education</publisher-name>. Published August 25, 2021. Accessed December 1, 2022. <uri>https://www2.ed.gov/policy/gen/guid/fpco/ferpa/index.html</uri></mixed-citation></ref>
<ref id="B13"><mixed-citation publication-type="webpage"><label>13.&#160;</label><collab>Bureau of Justice Assistance</collab>. <chapter-title>Fair Credit Reporting Act</chapter-title>. <publisher-name>Bureau of Justice Assistance</publisher-name>. Accessed December 1, 2022. <uri>https://bja.ojp.gov/program/it/privacy-civil-liberties/authorities/statutes/2349</uri></mixed-citation></ref>
<ref id="B14"><mixed-citation publication-type="webpage"><label>14.&#160;</label><article-title>Video Privacy Protection Act. Wikipedia</article-title>. Accessed December 1, 2022. <uri>https://en.wikipedia.org/wiki/Video_Privacy_Protection_Act</uri></mixed-citation></ref>
<ref id="B15"><mixed-citation publication-type="webpage"><label>15.&#160;</label><collab>U.S. Department of Health and Human Services</collab>. <article-title>HIPAA Home. U.S. Department of Health and Human Services</article-title>. Accessed December 1, 2022. <uri>https://www.hhs.gov/hipaa/index.html</uri></mixed-citation></ref>
<ref id="B16"><mixed-citation publication-type="webpage"><label>16.&#160;</label><article-title>Coca-Cola Formula. Wikipedia</article-title>. Accessed December 1, 2022. <uri>https://en.wikipedia.org/wiki/Coca-Cola_formula</uri></mixed-citation></ref>
<ref id="B17"><mixed-citation publication-type="webpage"><label>17.&#160;</label><collab>Ethical Tech Research Policy Education</collab>. <article-title>Ethical Tech</article-title>. Accessed December 13, 2022. <uri>https://ethicaltech.duke.edu</uri></mixed-citation></ref>
<ref id="B18"><mixed-citation publication-type="webpage"><label>18.&#160;</label><collab>U.S. Senate Committee on Finance</collab>. <article-title>Data Brokerage and Threats to U.S. Privacy and Security Written Testimony</article-title>. Accessed December 1, 2022. <uri>https://www.finance.senate.gov/imo/media/doc/Written%20Testimony%20-%20Justin%20Sherman.pdf</uri></mixed-citation></ref>
<ref id="B19"><mixed-citation publication-type="book"><label>19.&#160;</label><string-name><surname>Tanner</surname> <given-names>A</given-names></string-name>. <source>Our Bodies, Our Data: How Companies Make Billions Selling Our Medical Records</source>. <publisher-name>Beacon Press</publisher-name>; <year>2017</year>.</mixed-citation></ref>
<ref id="B20"><mixed-citation publication-type="webpage"><label>20.&#160;</label><article-title>Facebook&#8211;Cambridge Analytica data scandal</article-title>. Wikipedia. Accessed December 1, 2022. <uri>https://en.wikipedia.org/wiki/Facebook&#8211;Cambridge_Analytica_data_scandal</uri></mixed-citation></ref>
<ref id="B21"><mixed-citation publication-type="webpage"><label>21.&#160;</label><string-name><surname>Davies</surname> <given-names>R</given-names></string-name>. <article-title>Former Cambridge Analytica chief receives seven-year directorship ban</article-title>. <source>The Guardian</source>. Published September 24, 2020. Accessed December 1, 2022. <uri>https://www.theguardian.com/uk-news/2020/sep/24/cambridge-analytica-directorship-ban-alexander-nix</uri>.</mixed-citation></ref>
<ref id="B22"><mixed-citation publication-type="webpage"><label>22.&#160;</label><collab>National Institutes of Health</collab>. <chapter-title>How Can Covered Entities Use and Disclose Protected Health Information for Research and Comply with the Privacy Rule?</chapter-title> <publisher-name>HIPAA Privacy Rule</publisher-name>. Accessed December 1, 2022. <uri>https://privacyruleandresearch.nih.gov/pr_08.asp</uri></mixed-citation></ref>
<ref id="B23"><mixed-citation publication-type="webpage"><label>23.&#160;</label><string-name><surname>Hall</surname> <given-names>M</given-names></string-name>. <article-title>The DOJ is creating maps from subpoenaed cell phone data to identify rioters involved with the Capitol insurrection</article-title>. <source>Business Insider</source>. Published March 24, 2021. Accessed December 1, 2022. <uri>https://www.businessinsider.com/doj-is-mapping-cell-phone-location-data-from-capitol-rioters-2021-3</uri>.</mixed-citation></ref>
<ref id="B24"><mixed-citation publication-type="webpage"><label>24.&#160;</label><string-name><surname>Meyer</surname> <given-names>M</given-names></string-name>. <article-title>Law, Ethics &amp; Science of Re-identification Demonstrations</article-title>. Bill of Health. Accessed December 1, 2022. <uri>https://blog.petrieflom.law.harvard.edu/symposia/law-ethics-science-of-re-identification-demonstrations/</uri></mixed-citation></ref>
<ref id="B25"><mixed-citation publication-type="book"><label>25.&#160;</label><string-name><surname>Narayanan</surname> <given-names>A</given-names></string-name>, <string-name><surname>Shmatikov</surname> <given-names>V</given-names></string-name>. <chapter-title>Robust De-anonymization of Large Sparse Datasets</chapter-title>. In: <source>2008 IEEE Symposium on Security and Privacy (Sp 2008)</source>. <publisher-name>IEEE</publisher-name>; <year>2008</year>; <fpage>111</fpage>&#8211;<lpage>125</lpage>. DOI: <pub-id pub-id-type="doi">10.1109/SP.2008.33</pub-id></mixed-citation></ref>
<ref id="B26"><mixed-citation publication-type="journal"><label>26.&#160;</label><string-name><surname>Malin</surname> <given-names>B</given-names></string-name>, <string-name><surname>Sweeney</surname> <given-names>L</given-names></string-name>. <article-title>How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems</article-title>. <source>J Biomed Inform</source>. <year>2004</year>; <volume>37</volume>(<issue>3</issue>): <fpage>179</fpage>&#8211;<lpage>192</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jbi.2004.04.005</pub-id></mixed-citation></ref>
<ref id="B27"><mixed-citation publication-type="webpage"><label>27.&#160;</label><string-name><surname>Southwick</surname> <given-names>R</given-names></string-name>. <article-title>Cyberattacks in healthcare surged last year, and 2022 could be even worse</article-title>. <source>Chief Healthcare Executive</source>. Published January 24, 2022. Accessed December 11, 2022. <uri>https://www.chiefhealthcareexecutive.com/view/cyberattacks-in-healthcare-surged-last-year-and-2022-could-be-even-worse</uri>.</mixed-citation></ref>
<ref id="B28"><mixed-citation publication-type="book"><label>28.&#160;</label><string-name><surname>Clayton</surname> <given-names>EW</given-names></string-name>, <string-name><surname>Emb&#237;</surname> <given-names>PJ</given-names></string-name>, <string-name><surname>Malin</surname> <given-names>BA</given-names></string-name>. <chapter-title>Dobbs and the future of health data privacy for patients and healthcare organizations</chapter-title>. <source>J Am Med Inform Assoc</source>. <year>2023</year>; <volume>30</volume>(<issue>1</issue>): <fpage>155</fpage>&#8211;<lpage>160</lpage>. <publisher-loc>Erratum</publisher-loc>: <publisher-name>J Am Med Inform Assoc</publisher-name>. 30(1) January 2023, Page 208. DOI: <pub-id pub-id-type="doi">10.1093/jamia/ocac155</pub-id></mixed-citation></ref>
<ref id="B29"><mixed-citation publication-type="journal"><label>29.&#160;</label><string-name><surname>Kaiser</surname> <given-names>J</given-names></string-name>. <article-title>A judge said police can search the DNA of 1 million Americans without their consent. What&#8217;s next?</article-title> <source>Science</source>. Published online November 7, 2019. DOI: <pub-id pub-id-type="doi">10.1126/science.aba1428</pub-id></mixed-citation></ref>
<ref id="B30"><mixed-citation publication-type="webpage"><label>30.&#160;</label><string-name><surname>Grant</surname> <given-names>C</given-names></string-name>. <chapter-title>Police Are Using Newborn Genetic Screening to Search for Suspects, Threatening Privacy and Public Health</chapter-title>. <publisher-name>ACLU News and Comentary</publisher-name>. Published July 26, 2020. Accessed December 1, 2022. <uri>https://www.aclu.org/news/privacy-technology/police-are-using-newborn-genetic-screening</uri></mixed-citation></ref>
<ref id="B31"><mixed-citation publication-type="webpage"><label>31.&#160;</label><collab>Supreme Court of the United States</collab>. <source>Dobbs v. Jackson Women&#8217;s Health</source>.(Supreme Court of the United State 2022). Accessed December 1, 2022. <uri>https://www.supremecourt.gov/opinions/21pdf/19-1392_6j37.pdf</uri></mixed-citation></ref>
<ref id="B32"><mixed-citation publication-type="webpage"><label>32.&#160;</label><string-name><surname>Sasani</surname> <given-names>A</given-names></string-name>, <string-name><surname>Stolberg</surname> <given-names>SG</given-names></string-name>. <article-title>Indiana Attorney General Asks Medical Board to Discipline Abortion Doctor</article-title>. <source>New York Times</source>. Published November 30, 2022. Accessed December 1, 2022. <uri>https://www.nytimes.com/2022/11/30/us/indiana-attorney-general-abortion-doctor.html</uri>.</mixed-citation></ref>
<ref id="B33"><mixed-citation publication-type="webpage"><label>33.&#160;</label><string-name><surname>Freiden</surname> <given-names>T</given-names></string-name>. <chapter-title>U.S. drops fight to get abortion records</chapter-title>. <publisher-name>CNN.com Law Center</publisher-name>. Published June 1, 2004. Accessed December 1, 2022. <uri>https://www.cnn.com/2004/LAW/04/27/abortion.records/</uri></mixed-citation></ref>
<ref id="B34"><mixed-citation publication-type="webpage"><label>34.&#160;</label><string-name><surname>Cyphers</surname> <given-names>B</given-names></string-name>. <chapter-title>How the Federal Government Buys Our Cell Phone Location Data</chapter-title>. <publisher-name>Electronic Frontier Foundation</publisher-name>. Published June 13, 2022. Accessed December 1, 2022. <uri>https://www.eff.org/deeplinks/2022/06/how-federal-government-buys-our-cell-phone-location-data</uri></mixed-citation></ref>
<ref id="B35"><mixed-citation publication-type="webpage"><label>35.&#160;</label><collab>National Library of Medicine</collab>. <chapter-title>SNOMED CT Browsers</chapter-title>. <publisher-name>National Library of Medicine</publisher-name>. Published December 5, 2022. Accessed December 5, 2022. <uri>https://www.nlm.nih.gov/research/umls/Snomed/snomed_browsers.html</uri></mixed-citation></ref>
<ref id="B36"><mixed-citation publication-type="webpage"><label>36.&#160;</label><collab>National Institutes of Health</collab>. <article-title>Final NIH Policy for Data Management and Sharing</article-title>. Published October 29, 2023. Accessed December 1, 2022. <uri>https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html</uri></mixed-citation></ref>
<ref id="B37"><mixed-citation publication-type="webpage"><label>37.&#160;</label><collab>U.S. Department of Health and Human Services</collab>. <chapter-title>Attachment A &#8211; NIH Data Sharing Policy</chapter-title>. <publisher-name>Office for Human Research Protections</publisher-name>. Published September 17, 2020. Accessed December 5, 2022. <uri>https://www.hhs.gov/ohrp/sachrp-committee/recommendations/august-12-2020-attachment-a-nih-data-sharing-policy/index.html</uri></mixed-citation></ref>
<ref id="B38"><mixed-citation publication-type="journal"><label>38.&#160;</label><string-name><surname>Hripcsak</surname> <given-names>G</given-names></string-name>, <string-name><surname>Duke</surname> <given-names>JD</given-names></string-name>, <string-name><surname>Shah</surname> <given-names>NH</given-names></string-name>, et al. <article-title>Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers</article-title>. <source>Stud Health Technol Inform</source>. <year>2015</year>; <volume>216</volume>: <fpage>574</fpage>&#8211;<lpage>578</lpage>. PMID: PMID: 26262116; PMCID: PMC4815923.</mixed-citation></ref>
<ref id="B39"><mixed-citation publication-type="webpage"><label>39.&#160;</label><collab>Medicare Claims Synthetic Public Use Files (SynPUFs)</collab>. CMS.gov. Published December 1, 2021. Accessed April 4, 2023. <uri>https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/SynPUFs</uri></mixed-citation></ref>
<ref id="B40"><mixed-citation publication-type="webpage"><label>40.&#160;</label><collab>Medicare Claims Synthetic Public Use Files (SynPUFs)</collab>. CMS.gov. Published December 1, 2021. Accessed April 4, 2023. <uri>https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/SynPUFs</uri></mixed-citation></ref>
<ref id="B41"><mixed-citation publication-type="journal"><label>41.&#160;</label><string-name><surname>El Emam</surname> <given-names>K</given-names></string-name>, <string-name><surname>Mosquera</surname> <given-names>L</given-names></string-name>, <string-name><surname>Fang</surname> <given-names>X</given-names></string-name>. <article-title>Validating a membership disclosure metric for synthetic health data</article-title>. <source>JAMIA Open</source>. <year>2022</year>; <volume>5</volume>(<issue>4</issue>): <elocation-id>ooac083</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1093/jamiaopen/ooac083</pub-id></mixed-citation></ref>
<ref id="B42"><mixed-citation publication-type="journal"><label>42.&#160;</label><string-name><surname>Kuo</surname> <given-names>NIH</given-names></string-name>, <string-name><surname>Polizzotto</surname> <given-names>MN</given-names></string-name>, <string-name><surname>Finfer</surname> <given-names>S</given-names></string-name>, et al. <article-title>The Health Gym: synthetic health-related datasets for the development of reinforcement learning algorithms</article-title>. <source>Sci Data</source>. <year>2022</year>; <volume>9</volume>(<issue>1</issue>): <fpage>693</fpage>. DOI: <pub-id pub-id-type="doi">10.1038/s41597-022-01784-7</pub-id></mixed-citation></ref>
<ref id="B43"><mixed-citation publication-type="webpage"><label>43.&#160;</label><string-name><surname>Platzer</surname> <given-names>M</given-names></string-name>. <article-title>AI-based Re-Identification Attacks &#8211; and how to Protect Against Them</article-title>. Mostly.ai. Published April 22, 2022. Accessed April 3, 2023. <uri>https://mostly.ai/blog/synthetic-data-protects-from-ai-based-re-identification-attacks/</uri></mixed-citation></ref>
<ref id="B44"><mixed-citation publication-type="journal"><label>44.&#160;</label><string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Yan</surname> <given-names>C</given-names></string-name>, <string-name><surname>Malin</surname> <given-names>BA</given-names></string-name>. <article-title>Membership inference attacks against synthetic health data</article-title>. <source>J Biomed Inform</source>. <year>2022</year>; <volume>125</volume>: <elocation-id>103977</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1016/j.jbi.2021.103977</pub-id></mixed-citation></ref>
<ref id="B45"><mixed-citation publication-type="webpage"><label>45.&#160;</label><string-name><surname>Stadler</surname> <given-names>T</given-names></string-name>, <string-name><surname>Oprisanu</surname> <given-names>B</given-names></string-name>, <string-name><surname>Troncoso</surname> <given-names>C</given-names></string-name>. <article-title>Synthetic Data &#8212; Anonymisation Groundhog Day</article-title>. <year>2022</year>;(arXiv:2011.07018). Accessed April 3, 2023. <uri>http://arxiv.org/abs/2011.07018</uri></mixed-citation></ref>
<ref id="B46"><mixed-citation publication-type="webpage"><label>46.&#160;</label><collab>CERN, the European Organization for Nuclear Research</collab>. <article-title>The Network Challenge. CERN</article-title>. Accessed June 27, 2023. <uri>https://home.cern/science/computing/network</uri></mixed-citation></ref>
<ref id="B47"><mixed-citation publication-type="webpage"><label>47.&#160;</label><article-title>CMS releases heavy-ion data from 2010 and 2011. opendata CERN</article-title>. Published December 21, 2021. Accessed December 1, 2022. <uri>https://opendata.cern.ch/docs/cms-releases-heavy-ion-data</uri></mixed-citation></ref>
<ref id="B48"><mixed-citation publication-type="journal"><label>48.&#160;</label><string-name><surname>Abramovici</surname> <given-names>A</given-names></string-name>, <string-name><surname>Althouse</surname> <given-names>WE</given-names></string-name>, <string-name><surname>Drever</surname> <given-names>RWP</given-names></string-name>, et al. <article-title>LIGO: The Laser Interferometer Gravitational-Wave Observatory</article-title>. <source>Science</source>. <year>1992</year>; <volume>256</volume>(<issue>5055</issue>): <fpage>325</fpage>&#8211;<lpage>333</lpage>. DOI: <pub-id pub-id-type="doi">10.1126/science.256.5055.325</pub-id></mixed-citation></ref>
<ref id="B49"><mixed-citation publication-type="journal"><label>49.&#160;</label><collab>The All of Us Research Program Investigators</collab>. <article-title>The &#8220;All of Us&#8221; Research Program</article-title>. <source>N Engl J Med</source>. <year>2019</year>; <volume>381</volume>(<issue>7</issue>): <fpage>668</fpage>&#8211;<lpage>676</lpage>. DOI: <pub-id pub-id-type="doi">10.1056/NEJMsr1809937</pub-id></mixed-citation></ref>
<ref id="B50"><mixed-citation publication-type="journal"><label>50.&#160;</label><string-name><surname>Haendel</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Chute</surname> <given-names>CG</given-names></string-name>, <string-name><surname>Bennett</surname> <given-names>TD</given-names></string-name>, et al. <article-title>The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment</article-title>. <source>J Am Med Inform Assoc</source>. <year>2021</year>; <volume>28</volume>(<issue>3</issue>): <fpage>427</fpage>&#8211;<lpage>443</lpage>. DOI: <pub-id pub-id-type="doi">10.1093/jamia/ocaa196</pub-id></mixed-citation></ref>
<ref id="B51"><mixed-citation publication-type="journal"><label>51.&#160;</label><string-name><surname>Fanelli</surname> <given-names>D</given-names></string-name>. <article-title>How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data</article-title>. <string-name><surname>Tregenza</surname> <given-names>T</given-names></string-name> (ed.), <source>PLoS ONE</source>. <year>2009</year>; <volume>4</volume>(<issue>5</issue>): <elocation-id>e5738</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1371/journal.pone.0005738</pub-id></mixed-citation></ref>
<ref id="B52"><mixed-citation publication-type="webpage"><label>52.&#160;</label><collab>Federal Trade Comission</collab>. <chapter-title>CAN-SPAM Act: A Compliance Guide for Business</chapter-title>. <publisher-name>Federal Trade Commission</publisher-name>. Published January 1, 2022. Accessed December 2, 2022. <uri>https://www.ftc.gov/business-guidance/resources/can-spam-act-compliance-guide-business</uri></mixed-citation></ref>
<ref id="B53"><mixed-citation publication-type="journal"><label>53.&#160;</label><string-name><surname>Moody</surname> <given-names>GB</given-names></string-name>, <string-name><surname>Mark</surname> <given-names>RG</given-names></string-name>, <string-name><surname>Goldberger</surname> <given-names>AL</given-names></string-name>. <article-title>PhysioNet: a research resource for studies of complex physiologic and biomedical signals</article-title>. <source>Comput Cardiol</source>. <year>2000</year>; <volume>27</volume>: <fpage>179</fpage>&#8211;<lpage>182</lpage>. PMID: 14632011</mixed-citation></ref>
<ref id="B54"><mixed-citation publication-type="journal"><label>54.&#160;</label><string-name><surname>Markov</surname> <given-names>N</given-names></string-name>, <string-name><surname>Gao</surname> <given-names>CA</given-names></string-name>, <string-name><surname>Stoeger</surname> <given-names>T</given-names></string-name>, et al. <article-title>SCRIPT CarpeDiem Dataset: demographics, outcomes, and per-day clinical parameters for critically ill patients with suspected pneumonia</article-title>. PhysioNet. DOI: <pub-id pub-id-type="doi">10.13026/5PHR-4R89</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>