Note: This essay is the full draft of my graduate capstone paper submitted on April 17, 2022 to the Master of Science in Information Security and Privacy program at UT Austin.
It’s a vastly abridged version of a longer essay on identity and privacy in the metaverse.
Unless you’re very, very pressed for time, I recommend the longer one, not only because it is more complete, but because I cleaned up the tone to fit my Mirror audience instead of academia.
The advent of identity federations managed by large technology companies generated a lucrative opportunity to aggregate and monetize user data for secondary use, a business model commonly referred to as surveillance capitalism. This produced a multitude of societal harms, such as behavior manipulation, concealed influence, preference falsification, political polarization, and real-world violence. The situation is exacerbated by weak privacy protections for user data: frameworks that are at once overly prescriptive and insufficiently adaptive to encompass the shifting privacy boundaries of users in their digital contexts.
The emergence of distributed ledgers known as blockchains – where data is public by default and design – greatly increases the attack surface for digital privacy infringement. Its protection, meanwhile, remains largely unaddressed because both privacy governance and identity management fall outside the custody of the users whose experience of privacy is most affected.
Protecting consumers against manipulation and surveillance capitalism, particularly in emergent environments such the metaverse, requires rearchitecting privacy and identity management to adaptively safeguard user agency and autonomy in context.
This paper examines how conceptions of privacy have evolved over time and offers recommendations that include self-sovereign identity management as well as steps to ensure that users’ privacy experiences and consent preferences provide confidentiality, interoperability, and agency. The paper concludes with a call to action to the engineering and technical community involved in building blockchain technologies to urgently address the primitives that lead to surveillance capitalism – absence of user-centric identity management and overexposure of user telemetry – in order to avoid recreating surveillance capitalism on the blockchain and in emerging digital environments.
Policy and governance have historically lagged the privacy encroachments of advertising-supported technology platforms such as Facebook (now Meta), Google, and Twitter. These companies compete for attention and engagement to generate revenue, devising intentionally addictive user experiences and adversarial product interfaces designed to maximize user information capture. This data exhaust, or the trail of personal telemetry left behind by Internet users with little insight into or agency over their digital footprints, is then sold to advertisers, malicious actors, and anyone willing to pay for the datasets, giving rise to the pernicious business of surveillance capitalism.
Data exhaust is valuable to marketers and information operatives because it powers behavior models used to predict and manipulate outcomes. These outcomes can be as innocuous as influencing a purchase, or something much more alarming, such as polarizing an electorate and swaying a vote.
There is growing acknowledgement among lawmakers, policy experts, technologists, and consumers alike that platforms that feed on user telemetry are correlated with and often causally linked to political polarization, online toxicity, isolation and depression, and distrust in democratic institutions (Harris, 2019). The balkanization of public consensus resulting from behavior manipulation and targeted narrative warfare spills over into real world violence (Smith, 2021) and has, according to former National Security Advisor Lieutenant General H.R. McMaster, become a serious national security threat (Harris & Raskin, 2022).
Regulations have failed to stem the tide of data exhaust that powers surveillance capitalism. This is in part due to the pacing problem: innovation moves faster than its effects can be felt and described with sufficient precision to devise mitigation strategies – and by that time, the harm is already done (Thierer 2020). Even then, regulatory remedies only address symptoms, not underlying causes, so harms perpetuate in forms not captured in the new law. Ambiguity about what exactly privacy is and how much privacy citizens should enjoy further complicates the task of devising solutions for a problem that resists definition (Cohen, 2012).
Although pacing problems and definitional ambiguities do not serve society well, they are not the crux of the issue. The root cause of surveillance capitalism is a failure to build the Internet with a persistent, portable, and composable identity layer that allows users to self-custody their privacy decisions and self-govern how they connect, to what services, and under which conditions. The best available proxy for digital identity became the email address – and later, the social media account – which associated users to a data store of attributes and interactions owned and operated by technology companies – such as Google and Meta – provisioning those accounts to users.
Since this model of identity management separated users from custody over their data, privacy governance has focused on how companies should handle their users’ data, and not what users should be able to do with their own data. But privacy is a continuum of people’s ever-shifting boundaries and preferences that ebb and flow according to situation, environment, time, and mood. These characteristics defy the prescriptive, uniform definitions promulgated by centrally managed regulatory frameworks and legal remedies. Contexts shift while frameworks remain static, leaving user privacy boundaries vulnerable to exploitation.
The Cambridge Analytica scandal laid bare the need for better protections against surveillance capitalism – not only for the information security of the sensitive personally identifiable information (PII) that platforms manage, but for the autonomous agency and cognition of the person described by that PII. To reimagine privacy governance from first principles, we must rearchitect how we issue, manage, and govern digital identity — and place users in the center.
With the rise of public blockchains, this challenge is now even more pressing. Smart contract signatures and financial transactions are immutably stored on public ledgers for anyone, including advertisers, data brokers, and malicious actors, to track, analyze, and model, exacerbating the data exhaust problem that powers surveillance capitalism on platform businesses.
Internet users require robust protections across the full landscape of digital experiences, a daunting and infinitely complex undertaking that centrally managed governance has provably failed. Unless we give users full custody of their identities – and therefore their data, privacy preferences, and access controls – we will not only replicate but multiply the same problems that gave us Cambridge Analytica in the nascent blockchain, cryptocurrency, and metaverse technologies collectively referred to as Web3.
Until philosopher Helen Nissenbaum described privacy as a contextual spectrum determined by situated informational norms (2010), most experts and lawmakers had sought to articulate a precise universal definition – and to locate its value outside the individual’s subjective experience. A monolithic description, or at least a quantifiable evaluation of the purpose it serves in society, was deemed necessary for devising governance. But while laws require precise signifiers and calculable consequences, people’s privacy boundaries are variable and idiosyncratic, so any singular definition will necessarily remain simplistic, vague, and incomplete.
Samuel Warren and Louis Brandeis offered perhaps the first contribution to legal scholarship on privacy. Their Harvard Law Review polemic against the intrusions of the media, which reported on a private wedding ceremony without permission (reporting made possible, ironically, by the “disruptive” technology of the day: the invention of photo cameras) influenced over a century of privacy jurisprudence (1980). They argued that just as “a man’s house is his castle,” the private facts, thoughts, and emotions that constitute “inviolate personality” should likewise be shielded from public view. Their impassioned plea foreshadowed modern-day concerns about the impact of disruptive technologies on private life. It also laid the groundwork for the subsequent establishment of privacy torts that enumerate four specific harms: intrusion upon seclusion, public disclosure of private facts, false light, and appropriation of name or likeness (Prosser, 1960).
Attempts to define the legal boundaries of inherently subjective experience have largely proven too rigid and limiting to be practicable; this is especially true for digital experiences. Scholars have sought to broaden the scope beyond the narrow definitions of intrusive injuries and torts by connecting the presence of privacy to positive ideals of self-expression, actualization, and “free moral and cultural play” (Cohen, 2012), autonomous agency and sound decision-making (Burkert, 1998), and the conditions necessary for the developmental formation (Rosen & Santesso, 2011). Ethicist Shanon Vallor, for example, argues that “digital media mechanisms [that] undermine our self-control, cognitive autonomy, and moral agency” make it “harder, not easier, for us to choose well” (2016). This freedom to soundly evaluate options and make agentic, self-sovereign choices is, according to political scientist Priscilla Regan, a public good critical “to the flourishing of liberal societies” (Nissenbaum, 2010).
Scholars have also defined privacy by the inverse: that the absence of privacy causes material societal harms. Daniel Solove writes that people need privacy not to conceal illicit activity, but to avert social disciplining effects, without which their choices fall subject to decisional interference (2007). Because they generally fear the disapproval of their peers, individuals in surveilled spaces self-censor, falsify their preferences, and communicate ideas that differ from their true perspectives, generating a distorted view of reality. The accumulated misrepresentations of people’s real thoughts and sentiments achieves increasingly genuine social acceptance and normalization over time, leading to what philosopher Jeffrey Reiman calls psychopolitical metamorphosis (Nissenbaum, 2010).
By placing our survival needs for social approbation above our developmental needs for individual agency, surveilled spaces invite confirmation bias and groupthink. Individuals become more susceptible to social pressure and propaganda, especially if shared by those they wish to emulate or impress. Thus, even if people have nothing whatsoever to hide, surveillance materially alters perceived social norms and expressed behaviors.
While there is widespread agreement that privacy shares a causal link with autonomy and self-determination, its subjective and context-dependent nature does not lend itself to quantification or uniform governance. Yet it is precisely this nebulous space of shifting norms and expectations that privacy management must somehow locate and defend without collapsing the context and nuance necessary for rigorous, agentic governance.
The individual in question has the most granular just-in-time context necessary to formulate an appropriate defensive response, so, logically, privacy governance should reside within the individual’s purview. But absent the requisite identity layer to make such self-custody possible, digital identity shifted to the next available proxy: email addresses and social logins. Management of those digital identities likewise shifted to the custodians of those proxies: the technology companies issuing email and social login credentials. This awkward workaround for user-centric identity necessitated prescriptive and deterministic regulatory governance that, predictably, has failed to maintain the contextual integrity of users’ privacy boundaries, while leaving companies to amass vast stores of user telemetry and PII for exploitation by hackers, third parties, and the platforms themselves.
The National Institute of Standards and Technology’s Privacy Framework (NIST), for example, set out to “build customer trust” by “future-proofing products and services” (2020). It proposed to do this essentially by predicting the future: defining information inventory strategies and data processing policies, subdividing data types and users into categories, and prescribing response and communication protocols across the entire surface area of consumer-facing technology. That this framework was built to help companies comply with requirements rather than to help users preserve their cognitive autonomy is the first clue to why its relevance and applicability are limited.
More importantly, it is not difficult to imagine how attempts to “future-proof” technology by hoping to anticipate all emergent data types and interaction flows a user in some hypothetical future product might face creates system design contradictions for software architects and unintended complications for users. The practice of writing long and inscrutable Terms of Service (TOS) and End-User License Agreements (EULA) is the direct result of the tension between compliance requirements and irreconcilable design contradictions. Since companies cannot anticipate every possible context that a user might face any more so than a governance framework can, they settled on a workaround. By notifying users and obtaining their consent to relinquish data and decision rights in exchange for service, companies got off the hook for safeguarding contextual integrity while passing the burden of privacy management off to users without giving them the tools to manage their boundaries.
Of course, companies have no expectation that users will actually read, understand, or make rational and agentic choices about EULAs or TOS due to impracticable time and expertise barriers (Kröger et al, 2021). This purely performative practice has nonetheless become a widespread convenient fiction that neutralizes public concerns and allows companies to satisfy compliance without meaningfully protecting privacy. As a result, companies have been incentivized to treat privacy as a legal condition to be met while ignoring the intent of privacy management: to preserve the moral autonomy, cognitive consent, and contextual integrity of individuals whose unprotected digital telemetry leaves them vulnerable to surveillance and concealed influence.
Centralized, one-size-fits-all, exogenous technical frameworks separate users from agency over their digital identities and data. They promulgate a deterministic conception of privacy that holds constant across all contexts, settings, environments, and technological futures.
Realistic privacy governance would offer individuals latitude to modulate how much privacy they desire and when, and to respond in context to their shifting privacy boundaries. More privacy is not always preferable, nor is it always possible. Users might, for example, wish to simplify the labyrinthine process of collecting and transferring patient histories by lowering their privacy settings to expediently share records between medical offices. On the other hand, users would almost certainly want more privacy for their Amazon purchasing history or before placing a large bid on a non-fungible token (NFT).
Since the only party with sufficiently rich information to make an agentic, context-informed choice is the individual in question, responsive privacy governance must center end users, not companies.
Nearly every issue in privacy governance stems from the same origin: the absence of self-owned, user-centric identity and access controls. We evolved effective identifiers for websites and endpoints, but not for the people using them.
For users to interact with websites, companies began issuing local accounts with usernames and passwords. This siloed approach to digital identity meant that users had to create unique accounts for every site, leading to a poor user experience and creating massive breach liabilities for companies whose only interest was to grant users access, not to manage their accounts and PII. The prospect of getting out of the business of storing sensitive data and having to manage expensive cybersecurity regimes to fend off hackers became attractive to companies, who were happy to outsource the entire thing to bigger players with more ambitious plans for PII.
This gave rise to federated identity, an opportunity to both streamline fragmented user experiences across identity silos and monetize vast quantities of user telemetry for secondary use. Providers such as Facebook, Google, and Amazon entered the identity space to become the “trusted” middlemen of digital identity credentials, offering users a way log in with their pre-existing accounts, while shifting the responsibility for information security from individual businesses to federations equipped with the vast resources of technology platforms.
Importantly, this centralization of identity into federations abstracted interaction and identity decisions away from users and their context. Users ended up with “tens or hundreds of fragments of themselves scattered across different organizations, with no ability to control, update or secure them effectively” (Tobin, 2016), perfecting the conditions for surveillance capitalism.
If the goal is to return agency, moral autonomy, and cognitive consent to digital users, then users need the tools to modulate their own access controls and preferences across infinite contexts. An exhaustively descriptive and infinitely flexible expression of all possible choices across all systems and futures is impossible only if one considers building it in a centralized way, where governance takes place exogenously, outside the locus of the user, through universal practices and compliance requirements imposed from outside.
But people want different things. Exogenous, centralized privacy management cannot possibly give people the precise level of privacy they want because that is tantamount to predicting the future. The only parties capable of formulating just-in-time responses to infinite environments, attributes, transmission principles, and contexts – and modulating how much signal they emit in response – are end users themselves.
Self-sovereign identity (SSI) is an approach to identity management that empowers users to self-custody their own identities, data, and privacy decisions. SSI can eliminate centralized middlemen and the overexposure of PII to federations by appending encrypted identity attributes to a user’s public key, which only the user or a designated third party can access. The flow of information between parties happens only with the cryptographic consent of the identity owner whose credentials are requested. In its ideal state, SSI allows users to “log in” to any product, service, game, metaverse, or protocol – irrespective of the user’s chosen SSI tool or wallet – and to transact while minimizing data exchange.
Significant obstacles stand in the way of ecosystem-wide adoption of SSI. In order for an identity to be useful, trusted identity providers must agree to issue their credentials to the user’s identity or namespace, and verifying parties must be satisfied that the levels of assurance followed by issuers satisfy security criteria. But for identity providers to undertake the effort to develop credentials, the credentials themselves must first be accepted by enough verifying parties, or their usecase becomes too narrow to pursue. In turn, for credentials to see wide adoption, enough issuers must first agree to develop and issue them.
This cold-start problem in SSI requires urgent resolution, because absent user-centric identity management, the problems of federated identity will compound in Web3. Indeed, Web3 has already encountered familiar privacy intrusion problems. Decentralized autonomous organizations (DAOs), for example, struggle with preference falsification in governance proposals. Since votes are stored on-chain and visible to others, DAO members self-discipline their preferences to abide by the group’s prevailing norms. In decentralized finance (DeFi), bad actors can manipulate outcomes in what is called a sandwich attack. Since all participants can see the price of any trade, attackers scan protocols for pending transactions. They then issue two orders: one just before the transaction and one right after, artificially increasing the price of the trade to generate profits. Finally, in online gaming, users find themselves in increasingly immersive environments where their behaviors – from how long they stare at an object to which users they interact with – become inputs for predictive modeling. The surface area for manipulation in the metaverse increases by orders of magnitude from what users face today on social networks because augmented and virtual reality interfaces produce richer, more descriptive telemetry, including biometric markers.
Because Web3 alters the fundamentals about how people transact value and construct meaning, many observers ascribe to it all manner of unrealistic hopes, including the naive idea that users will own their data simply by exiting extractive platforms that monetize attention. In fact, Web3 does nothing to end surveillance capitalism because the primitives that drive surveillance capitalism – missing identity layer, unchecked data exhaust – are left unaddressed. Even if we do away with centralized platforms altogether and shift all business to decentralized protocols, the only thing that changes is where user telemetry gets stored: in the cloud or on the public ledger. Indeed, the privacy implications of Web3 are made worse, not better, because even if platforms no longer own user data, all their transactions become a matter of public record, exposing users to targeted monitoring and surveillance by anyone who wishes to perform a rudimentary Etherscan or Chainanalysis search.
SSI provides an essential component for architecting a coherent privacy framework that places the locus of control within the user’s purview. A working SSI ecosystem would furthermore have three key attributes: confidentiality, interoperability, and agency.
To end surveillance capitalism, technology must stem the flow of unprotected, widely available data exhaust. While SSI allows users to reduce their aggregate digital footprint, it provides zero confidentiality for signal emitted once a smart contract is signed or a transaction appended to a blockchain. Those signals leave a traceable, publicly viewable trail of remittances, holdings, purchases, votes, and interactions enabling precise reconstruction of identity to model behavior and manipulate outcomes.
To mitigate the accumulation of so much sensitive telemetry on public ledgers, a privacy-preserving decentralized identity ecosystem would need to provide confidentiality by obscuring the details of smart contract signatures and blockchain transactions, or by altogether breaking the link between interconnected public keys. There is a need for research, development, and investment into encryption techniques such as, for example, zero-knowledge proof cryptography, which allows a party to prove that a statement is true without furnishing proof of the statement itself. By taking advantage of selective disclosure and least privilege access, SSI can help users reduce their data exhaust, making surveillance more time-intensive and less lucrative to pursue.
Interoperable acceptance is a precondition for ecosystem-wide adoption of SSI, which requires that all companies, applications, protocols, and platforms agree to use a common set of data rails and not lock users into their own proprietary ways of handling data. A familiar way to think about interoperability is having all email providers using the same Simple Mail Transfer Protocol (SMTP), without which users could not communicate between servers. To that end, the Worldwide Web Consortium, an international standards body led by web inventor Tim Berners-Lee, has released a set of common standards for decentralized identifiers (DiD) and verifiable credentials (VC) that the entire ecosystem can use to interoperate (Worldwide Web Consortium, 2021).
Interoperability is critical because it not only eliminates silos but alleviates the information overload and unrealistic time and expertise requirements that arise when users are asked to consent to TOS and EULAs (or, in the case of blockchains, to “do their own research” by reading smart contract code, a popular and cynical version of the notice-and-consent paradigm that is emerging in Web3). Users cannot and will not read technical agreements; nor will they ever become privacy experts – and not because they are lazy but because the idea itself is impractical. This is a design constraint that technologists must stop ignoring, minimizing, and sidestepping.
For SSI to provide value, users would need ways to “set and forget” default preferences across categories of similar transactions and experiences without getting bogged down in code and contracts – while also protecting these preference groupings against exploitation by malicious code. Digital products and services would, by extension, be obligated to read and abide by these default preferences, adjusting the experience they provide accordingly instead of expecting users to lower their privacy boundaries. Unless the entire SSI ecosystem operates on the same, standardized data rails, SSI will not be an improvement over the consent theater of federated identity that it replaces.
Finally, for users to become the ultimate arbiters of their online lives, they must be able to modulate their privacy boundaries entirely at their discretion, requiring a rich menu of controls to adapt their privacy preferences in response to shifting norms and surrounding context. This calls for clear, easily navigable user interfaces that present transparent controls for privacy options. Since very little interaction design research has been conducted on SSI interfaces and identity wallets, this is an area of Web3 that is ripe for innovation and invention.
This paper aims to stimulate discourse among the engineers, founders, and visionaries building Web3 about the urgency of rethinking privacy and identity management. Deterministic, exogenous privacy frameworks not only limit the possibility space for invention, but quickly fall out of relevance as system designers devise creative workarounds that most often take the form of high-friction user experiences. This is especially true of Web3, a space that is experiencing rapid innovation with consequences that fall outside the scope of existing governance.
There is a fatal flaw in the naive logic that blockchain technologies will, on their own, address the harms that pervade our dominant platforms: most of those harms are rooted in ineffective privacy protections and poorly designed identity management, not extractive business models. The promise of a new business model does not by itself address the underlying primitives that create surveillance capitalism: data exhaust, behavior tracking and aggregation, and digital identities abstracted from their true owners. The business model of attention-driven economics is merely a symptom of surveillance capitalism, not its source. So long as the surplus telemetry that emerges from ineffective and outdated privacy frameworks and identity governance enables the monetization of data exhaust for commercial gain, surveillance capitalism will persist.
Unless we address these failure modes by architecting a confidential, interoperable, and agentic self-sovereign identity ecosystem, there will be no material difference between extractive platform businesses and the decentralized versions that hope to supplant them. Systematic commercialization of attention will merely shift from platforms to protocols, yielding the same predations we have grown weary of today, indistinguishable except in degree: reidentification, targeting, and concealed influence in even more immersive, inescapable, pervasive, and immutable forms.
But while Web3 exposes users to more risk, it presents a unique opportunity to abandon outmoded frameworks in favor of identity and privacy schemes that center individual autonomy and agency. This is an opportunity that nobody, least of all those building Web3, can afford to ignore.
Burkert, H. (1998). “Privacy-Enhancing Technologies: Typology, Critique, Vision.” In Technology and Privacy: The New Landscape (ed. Philip E. Agre and Marc Rotenberg).
Cohen, J. E. (2012). Configuring the Networked Self: Law, Code, and the Play of Everyday Practice (Illustrated ed.). Yale University Press.
Gavison, R. (1980). Privacy and the Limits of Law. The Yale Law Journal, 89(3), 421. https://doi.org/10.2307/795891
Harris, T. (2019). “Tech is ‘Downgrading Humans.’ It’s Time to Fight Back.” Wired. https://www.wired.com/story/tristan-harris-tech-is-downgrading-humans-time-to-fight-back/
Harris, T. and Raskin, A. (Hosts). (2022, January 13). Is World War III Already Here? (No. 45) Guest: Lieutenant General H.R. McMaster [Audio podcast episode]. In Your Undivided Attention. TED. https://www.humanetech.com/podcast/45-is-world-war-iii-already-here
Kröger, J.L., Lutz, O.H.M., and Ullrich, S., (2021, July 7). “The Myth of Individual Control: Mapping the Limitations of Privacy Self-management.” https://ssrn.com/abstract=3881776 or http://dx.doi.org/10.2139/ssrn.3881776
Kuran, T. (1997). Private Truths, Public Lies: The Social Consequences of Preference Falsification (Reprint ed.). Harvard University Press.
Nissenbaum, H. (2010). Privacy in Context: Technology, Policy, and the Integrity of Social Life (1st ed.). Stanford Law Books.
NIST Privacy Framework. (2020, January 16). National Institute of Standards and Technology. https://doi.org/10.6028/NIST.CSWP.01162020
Prosser, W. L. (1960). Privacy. California Law Review, 48(3), 383. https://doi.org/10.2307/3478805
Rosen, D., & Santesso, A. (2011). Inviolate Personality and the Literary Roots of the Right to Privacy. Law and Literature, 23(1), 1–25. https://doi.org/10.1525/lal.2011.23.1.1
Smith, A. (2021, October 25). Facebook whistleblower says riots and genocides are the ‘opening chapters’ if action isn’t taken. The Independent. https://www.independent.co.uk/life-style/gadgets-and-tech/facebook-whistleblower-zuckerberg-frances-haugen-b1944865.html
Solove, Daniel J. (2007) 'I've Got Nothing to Hide' and Other Misunderstandings of Privacy. San Diego Law Review, Vol. 44, p. 745, 2007, GWU Law School Public Law Research Paper No. 289, Available at SSRN: https://ssrn.com/abstract=998565
Thierer, A. (2020, June 8). The Pacing Problem and the Future of Technology Regulation. Mercatus Center. https://www.mercatus.org/bridge/commentary/pacing-problem-and-future-technology-regulation
Tobin, A., & Reed, D. (2016, September). The Inevitable Rise of Self-Sovereign Identity. Sovrin Foundation. https://sovrin.org/wp-content/uploads/2017/06/The-Inevitable-Rise-of-Self-Sovereign-Identity.pdf
Vallor, S. (2016). Technology and the Virtues: A Philosophical Guide to a Future Worth Wanting. Oxford University Press.
Warren, S. D., & Brandeis, L. D. (1890). The Right to Privacy. Harvard Law Review, 4(5), 193. https://doi.org/10.2307/1321160
Worldwide Web Consortium. (2021, August 3). Decentralized Identifiers (DIDs) v1.0. W3C. https://www.w3.org/TR/did-core/
Zuboff, S. (2020). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (Reprint ed.). PublicAffairs.