Privacy is both one of the most widely discussed, and least understood, topics at the intersection of technology and society. While a number of individual products handle privacy for specific data stores, until recently there was no single platform to enforce privacy considerations across a typically complex modern enterprise. Today, LightBeam is emerging from stealth to address this towering problem, and 8VC is honored to have led their seed round.
As followers of this space are intimately aware, today’s enterprises are responsible for handling more and more data, from more and more sources, affecting numerous workflows / applications downstream, and spread across countless tables. Many of these workflows and analyses are driving direct business outcomes. The issue, however, is that the customer assets enabling the relevant use cases often contain PII, PHI, or otherwise sensitive data. To compound the difficulty, there are also more regulations than ever concerning the handling of such data, and these regulations tend to be easier to write than to enforce.
These regulations are identity-centric: everything must be linked to a single identity — typically a person in the privacy context, but also applicable to different types of assets that are linked to that identity. Given the right tools, this should enable practitioners to answer questions around data usage, map out which data exists where, apply restrictions and protections to data, and delete data in a centralized, complete manner. In addition to an organization proactively auditing and removing PII as required by regulations, individual consumers should in theory be able to ask an organization for questions about their specific data, or request to have it deleted entirely. Beyond these practical considerations, the very concept of privacy is itself heavily identity-centric — and this provided a focal point for LightBeam to build the definitive platform for privacy. Before taking a closer look at LightBeam, it’s worth reviewing the status quo for data privacy. Today, most of the work that goes into enforcing privacy considerations is manual with detection and identification of which tables or documents contain PII being handled by individual analysts. As the variety of sources, complexity, scale, and velocity of data lurches upward, keeping up using the existing solution set quickly becomes impossible. This forces data enterprises into tunnel vision around the most immediate consequences, and as a result, most of the existing privacy products on the market are parched by compliance departments to check a specific box, rather than solving the underlying issue.
LightBeam is the first unified, data privacy automation platform that allows enterprises to handle PII and PHI in an identity-centric manner, while paying close attention to the very hard technical problems of disambiguating entities and defining relationships between them in the core data tier. On top of this foundational principle, LightBeam features three pillars of product differentiation:
A rich and holistic approach towards supporting structured and unstructured data (shockingly, most products are only well suited to one or the other).
Automation at the center: LightBeam can extract information from historically hard-to-regulate channels such as Slack, email, and others. This obviously saves a lot of manual effort, but it also raises the stakes significantly for precision, and avoiding both false positives and false negatives. The under-the-hood ML likewise cannot be static, given the inevitability of new data coming into the organization, along with new regulations.
The ability to build a notion of identity across multiple sources and a vast number of entity types (not just user/customer attributes) — ultimately constituting a 360-degree view of each customer’s sensitive information. It’s akin to seeing a picture emerge after finally putting all the tiny, disparately shaped puzzle pieces together.
The upshot is that LightBeam enables companies to define thoughtful, fine-grained policies to determine how data flows through the enterprise and where it should, and should not, appear. This, in turn, enables them to service complex workflows for all the different enterprise personas who ask identity/governance questions, and adhere to ever changing regulations, current and future.
An ideal platform should reflect both the conviction and the nuance with which human experts treat questions of privacy, and this realization helps frame what made LightBeam so challenging to create. It is simultaneously a systems-architecture problem — balancing issues of raw scale with those of diverse, complex, and often unpredictable data formats — and a machine-learning problem, encompassing all the demands of automatically flagging, detecting, and correlating the right pieces across a vast data landscape, on a per-identity basis.
As you have probably guessed by now, LightBeam’s impressive technology is the brainchild of an equally impressive team. Together, founders Himanshu, Priyadarshi, and Aditya supply an incredible technical background in both systems engineering and machine learning honed at some of the epicenters of data infrastructure innovation, most recently Nutanix and LinkedIn. We’ve been equally impressed by the elegance of their solutions to extraordinarily complex problems, and their blistering speed of execution.
In the classic 8VC Smart Enterprise tradition, LightBeam represents an investment in solving an entire class of urgent problems, both immediate and downstream, central and adjacent. Even better, we saw an opportunity to back another category-creating company. Just as we invested in Airbyte to solve data movement in enterprises, Acryl Datahub to solve defragmentation through a unified metadata platform, and Yugabyte to solve geo-distributed cloud-scale OLTP storage, we believe that solving privacy in an identity-centric way is a foundational component of the modern data stack. Handling privacy can be extremely complex, but the principles behind it, and the desired outcomes that follow, are not. LightBeam is bringing desperately needed simplicity to this problem space, enabling the technology to work for the human — and not the other way around.
Originally published at https://medium.com on April 7, 2022.