Data Lifecycle: Why Good Data Governance Is Good for Business | Lightbeam
Discover why good data governance is good for business. Greg Vaisberg & Niramayee Sarpotdar share insights from Lightbeam’s Privacy eBook Vol II.
Data Lifecycle - Good Data Governance is Good (for) Business! | Lightbeam
In conversation with Greg Vaisberg and Niramayee Sarpotdar. Discussing our recently published eBook Privacy : Applied. Proactive. Innovative. Volume II and a chapter written by Greg specifically called, 'Data Lifecycle - Good Data Governance is Good (for) Business!'
Transcript
Hello, and welcome to another episode
of the Privacy Pros video series.
Uh, this is an initiative by the SI circle, where SI stands
for Privacy and Security Circle.
As a part of this initiative, we published an ebook
with articles written by reading privacy experts.
We are also doing a series of short interviews
with them about their articles.
And I have one such expert with me today.
His name is Greg Weisberg.
Currently Greg leads the legal team responsible
for all aspects and security at, uh, Nutanix Technologies.
Your
Work? Yeah, absolutely.
Uh, so I, like you mentioned, I lead the,
uh, the legal team, uh, here at Nutanix.
Uh, we're an, uh, an enterprise infrastructure company
and, uh, my team's responsible for all of product, uh,
privacy and security part of it.
I've become very passionate about privacy
and specifically the intersection of privacy
and technology, privacy technology,
and as I like to also say operations, which is why I chose,
uh, to, uh, share with our audience, uh, the topic of,
uh, privacy framework
and more information security framework, uh,
that I believe really should inform the way we think about
all data in the company, not necessarily just personal data,
uh, that I think will hold, uh, will not only, uh,
help us gain further customer trust us as an industry,
but also enable us to build strong operational ties
that I think are really required now between the operations,
engineering, compliance and legal teams, uh,
and really at any company.
Uh, that makes sense. Thanks, Greg.
So, like you said, you've written an article that's titled,
uh, data Lifecycle Governance is Good for Business.
And in the article, like you said, you, uh,
laid out a pretty typical lifecycle for any information
and, uh, seven core aspects related to it.
You've also provided a framework that organizations can use,
you know, for that is a processing activity,
but just, uh, this framework is useful for
so many other things as well.
So, yeah, can you tell me a little more about, you know,
why this is important and, um, you know,
why this topic specifically? Oh,
Absolutely. I, I mean, for
one, it's complicated.
It is, uh, uh, you know, uh, I,
I talk a little bit in the article about, uh, and I,
and I think when I was writing an article,
I think I was having my annual checkups at the doctor,
and, uh, I may have just recently gone to a, a lab
and I thought, well, that's interesting, right?
Uh, a blood sample in a lab never gets lost practically.
We don't always, you know, misplace our keys
or lose our mobile phones.
We keep pretty good track of physical assets.
I've been thinking a lot about the complexity
of tracking data
because data is really a lifeblood of our business today.
Um, and when I say that I don't just mean personal data,
it's very easy for us to think about the, uh,
consumer data usage
that we've been hearing a lot about in the press, uh,
with social media and others and advertising.
But truthfully, any company has data
and any company has data that they need to protect,
whether it's personal or confidential information.
And that could be starting from the very thing
that makes your business, your employees,
you hold personal data about them, um, with respect
to the intellectual property
that they create for the company.
That is also an asset that is confidential
and should be protected.
Um, information that your customers share with you, whether
that is their account login information, workloads
that they process with you, um, if you're a cloud provider
that has information about their customers, um,
or confidential information that they provide as part
of the services you provide them
and enable you to have access
to their confidential information.
And then lastly, any personal data, uh, that,
that you receive as a result of supporting those customers
or supporting your customer's customers or being aprocessor.
So it's a very rich life cycle and a rich data set.
And I think that one of the reasons
that I proposed this framework
as a general data governance framework,
rather than a privacy governance F framework, is
because it's truly applicable to all data,
and it really highlights the difficulty of identifying
what type of data it is and then
consequently how you're going to use it
and how you're gonna enable other users
in the company to use it.
But I think we are at a technologically pivotal point where
we as an industry can actually tackle this.
Um, and I think that only happens
through a close collaboration
with others in an organization.
Uh, this is an operational problem as much
as it is the legal and compliance one.
And if we approach it as such, then I think we emerge with
stronger relationships because we're starting with the why,
and we are providing a clear guidance on the how
to our business partners to actually execute the things
that we need to provide that level of visibility.
Right. I completely agree with you when you said that,
you know, data that we tend to store organizations tend
to pursue store, it's so rich, there's so many facets to it.
Uh, it's not always just about,
you know, personal information.
There's just so much to it and that's why it's complex.
But you've laid out a framework in your article,
and you mentioned several times
that the framework is simple, uh,
but we all know that, you know,
the simple things are the ones
that are hardest to implement, right?
So what has your experience been like so far?
You know, what would you say are the top three
or four challenges
that organizations face when they even wanna remotely
implement anything of this sort?
A couple of things. I think one is making sure
that you have a proper whatever system you use, right?
And, and there are myriads
because the data sources are myriad,
but whatever sources you use,
you can always identify the point of origin, right?
And, uh, that really, uh, requires you
to understand your organization.
Where's the data coming from for customers, for,
from partners, from leads, and, uh, other sources?
What are the systems that are actually ingesting it, right?
So this really, uh,
you mentioned the new introduction, ropa, right?
And fundamentally a good ropa is essentially
this model, right?
If we just look through a, a RPA lens,
most data should be trackable through that framework.
And really this framework just kind
of double clicks on a few areas
and kind of expands operationally on what may be required.
So I think the challenging thing first is just mapping,
mapping your assets.
What are your ingestion points
and where's the data coming from?
And from there, you can start
to actually parse once you find
where the data's coming into the company,
what rights you're receiving to that data,
and what that data actually is.
Because it's very important
to essentially earmark each piece of data
with the rights you have with respect to it,
and the kind of uses you can make of it.
Because as we all know,
whenever we receive data, we don't
have unfettered rights to it.
It's not something that we created, and
therefore, it's either the subject to a licensing agreement,
to a non-disclosure agreement
or subject to privacy regulations
and your data processing agreement
with your customers, for example.
It is subject to obligations to your partners.
Uh, you may be a controller of that data,
you may be a processor of that data,
but there are strict obligations,
and you have to know how you can use that data
before you can make use of it
and make sure you use it within those parameters.
The other difficult part is the ease of replication
of data within any organization today makes it difficult
to know where it's going.
And so mapping where the data resides
and that actually includes access is also quite imperative.
And something that is absolutely aided
by a good relationship with your, uh, IT team,
your engineering team, to be able to map that data.
And also technological solutions that we have today, such
as live beam, that can help with discovering
where your data actually is.
And that foundational set
of elements really then enables the next step,
which is the most difficult, setting up the access
security controls and the use restrictions
because recall that each piece of data
that we have can only be used for a specific purpose.
And so in order to be able to technologically implement, um,
access controls, security restrictions
and analytics limitations, you must have earmarked the data
with, with elements that identify what
that data can be used for.
And then you can pass that, those permissions
and limitations onto individuals and systems who utilize
or access to that information.
And you know, some, you know, and somewhat thereafter,
because you have identified everything, you know
what you're using it for and you know where it came from,
you are able to keep that data up to date
to make sure you're making good decisions
and you are not using stale data.
And that's really at best, right?
At worst, you are having data that is inaccurate
and you are actually non-compliant with your obligations
to ensure that, uh, your processing activities are once kind
of the often overlooked area is the term.
When do we delete it, right? And how do we delete it?
And given the wealth
of data the companies are processing today, you really have
to think about that upfront.
It is the last piece of the life cycle,
but you must identify that what technological means, whether
that is a piece of data that is regulatory required
to be kept for a period
and then delete it in accordance to regulations
after a certain period, or whether it's data
that you actually have under privacy regulations established
and maintain lawful basis of processing for,
you must have a mechanism that actually implements this.
And each one of these represents a challenge,
but I would say that they're surmountable today
with appropriate tooling,
but really most importantly,
with appropriate operational rigor
and really strong relationships across the organization,
anybody today, in any organization can onboard new tools,
can move data around, uh,
and if these controls are not implemented,
it's simply not traceable.
And that traceability gives a tremendous amount, amount
of power, not just for compliance,
but also for strong organizational and business insight,
Right? So if I'm
curing you right, that there are obviously a lot
of challenges along the way,
but the most important thing here to overcome everything is,
like you said right at the beginning, just to ear mark,
mark all of your data, just have as much information
as possible about where it's coming from, where it's going,
what it's being used for, how long is it gonna be around?
The more information you have, the more easier it is to,
you know, implement, uh, some of these things, right?
So, um, in your article,
you've out a pretty detailed framework, but you've also
provided pretty detailed recommendations, right?
On what should happen around each aspect of data.
So if you had to just very quickly summarize your learnings
and give your colleagues any kind
of advice, what would it be?
Partner with your security organization
and your right to your organization, um,
and help them discover all of the assets you have,
do a proper data mapping exercise.
And the data mapping exercise starts with understanding all
of the assets you have that actually touch data
and all of the use cases that you have.
So you have the session elements
of your record of processing.
If you approach it from a record of processing, like, uh,
perspective, you won't go wrong.
This framework overlays that
and is helpful by providing additional questions.
But if you start with where is my, where is my data?
What, and what data do I have
and what rights I have with it,
and document that you can then establish what you need
for all of the subsequent steps.
How do you secure that data? Who should have access to it?
What are we gonna use that data for?
What can we allow that use of data for
and implement the appropriate cons, uh, appropriate set
of operational measures that are actually going
to be technologically enforced. So
What I'm hearing is, um, basically, yeah, like you said,
look at this like a record of processing activity.
Look at all of your data, everything that you store, uh,
from that lens, and more importantly, uh,
make this extremely collaborative.
Uh, like you said, partnering with the right, uh,
teams within your organization.
IT engineering especially, that is the key to, you know, um,
a successful, you know, foundation
for any of these processes.
Absolutely. This, this is,
this is absolutely an operations problem,
and it really can bring your teams closer together
for a much stronger partnership across, uh,
uh, all of the functions.
Uh, whether it's marketing, hr, engineering, uh,
product development, and of course legal
and compliance who often lead these functions, um, uh,
sorry, lead these activities.
We can't do it alone. Uh, we absolutely need
that partnership, both from a knowledge perspective as well
as technological perspective.
That makes sense. Greg, thank you so much.
This was really insightful.
And for our audiences, I just wanna say
that this was just a very small snippet
of the article that Greg has written.
I would highly recommend all of you to go ahead and read it.
Like we've discussed even in the interview,
it lays out like a very typical, uh, information
or data lifecycle along with recommendations on
what each organization needs to do to keep in mind
or, you know, aspects that they need
to keep in mind about every aspect of
that information or data.
And like Greg says, um, think of thing, um, you know,
from the perspective of how can I maintain a good record
of all my pressing activity, right?
So yeah, good data governance indeed is good for business.
So please make sure that you read this chapter
in our latest ebook.
The book also contains several other chapters
by other leading, uh, privacy experts.
So please go ahead.
I would highly recommend you read for content, the privacy
and security
and.
of the Privacy Pros video series.
Uh, this is an initiative by the SI circle, where SI stands
for Privacy and Security Circle.
As a part of this initiative, we published an ebook
with articles written by reading privacy experts.
We are also doing a series of short interviews
with them about their articles.
And I have one such expert with me today.
His name is Greg Weisberg.
Currently Greg leads the legal team responsible
for all aspects and security at, uh, Nutanix Technologies.
Your
Work? Yeah, absolutely.
Uh, so I, like you mentioned, I lead the,
uh, the legal team, uh, here at Nutanix.
Uh, we're an, uh, an enterprise infrastructure company
and, uh, my team's responsible for all of product, uh,
privacy and security part of it.
I've become very passionate about privacy
and specifically the intersection of privacy
and technology, privacy technology,
and as I like to also say operations, which is why I chose,
uh, to, uh, share with our audience, uh, the topic of,
uh, privacy framework
and more information security framework, uh,
that I believe really should inform the way we think about
all data in the company, not necessarily just personal data,
uh, that I think will hold, uh, will not only, uh,
help us gain further customer trust us as an industry,
but also enable us to build strong operational ties
that I think are really required now between the operations,
engineering, compliance and legal teams, uh,
and really at any company.
Uh, that makes sense. Thanks, Greg.
So, like you said, you've written an article that's titled,
uh, data Lifecycle Governance is Good for Business.
And in the article, like you said, you, uh,
laid out a pretty typical lifecycle for any information
and, uh, seven core aspects related to it.
You've also provided a framework that organizations can use,
you know, for that is a processing activity,
but just, uh, this framework is useful for
so many other things as well.
So, yeah, can you tell me a little more about, you know,
why this is important and, um, you know,
why this topic specifically? Oh,
Absolutely. I, I mean, for
one, it's complicated.
It is, uh, uh, you know, uh, I,
I talk a little bit in the article about, uh, and I,
and I think when I was writing an article,
I think I was having my annual checkups at the doctor,
and, uh, I may have just recently gone to a, a lab
and I thought, well, that's interesting, right?
Uh, a blood sample in a lab never gets lost practically.
We don't always, you know, misplace our keys
or lose our mobile phones.
We keep pretty good track of physical assets.
I've been thinking a lot about the complexity
of tracking data
because data is really a lifeblood of our business today.
Um, and when I say that I don't just mean personal data,
it's very easy for us to think about the, uh,
consumer data usage
that we've been hearing a lot about in the press, uh,
with social media and others and advertising.
But truthfully, any company has data
and any company has data that they need to protect,
whether it's personal or confidential information.
And that could be starting from the very thing
that makes your business, your employees,
you hold personal data about them, um, with respect
to the intellectual property
that they create for the company.
That is also an asset that is confidential
and should be protected.
Um, information that your customers share with you, whether
that is their account login information, workloads
that they process with you, um, if you're a cloud provider
that has information about their customers, um,
or confidential information that they provide as part
of the services you provide them
and enable you to have access
to their confidential information.
And then lastly, any personal data, uh, that,
that you receive as a result of supporting those customers
or supporting your customer's customers or being aprocessor.
So it's a very rich life cycle and a rich data set.
And I think that one of the reasons
that I proposed this framework
as a general data governance framework,
rather than a privacy governance F framework, is
because it's truly applicable to all data,
and it really highlights the difficulty of identifying
what type of data it is and then
consequently how you're going to use it
and how you're gonna enable other users
in the company to use it.
But I think we are at a technologically pivotal point where
we as an industry can actually tackle this.
Um, and I think that only happens
through a close collaboration
with others in an organization.
Uh, this is an operational problem as much
as it is the legal and compliance one.
And if we approach it as such, then I think we emerge with
stronger relationships because we're starting with the why,
and we are providing a clear guidance on the how
to our business partners to actually execute the things
that we need to provide that level of visibility.
Right. I completely agree with you when you said that,
you know, data that we tend to store organizations tend
to pursue store, it's so rich, there's so many facets to it.
Uh, it's not always just about,
you know, personal information.
There's just so much to it and that's why it's complex.
But you've laid out a framework in your article,
and you mentioned several times
that the framework is simple, uh,
but we all know that, you know,
the simple things are the ones
that are hardest to implement, right?
So what has your experience been like so far?
You know, what would you say are the top three
or four challenges
that organizations face when they even wanna remotely
implement anything of this sort?
A couple of things. I think one is making sure
that you have a proper whatever system you use, right?
And, and there are myriads
because the data sources are myriad,
but whatever sources you use,
you can always identify the point of origin, right?
And, uh, that really, uh, requires you
to understand your organization.
Where's the data coming from for customers, for,
from partners, from leads, and, uh, other sources?
What are the systems that are actually ingesting it, right?
So this really, uh,
you mentioned the new introduction, ropa, right?
And fundamentally a good ropa is essentially
this model, right?
If we just look through a, a RPA lens,
most data should be trackable through that framework.
And really this framework just kind
of double clicks on a few areas
and kind of expands operationally on what may be required.
So I think the challenging thing first is just mapping,
mapping your assets.
What are your ingestion points
and where's the data coming from?
And from there, you can start
to actually parse once you find
where the data's coming into the company,
what rights you're receiving to that data,
and what that data actually is.
Because it's very important
to essentially earmark each piece of data
with the rights you have with respect to it,
and the kind of uses you can make of it.
Because as we all know,
whenever we receive data, we don't
have unfettered rights to it.
It's not something that we created, and
therefore, it's either the subject to a licensing agreement,
to a non-disclosure agreement
or subject to privacy regulations
and your data processing agreement
with your customers, for example.
It is subject to obligations to your partners.
Uh, you may be a controller of that data,
you may be a processor of that data,
but there are strict obligations,
and you have to know how you can use that data
before you can make use of it
and make sure you use it within those parameters.
The other difficult part is the ease of replication
of data within any organization today makes it difficult
to know where it's going.
And so mapping where the data resides
and that actually includes access is also quite imperative.
And something that is absolutely aided
by a good relationship with your, uh, IT team,
your engineering team, to be able to map that data.
And also technological solutions that we have today, such
as live beam, that can help with discovering
where your data actually is.
And that foundational set
of elements really then enables the next step,
which is the most difficult, setting up the access
security controls and the use restrictions
because recall that each piece of data
that we have can only be used for a specific purpose.
And so in order to be able to technologically implement, um,
access controls, security restrictions
and analytics limitations, you must have earmarked the data
with, with elements that identify what
that data can be used for.
And then you can pass that, those permissions
and limitations onto individuals and systems who utilize
or access to that information.
And you know, some, you know, and somewhat thereafter,
because you have identified everything, you know
what you're using it for and you know where it came from,
you are able to keep that data up to date
to make sure you're making good decisions
and you are not using stale data.
And that's really at best, right?
At worst, you are having data that is inaccurate
and you are actually non-compliant with your obligations
to ensure that, uh, your processing activities are once kind
of the often overlooked area is the term.
When do we delete it, right? And how do we delete it?
And given the wealth
of data the companies are processing today, you really have
to think about that upfront.
It is the last piece of the life cycle,
but you must identify that what technological means, whether
that is a piece of data that is regulatory required
to be kept for a period
and then delete it in accordance to regulations
after a certain period, or whether it's data
that you actually have under privacy regulations established
and maintain lawful basis of processing for,
you must have a mechanism that actually implements this.
And each one of these represents a challenge,
but I would say that they're surmountable today
with appropriate tooling,
but really most importantly,
with appropriate operational rigor
and really strong relationships across the organization,
anybody today, in any organization can onboard new tools,
can move data around, uh,
and if these controls are not implemented,
it's simply not traceable.
And that traceability gives a tremendous amount, amount
of power, not just for compliance,
but also for strong organizational and business insight,
Right? So if I'm
curing you right, that there are obviously a lot
of challenges along the way,
but the most important thing here to overcome everything is,
like you said right at the beginning, just to ear mark,
mark all of your data, just have as much information
as possible about where it's coming from, where it's going,
what it's being used for, how long is it gonna be around?
The more information you have, the more easier it is to,
you know, implement, uh, some of these things, right?
So, um, in your article,
you've out a pretty detailed framework, but you've also
provided pretty detailed recommendations, right?
On what should happen around each aspect of data.
So if you had to just very quickly summarize your learnings
and give your colleagues any kind
of advice, what would it be?
Partner with your security organization
and your right to your organization, um,
and help them discover all of the assets you have,
do a proper data mapping exercise.
And the data mapping exercise starts with understanding all
of the assets you have that actually touch data
and all of the use cases that you have.
So you have the session elements
of your record of processing.
If you approach it from a record of processing, like, uh,
perspective, you won't go wrong.
This framework overlays that
and is helpful by providing additional questions.
But if you start with where is my, where is my data?
What, and what data do I have
and what rights I have with it,
and document that you can then establish what you need
for all of the subsequent steps.
How do you secure that data? Who should have access to it?
What are we gonna use that data for?
What can we allow that use of data for
and implement the appropriate cons, uh, appropriate set
of operational measures that are actually going
to be technologically enforced. So
What I'm hearing is, um, basically, yeah, like you said,
look at this like a record of processing activity.
Look at all of your data, everything that you store, uh,
from that lens, and more importantly, uh,
make this extremely collaborative.
Uh, like you said, partnering with the right, uh,
teams within your organization.
IT engineering especially, that is the key to, you know, um,
a successful, you know, foundation
for any of these processes.
Absolutely. This, this is,
this is absolutely an operations problem,
and it really can bring your teams closer together
for a much stronger partnership across, uh,
uh, all of the functions.
Uh, whether it's marketing, hr, engineering, uh,
product development, and of course legal
and compliance who often lead these functions, um, uh,
sorry, lead these activities.
We can't do it alone. Uh, we absolutely need
that partnership, both from a knowledge perspective as well
as technological perspective.
That makes sense. Greg, thank you so much.
This was really insightful.
And for our audiences, I just wanna say
that this was just a very small snippet
of the article that Greg has written.
I would highly recommend all of you to go ahead and read it.
Like we've discussed even in the interview,
it lays out like a very typical, uh, information
or data lifecycle along with recommendations on
what each organization needs to do to keep in mind
or, you know, aspects that they need
to keep in mind about every aspect of
that information or data.
And like Greg says, um, think of thing, um, you know,
from the perspective of how can I maintain a good record
of all my pressing activity, right?
So yeah, good data governance indeed is good for business.
So please make sure that you read this chapter
in our latest ebook.
The book also contains several other chapters
by other leading, uh, privacy experts.
So please go ahead.
I would highly recommend you read for content, the privacy
and security
and.