The University of Newcastle, Australia

Data Centric Security

Wednesday, 27 November 2019

Nowadays we update our status and profiles on Facebook, upload snaps in Instagram, search Google for restaurants to go for dinner, talk on Skype, send email, download an App into our mobile phone , track steps on our smart watches, scan loyalty cards supermarkets, and do much more.

Every time do such actions our data (sometimes our personal data) is being sent over the Internet and stored in cloud systems and being used by different applications and services belonging to different infrastructure, service and software providers.

At the centre of cyberspace lies the issue of data, which yes has become the universal currency in the digital economy. But then if I were to sit on a computer and use my mobile phone or tablet and ask for example, who has my credit card number, the network system (the Internet) at present will not be able to give me an answer. Though we talk about data becoming the universal currency in this digital world, we are not able to make a data specific query to the networked system or the Internet infrastructure. What is needed a different approach to addressing this issue which requires a data centric focus. In such data centric paradigm, data has to be a fundamental tenet in the specification of systems, services and policies. Such an approach is needed if the data owners are able to better control what rights others should have on their personal data as the data moves over the Internet.

In fact, more generally data centric focus is needed to address a range of specific data related issues. Just because data is accessible, it does not mean the data is trustworthy or reliable to make decisions, or even ethical to access it.
(i) Where does the data come from? For instance, provenance of data plays an important role when scientists are using the data in their analysis to prevent errors in research.
(ii) How do you know where your data is? Do you know who can see the data and modify it without a trace? Tracking the use of personal data is a critical issue when it comes to protection of privacy in a range of applications such as with healthcare data.
(iii) What are the purposes for which the data has been used? This is a fundamental issue as the data is usually collected for some purpose but then is misused for some other purpose.
(iv) Who can aggregate or summarize or embed your data for purposes other than what you specified?
(v) How to specify security policies on the data as it moves over the Internet? How to achieve large scale dynamic security management of data and services that manipulate them?
(vi) How to reason about different types of data -- such as who (social metadata), where (spatial metadata), when (temporal metadata) and the context metadata -- to enhance the trust and quality of decision making?

Having said this, typically from the users’ perspective, they want personal control over their data even if they don’t want to exercise this control. They will allow agents that they trust to access and process their data. From the industry point of view, they usually prefer consistent rules to build customer relationships and agreed set of rules to comply with the regulations. Recently, regulations such as the EU’s GDPR (May 2018), California’s Consumer Privacy Act CCPA (Jan 2020) and Australia’s Mandatory Data Breach Regulation (Feb 2018) have specific requirements which can be better addressed by taking a data centric view.

Such a data centric approach requires security mechanisms and infrastructures at different levels.
* At the basic level, it needs a fundamental mechanism that securely couples a data item to its policy (e.g. metadata attributes and rules); there must be security mechanism enforcing this coupling which should be logically inseparable. Policy stays with the data when the data is copied.
* Then at another level, we need data processing agents that check whether the policies are satisfied before and after data usage. This requires trust in the data processing agents that they will enforce the policies properly, they will not change the policies, they will check the policies at the right time etc. As to policies themselves, there can be pre-requisite policies which should be satisfied before the data can be used, there can be obligation policies after data usage. There can also be policies related to auditing and tracking of data and data provenance as well as checking whether the data has been used for the right purpose.
* There is also the need for trusted infrastructures just like the Internet requires its own infrastructures such as DNS. In the data centric security architecture, we need trusted infrastructures that enable mapping of policies to data as well as vice-versa from data to policies.

With such mechanisms in place, we can answer questions like the one we posed at the beginning such as who has my credit card number. Such mechanisms provides us the design framework to address issues raised by the GDPR such as the right for an individual to know how data about his or her data is being processed, the right to withdraw consent to processing and the right to have the data erased under certain conditions.

At the ACSRC Research Centre at the University of Newcastle, we are working on this Data Centric Security project with Data61, which is aimed at developing a secure and trustworthy data ecosystem enabling reliable decision making.

Vijay Varadharajan
Global Innovation Chair Professor in Cyber Security
Director: Advanced Cyber Security Engineering Research Centre (ACSRC)
The University of Newcastle, Australia


Related news