|dc.description.abstract||The EU-wide General Data Protection Regulation (GDPR) on personal data comes into force in May 2018 and, pacé Brexit, is already being used as the basis of an overhaul of the UK’s Data Protection Act. GDPR introduces fundamental rights for data subjects and is pretty strict about what data controllers and data processors can and can’t do with personal data. “Research use” may have opt-outs but these are largely derogated to national law and/or professional codes of conduct which are still on the drafting table. Paying attention to GDPR is, therefore, both important and unavoidable.
GDPR Article 4 (3) defines 'restriction of processing' as the marking of stored personal data with the aim of limiting their processing in the future. This talk builds on this idea of classifying data with some sort of ‘mark’ and explores the Harvard DataTags system  as an approach for managing data.
In the DataTags system, a data tag is a label indicating the level of protection to which a data object should be subject within a repository (or elsewhere). Its Harvard creators Sweeney, Crosas and Bar-Sinai introduced the notion of a DataTags repository as a facility which stores and shares data files in accordance with different levels of security, access requirements and auto-generated data use agreements. The system basically defines security features and access requirements for handling sensitive data. The original American system uses six levels of access from blue (public data) to crimson (highest level of restriction) and is modelled on various U.S. privacy laws. Effectively, the DataTags system informs the technical infrastructure of handling requirements that are needed for a given data object by attaching a data tag to it, taking into account the specific legal obligations for processing these data.
The Harvard authors discuss the notion of using a flow chart or decision tree approach to arriving at a given tag for a particular file or data object (we use the terms interchangeably here), thus describing a way to generalise the DataTags system for other privacy and personal data frameworks.
We report on an analysis of the use of DataTags under the GDPR framework from the EUDAT2020 project. This work is based on a pilot study carried out by the Dutch Archiving and Network Service institute, DANS, during summer 2017, within the “policy and strategy” arm of EUDAT2020 led by EPCC. Bearing in mind the principal goal of EUDAT is to be an Internet-connected distributed platform for research data with a mandate for open access, we report that a classification scheme based on DataTags for types of data, personal and otherwise, could be used as a tool in assessing the risks associated with processing certain data within repositories and distributed repository networks like EUDAT.
 L. Sweeney, M. Crosas, M. Bar-Sinai, Sharing Sensitive Data with Confidence: The Datatags System. Technology Science [Internet], 2015. http://techscience.org/a/2015101601/
A video of this presentation can be viewed at https://media.ed.ac.uk/media/0_5f8tlvrp||en