Archiving @ FGB: Summary of FGB Archiving Guidelines

Introduction

The following is a summary of the FGB Implementation of the National Guidelines for Archiving of Academic Research for Faculties of Behavioural and Social Sciences. This summary highlights the major concerns when archiving data; in-depth explanation is found in the full guidelines.

NB: The guidelines summarized here refer specifically to data archiving and some elements of data publishing, but they do not address the complexities involved with the reuse of data by third parties.

Why Archiving?

Data and supplementary materials from your research should be archived because:

Archiving promotes research integrity and transparency by:
- Ensuring that research results can be verified, reproduced and, where possible, replicated
- Providing assurances that data have not been inappropriately tampered with after collection
Archiving preserves valuable data and the materials necessarily for the proper interpretation of this data well into the future
Archiving data is important for the safety of your participants when conducting medical research
- The archived data serves as a record of the medical interventions that each participant experienced
- This information is important for safety monitoring and long-term follow-up

Summarized from guideline sections: 1 and 1.1

What to Archive?

You should archive whatever would be necessary to properly interpret your data. This includes data, but also documentation about the data and the research process, as well as code scripts used in the process of your research

There should be sufficient documentation in the archived materials so that another researcher could reanalyse your data and reproduce your research results without ever contacting you.
Section 2.1.1 of the full guidelines lists the kinds of materials you are expected to archive. Not everything listed will necessarily apply to your research.

Summarized from guideline sections: 2.1, 2.1.1 and 2.1.2

When to Archive?

A data package must be archived whenever you publish a research article. Researchers are expected to manage their data efficiently during the course of their research so that they can archive the data and supporting materials as soon as their research article is accepted for publication.

Additionally, archiving is recommended, but not required, when:

Data collection for a research project has finished, so that the raw data can be stored in a way that prevents (unintentional) modification
A research project is complete, even if the data are not used in any research publications

Summarized from guideline sections: 1.1 and 2.2

Where to Archive?

The vast majority of data used within the faculty is considered personal data and therefore should be stored in a secure archive. The default archive at FGB for most situations and data types is YODA. Following this YODA archiving checklist to ensure that you are archiving everything that needs to be archived in YODA in the correct way.

If your data are low enough risk and/or can be de-identified * enough that they fall under the “Green” category from the Privacy Risk categorization, then you can use DataverseNL instead of YODA if YODA does not meet your requirements.
If your data are low enough risk and/or can be de-identified* enough that they fall under the “Blue” category from the Privacy Risk categorization, then you can use DataverseNL or an external data repository instead of YODA if YODA does not meet your requirements.
- You should check with the FGB Privacy Champion that your data are indeed “Blue” data before archiving them in an external data repository.
If your data cannot be de-identified to the “Green” or “Blue” category level, it may still be possible to archive the supporting materials (e.g. research code, codebooks, interview scipts etc.) in archives or data repositories other than YODA. Just make sure these supporting materials don’t contain any personal or confidential information.

* It is very important to remember that when you are de-identifying your data that you do not irrevocably modify the raw data. Irrevocable modification defeats the purpose of archiving unadulterated raw data. See How Do I Meet Privacy and Security Requirements? for further explanation.

Summarized from guideline sections: 1.3, 2.1.1 and 3.3

Who Needs to Archive?

Who do the guidelines apply to?

The FGB archiving guidelines apply to all researchers conducting research within the faculty.
It also applies to Bachelor’s or one-year Master’s students if their research results in a research publication.
- For Bachelor’s and one-year Master’s students who don’t publish any research articles, it is still recommended that they provide an informal data package of their work to their supervisor as a way to practice archiving. The supervisor can decide whether these materials needs to be preserved.

Who is responsible for archiving?

If you are the first author on a research paper, you are responsible for archiving. If the first author works at another research facility, you should make sure that the first author will ensure that the data are archived.
If data are archived for preservation purposes after data collection or upon completion of a research project, the lead researcher is responsible for archiving.
For research from PhD candidates, Master’s and Bachelor’s students, the supervisor is responsible for archiving; they may delegate the task to be completed by their student, but they remain ultimately responsible for this task.
The final responsibility for all archiving in the faculty lies with the dean.

Who should have access to the archived data?

Whoever is responsible for archiving should have access to the archived data and supporting materials.
- There should also be at least one other person who also has access to the archived materials.

Summarized from guideline sections: 1.2, 2.3 and 2.4

How Long Should Archiving Last?

The duration of archiving depends on the reason you are archiving as well as on other policies and laws.

For research that isn’t subject to the WMO law, the Good Clinical Practice (GCP) Guidelines or the other regulations mentioned in section 1.1 of the full archiving guidelines:
- The data and supporting materials used in the publication of a research article must be archived for 10 years from the date of publication
  - If the data are reused for new research articles, this archiving term should be extended for another 10 years from the new publication date
- If you chose to archive data after data collection was completed or upon the completion of a research project, even if the data have not (yet) been used for a research article, you can determine how long the data should be archived
If you are conducting medical research, the duration of archiving will depend on which laws and regulations apply. To determine which archiving duration applies, see this page from the CCMO or section 3.1 from the full archiving guidelines
Any data that falls under the “Blue” category from the Privacy Risk categorization and/or supporting materials that don’t contain any personal and/or confidential information can be archived indefinitely, unless any other contracts or agreements apply that limit the archiving duration.

Summarized from guideline section: 3.1

How Do I Meet Privacy and Security Requirements?

It is important that the archived data are protected, particularly when these data are considered personal, but you must also ensure that the integrity of the raw data is maintained.

Raw data can usually be de-identified, but it’s important that they are not irrevocably altered in the process. Generally, you can de-identify the data up to step 4 of this de-identification guide without irrevocably altering the raw data. Only de-identify raw data to a point where it could be returned to its original state.
You will need to determine whether any directly identifying personal data collected in the course of your research need to be archived** and, if so, for how long. Also determine whether it’s necessary to re-identify any de-identified data.
- This will depend on the nature of your research and what regulations apply. Section 3.2 of the full guidelines explains in detail what is required.
If raw data are separated into directly identifying personal data and de-identified research data, you may:
- Archive the de-identified research data in a less secure archive if the data are sufficiently low risk (see Where to Archive). The personal data must be archived in a separate, highly secure archive. Make sure to cross-reference these separate archiving submissions to each other in your documentation.
- Archive both sets of data in a single, highly secure archive, but submit the personal data separately from the de-identified research data (in other words, create two submissions). Make sure to cross-reference the two submissions to each other in your documentation
If the raw data cannot be separated into personal data and de-identified research data without irrevocably altering the raw data, archive all of the data together in a secure archive (see Where to Archive)
You may consider encrypting the personal data (or all of the data if the personal data cannot be separated from the research data), but you must ensure that you have a plan for the long-term management of the de-encryption key. When using YODA as an archive, your best option is to contact the YODA administrator to provide them with a copy of the de-encryption key, and then print a copy of the key and store it in your department’s paper archive.

**One reason you may need to preserve directly identifying personal data is if you obtained consent from participants for the use of their data in your research. You must save these consent forms for as long as the data will be archived.

The forms need to be saved to serve as evidence that consent was legally obtained. Depending on the nature of the research, the link between the consent form and the data itself may or may not need to be maintained. Section 3.2 of the full guidelines explains this further. Once the archiving term is complete, assess whether the research data can be destroyed. Once the research data are destroyed the consent forms should also be destroyed.
Even if the consent forms do not include any personal data, it is still necessary to save the consent forms for the same duration as the archived research data.
If paper consent forms were scanned into a digital form, it is not allowed to destroy the original paper copies if your research is subject to the WMO. For non-WMO research, you may destroy the consent forms after digitization as long as you follow the faculty-approved digitization guidelines. See this statement in the full guidelines for more information.

Summarized from guideline sections: 1.3, 2.1.1, 3.2 and 3.3

Anything else?

Archived data must be persistently findable when the data are used in a research publication. Archived data must also be registered in PURE for VU administrative purposes. FGB researchers can meet these requirements by archiving their data in YODA and publishing those data, even if the data are kept closed or restricted access to prevent/limit data reuse of the data. The metadata published in YODA is in a format that can be easily imported into PURE. The researcher must still create a PURE registration for their dataset, but the process is much easier if the data are published in YODA because of this import functionality.

Summarized from guideline section: 4