Skip to content

The Summary worksheet

This worksheet contains a simple set of rows describing the dataset and identifying the spreadsheets that contain data tables. Each row is labelled on the left in the first column and then the description data should be typed in the columns to the right. The description below sets out the possible summary metadata in blocks.

Some blocks of fields are mandatory (core, authors, worksheets, keywords) but may include optional fields (such as author ORCID). Other blocks are optional, but contain fields that may be mandatory if the block is used. We've tried to make this as clear as possible below!

Some blocks allow multiple records (e.g. authors and data worksheets) with sets of values in adjacent columns but other blocks only allow a single record (e.g. core fields and geographic extents).


This sheet must be called "Summary" for dataset validation to work properly. If it is called something similar but not identical to "Summary" (e.g. "summary") validation will fail.

The core fields block

Mandatory block

All fields are mandatory.

This block provides a set of core details for the dataset. You can only provide a single value for each field.

  • Title: This should be a short informative title for the dataset: it will be used as the public title for the dataset so make sure it is clear and grammatical!
  • Description: This will be the public description of the dataset. Note that you can have paragraphs of text within a single cell in Excel, so please do provide a reasonable summary. You will need to use Alt + Enter (or Alt + Shift + Enter on a Mac) to insert a carriage return.
Title Example data for the safedata system
Description This is an example dataset.

The Project ID block

Possibly mandatory block

This block will be mandatory if the data collection you are publishing to uses projects to group datasets. If they are, then you should obtain relevant project IDs from the organisation's data manager and add them in this block.

This simple block provides project ID codes for the dataset.

  • Project ID: Provide the integer project id codes for the research project that this dataset is associated with. Older datasets may use the field name as SAFE Project ID but this is deprecated.
Project ID 1

The access block

Mandatory block

Access status is required for all datasets. Embargo date is only required if the Access status value is 'Embargo'. Access conditions is only required if the Access status value is 'Restricted'.

This block provides the access details for the dataset. You can only provide a single value for each field.

  • Access status: You must enter an Access Status of Open, Embargo or Restricted. We prefer as much data as possible to be open access: see the discussion of data availability.
  • Embargo date: If you choose embargoed access status then you must also enter the date when the embargo will end. This must be an Excel date formatted value and you cannot embargo data for more than two years. Do not provide access conditions: embargoed datasets become freely available when the embargo ends.
  • Access conditions: If you choose restricted access status then you must also provide text describing the access conditions. Do not provide an embargo date - restricted datasets are permanently restricted.
Access status Embargo
Embargo date 03/09/18
Access conditions

The author block

Mandatory block

Author ORCID is optional and the remaining fields are mandatory.

These rows provide contact details for the authors of the data. If the datasets should be credited to more than one author, then provide sets of details in adjacent columns. If you have an ORCID, provide it here: this is a good way to help link all of your academic outputs to you!

Affiliation and email are also optional, but we would very much prefer complete author metadata (name, affiliation, email) for all authors. However, we realise that sometimes this isn't possible: if you're uploading data collected by past students who you've lost contact with, then you might not have these details for any author.

Author names must be formatted as "last name, first name": "Orme, David" not "David Orme". Please provide just the numeric part of the ORCIDs, as shown below, not the full URL

Author name Orme, David
Author email
Author affiliation Imperial College London
Author ORCID 0000-0002-7005-1394


The authors provided here will form part of the permanent citation for the published dataset. Authorship on published datasets should be treated in the same way as you would consider authorship on papers: you should include not only the people responsible for physically collecting the data but also other researchers who facilitated the work, such as project supervisors and local collaborators.

The data worksheet block

Mandatory block

The Worksheet external file field is only required if a worksheet entry describes data held in another file. All other fields are mandatory,

Each data worksheet must be described here - do not include the Taxa and Locations worksheet in this block. As with the authors, you can describe multiple sheets in adjacent columns.

  • The Worksheet name row must contain the label of a worksheet in the workbook: that is, the exact text shown on the worksheet tab at the bottom.
  • The Worksheet title and Worksheet description rows are free text to provide a longer title and a summary description of the contents of a given sheet.
  • You only need to use the Worksheet external file row if a data worksheet describes tabular data held in an external file. The value must then be a filename which appears in the External file block.
Worksheet name DF Incidence Transects
Worksheet title My shiny dataset My incidence matrix Bait trap transect lines
Worksheet description This is a test dataset A test dataset too Attribute table for transect GIS
Worksheet external file BaitTrapTransects.geojson

Keywords block

Mandatory block

This row allows you to enter a set of keywords for the dataset, with one keyword (or short phrase) per cell in the row. Do not use lists of keywords within cells and provide one set of keywords for the whole dataset, not one per data worksheet.

Keywords Keyword 1 Keyword 2

External files block

Optional block

You only need to provide this information if you are also providing data in other file formats. If you do provide this block, all rows are mandatory.

You can include files in other file formats in your data submission as described here. If you do so, then these files must be listed in this block: we use this information to ensure that all the correct datafiles have been uploaded to Zenodo and to provide a description in the Zenodo record.

For each file you must provided the exact filename, which must not contain spaces, and a description of the file.

External file BaitTrapTransects.geojson
External file description Zip file containing 5000 JPEG images of bait trap cards GeoJSON file containing polylines of the bait trap transects

Publication DOI block

Optional block

This block allows you to provide DOIs for publications using the data here. You can add multiple DOIs, one per cell in the row. Please format the DOI as a URL using before the DOI, so not DOI:10.1098/rstb.2011.0049. We do also accept, and as the root of the URL.

Publication DOI

Funders block

Optional block

Although the funders block is optional, you should provide it in most cases as you must provide details of any funding that lead to the collection of the data.

The inclusion of this information can be a condition for funders to allow data to be hosted under a common portal. Your data manager should inform you of funding organisations that require this.

Funding body and Funding type are mandatory, but please do provide a reference number and a link if possible.

safedata at the SAFE Project

For funder specific details for the SAFE project see here.

Funding details are provided by completing a block as follows and, as with Authors and Worksheets, you can use multiple columns to acknowledge more than one funder.

Funding body NERC
Funding type Standard grant
Funding reference NE/K006339/1
Funding link

Permits block

Optional block

If you provide permit details, all the fields are required.

Permits are very often required for ecological research. Use this block to record the permits used to collect this data. The permit type value must be one of research, export or ethics. Again, you can use multiple columns to record multiple permits.

Permit type Research
Permit authority Sabah Biodiversity Centre
Permit number ABC-123-456

Unknown permit numbers

With older datasets permit numbers are often no longer readily available, but it remains important to acknowledge permit authorities. While effort should be made to obtain original permit numbers, if they cannot be found "Unknown" should be entered in the "Permit number" field.

Extents metadata

It is important to publish dataset metadata using a recognised standard, as this aids dataset discovery. The safedata_validator package provides tools for data managers to generate XML metadata documents following the UK GEMINI standard. These metadata documents are very broad: the contents of the file are mostly contact details and access restrictions, but do also have to include temporal and geographic extents.

Ordinarily, the dataset checking process will calculate these extents automatically from the reported locations for the geographic extent and from any date or datetime fields for the temporal extent. However, if we cannot populate these extents from the datasets, then you will have to provide additional rows in your Summary worksheet that provide extent metadata as described below.

Summary extents with location and time data

It is not an error to provide extent data in the Summary sheet when there is time and location data in the data worksheets. Location and temporal data can be incomplete, and so providing wider extents in the Summary is fine.

If you do this, you will get a warning that both the Summary and data extents exist, but this is to check that this is intentional. The Summary extents must however be wider than the data extents.

Temporal extents

Potentially mandatory block

If temporal extents cannot be inferred from your dataset (see above) then all rows in this block are required.

The start and end date values must be provided as an Excel date formatted cell.

Start Date 01/06/2015
End Date 11/07/2015

Geographic extents

Potentially mandatory block

If geographic extents cannot be inferred from your dataset (see above) then all rows in this block are required.

The geographic extents must be provided as decimal degrees (16.75) not degrees, minutes and seconds (16° 45' 00'") or degrees and decimal minutes (16° 45.00).

West 116.75
East 117.82
South 4.50
North 5.07

safedata at the SAFE Project

For geographic bounds of the SAFE project see here.

Complete example summary table

Project ID 1
Title Example data for the safedata system
Description This is an example dataset.
Access status Embargo
Embargo date 03/09/18
Access conditions
Author name Orme, David
Author email
Author affiliation Imperial College London
Author ORCID 0000-0002-7005-1394
Worksheet name DF Incidence Transects
Worksheet title My shiny dataset My incidence matrix Bait trap transect lines
Worksheet description This is a test dataset A test dataset too Attribute table for transect GIS
Worksheet external file BaitTrapTransects.geojson
Keywords Keyword 1 Keyword 2
External file BaitTrapTransects.geojson
External file description Zip file containing 5000 JPEG images of bait trap cards GeoJSON file containing polylines of the bait trap transects
Publication DOI
Funding body NERC
Funding type Standard grant
Funding reference NE/K006339/1
Funding link
Permit type Research
Permit authority Sabah Biodiversity Centre
Permit number ABC-123-456
Start Date 01/06/2015
End Date 11/07/2015
West 116.75
East 117.82
South 4.50
North 5.07