Skip to content

The Summary worksheet

This worksheet contains a simple set of rows describing the dataset and identifying the spreadsheets that contain data tables. Each row is labelled on the left in the first column and then the description data should be typed in the columns to the right. The description below sets out the possible summary metadata in blocks.

Some blocks of fields are mandatory (core, authors, worksheets, keywords) but may include optional fields (such as author ORCID). Other blocks are optional, but contain fields that may be mandatory if the block is used. We've tried to make this as clear as possible below!

Some blocks allow multiple records (e.g. authors and data worksheets) with sets of values in adjacent columns but other blocks only allow a single record (e.g. core fields and geographic extents).

Naming

This sheet must be called "Summary" for dataset validation to work properly. If it is called something similar but not identical to "Summary" (e.g. "summary") validation will fail.

The core fields block

Mandatory block

All fields are mandatory.

This block provides a set of core details for the dataset. You can only provide a single value for each field.

  • Title: This should be a short informative title for the dataset: it will be used as the public title for the dataset so make sure it is clear and grammatical!
  • Description: This will be the public description of the dataset. Note that you can have paragraphs of text within a single cell in Excel, so please do provide a reasonable summary. You will need to use Alt + Enter (or Alt + Shift + Enter on a Mac) to insert a carriage return.
0 1
Title Example data for the safedata system
Description This is an example dataset

The Project ID block

Possibly mandatory block

This block will be mandatory if the data collection you are publishing to uses projects to group datasets. If they are, then you should obtain relevant project IDs from the organisation's data manager and add them in this block.

This simple block provides project ID codes for the dataset.

  • Project ID: Provide the integer project id codes for the research project that this dataset is associated with. Older datasets may use the field name as SAFE Project ID but this is deprecated.
0 1
Project ID 1

The access block

Mandatory block

Access status is required for all datasets. Embargo date is only required if the Access status value is 'Embargo'. Access conditions is only required if the Access status value is 'Restricted'.

This block provides the access details for the dataset. You can only provide a single value for each field.

  • Access status: You must enter an Access Status of Open, Embargo or Restricted. We prefer as much data as possible to be open access: see the discussion of data availability.
  • Embargo date: If you choose embargoed access status then you must also enter the date when the embargo will end. This must be an Excel date formatted value and your organisation will set a maximum embargo length. Do not provide access conditions: embargoed datasets become freely available when the embargo ends.
  • Access conditions: If you choose restricted access status then you must also provide text describing the access conditions. Do not provide an embargo date - restricted datasets are permanently restricted.
0 1
Access status Embargo
Embargo date 14-05-2026
Access conditions

The author block

Mandatory block

Author ORCID is optional and the remaining fields are mandatory.

These rows provide contact details for the authors of the data. If the datasets should be credited to more than one author, then provide sets of details in adjacent columns. If you have an ORCID, provide it here: this is a good way to help link all of your academic outputs to you!

Affiliation and email are also optional, but we would very much prefer complete author metadata (name, affiliation, email) for all authors. However, we realise that sometimes this isn't possible: if you're uploading data collected by past students who you've lost contact with, then you might not have these details for any author.

Author names must be formatted as "last name, first name": "Orme, David" not "David Orme". Please provide just the numeric part of the ORCIDs, as shown below, not the full URL http://orcid.org/0000-0002-7005-1394.

0 1
Author name Orme, David
Author email d.orme@imperial.ac.uk
Author affiliation Imperial College London
Author ORCID 0000-0002-7005-1394

Important

The authors provided here will form part of the permanent citation for the published dataset. Authorship on published datasets should be treated in the same way as you would consider authorship on papers: you should include not only the people responsible for physically collecting the data but also other researchers who facilitated the work, such as project supervisors and local collaborators.

The data worksheet block

Mandatory block

The Worksheet external file field is only required if a worksheet entry describes data held in another file. All other fields are mandatory,

Each data worksheet must be described here - do not include the GBIFTaxa worksheet, any sequenced taxa worksheets or the Locations worksheet in this block. As with the authors, you can describe multiple sheets in adjacent columns.

  • The Worksheet name row must contain the label of a worksheet in the workbook: that is, the exact text shown on the worksheet tab at the bottom.
  • The Worksheet title and Worksheet description rows are free text to provide a longer title and a summary description of the contents of a given sheet.
  • You only need to use the Worksheet external file row if a data worksheet describes tabular data held in an external file. The value must then be a filename which appears in the External file block.
0 1 2 3 4 5 6 7
Worksheet name abundances_1 abundances_2 interactions_1 interactions_2 interactions_3 external_1 external_2
Worksheet title Abundances example 1 Abundances example 2 Interactions example 1 Interactions example 2 Interactions example 3 External file example 1 External file example 2
Worksheet description Example worksheet showing the first method for recording species abundances Example worksheet showing the second method for recording species abundances Example worksheet showing the first method for recording species interactions Example worksheet showing the second method for recording species interactions Example worksheet showing the third method for recording species interactions Example worksheet showing the secod method for including information from external files Example worksheet showing the first method for including information from external files
Worksheet external file

Sequenced taxa sheets block

Optional block

You only need to provide this information if you are providing sequenced taxonomy sheets. If you are either providing observed taxonomy (as a GBIFTaxa sheet) or your data doesn't involve taxonomy at all nothing needs to be provided here.

For each sequenced taxa sheet, metadata has to be provided. This is so that the taxonomic authority the taxonomy is generated from is recorded. This isn't required for observed taxonomy as GBIF is the only taxonomic authority that we allow, but for sequenced taxonomies we do not restrict you to use a particular taxonomic authority, and so the details of this authority need to be recorded.

Each sequenced taxonomy sheet must have its metadata described - do not include metadata for the GBIFTaxa worksheet, the Locations worksheet or data worksheets in this block. As with the data worksheets, you can describe multiple sheets in adjacent columns.

  • The Sequenced taxa sheet name row must contain the label of a worksheet in the workbook: that is, the exact text shown on the worksheet tab at the bottom.
  • The Reference database name row is free text and must contain the name of the reference taxonomy database used.
  • The Reference database version row is also free text and must contain the specific version of the reference database that was used to generate the taxonomy. Sometimes these versions are provided as date stamps, in this case you must ensure that the cell is formatted as text and not as a date.
  • You can also optionally provide a link to the taxonomic authority/reference database using the Reference database link row. If provided this value must be a valid link (e.g. start with https or http, etc).
0 1 2
Sequenced taxa sheet name Sequenced MoreSeqData
Reference database name Greengenes SILVA
Reference database version v1.7.9 2024-02
Reference database link

Keywords block

Mandatory block

This row allows you to enter a set of keywords for the dataset, with one keyword (or short phrase) per cell in the row. Do not use lists of keywords within cells and provide one set of keywords for the whole dataset, not one per data worksheet.

0 1
Keywords Keyword 1

External files block

Optional block

You only need to provide this information if you are also providing data in other file formats. If you do provide this block, all rows are mandatory.

You can include files in other file formats in your data submission. If you do so, then these files must be listed in this block: we use this information to ensure that all the correct datafiles have been uploaded to Zenodo and to provide a description in the Zenodo record.

For each file you must provided the exact filename, which must not contain spaces, and a description of the file.

0 1 2 3
External file bait_trap_images.zip My_raster_1.tiff My_raster_2.tiff
External file description Zip file containing 5000 JPEG images of bait trap cards A second raster file containing altitudes A raster file containing altitudes

Publication DOI block

Optional block

This block allows you to provide DOIs for publications using the data here. You can add multiple DOIs, one per cell in the row. Please format the DOI as a URL using https://doi.org/ before the DOI, so https://doi.org/10.1098/rstb.2011.0049 not DOI:10.1098/rstb.2011.0049. We do also accept http://doi.org/, http://dx.doi.org/ and https://dx.doi.org/ as the root of the URL.

0 1 2
Publication DOI https://doi.org/10.1098/rstb.2011.0049

Funders block

Optional block

Although the funders block is optional, you should provide it in most cases as you must provide details of any funding that lead to the collection of the data.

The inclusion of this information can be a condition for funders to allow data to be hosted under a common portal. Your data manager should inform you of funding organisations that require this.

Funding body and Funding type are mandatory, but please do provide a reference number and a link if possible.

safedata at the SAFE Project

The funder specific details for the SAFE project are included in the SAFE project details.

Funding details are provided by completing a block as follows and, as with Authors and Worksheets, you can use multiple columns to acknowledge more than one funder.

0 1
Funding body NERC
Funding type Standard grant
Funding reference NE/K006339/1
Funding link https://gtr.ukri.org/projects?ref=NE%2FK006339%2F1

Permits block

Optional block

If you provide permit details, all the fields are required.

Permits are very often required for ecological research. Use this block to record the permits used to collect this data. The permit type value must be one of research, export or ethics. Again, you can use multiple columns to record multiple permits.

0 1
Permit type Research
Permit authority Sabah Biodiversity Centre
Permit number ABC-123-456

Unknown permit numbers

With older datasets permit numbers are often no longer readily available, but it remains important to acknowledge permit authorities. While effort should be made to obtain original permit numbers, if they cannot be found "Unknown" should be entered in the "Permit number" field.

Extents metadata

It is important to publish dataset metadata using a recognised standard, as this aids dataset discovery. The safedata_validator package provides tools for data managers to generate XML metadata documents following the UK GEMINI standard. These metadata documents are very broad: the contents of the file are mostly contact details and access restrictions, but do also have to include temporal and geographic extents.

Ordinarily, the dataset checking process will calculate these extents automatically from the reported locations for the geographic extent and from any date or datetime fields for the temporal extent. However, if we cannot populate these extents from the datasets, then you will have to provide additional rows in your Summary worksheet that provide extent metadata as described below.

Summary extents with location and time data

It is not an error to provide extent data in the Summary sheet when there is time and location data in the data worksheets. Location and temporal data can be incomplete, and so providing wider extents in the Summary is fine.

If you do this, you will get a warning that both the Summary and data extents exist, but this is to check that this is intentional. The Summary extents must however be wider than the data extents.

Temporal extents

Potentially mandatory block

If temporal extents cannot be inferred from your dataset (see above) then all rows in this block are required.

The start and end date values must be provided as an Excel date formatted cell.

0 1
Start date 01-06-2015
End date 11-07-2015

Geographic extents

Potentially mandatory block

If geographic extents cannot be inferred from your dataset (see above) then all rows in this block are required.

The geographic extents must be provided as decimal degrees (16.75) not degrees, minutes and seconds (16° 45' 00'") or degrees and decimal minutes (16° 45.00).

0 1
North 5.07
South 4.5
East 117.82
West 116.75

safedata at the SAFE Project

The geographic bounds of the SAFE project are included in the SAFE project details.

Complete example summary table

0 1 2 3 4 5 6 7
Title Example data for the safedata system
Description This is an example dataset
Project ID 1
Access status Embargo
Embargo date 14-05-2026
Access conditions
Author name Orme, David
Author email d.orme@imperial.ac.uk
Author affiliation Imperial College London
Author ORCID 0000-0002-7005-1394
Worksheet name abundances_1 abundances_2 interactions_1 interactions_2 interactions_3 external_1 external_2
Worksheet title Abundances example 1 Abundances example 2 Interactions example 1 Interactions example 2 Interactions example 3 External file example 1 External file example 2
Worksheet description Example worksheet showing the first method for recording species abundances Example worksheet showing the second method for recording species abundances Example worksheet showing the first method for recording species interactions Example worksheet showing the second method for recording species interactions Example worksheet showing the third method for recording species interactions Example worksheet showing the secod method for including information from external files Example worksheet showing the first method for including information from external files
Worksheet external file
Keywords Keyword 1 Keyword 2
External file bait_trap_images.zip My_raster_1.tiff My_raster_2.tiff
External file description Zip file containing 5000 JPEG images of bait trap cards A second raster file containing altitudes A raster file containing altitudes
Publication DOI https://doi.org/10.1098/rstb.2011.0049
Funding body NERC
Funding type Standard grant
Funding reference NE/K006339/1
Funding link https://gtr.ukri.org/projects?ref=NE%2FK006339%2F1
Permit type Research
Permit authority Sabah Biodiversity Centre
Permit number ABC-123-456
Start date 01-06-2015
End date 11-07-2015
North 5.07
South 4.5
East 117.82
West 116.75
Sequenced taxa sheet name Sequenced MoreSeqData
Reference database name Greengenes SILVA
Reference database version v1.7.9 2024-02
Reference database link