Process established for metadata inventory
Date: March 23, 2023 Author: Thaïsa van der Woude (ISRIC), Emily Toner (ISRIC)
LSC Hubs project follows the FAIR principles for scientific data management and stewardship and developed a process to enable project partners to describe their data systematically
Land Soil Crop Hubs (LSC Hubs) project follows the FAIR principles for scientific data management and stewardship. FAIR principles provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse (FAIR) of digital assets. An important FAIR principle is describing data assets using a standard and sharing these records via a central searchable repository (i.e. a metadata catalogue). Therefore, we developed a process to enable project partners to describe their data systematically.
This process includes:
Step 1: Develop a metadata inventory template;
Step 2: Collect initial metadata with project partners;
Step 3: Obtain feedback from project partners;
Step 4: Implement the feedback in the template;
Step 5: Collect additional metadata from project partners;
Step 6: Create an Open Data Kit Form.
A metadata inventory template was developed in the first step based on the ISO19139:2007 standard. By collecting the metadata in a standardized format, it can be smoothly uploaded to the LSC Hub. The template presents a subset of common ISO19139:2007 metadata properties, extended with project-specific keywords. The template is an Excel file which collects the metadata properties shown in Table 1 below.
Table 1: Metadata properties based on ISO19139:2007
|Identification||Unique identification of the dataset (A UUID, URN, or URI, such as DOI) or Dataset location (A specific data storage like a folder on a personal computer, common hard disk, etc.)|
|Title||Short meaningful title|
|LCS category||Land Soil Crop domain|
|Data category||Generic data category according to AGROVOC|
|Abstract||Brief description or abstract that describes the dataset|
|Keywords||Keywords; separated by ‘;’. These are important words for the dataset, e.g., variable names, column names, key concepts, etc.|
|Authors||Person (last name(s) initials first name(s) - e.g., Jones, J.A., or institute|
|Contact - Name||Name of the contact|
|Contact - Organization||Name of the Organization|
|Contact - Role||Role within the organization|
|Contact - Email|
|Who fills the form||Name, Organization, contact. Fill in case this person is different from the Contact person|
|Source||Source is a reference to another dataset which is used as a source for this dataset. Reference a single dataset per line; Title; Date; or provide a DOI;|
|Language||Language, of the data and metadata, if metadata is multilingual multiple languages can be provided|
|Reference system||Spatial Projection: drop down list of options, including ‘unknown’ (you can also leave out the field if it is unknown)|
|Citation||Citations are references to articles which reference this dataset; one citation on each line; Title; Authors; Date; or provide a DOI|
|Paper or Report||Reference to a scientific paper or report which used this dataset as a source. Needed info: Title; Date; or provide a DOI;|
|Spatial resolution||Resolution (grid) or scale (vector)|
|Spatial format||Vector, Grid|
|Format||File Format in which the data is maintained or published|
|Extent (geographic)||Geographical coverage (e.g., Global, Africa, Rwanda, Ethiopia, …)|
|Extent (category)||national, county/district/province, catchment, village, plot (farm)|
|Usage constraints||Indicates if there are legal usage constraints (license); free text and/or value from list|
After developing the template, the second step was for the project partners to fill in initial metadata. The template was made available through a central folder with shared editing capability. Project partners including Kenya Agricultural & Livestock Research Organization (KALRO), Rwanda Agriculture and Animal Resources Development Board (RAB), ICRAF World Agroforestry and the International Union for Conservation of Nature (IUCN), provided feedback on the metadata inventory template in the third step which was processed to improve the template. Our project partners are currently describing more metadata in the newest version of the metadata inventory template.
In addition, our project partners provided the feedback that the Excel file can not be easily distributed to other stakeholders inside or outside their organisation. Therefore, we developed an Open Data Kit (ODK) form. This is an open-source tool that enables users to fill out forms and submit them to a central database. The metadata can then be automatically retrieved from the database and ingested into the catalogue. The initial ODK form is shown below.
Once the metadata is collected, these will be transformed to ISO19139 records using Python scripts. Relevant and complete metadata records will be published in the data catalogues of the Hubs once these become operational. We are testing this process with an initial catalogue. The next step is to discuss which kind of catalogue per country is preferred and support the countries with implementing the catalogue.
More about this project:
LSC Hubs is a four-year project (2021-2024) supported through funding from the European Union’s Development of Smart Innovation through Research in Agriculture (DeSIRA) program, the Dutch Ministry of Foreign Affairs, and a contribution from ISRIC - World Soil Information with the aim to develop sustainable land, soil, and crop information hubs. The project’s objective is to develop sustainable land, soil, and crop information hubs in national agricultural research organizations in East Africa. Ethiopia, Kenya and Rwanda will host the information hubs to enhance the effectiveness of national Agricultural Knowledge and Innovation Systems (AKIS) and contribute to rural transformation and climate-smart agriculture.