Convenience function to record dataset metadata — dataset

This function create a metadata object used to then interact with the API

Usage

dataset_metadata(
  title = NULL,
  name = NULL,
  short_title = NULL,
  notes = NULL,
  tag_string = NULL,
  url = NULL,
  owner_org = NULL,
  geographies = "UNSPECIFIED",
  private = NULL,
  visibility = NULL,
  external_access_level = NULL,
  data_sensitivity = NULL,
  original_id = NULL,
  data_collector = NULL,
  date_range_start = NULL,
  date_range_end = NULL,
  keywords = NULL,
  unit_of_measurement = NULL,
  sampling_procedure = NULL,
  operational_purpose_of_data = NULL,
  `hxl-ated` = NULL,
  process_status = NULL,
  identifiability = NULL,
  geog_coverage = NULL,
  data_collection_technique = NULL,
  linked_datasets = NULL,
  archived = NULL,
  admin_notes = NULL,
  sampling_procedure_notes = NULL,
  response_rate_notes = NULL,
  data_collection_notes = NULL,
  weight_notes = NULL,
  clean_ops_notes = NULL,
  data_accs_notes = NULL,
  ddi = NULL,
  ...
)

Arguments

title: Title(*) - Make sure to include: 'Survey name/title', 'Location', 'Country', and 'Year(s)' in the order indicated.
name: URL(*) - The canonical name of the dataset, eg. my-dataset.
short_title: Short title - eg. Short title for the project.
notes: Description(*) - Some useful notes about the data. Please include the number of observations.
tag_string: Tags - eg. economy, mental health, government.
url: Project URL - Website URL associated with this data project (if applicable).
owner_org: Data container(*) - Use the canonical name for the container (i.e. all lower case) for instance "americas" - not "Americas" - in case you are not using the right container you will receive.The id of the container can also be used
geographies: defaults is geographies - pulling from a webservice from geoserver
private: Visibility (Private/Public).
visibility: Internal Access Level(*). Allowed values: restricted (Private), public (Internally Visible).
external_access_level: External access level(*). Allowed values: not_available (Not available), direct_access (Direct access), public_use (Public use), licensed_use (Licensed use), data_enclave (Data enclave), open_access (Open access).
data_sensitivity: Data sensitivity - Apply to both Anonymized and Personally identifiable data. Allowed values: yes (Yes), no (No).
original_id: Original ID - If the dataset already has an ID from the source org, DDI, etc...
data_collector: Data Collector(*) - Which organization owns / collected the data. Multiple values are allowed.
date_range_start: Date collection first date - Use dd/mm/yyyy format.
date_range_end: Date collection last date - Use dd/mm/yyyy format.
keywords: Topic classifications(*) - Tags useful for searching for the datasets. Multiple values are allowed. See keywords
unit_of_measurement: Unit of measurement(*) - Unit of measurement / observation for the dataset.
sampling_procedure: Sampling Procedure. Multiple values are allowed. Allowed values: total_universe_complete_enumeration (Total universe/Complete enumeration), probability_simple_random (Probability: Simple random), probability_systematic_random (Probability: Systematic random), probability_stratified (Probability: Stratified), probability_stratified_proportional (Probability: Stratified: Proportional), probability_stratified_disproportional (Probability: Stratified: Disproportional), probability_cluster (Probability: Cluster), probability_cluster_simple_random (Probability: Cluster: Simple random ), probability_cluster_stratified_random (Probability: Cluster: Stratified random), probability_multistage (Probability: Multistage), nonprobability (Non-probability), nonprobability_availability (Non-probability: Availability), nonprobability_purposive (Non-probability: Purposive), nonprobability_quota (Non-probability: Quota), nonprobability_respondentassisted (Non-probability: Respondent-assisted), mixed_probability_nonprobability (Mixed probability and non-probability), other_other (Use if the sampling procedure is known, but not found in the list..).
operational_purpose_of_data: Operational purpose of data - Classification of the type of data contained in the file. Multiple values are allowed. Allowed values: participatory_assessments (Participatory assessments), baseline_household_survey (Baseline Household Survey), rapid_needs_assessment (Rapid Needs Assessment), protection_monitoring (Protection Monitoring), programme_monitoring (Programme monitoring), population_data (Population Data), cartography (Cartography, Infrastructure & GIS).
process_status: Dataset Process Status. Allowed values: raw (Raw-Uncleaned), cleaned (Cleaned Only), anonymized (Cleaned & Anonymized).
identifiability: Identifiability. Allowd values: personally_identifiable (Personally identifiable), anonymized_enclave (Anonymized 1st level: Data Enclave - only removed direct identifiers), anonymized_scientific (Anonymized 2st level: Scientific Use File (SUF)), anonymized_public (Anonymized 3rd level: Public Use File (PUF)).
geog_coverage: Geographic Coverage - eg. National coverage, or name of the area, etc.
data_collection_technique: Data collection technique(*). Allowed values: nf (Not specified), f2f (Face-to-face interview), capi (Face-to-face interview: Computerised), cami (Face-to-face interview: Mobile), papi (Face-to-face interview: Paper-and-pencil), tri (Telephone interview), eri (E-mail interview), wri (Web-based interview: audio-visual technology enabling the interviewer(s) and interviewee(s) to communicate in real time), easi (Self-administered questionnaire: E-mail), pasi (Self-administered questionnaire: Paper), sasi (Self-administered questionnaire: SMS/MMS), casi (Self-administered questionnaire: Computer-assisted), cawi (Self-administered questionnaire: Web-based), foc (Face-to-face focus group), tfoc (Telephone focus group), obs (Observation), oth (Other).
linked_datasets: Linked Datasets - Links to other RIDL datasets. It supports multiple selections.
archived: Archived(*) - Allows users to indicate if the dataset is archived or active. Allowed values: False (No), True (Yes).
admin_notes: Admin Notes - General. You can use Markdown formatting here.
sampling_procedure_notes: Admin Notes - Sampling Procedure. You can use Markdown formatting here.
response_rate_notes: Admin Notes - Response Rate. You can use Markdown formatting here.
data_collection_notes: Admin Notes - Data Collection. You can use Markdown formatting here.
weight_notes: Admin Notes - Weighting. You can use Markdown formatting here.
clean_ops_notes: Admin Notes - Cleaning. You can use Markdown formatting here.
data_accs_notes: Admin Notes - Access authority. You can use Markdown formatting here.
ddi: DDI.
...: ignored.
`hxl-ated`: HXL-ated. Allowed values: False (No), True (Yes).

Value

A list with the provided metadata.

Details

All arguments are of type character. Fields tag_string, data_collector, keywords, sampling_procedure, and operational_purpose_of_data accept vectors of multiple values.

Fields marked with a (*) are required for dataset_create() and dataset_update() operations.

Examples

m <- dataset_metadata(title = "Motor Trend Car Road Tests",
                      name = "mtcars",
                      notes = "The data was extracted from the 1974 Motor Trend 
                      US magazine, and comprises fuel consumption and 10 aspects
                      of automobile design and performance for 32 automobiles 
                      (1973–74 models).",
                      owner_org = "americas",
                      visibility = "public",
                      geographies = "UNSPECIFIED",
                      external_access_level = "open_access",
                      data_collector = "Motor Trend",
                      keywords = keywords[c("Environment", "Other")],
                      unit_of_measurement = "car",
                      data_collection_technique = "oth",
                      archived = "False")

m
#> $title
#> [1] "Motor Trend Car Road Tests"
#> 
#> $name
#> [1] "mtcars"
#> 
#> $notes
#> [1] "The data was extracted from the 1974 Motor Trend \n                      US magazine, and comprises fuel consumption and 10 aspects\n                      of automobile design and performance for 32 automobiles \n                      (1973–74 models)."
#> 
#> $owner_org
#> [1] "americas"
#> 
#> $visibility
#> [1] "public"
#> 
#> $external_access_level
#> [1] "open_access"
#> 
#> $data_collector
#> [1] "Motor Trend"
#> 
#> $keywords
#> Environment       Other 
#>        "11"        "54" 
#> 
#> $unit_of_measurement
#> [1] "car"
#> 
#> $geographies
#> [1] "UNSPECIFIED"
#> 
#> $data_collection_technique
#> [1] "oth"
#> 
#> $archived
#> [1] "False"
#>