Skip to contents

This function check that the duration relative to the household size of each interview is more than a specified threshold. There is an option to automatically mark for deletion the surveys which are under the threshold. Warning: If there are uncorrected mistakes in the survey dates, it can lead to have the length of the survey in seconds and this check will not performed well

Usage

isInterviewTooShortForTheHouseholdSize(
  ds = NULL,
  surveyConsent = NULL,
  dates = NULL,
  householdSize = NULL,
  minimumSurveyDurationByIndividual = 10,
  reportingColumns = c(enumeratorID, uniquerespondantID),
  deleteIsInterviewTooShortForTheHouseholdSize = FALSE
)

Arguments

ds

dataset containing the survey (from kobo): labelled data.frame

surveyConsent

name of the field in the dataset where the survey consent is stored: string

dates

name of the fields where the information about the start and end date of the survey is stored: list of string (c('start_date','end_date'))

householdSize

name of the field in the dataset where the household size is stored: string

minimumSurveyDurationByIndividual

minimum acceptable survey duration for one individual in minutes: integer

reportingColumns

(Optional, by default it is built from the enumeratorID and the uniquerespondantID) name of the columns from the dataset you want in the result: list of string (c('col1','col2',...))

deleteIsInterviewTooShortForTheHouseholdSize

(Optional, by default set as FALSE) if TRUE, the survey in error will be marked as 'deletedIsInterviewTooShortForTheHouseholdSize': boolean (TRUE/FALSE)

checkperiod

if not null number of day before today when the check should be made

consentForValidSurvey

value defined in the kobo form to acknowledge the surveyed person gave his consent: string

uniquerespondantID

name of the field where the survey unique ID is stored: string

enumeratorID

name of the field where the enumerator ID is stored: string

Value

result a list that includes: * dst same dataset as the inputed one but with survey marked for deletion if errors are found and delete=TRUE (or NULL) * ret_log list of the errors found (or NULL) * var a list of value (or NULL) * graph graphical representation of the results (or NULL)

Examples

load(system.file("sample_dataset.RData", package = "HighFrequencyChecks")) 
ds <- sample_dataset
surveyConsent <- "survey_consent"
dates <- c("survey_start","end_survey")
householdSize <-"consent_received.respondent_info.hh_size"
uniquerespondantID <- "X_uuid"
enumeratorID <- "enumerator_id"
minimumSurveyDurationByIndividual <- 10
reportingColumns <- c(enumeratorID, uniquerespondantID)

result <- isInterviewTooShortForTheHouseholdSize(ds = ds,
                            surveyConsent=surveyConsent,
                            dates=dates,
                            householdSize=householdSize,
                            minimumSurveyDurationByIndividual=minimumSurveyDurationByIndividual,
                          reportingColumns=reportingColumns,
                            deleteIsInterviewTooShortForTheHouseholdSize=FALSE)
knitr::kable(head(result[["ret_log"]], 10))
#> 
#> 
#> |   | enumerator_id|X_uuid                               | HHSize| SurveyLength|
#> |:--|-------------:|:------------------------------------|------:|------------:|
#> |2  |         10052|3c34b46b-96c4-46aa-9d4f-4147527f9042 |      7|     46.09367|
#> |6  |            83|225f1521-c75d-4a4e-b394-0f02ac8e9d8a |      8|     52.15687|
#> |9  |            43|a98f85bc-0752-437d-a513-cc729804c303 |      8|     52.57200|
#> |13 |            45|3f0d915b-7fa6-4262-8783-8d2c477af2e3 |      6|     53.24095|
#> |14 |         30022|93ea674d-29c4-4750-bb48-ce2125a55094 |      7|     58.11412|
#> |16 |            16|67324a24-765c-4ed7-bff0-5c739643b21b |     11|     52.47767|
#> |22 |            26|16016bc3-cec1-410b-a7e5-e8b34a8c4c00 |      9|     41.98670|
#> |23 |            80|0d7c18ad-4606-484b-bbe9-519b7e9eb4db |      5|     31.28085|
#> |24 |         30022|58a431d8-af01-452f-a90f-eba8a7732089 |      7|     39.70487|
#> |25 |         30022|d00e83a0-70f9-4fdd-98e6-efead831ce56 |      7|     65.95892|
print(result[["graph"]])