Skip to contents

Aggregates flows into donor-recipient-year observations and prepares features for an opportunity prediction model. Features include:

  • donor funding cycle timing (mean decision month)

  • historical funding patterns (growth in recent years)

  • simple NLP signals from description/keywords (presence of crisis keywords)

  • sector funding trends (share to global clusters)

Usage

analysis_prepare_opportunity_dataset(
  flows,
  lookback_years = 3,
  crisis_keywords = c("refugees", "refugee", "displacement", "displaced", "returnees",
    "idps", "protection", "conflict", "vulnerable"),
  donor_name = NULL,
  recipient_name = NULL
)

Arguments

flows

Dataframe flows.

lookback_years

Integer number of past years to compute trends (default 3).

crisis_keywords

Character vector of keywords to flag global events (default common terms).

donor_name

Optional donor name to highlight in the plot.

recipient_name

Optional recipient name to highlight in the plot.

Value

A list with a tibble of donor, recipient, year, and features, and a plot.

Details

This function returns a tidy dataframe suitable for model training.

Examples

crisis_keywords = c("refugees", "refugee","displacement", "displaced",
                         "returnees","idps",
                        "protection", "conflict", "vulnerable")

result <- analysis_prepare_opportunity_dataset( flows,
    lookback_years = 3,
    crisis_keywords = crisis_keywords,
    donor_name = "Germany")
print(result$plot)
#> Warning: log-10 transformation introduced infinite values.
#> Warning: Removed 6559 rows containing missing values or values outside the scale range
#> (`geom_point()`).