The Universal Summarizer endpoint is often where users need both quality and resilience.
In practice, summarization workloads are mixed: some inputs are ideal, some are too short, and some fail upstream for reasons outside your script. This guide shows how to keep pipelines reliable while preserving useful output.
library(kagiPro)
conn <- kagi_connection(
api_key = function() keyring::key_get("API_kagi")
)kagiPro supports both URL-based and raw-text summarization.
q_url <- query_summarize(
url = "https://www.example.com/long-article",
engine = "muriel",
summary_type = "summary",
target_language = "EN",
cache = TRUE
)q_text <- query_summarize(
text = paste(
"Biodiversity underpins ecosystem services such as pollination, soil fertility,",
"water purification, and climate regulation.",
"Habitat loss and climate pressure accelerate species decline with consequences",
"for resilience and human wellbeing."
),
engine = "cecil",
summary_type = "takeaway",
target_language = "EN",
cache = TRUE
)Both calls return named lists, so execution stays consistent regardless of input style.
out_sum <- "summarize_results"
dir.create(out_sum, recursive = TRUE, showWarnings = FALSE)
kagi_request(
connection = conn,
query = q_text[[1]],
output = out_sum,
overwrite = TRUE
)This is the baseline path for standard inputs.
Very short text can trigger server-side failures (for example minimum document size constraints). If you do not want a full pipeline stop, use graceful mode.
q_short <- query_summarize(
text = "Too short.",
engine = "cecil",
summary_type = "summary",
target_language = "EN"
)
kagi_request(
connection = conn,
query = q_short[[1]],
output = "summarize_short_safe",
overwrite = TRUE,
error_mode = "write_dummy"
)In this mode, the request warns and writes a dummy JSON payload, where summarize fields are present but empty (data$output = null, data$tokens = 0).
A realistic batch often contains both valid and invalid inputs.
q_ok <- query_summarize(
text = paste(rep("Long summarize input text.", 40), collapse = " "),
engine = "cecil",
summary_type = "summary",
target_language = "EN"
)
q_err <- query_summarize(
text = "short",
engine = "cecil",
summary_type = "summary",
target_language = "EN"
)
kagi_request(
connection = conn,
query = list(ok = q_ok[[1]], err = q_err[[1]]),
output = "summarize_mixed",
overwrite = TRUE,
workers = 1,
error_mode = "write_dummy"
)This keeps the successful summary while recording a structured placeholder for the failed item.
kagi_request_parquet(
input_json = "summarize_mixed",
output = "summarize_mixed_parquet",
overwrite = TRUE
)Parquet conversion allows downstream analysis while retaining run-level consistency.
workers = 1 while diagnosing warnings, then scale up.error_mode = "write_dummy" for long unattended jobs.