This repo is a topic-focused CTI article acquisition and export front-end for CTIKG / LLM4CTI-style downstream workflows.
It is built to:
It is not:
README.md
docs/OPEN_TOPIC_QUICKSTART.md
docs/SOL_RUNBOOK.md
docs/OUTPUTS_CONTRACT.md
docs/PIPELINE.md
docs/LLM4CTI_NOTEBOOK_BRIDGE.md
docs/LLM4CTI_COMPAT_TEST.md
docs/TROUBLESHOOTING.md
Run:
make open-topic TOPIC="..." PROVIDER=... MODEL=... SCRAPE_MAX=...
Then create notebook handoff artifacts explicitly:
python scripts/export_llm4cti_articles.py --run-dir runs/<SAFE_TOPIC>/<RUN_ID>
Use the staged pattern in docs/SOL_RUNBOOK.md:
topic.yaml once on the login nodeImmediate per-run outputs:
runs/<SAFE_TOPIC>/<RUN_ID>/scrape/scraped_corpus.jsonlruns/<SAFE_TOPIC>/<RUN_ID>/exports/ctikg_input.csvruns/<SAFE_TOPIC>/<RUN_ID>/data/ctikg_docs_meta.jsonruns/<SAFE_TOPIC>/<RUN_ID>/scrape/scrape_log.csvruns/<SAFE_TOPIC>/<RUN_ID>/manifest.jsonManual post-run notebook handoff:
runs/<SAFE_TOPIC>/<RUN_ID>/llm4cti/Articles.xlsxruns/<SAFE_TOPIC>/<RUN_ID>/llm4cti/llm4cti_articles.csvruns/<SAFE_TOPIC>/<RUN_ID>/llm4cti/llm4cti_articles_meta.jsondocs/QUICKSTART.md documents the older shared-path workflow under
data/, results/, and exports/.
Prefer the open-topic per-run workflow unless you specifically need the legacy path.