Researcher | MCP 服務詳情 | OpenClaw Study

Researcher Fast AI research agent in Rust — plans sub-questions, searches the web, scrapes sources in parallel, and writes a comprehensive markdown report.…

Researcher Fast AI research agent in Rust — plans sub-questions, searches the web, scrapes sources in parallel, and writes a comprehensive markdown report. Plaintextquery → planner (LLM) → search+scrape×N → quality filter → dedup → rerank → summarize×M → report (LLM) Why Rust? No GIL — true parallel scraping, concurrent LLM summarization, ~5MB static binary, zero LangChain. Recommended backend: gemini-2.0-flash via the Google AI OpenAI-compatible endpoint. Local LLM stacks (llama.cpp + quantized models) work but produce noticeably weaker results — small cloud models consistently outperform local quantized models for this pipeline's multi-stage workload. Features Multi-stage pipeline — LLM-driven query planning, parallel web crawling, concurrent summarization, final report synthesis Any OpenAI-compatible LLM — local (llama.cpp, Ollama, vLLM) or cloud (OpenAI, Anthropic via LiteLLM) Dual-model routing — optionally route different pipeline stages to different model backends Semantic deduplication — TEI embeddings + cosine similarity drop near-duplicate sources before summarization Cross-encoder reranking — ms-marco-MiniLM scores and reranks sources by relevance, authority, and content quality Domain profiles — pin searches to curated source lists (tech-news, academic, llm-news, shopping, travel, news) 6 MCP tools — research, research_person, research_company, research_code, search_jobs, market_insight Streaming HTTP API — SSE token stream for the web UI; blocking JSON for MCP and programmatic use Job search — finds remote jobs matching your profiles.toml preferences, with optional deep company briefs Architecture Plaintexttopic │ ▼ Planner (LLM) ──── generates N sub-questions │ ▼ Crawler (parallel per query) ├─ SearXNG search (→ DuckDuckGo fallback) └─ scrape URLs concurrently (reqwest + scraper crate) │ ▼ Quality filter ──── min word count, text density │ ▼ Dedup (TEI embed → cosine sim) ──── optional, requires EMBED_BASE_URL │ ▼ Cross-encoder rerank (TEI) ──── optional, requires RERANK_BASE_URL │ ▼ Summarizer (LLM, join_all — all calls concurrent) │ ▼ Publisher (LLM) ──── final markdown report / streaming tokens Two binaries: researcher — HTTP server (POST /research, POST /research/stream, GET /) + CLI ()

本頁屬於 OpenClaw Skills 學習體系,涵蓋技能安裝、分類導覽與實戰連結。

English 简体中文 日本語 Español Português