ArgentOS Business — This feature is part of ArgentOS Business. The architecture is documented here for all users, but full functionality requires a Business license. Learn more about Business
Overview
The Knowledge Library is ArgentOS’s Retrieval-Augmented Generation (RAG) system. It provides agents with a structured document store where files can be ingested, chunked, embedded, and searched. Access is controlled through a fine-grained ACL system that governs which agents can read, write, and own which collections.
This is a PG-only feature — it requires PostgreSQL and has no SQLite fallback.
Collections
Collections are named document buckets that organize knowledge by scope and purpose. Example collections:
corporate — Company-wide policies and procedures
department-sales — Sales team playbooks and materials
department-support — Support knowledge base
agent-personal — An individual agent’s personal reference documents
Each collection has an owner agent, ACL grants controlling access, and a collection tag (normalized slug).
Document Ingestion
| Format | MIME Types | Notes |
|---|
| PDF | application/pdf | Text extraction, max page limits |
| DOCX | application/vnd.openxmlformats-officedocument.wordprocessingml.document | Full text extraction |
| XLSX | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | Cell-level extraction |
| Text | text/plain, text/markdown, text/csv, application/json | Direct content |
| HTML | text/html | Tag stripping |
Chunking
Documents are split into overlapping chunks for embedding:
| Parameter | Default | Description |
|---|
chunkSize | 1800 characters | Maximum characters per chunk |
overlap | 200 characters | Overlap between adjacent chunks |
Each chunk receives a citation reference in the format filename.pdf#chunk-5, enabling precise source attribution in search results.
Embedding
Chunks are embedded using the configured embedding provider:
- OpenAI —
text-embedding-3-small or text-embedding-3-large
- Gemini — Google embedding models
- Ollama — Local embedding models (free)
Embeddings are stored as pgvector columns with HNSW indexes for fast approximate nearest-neighbor search.
Hybrid Search
Search combines two retrieval strategies for maximum recall and precision:
BM25 Keyword Search
Vector Similarity Search
PostgreSQL tsvector with GIN indexes provides fast full-text keyword matching. This catches exact terminology, proper nouns, and technical terms that vector search might miss.
pgvector HNSW indexes enable semantic similarity search. This catches paraphrases, related concepts, and queries that use different terminology than the source documents.
Results from both strategies are merged and ranked. The hybrid approach ensures that exact keyword queries find precise matches, conceptual queries find semantically related content, and rare terms are not lost in vector space.
ACL Enforcement
The ACL system controls access at the collection level. Every knowledge operation checks permissions before proceeding.
Permission Model
| Permission | Description |
|---|
can_read | Agent can search and view documents in the collection |
can_write | Agent can ingest new documents into the collection |
is_owner | Agent has full control: read, write, delete, manage grants |
Grant Resolution
Grants are resolved in this order:
- Exact agent match: Direct grant for the requesting agent
- Alias resolution:
main and argent are treated as equivalent
- Wildcard grant:
agent_id='*' grants access to all agents
Fail-Closed vs Fail-Open
The ACL system defaults to fail-closed when PostgreSQL is the configured backend — if ACL tables are unavailable or a query fails, access is denied.
Override with environment variables:
ARGENT_KNOWLEDGE_ACL_FAIL_OPEN=1 — Allow access when ACL check fails
ARGENT_KNOWLEDGE_ACL_FAIL_CLOSED=1 — Deny access when ACL check fails (explicit)
4-Level Scoping
Knowledge collections can be organized at four hierarchical levels:
| Level | Example | Typical Use |
|---|
| Global | corporate, policies | Company-wide reference, accessible to all agents |
| Department | department-support, department-sales | Team-specific knowledge bases |
| Agent | agent-argent-personal | Individual agent’s reference documents |
| Worker | worker-exec-research | Task-specific temporary collections |
Scoping is by convention (collection naming) rather than enforced hierarchy. ACL grants control actual access.
Dashboard Integration
The ConfigPanel provides a full library browser:
- Collection list with ACL indicators (read/write/owner per collection)
- Document browser with search, sort, and filter
- Ingest UI for uploading documents to collections
- ACL manager for granting/revoking agent access
- Reindex controls for regenerating embeddings