Skip to main content
ArgentOS Business — This feature is part of ArgentOS Business. The architecture is documented here for all users, but full functionality requires a Business license. Learn more about Business

Overview

The Knowledge Library is ArgentOS’s Retrieval-Augmented Generation (RAG) system. It provides agents with a structured document store where files can be ingested, chunked, embedded, and searched. Access is controlled through a fine-grained ACL system that governs which agents can read, write, and own which collections.
This is a PG-only feature — it requires PostgreSQL and has no SQLite fallback.

Collections

Collections are named document buckets that organize knowledge by scope and purpose. Example collections:
  • corporate — Company-wide policies and procedures
  • department-sales — Sales team playbooks and materials
  • department-support — Support knowledge base
  • agent-personal — An individual agent’s personal reference documents
Each collection has an owner agent, ACL grants controlling access, and a collection tag (normalized slug).

Document Ingestion

Supported Formats

FormatMIME TypesNotes
PDFapplication/pdfText extraction, max page limits
DOCXapplication/vnd.openxmlformats-officedocument.wordprocessingml.documentFull text extraction
XLSXapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheetCell-level extraction
Texttext/plain, text/markdown, text/csv, application/jsonDirect content
HTMLtext/htmlTag stripping

Chunking

Documents are split into overlapping chunks for embedding:
ParameterDefaultDescription
chunkSize1800 charactersMaximum characters per chunk
overlap200 charactersOverlap between adjacent chunks
Each chunk receives a citation reference in the format filename.pdf#chunk-5, enabling precise source attribution in search results.

Embedding

Chunks are embedded using the configured embedding provider:
  • OpenAItext-embedding-3-small or text-embedding-3-large
  • Gemini — Google embedding models
  • Ollama — Local embedding models (free)
Embeddings are stored as pgvector columns with HNSW indexes for fast approximate nearest-neighbor search. Search combines two retrieval strategies for maximum recall and precision: Results from both strategies are merged and ranked. The hybrid approach ensures that exact keyword queries find precise matches, conceptual queries find semantically related content, and rare terms are not lost in vector space.

ACL Enforcement

The ACL system controls access at the collection level. Every knowledge operation checks permissions before proceeding.

Permission Model

PermissionDescription
can_readAgent can search and view documents in the collection
can_writeAgent can ingest new documents into the collection
is_ownerAgent has full control: read, write, delete, manage grants

Grant Resolution

Grants are resolved in this order:
  1. Exact agent match: Direct grant for the requesting agent
  2. Alias resolution: main and argent are treated as equivalent
  3. Wildcard grant: agent_id='*' grants access to all agents

Fail-Closed vs Fail-Open

The ACL system defaults to fail-closed when PostgreSQL is the configured backend — if ACL tables are unavailable or a query fails, access is denied.
Override with environment variables:
  • ARGENT_KNOWLEDGE_ACL_FAIL_OPEN=1 — Allow access when ACL check fails
  • ARGENT_KNOWLEDGE_ACL_FAIL_CLOSED=1 — Deny access when ACL check fails (explicit)

4-Level Scoping

Knowledge collections can be organized at four hierarchical levels:
LevelExampleTypical Use
Globalcorporate, policiesCompany-wide reference, accessible to all agents
Departmentdepartment-support, department-salesTeam-specific knowledge bases
Agentagent-argent-personalIndividual agent’s reference documents
Workerworker-exec-researchTask-specific temporary collections
Scoping is by convention (collection naming) rather than enforced hierarchy. ACL grants control actual access.

Dashboard Integration

The ConfigPanel provides a full library browser:
  • Collection list with ACL indicators (read/write/owner per collection)
  • Document browser with search, sort, and filter
  • Ingest UI for uploading documents to collections
  • ACL manager for granting/revoking agent access
  • Reindex controls for regenerating embeddings