Feat/langchain vectorstore by alessandrostone · Pull Request #73 · inputlayer/inputlayer

alessandrostone · 2026-04-07T13:37:21Z

InputLayerVectorStore — LangChain VectorStore interface

Summary

Adds InputLayerVectorStore, a LangChain VectorStore implementation backed by InputLayer. This makes InputLayer a drop-in replacement for Chroma, Pinecone, Weaviate, FAISS, etc. in any existing LangChain RAG tutorial or chain — change the import, keep the code.

Why this matters

LangChain's VectorStore is the most common abstraction in the LangChain ecosystem. Hundreds of tutorials, courses, and example projects assume you have a VectorStore. Until now, those flowed past InputLayer because we only offered a custom Retriever. With this PR, every VectorStore tutorial works with InputLayer by changing one import.

What's new

Component	File	What it does
`InputLayerVectorStore`	`integrations/langchain/vectorstore.py`	Full `VectorStore` implementation with sync + async paths

Implemented methods

Method	Sync	Async
`from_texts` (classmethod, required)	yes	`afrom_texts`
`add_texts` (with UUIDs, metadata, explicit ids)	yes	`aadd_texts`
`add_documents`	yes	`aadd_documents`
`similarity_search` (required)	yes	`asimilarity_search`
`similarity_search_by_vector`	yes	`asimilarity_search_by_vector`
`similarity_search_with_score`	yes	`asimilarity_search_with_score`
`get_by_ids`	yes	`aget_by_ids`
`delete` (by ids)	yes	`adelete`
`as_retriever()`	inherited from base — works automatically

All sync methods go through the run_sync bridge, so they work safely in Jupyter, FastAPI, LangGraph, and any running event loop.

Usage

pip install inputlayer-client-dev[langchain]

from langchain_openai import OpenAIEmbeddings
from inputlayer.integrations.langchain import InputLayerVectorStore

embeddings = OpenAIEmbeddings()

# Bulk-load — same as Chroma/Pinecone/etc
store = await InputLayerVectorStore.afrom_texts(
    texts=["Python is a language", "Rust is fast"],
    embedding=embeddings,
    metadatas=[{"category": "lang"}, {"category": "lang"}],
    kg=kg,
    collection_name="my_docs",
)

# Search
docs = await store.asimilarity_search("programming", k=3)

# As a retriever in an LCEL chain — drop-in compatible
retriever = store.as_retriever(search_kwargs={"k": 5})
chain = retriever | prompt | llm | StrOutputParser()

How it stores data

A single relation per instance:

+<collection>(id: string, content: string, metadata: string, embedding: vector)

id — UUID by default, user-provided ids supported
content — the document text
metadata — JSON-encoded for arbitrary structure
embedding — the dense vector from the embeddings model

Distance is computed via InputLayer's cosine/euclidean/dot/manhattan functions in a Datalog query.

Tests

27 unit tests (tests/test_vectorstore.py) using a mock KG with realistic Datalog parsing:

Setup (creation, idempotency, custom collection name)
Add (texts, with metadata, explicit ids, documents)
Search (basic, k limit, with score, by vector, metadata roundtrip)
Get by ids (existing + missing)
Delete (by ids + none)
from_texts / afrom_texts (with metadatas, requires kg)
Sync bridge (all major methods via run_sync)
as_retriever (correct type, invokes search)

Example

examples/langchain/ex18_vectorstore.py — end-to-end demo:

from_texts — bulk-load 6 documents with metadata
similarity_search — find most similar docs to a query
similarity_search_with_score — show distances
as_retriever — wrap as a LangChain retriever
Full LCEL chain — retriever → prompt → llm (when LM Studio is available)

Falls back to deterministic fake embeddings when no LLM server is running, so the example always works for demos and CI.

Files changed

src/inputlayer/integrations/langchain/vectorstore.py — new (~330 lines)
src/inputlayer/integrations/langchain/__init__.py — export InputLayerVectorStore
tests/test_vectorstore.py — new (27 tests)
examples/langchain/ex18_vectorstore.py — new
examples/langchain/runner.py — register example test: hardening and comprehensive regression tests #18

Test plan

uv run pytest tests/test_vectorstore.py -v — 27 unit tests
uv run python -m examples.langchain.ex18_vectorstore — runs against a server, demonstrates full lifecycle
uv run python -m examples.langchain.runner 18 — runs via the runner

Design notes

Single-collection model: each InputLayerVectorStore instance maps to one relation. To use multiple collections, instantiate the store multiple times with different collection_name values.
Metadata as JSON: arbitrary metadata is JSON-encoded into a string column. This is the simplest cross-version-compatible approach until InputLayer supports nested types in schemas.
Search is currently a full scan + Python sort: a future optimization could use kg.vector_search() with HNSW indexes for large collections (~100K+ documents). For typical RAG corpora (1K-10K documents) the scan is fast enough.

… wrapper The old sync wrapper crashed inside running event loops (Jupyter, FastAPI, LangGraph). Uses a dedicated daemon thread with its own loop instead — same pattern as httpx. Prerequisite for LangChain integration.

InputLayerRetriever supports raw Datalog queries with {input} placeholder and vector search mode. InputLayerTool exposes KG queries to LangChain agents. Both provide native async and sync paths via the run_sync bridge.

…lding)

… graph

…ived context

… filtering

…diction detection

…planations

…tion

…xamples

…E and polish

alessandrostone added 15 commits April 1, 2026 13:28

chore: add ruff, mypy, and uv to Python SDK dev tooling

5a7ea4a

feat: add LangChain integration (retriever + tool)

1465ad9

InputLayerRetriever supports raw Datalog queries with {input} placeholder and vector search mode. InputLayerTool exposes KG queries to LangChain agents. Both provide native async and sync paths via the run_sync bridge.

feat: add LangChain integration with example (retriever, tool, KG bui…

1d2db38

…lding)

feat: add explainable RAG example with .why() proof trees and .why_not

1c490e9

feat: add multi-hop reasoning example with transitive closure and org…

0ce5a71

… graph

feat: add conversational memory example with topic extraction and der…

374364a

…ived context

feat: add access-controlled RAG example with clearance-based document…

7bb0000

… filtering

feat: add multi-agent fact-checking example with shared KG and contra…

75aac2c

…diction detection

feat: add anomaly detection example with salary band rules and LLM ex…

b3626b7

…planations

feat: add hallucination detection and guardrails examples

745c2d0

feat: add GraphRAG example with entity extraction and community detec…

6512bf0

…tion

feat: add semantic caching, recommendation engine, and data lineage e…

645b3d9

…xamples

chore: split examples into individual files with runner, update READM…

38ec0a0

…E and polish

feat: add InputLayerVectorStore for LangChain VectorStore interface

677c545

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/langchain vectorstore#73

Feat/langchain vectorstore#73
alessandrostone wants to merge 15 commits intomainfrom
feat/langchain-vectorstore

alessandrostone commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alessandrostone commented Apr 7, 2026

InputLayerVectorStore — LangChain VectorStore interface

Summary

Why this matters

What's new

Implemented methods

Usage

How it stores data

Tests

Example

Files changed

Test plan

Design notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant