Skip to main content

C/C++ SDK

Learn how to install and use KeraDB in your C and C++ applications.

Installation

Using CMake

cd sdks/cpp
mkdir build && cd build
cmake ..
make
sudo make install

Manual Compilation

C Example:

gcc -o myapp main.c -L/path/to/keradb/target/release -lkeradb -I/path/to/keradb/include

C++ Example:

g++ -std=c++11 -o myapp main.cpp -L/path/to/keradb/target/release -lkeradb -I/path/to/keradb/include

Requirements

  • C compiler (GCC, Clang, MSVC)
  • C++11 or higher (for C++ wrapper)
  • CMake 3.10+ (optional)
  • KeraDB native library

Quick Start

C API

#include "keradb.h"
#include <stdio.h>
#include <stdlib.h>

int main() {
// Create database
KeraDB db = keradb_create("mydb.ndb");
if (!db) {
fprintf(stderr, "Failed to create database\n");
return 1;
}

// Insert document
const char* json = "{\"name\":\"Alice\",\"age\":30}";
char* id = keradb_insert(db, "users", json);
if (id) {
printf("Inserted ID: %s\n", id);
keradb_free_string(id);
}

// Find by ID
char* doc = keradb_find_by_id(db, "users", id);
if (doc) {
printf("Found: %s\n", doc);
keradb_free_string(doc);
}

// Close
keradb_close(db);
return 0;
}

C++ API

The C++ wrapper provides a modern, RAII-based interface with automatic memory management and exception handling.

#include "keradb.hpp"
#include <iostream>

int main() {
try {
// Create database
auto db = keradb::Database::create("mydb.ndb");

// Insert document
auto id = db.insert("users", R"({
"name": "Alice",
"age": 30,
"email": "alice@example.com"
})");
std::cout << "Inserted ID: " << id << std::endl;

// Find by ID
auto doc = db.findById("users", id);
std::cout << "Found: " << doc << std::endl;

// Update
auto updated = db.update("users", id, R"({
"name": "Alice Smith",
"age": 31
})");
std::cout << "Updated: " << updated << std::endl;

// Find all
auto users = db.findAll("users");
std::cout << "All users: " << users << std::endl;

// Count
int count = db.count("users");
std::cout << "Total users: " << count << std::endl;

// Delete
db.remove("users", id);

// Sync to disk
db.sync();

} catch (const keradb::Exception& e) {
std::cerr << "Error: " << e.what() << std::endl;
return 1;
}

return 0;
}

C API Reference

Database Operations

FunctionDescription
KeraDB keradb_create(const char* path)Create a new database
KeraDB keradb_open(const char* path)Open an existing database
void keradb_close(KeraDB db)Close database
int keradb_sync(KeraDB db)Sync changes to disk

Document Operations

FunctionDescription
char* keradb_insert(db, collection, json_data)Insert a document
char* keradb_find_by_id(db, collection, doc_id)Find document by ID
char* keradb_update(db, collection, doc_id, json_data)Update a document
int keradb_delete(db, collection, doc_id)Delete a document
char* keradb_find_all(db, collection, limit, skip)Find all documents
int keradb_count(db, collection)Count documents
char* keradb_list_collections(db)List all collections

Memory Management

FunctionDescription
void keradb_free_string(char* s)Free strings returned by KeraDB
Important

Always free strings returned by KeraDB functions using keradb_free_string() to avoid memory leaks.

C++ API Reference

The C++ wrapper provides RAII and exception-safe interface:

Database Class

namespace keradb {
class Database {
public:
// Factory methods
static Database create(const std::string& path);
static Database open(const std::string& path);

// CRUD operations
std::string insert(const std::string& collection, const std::string& json);
std::string findById(const std::string& collection, const std::string& id);
std::string update(const std::string& collection, const std::string& id, const std::string& json);
void remove(const std::string& collection, const std::string& id);

// Query operations
std::string findAll(const std::string& collection, int limit = -1, int skip = 0);
int count(const std::string& collection);
std::string listCollections();

// Database operations
void sync();
void close();
bool isOpen() const;
const std::string& path() const;
};
}

Exception Handling

try {
auto db = keradb::Database::create("mydb.ndb");
db.insert("users", "{\"name\":\"Alice\"}");
} catch (const keradb::Exception& e) {
std::cerr << "KeraDB Error: " << e.what() << std::endl;
}

Vector Database

KeraDB includes a powerful lightweight vector database for similarity search, making it ideal for AI/ML applications, semantic search, recommendations, and RAG (Retrieval-Augmented Generation) systems.

Why Vector Database?

Modern AI applications often need to:

  • Store embeddings from language models (OpenAI, Sentence Transformers, etc.)
  • Find semantically similar documents or images
  • Build recommendation systems
  • Implement RAG for LLM applications

KeraDB's vector database provides all this functionality in a lightweight, embedded package—no separate vector database server required.

Quick Start

#include "keradb.hpp"
#include <iostream>
#include <vector>

int main() {
try {
auto db = keradb::Database::create("vectors.ndb");

// Create a vector collection (384 dimensions for sentence embeddings)
keradb::Database::VectorConfig config(384, keradb::Database::Distance::Cosine);
db.createVectorCollection("embeddings", config);

// Insert a vector with metadata
std::vector<float> embedding(384, 0.0f);
embedding[0] = 0.8f;
embedding[1] = 0.6f;
// ... fill with actual embedding values

auto id = db.insertVector("embeddings", embedding, R"({
"text": "Hello world",
"category": "greeting",
"language": "en"
})");
std::cout << "Inserted vector: " << id << std::endl;

// Search for similar vectors
std::vector<float> query(384, 0.0f);
query[0] = 0.9f;
query[1] = 0.5f;

auto results = db.vectorSearch("embeddings", query, 5);
std::cout << "Similar vectors: " << results << std::endl;

} catch (const keradb::Exception& e) {
std::cerr << "Error: " << e.what() << std::endl;
return 1;
}
return 0;
}

Distance Metrics

KeraDB supports multiple distance metrics for different use cases:

MetricEnum ValueBest ForDescription
CosineDistance::CosineText embeddings, normalized vectorsMeasures angle between vectors (0 = identical, 1 = opposite)
EuclideanDistance::EuclideanSpatial data, image featuresStraight-line distance in vector space
Dot ProductDistance::DotProductPre-normalized vectors, recommendationsInner product of vectors
ManhattanDistance::ManhattanHigh-dimensional sparse dataSum of absolute differences
// Cosine similarity (default) - best for text embeddings
keradb::Database::VectorConfig textConfig(384, keradb::Database::Distance::Cosine);

// Euclidean distance - best for image features
keradb::Database::VectorConfig imageConfig(512, keradb::Database::Distance::Euclidean);

// Dot product - for pre-normalized vectors
keradb::Database::VectorConfig dotConfig(768, keradb::Database::Distance::DotProduct);

// Manhattan distance - for sparse high-dimensional data
keradb::Database::VectorConfig sparseConfig(1024, keradb::Database::Distance::Manhattan);

Vector API Reference

Creating Vector Collections

// VectorConfig constructor
VectorConfig(int dimensions, Distance metric = Distance::Cosine);

// Create collection
void createVectorCollection(const std::string& collection, const VectorConfig& config);

Example:

// Create a collection for OpenAI embeddings (1536 dimensions)
keradb::Database::VectorConfig openaiConfig(1536, keradb::Database::Distance::Cosine);
db.createVectorCollection("openai_embeddings", openaiConfig);

// Create a collection for image features (512 dimensions)
keradb::Database::VectorConfig imageConfig(512, keradb::Database::Distance::Euclidean);
db.createVectorCollection("image_features", imageConfig);

Inserting Vectors

std::string insertVector(
const std::string& collection,
const std::vector<float>& vector,
const std::string& metadata = "" // Optional JSON metadata
);

Example:

// Insert with metadata
std::vector<float> embedding = getEmbedding("The quick brown fox");
auto id = db.insertVector("documents", embedding, R"({
"text": "The quick brown fox",
"source": "document1.txt",
"page": 5,
"category": "animals"
})");

// Insert without metadata
auto id2 = db.insertVector("documents", anotherEmbedding);
// Basic search - returns top K similar vectors
std::string vectorSearch(
const std::string& collection,
const std::vector<float>& query,
int topK = 10
);

// Filtered search - with metadata filter
std::string vectorSearchFiltered(
const std::string& collection,
const std::vector<float>& query,
int topK,
const std::string& filter // JSON filter
);

Example:

// Find 5 most similar vectors
std::vector<float> queryVector = getEmbedding("A fox jumps");
auto results = db.vectorSearch("documents", queryVector, 5);

// Results format (JSON):
// [
// {"id": "abc123", "score": 0.95, "metadata": {"text": "The quick brown fox", ...}},
// {"id": "def456", "score": 0.87, "metadata": {"text": "A red fox runs", ...}},
// ...
// ]

// Filtered search - only "animals" category
auto filtered = db.vectorSearchFiltered("documents", queryVector, 5,
R"({"category": "animals"})");

// Multiple filter conditions
auto multiFilter = db.vectorSearchFiltered("documents", queryVector, 10,
R"({"category": "animals", "page": 5})");

Vector CRUD Operations

// Get a vector by ID
std::string getVector(const std::string& collection, const std::string& vectorId);

// Delete a vector
void deleteVector(const std::string& collection, const std::string& vectorId);

// Update vector metadata (vector values cannot be changed)
void updateVectorMetadata(
const std::string& collection,
const std::string& vectorId,
const std::string& metadata
);

Example:

// Get vector details
auto vectorData = db.getVector("documents", vectorId);
// Returns: {"id": "abc123", "vector": [0.1, 0.2, ...], "metadata": {...}}

// Update metadata
db.updateVectorMetadata("documents", vectorId, R"({
"text": "The quick brown fox",
"reviewed": true,
"timestamp": "2024-01-15T10:30:00Z"
})");

// Delete vector
db.deleteVector("documents", vectorId);

Collection Statistics

// Get collection stats
std::string vectorStats(const std::string& collection);

// Count vectors
int vectorCount(const std::string& collection);

Example:

auto stats = db.vectorStats("documents");
// Returns: {"vector_count": 1000, "dimensions": 384, "distance": "cosine"}

int count = db.vectorCount("documents");
std::cout << "Total vectors: " << count << std::endl;

Complete Vector Search Example

Here's a complete example demonstrating all vector database features:

#include "keradb.hpp"
#include <iostream>
#include <vector>
#include <cmath>

// Helper: Normalize vector for cosine similarity
std::vector<float> normalize(const std::vector<float>& v) {
float norm = 0.0f;
for (float x : v) norm += x * x;
norm = std::sqrt(norm);

std::vector<float> result(v.size());
for (size_t i = 0; i < v.size(); ++i) {
result[i] = v[i] / norm;
}
return result;
}

int main() {
try {
std::cout << "🚀 KeraDB Vector Database Example\n" << std::endl;

// Create database
auto db = keradb::Database::create("vector_demo.ndb");

// Create vector collection (4D for demo, use 384+ for real embeddings)
keradb::Database::VectorConfig config(4, keradb::Database::Distance::Cosine);
db.createVectorCollection("embeddings", config);
std::cout << "✅ Created vector collection" << std::endl;

// Insert vectors representing directions
struct Item {
std::vector<float> vec;
std::string meta;
std::string name;
};

std::vector<Item> items = {
{{1.0f, 0.0f, 0.0f, 0.0f}, R"({"type": "direction", "name": "north"})", "north"},
{{0.0f, 1.0f, 0.0f, 0.0f}, R"({"type": "direction", "name": "east"})", "east"},
{{0.7f, 0.7f, 0.0f, 0.0f}, R"({"type": "direction", "name": "northeast"})", "northeast"},
{{0.5f, 0.5f, 0.5f, 0.5f}, R"({"type": "mixed", "name": "center"})", "center"},
{{-1.0f, 0.0f, 0.0f, 0.0f}, R"({"type": "direction", "name": "south"})", "south"},
};

std::vector<std::string> ids;
std::cout << "\n📥 Inserting vectors..." << std::endl;
for (const auto& item : items) {
auto id = db.insertVector("embeddings", item.vec, item.meta);
ids.push_back(id);
std::cout << " • " << item.name << " -> " << id << std::endl;
}

// Similarity search
std::cout << "\n🔍 Searching for vectors similar to [0.8, 0.6, 0, 0]..." << std::endl;
std::vector<float> query = {0.8f, 0.6f, 0.0f, 0.0f};
auto results = db.vectorSearch("embeddings", query, 3);
std::cout << "Results: " << results << std::endl;

// Filtered search
std::cout << "\n🔍 Filtered search (type=direction only)..." << std::endl;
auto filtered = db.vectorSearchFiltered("embeddings", query, 3,
R"({"type": "direction"})");
std::cout << "Filtered: " << filtered << std::endl;

// Get stats
std::cout << "\n📊 Collection stats:" << std::endl;
std::cout << db.vectorStats("embeddings") << std::endl;
std::cout << "Vector count: " << db.vectorCount("embeddings") << std::endl;

// Update metadata
std::cout << "\n✏️ Updating metadata..." << std::endl;
db.updateVectorMetadata("embeddings", ids[0],
R"({"type": "direction", "name": "north", "primary": true})");
std::cout << "Updated: " << db.getVector("embeddings", ids[0]) << std::endl;

// Delete a vector
std::cout << "\n🗑️ Deleting vector..." << std::endl;
db.deleteVector("embeddings", ids.back());
std::cout << "New count: " << db.vectorCount("embeddings") << std::endl;

// Cleanup
db.sync();
db.close();

std::cout << "\n✨ Done!" << std::endl;

} catch (const keradb::Exception& e) {
std::cerr << "❌ Error: " << e.what() << std::endl;
return 1;
}

return 0;
}

Use Cases

// Store document embeddings
for (const auto& doc : documents) {
auto embedding = getEmbedding(doc.content); // Your embedding model
db.insertVector("docs", embedding, R"({
"title": ")" + doc.title + R"(",
"path": ")" + doc.path + R"("
})");
}

// Search
auto queryEmb = getEmbedding("How to configure logging?");
auto results = db.vectorSearch("docs", queryEmb, 5);

2. RAG (Retrieval-Augmented Generation)

// Retrieve context for LLM
std::string getContext(const std::string& question) {
auto queryEmb = getEmbedding(question);
auto results = db.vectorSearch("knowledge_base", queryEmb, 3);

// Parse results and concatenate relevant text
// ...
return context;
}

// Use with LLM
std::string answer = llm.generate(
"Context: " + getContext(userQuestion) +
"\nQuestion: " + userQuestion
);

3. Image Similarity

// Store image features
for (const auto& image : images) {
auto features = extractFeatures(image); // CNN feature extraction
db.insertVector("images", features, R"({
"path": ")" + image.path + R"(",
"tags": [")" + join(image.tags, "\", \"") + R"("]
})");
}

// Find similar images
auto queryFeatures = extractFeatures(queryImage);
auto similar = db.vectorSearch("images", queryFeatures, 10);

4. Recommendation System

// User-item embeddings
db.insertVector("users", userEmbedding, R"({"user_id": "user123"})");
db.insertVector("items", itemEmbedding, R"({"item_id": "item456", "name": "Product X"})");

// Find items similar to user's taste
auto recommendations = db.vectorSearch("items", userEmbedding, 20);

Best Practices

Performance Tips
  1. Normalize vectors before inserting when using cosine distance
  2. Use appropriate dimensions - higher isn't always better
  3. Batch inserts when loading large datasets
  4. Index frequently searched collections for better performance
Memory Considerations
  • Each vector uses dimensions × 4 bytes of memory
  • 1 million vectors at 384 dimensions ≈ 1.5 GB
  • Consider using smaller embedding models for large datasets
Choosing Distance Metric
  • Cosine: Best for text embeddings (OpenAI, Sentence Transformers)
  • Euclidean: Best for image features, spatial data
  • Dot Product: Use when vectors are pre-normalized
  • Manhattan: Good for sparse, high-dimensional data

Performance Benchmarks

The C++ SDK includes benchmarks comparing KeraDB vs SQLite performance.

Running Benchmarks

Windows (PowerShell):

cd sdks/cpp
.\build-benchmark.ps1 -Run

Benchmark Results

Tested on Windows 11, Intel Core i7, Release build

Small Dataset (100 records)

OperationKeraDB (ms)SQLite (ms)Speedup
INSERT3.0544.0614.5x faster
READ BY ID0.271.003.7x faster
COUNT (100x)0.020.9446x faster
UPDATE1.5233.3622x faster
DELETE0.7545.6261x faster

Large Dataset (10,000 records)

OperationKeraDB (ms)SQLite (ms)Speedup
READ BY ID6.9911.301.62x faster
COUNT (100x)0.011.02137x faster
UPDATE7.3937.695.1x faster
DELETE7.8936.994.7x faster

Performance Summary

KeraDB excels at:

  • COUNT operations - 46-137x faster (in-memory metadata)
  • Single document lookups - 1.6-3.7x faster
  • Updates - 5-22x faster
  • Deletes - 4.5-61x faster

SQLite excels at:

  • Bulk inserts at large scale
  • Table scans (READ ALL)

Building from Source

Prerequisites

  • CMake 3.10+
  • C++11 compatible compiler
  • KeraDB library (build with cargo build --release)

Build Steps

# Build KeraDB library first
cd keradb
cargo build --release

# Build C++ SDK
cd ../sdks/cpp
mkdir build && cd build
cmake ..
make

# Run examples
./basic_example
./basic_c_example

Platform Support

PlatformCompilerStatus
WindowsMSVC 2019+✅ Supported
WindowsMinGW-w64✅ Supported
LinuxGCC 7+✅ Supported
LinuxClang 6+✅ Supported
macOSClang✅ Supported

Next Steps