Hi, I'm Niko
Full-stack engineer building agent platforms — realtime voice agents, agent harnesses, and the infrastructure that lets AI drive real products.

About

I'm a founding team member at Cookiy AI, a realtime user-research platform (funded at eight figures USD, 50k+ respondents and 700+ studies in production). I co-authored AOI (arXiv 2026), a multi-agent framework for autonomous cloud diagnosis.

Work Experience

Skills

TypeScript
Node.js
NestJS
React
Next.js
Python
PostgreSQL
Prisma
Redis
LiveKit
MCP
Docker
GCP / AWS
My Projects

Check out my latest work

I've worked on a variety of projects, from simple websites to complex web applications. Here are a few of my favorites.

Realtime Voice Interview Agent

Production voice agent that runs full user-research interviews. Cascaded STT→LLM→TTS streaming with dynamic TTS chunking, VAD + multilingual turn detection, barge-in and silence recovery — end-to-end first response under one second. Screen-share vision lets the agent watch, pause capture, and probe on-screen behavior in realtime.

Python
LiveKit Agents
GCP Cloud Run
ffmpeg
WebRTC

Agent Runtime Harness & cookiy-agent

A pluggable agent runtime built from scratch — single event loop (AgentLoop), unified event stream (AgentEvent), and three swappable contracts (Tool / LlmProvider / SessionStore); no LangChain. On top of it: a Slack-native research agent that parses natural language into tool calls via Gemini function-calling and executes them inside ephemeral E2B sandboxes, with webhook-driven session resume for long tasks.

TypeScript
Gemini
E2B
Slack
MCP

Cookiy MCP Server & LLM Gateway

Wrapped the entire SaaS as an agent-callable MCP server: 30 decorator-registered tools over SSE / StreamableHTTP / stdio, approved through the official OpenAI Apps review. Unified agent & CLI auth with RFC 8414/9728 OAuth discovery and stateless compact tokens (~75% smaller than JWT). All LLM traffic governed through Cloudflare AI Gateway: multi-model routing, rate limiting, cost observability, ~86% prompt-cache hit across 759K+ calls.

NestJS
MCP
OAuth 2.1
Cloudflare AI Gateway
TypeScript

Recruit Platform (0→1)

Built Cookiy's participant-recruitment vertical solo: adapter + runtime-registry pattern normalizing Prolific / CloudResearch / CINT into one domain model (new supplier ≈ one adapter), a composable screener engine (piped questions, min-selections, 3-state screen-in), and a concurrency-safe matching & billing engine — DB transactions, conditional atomic updates, idempotent wallet dual-write with drift=0. 50k+ respondents and 700+ studies served in production.

NestJS
Prisma
PostgreSQL
Redis
React

Claude Code Usage Dashboard

I'm a heavy user of AI coding tools — this dashboard makes it measurable. A GitHub Actions pipeline auto-syncs my daily Claude Code token usage to a public page: ~2B tokens and $1.6k+ API-equivalent cost in a single month, broken down by input/output/cache and per-day cost.

GitHub Actions
TypeScript
Automation

ard-registry

An ard-spec v0.9 conformant Agentic Resource Discovery (ARD) registry in TypeScript — crawls ai-catalog.json, BM25 search, federation between registries, self-publishes via .well-known. Verified with the official conformance CLI.

TypeScript
Node.js
BM25

obsidian-image-uploader

Obsidian plugin that auto-uploads pasted images to a GitHub repo and rewrites links to permanent jsDelivr CDN URLs — keeping vaults lightweight while notes stay portable.

TypeScript
Obsidian API
GitHub API
Research

Research & Publications

Alongside product engineering, I work on research about agentic systems — how agents fail, and how those failures become training signals.

AOI: Turning Failed Trajectories into Training Signals for Autonomous Cloud Diagnosis

Under review at SIGKDD 2026 · arXiv preprint · cs.LG / cs.AI

Co-authored a multi-agent framework for automating SRE diagnosis with LLMs: distills expert knowledge into open-source models via GRPO, isolates execution behind a read-write separated architecture, and converts failed operation attempts into training signals — 66.3% success on benchmark tasks, with failure analysis alone contributing +4.8pp.

Contact

Get in Touch

Want to chat? Just shoot me a dm with a direct question on twitter and I'll respond whenever I can. I will ignore all soliciting.

GitHub
GitHub
X