Hi, I'm Sanket Nyayadhish

Senior AI Engineer | LLMs × GenAI × Backend × Cloud Infrastructure

Building production LLM systems, scalable backends, and AI-optimized infrastructure

LLMs
GenAI
Backend
Cloud

About Me

I'm a senior engineer passionate about building production-grade AI systems at the intersection of LLMs, distributed backends, and cloud-native infrastructure. My work focuses on taking cutting-edge AI research and transforming it into reliable, scalable systems that solve real-world problems.

I believe in learning by building, sharing knowledge openly, and pushing the boundaries of what's possible with modern AI technology.

Production AI

Building scalable LLM systems that work in the real world

Innovation

Implementing cutting-edge research in production

Open Source

Contributing to and building open-source projects

Core Competencies

LLMs & GenAI

  • Advanced RAG Systems
  • Fine-tuning (LoRA/QLoRA, RLHF, DPO)
  • Prompt Engineering Patterns
  • Multi-agent Orchestration
  • Vector Databases

AI/ML Stack

PyTorch Transformers LangChain vLLM PEFT TRL LlamaIndex MLflow

Backend & Services

  • Go (Microservices, gRPC)
  • Python (FastAPI, async)
  • Event-Driven Architecture
  • REST, gRPC, GraphQL
  • Observability Tools

AI Infrastructure

Kubernetes Terraform Docker ArgoCD KServe Ray GPU Optimization

Supported Models & Frameworks

Proprietary LLMs

GPT-4, GPT-4 Turbo, Claude 3, Gemini Pro

Open Source LLMs

Llama 3/3.1, Mistral, Phi-3, Qwen 2, Gemma 2

Embedding Models

OpenAI Ada-002, Cohere, BGE, E5

Inference Engines

vLLM, TGI, ONNX Runtime, Triton

Featured Projects

Enterprise RAG System

Production-grade Retrieval Augmented Generation platform with advanced chunking, hybrid search, and multi-LLM support.

  • Hybrid retrieval with reranking
  • Multi-LLM router (OpenAI, Anthropic, local)
  • Streaming responses with citations
  • RAG evaluation framework (RAGAS)
Python FastAPI LangChain Weaviate vLLM
View Project

Autonomous LLM Agents

Multi-agent system demonstrating agentic AI patterns with tool use, planning, memory, and orchestration.

  • ReAct pattern with self-reflection
  • Dynamic tool registry
  • Multi-tier memory system
  • Sandboxed execution environment
Python LangGraph GPT-4 Claude Weaviate
View Project

LLM Fine-tuning Platform

End-to-end platform for fine-tuning open-source LLMs with experiment tracking, evaluation, and deployment.

  • LoRA/QLoRA for Llama 3, Mistral, Phi
  • Comprehensive evaluation suite
  • MLflow + W&B tracking
  • Automatic deployment to vLLM
PyTorch Transformers PEFT MLflow vLLM
View Project

AI Gateway Microservices

High-performance Go backend for LLM service routing, request management, and observability.

  • Multi-provider routing with failover
  • Semantic caching for cost reduction
  • Token-based rate limiting
  • Sub-10ms p99 latency
Go gRPC Redis PostgreSQL Prometheus
View Project

GPU Kubernetes Platform

Production Kubernetes infrastructure optimized for GPU workloads and model serving with GitOps.

  • GPU node pools (T4, A10G, A100)
  • KServe + vLLM operator
  • Ray for distributed training
  • GitOps with ArgoCD
Terraform Kubernetes Helm ArgoCD KServe
View Project

Current Focus

Building agentic workflows with multi-agent collaboration

Optimizing RAG systems for production (latency, cost, quality)

Exploring RLHF and DPO for LLM alignment

GPU cost optimization strategies for model serving

Responsible AI practices and bias mitigation

Let's Connect!

I'm always interested in connecting with fellow engineers, researchers, and builders in the AI space.

Open to:

Collaboration Technical Discussions Mentorship Open Source