Sergey Tovpeko — AI / LLM / MLOps

MLOps / LLM Infrastructure Engineer

Experience

3 years 5 months

Tech Stack

VLLM · Kubernetes/k3s · Python · FastAPI · Rust · Docker · RAG · Ollama · CUDA · Prometheus · pgvector · Go

Work Experience

Middle MLOps

Apr 2025 – May 2026

Designed and deployed a production-ready fault-tolerant architecture for high-load LLM inference on vLLM and Ollama.
Built MLOps processes from scratch: CI/CD pipelines for delivering ML models to Kubernetes (k3s), observability via Prometheus/Grafana.
Implemented a RAG system on Qdrant/pgvector for semantic search to improve LLM answer accuracy.

DevOps Engineer

Feb 2024 – Apr 2025

Created load testing tooling: CLI wrappers and traffic generators in Go and Python.
Designed external dependency mocking for integration tests in isolated Docker environments.
Redesigned monitoring and configured business metric collection in Victoria Metrics/ClickHouse.

ETL Programmer

Jul 2023 – Feb 2024

Developed the corporate data bus and optimized Apache Airflow DAGs.
Integrated PostgreSQL, MariaDB, and MinIO, and deployed the ELK Stack for logging.

Programmer

Dec 2022 – Jun 2023

Deployed infrastructure in Yandex Cloud for delivering advertising content to a restaurant network.
Developed server modules in C#/FastAPI and plugins for IIKO SDK POS systems.

Desktop GIS & Telemetry

Designed a desktop application for real-time spatial data monitoring and visualization.
Implemented trajectory analysis algorithms and streaming telemetry architecture.

AI Voice Assistant Backend

Developed a Python/FastAPI backend for streaming voice data processing and LLM interaction.
Configured low-latency NLP pipelines and prototyped agent short-term and long-term memory.