Vasan Ramesh
SOFTWARE engineer & AI SYSTEMS
Building evaluation frameworks for frontier AI models at Meta, and the large-scale infrastructure underneath them.
About
Eight-plus years across AI systems, large-scale distributed infrastructure, and machine-learning pipelines. Currently building evaluation frameworks and benchmark systems for supervised fine-tuning of frontier models, including the coding harness behind Meta’s Muse Spark.
Before that: payments at Walmart-scale, performance engineering at Ancestry, and data-pipeline infrastructure at Sprinklr.
Off the clock , , on calm water, and one more game of . Wololo.
Experience
-
Evaluation frameworks and SFT benchmark systems for Meta’s frontier models; building MetaCode, the coding harness for Muse Spark. Architected a persistent, shareable memory system for internal AI agents, with RAG retrieval recall above 90%.
Before that, measurement and data infrastructure across Business AI, Marketing Messages, Commerce Insights, and Ads. Selected work below.
-
Re-architected a payments settlement platform processing $900M a year from monolith to microservices.
Took a $50M/day transaction system from 270 to 4,000+ TPS, at 70% lower cost.
-
Intelligent auto-scaling for Kafka consumers in large-scale ETL pipelines; vector auto-regressive time-series framework for anomaly detection. Employee of the Quarter, Q4 2018.
-
Serverless notification system on Lambda + SQS with a 99.99% delivery rate across four channels.
Selected Work
Muse Spark
Meta’s frontier AI model, ranked 4th globally on the Artificial Analysis Intelligence Index. Building the SFT evaluation and benchmark systems behind it, and MetaCode, its coding harness.
②Business AI
End-to-end measurement infrastructure for Meta’s business AI agents, quantifying agent impact on real business outcomes and unblocking the product’s general-availability launch.
③Marketing Messages
Led metrics computation and data-processing flows for business messaging. Modeled reporting under privacy constraints unblocked $6B in incremental revenue.
④Commerce Insights
Consistent cross-channel commerce measurement over data spread across multiple stores: HLL sketches, query optimization, and re-modeled views that cut months off development time.
⑤Ads Measurement
Measurement pipelines at ads scale: graph data-model redesign and Spark optimization, cutting compute by 95% and storage by 96%.
Contact
Working on something interesting in AI, infrastructure, or both?
Let’s talk