← CIOTP 2025 proceedings

Co-Located Workshop · CN-CSA · 2025

Workshop on Cloud-Native Systems for Consumer-Scale Applications

October 20, 2025 · Moscone Center West · San Francisco, USA

Acronym
CN-CSA
Submissions
71
Accepted
21
Co-Located With
CIOTP 2025

About the Workshop

CN-CSA is a full-day workshop dedicated to the systems engineering behind consumer-scale marketplaces, search, and recommendation surfaces. It pairs academic researchers in distributed systems with senior industry engineers running production platforms at hundreds of millions of monthly users. The 2025 edition focused on cloud-native patterns that combine retrieval, ranking, and LLM-based reasoning under strict tail-latency and cost budgets, and produced a community-authored short report distributed alongside the main proceedings.

Call for Papers — Topics of Interest

  • · Retrieval and ranking under tail-latency SLOs
  • · Prompt and retrieval caching at the edge
  • · Guardrails and trust-and-safety for user-generated and listing content
  • · Cost-aware autoscaling for inference and serving
  • · Evaluation harnesses for production search and recommendations
  • · Multi-region data placement for personalization

Organizers

Prof. Helena Larsson

Workshop Chair

KTH Royal Institute of Technology

Dr. Mei Hwang

Workshop Co-Chair

National University of Singapore

Srikanth Jonnakuti

Industry Co-Chair

Realtor.com (News Corp)

Submissions were managed on HotCRP under a double-blind protocol. Each paper received three independent reviews from the 12-member programme committee, followed by online discussion. Final accept/reject decisions were made jointly by the three workshop co-chairs (Larsson, Hwang, Jonnakuti) on the basis of the reviews and discussion.

Invited Keynote

Dr. Eric Brewer

VP Infrastructure, Google · UC Berkeley

"Beyond CAP: Operating Stateful Consumer Services at Planet Scale"

Important Dates

Paper submission deadline
June 30, 2025
Author notification
August 18, 2025
Camera-ready due
September 8, 2025
Workshop date
October 20, 2025

Workshop Programme

  1. 08:30 - 09:00Registration and breakfast
  2. 09:00 - 09:15Opening remarks (Larsson, Hwang, Jonnakuti)
  3. 09:15 - 10:15Keynote: Dr. Eric Brewer (Google) - Beyond CAP
  4. 10:15 - 10:45Coffee break
  5. 10:45 - 12:15Session 1: Retrieval and Ranking at Scale (4 papers)
  6. 12:15 - 13:30Lunch
  7. 13:30 - 15:00Session 2: LLM Serving and Caching (5 papers)
  8. 15:00 - 15:30Coffee break
  9. 15:30 - 16:30Session 3: Trust, Safety, and Evaluation (4 papers)
  10. 16:30 - 17:30Industry panel: Roadmaps from Realtor.com, Zillow, Airbnb, Booking
  11. 17:30 - 17:45Closing and best-paper announcement

Accepted Papers

  1. 01

    Latency-Aware Hybrid Retrieval for Marketplace Search

    J. Park, A. Singh, M. Ribeiro

    Stanford University · Fastly

  2. 02

    Prefix-Cache Sharing Across Tenants in Multi-Model Inference Fleets

    H. Brooks, L. Zhao, K. Iyer

    Anthropic · OpenAI · Google Cloud

  3. 03

    Cost-Aware Autoscaling for Recommendation Serving at Consumer Marketplaces

    D. Alvarez, J. van der Meer

    Booking.com

  4. 04

    Evaluation Harnesses for Conversational Search in Real-Estate

    A. Sharma, P. Ramanathan, J. Park

    Zillow Group · ACM SIGAI · Stanford University

    doi: 10.1109/CIOTP.2025.W-CNCSA.04ACM Artifact · availableACM Artifact · evaluated functional
    Extended abstract

    We report on a production evaluation harness used to assess conversational search quality on a U.S. real-estate marketplace serving tens of millions of monthly users. The harness combines offline LLM-judge scoring against a curated 12k-query gold set with an online interleaving framework that compares candidate retrieval and ranking pipelines on live traffic under a strict 200ms p95 budget. We describe the corpus construction, the bias-mitigation procedure for the LLM judge, the variance-reduction techniques used to keep online experiments tractable at our traffic volume, and the operational guardrails (per-tenant rate limits, cost ceilings, and rollback hooks) that allowed the harness to be run continuously in production. The framework has been used to gate every conversational-search release on the platform since Q1 2025, and we summarise the categories of regression it has caught that offline metrics alone missed. An anonymised replication package (gold-set schema, judge prompts, interleaving simulator, and evaluation notebooks) is published alongside the paper.

  5. 05

    Guardrails for Listing-Generated Content in Two-Sided Marketplaces

    M. Chen, F. Okafor

    Airbnb

  6. 06

    Multi-Region Personalization with Bounded Staleness

    A. Krishnan, R. Iyer

    Netflix · Microsoft Azure AI

  7. 07

    Trust-and-Safety Pipelines for AI-Assisted Listings

    E. Marquez, T. Novak

    Red Hat · Amazon Web Services

  8. 08

    Edge-Cached Embeddings for Sub-100ms Retrieval

    O. Reid, M. Ribeiro

    Cloudflare · Fastly

Workshop Programme Committee

  • Prof. Andre Dupont

    EPFL

  • Dr. Maya Patel

    Amazon Web Services

  • Sara Okonkwo

    Google Cloud

  • Dr. Rajeev Iyer

    Microsoft Azure AI

  • Lin Zhao

    OpenAI

  • Hannah Brooks

    Anthropic

  • Aditi Krishnan

    Netflix

  • Olivia Reid

    Cloudflare

  • Karthik Iyer

    Google Cloud

  • Prof. Linnea Bergstrom

    Chalmers University of Technology

  • Dr. Rohan Mehta

    IIT Bombay

  • Prof. Maria Chen

    University of Toronto

Citation

@proceedings{cn_csa_2025,
  title     = {Proceedings of the Workshop on Cloud-Native Systems for Consumer-Scale Applications (CN-CSA 2025)},
  booktitle = {Co-located with Proceedings of the 7th International Conference on Cloud, IoT & Agentic AI (CIOTP 2025)},
  year      = {2025},
  address   = {San Francisco, USA},
  publisher = {IEEE}
}