We ran real-world load tests across seven different infrastructure plans—from Grid to Dedicated Split—using realistic conversion rates, bot traffic blends, and ERP-driven API imports. The findings were clear: performance scales predictably with resources, but only if your code, cache, and configuration keep up. This blog post walks through key results, why API load is disproportionately expensive, and what metrics matter most.
How well does Shopware actually perform under load? This question comes up a lot, especially when teams are evaluating new infrastructure or preparing for traffic surges. So we decided to test it ourselves.
We ran structured, production-style load tests across seven different Shopware PaaS infrastructure plans, from entry-level Grid configurations to high-capacity Dedicated Split clusters. The goal was not to break the system, but to simulate real e-commerce usage under pressure: browsing, buying, and backend automation, all with realistic conversion rates and caching behaviour. Here’s what we found.
Many performance tests rely on synthetic scenarios, that is, single endpoints hammered at fixed rates. That’s not how real storefronts behave. Our test plan included:
We also introduced realistic conversion rates (~3%), bot-to-user ratios, and caching behaviour. Caches were prewarmed using automated crawlers to mirror real-world deployments.
As infrastructure plans increased in CPU and memory, Shopware handled more concurrent traffic while keeping response times low. Even modest plans performed well under tuned conditions. For example, a Dedicated Grid Host (DGH32) handled over 7,000 orders per day with sub-second p95 TTFB.
API traffic (such as product updates or ERP syncs) generated a disproportionate share of backend load. Even though it represented only 5–10% of total traffic, API requests were responsible for cache invalidations and spikes in CPU usage.
Plans with prewarmed Fastly caches maintained consistent response times. Those tested without proper cache priming showed significant latency. Cache fragmentation and poorly structured endpoints led to MISS responses, which immediately degraded throughput.
In almost every scenario where performance dropped, the issue traced back to code, configuration, or cache invalidation—not the underlying platform. For example, one plugin left in debug mode introduced disk I/O contention that slowed pages to a crawl.
We used Grafana K6 to simulate traffic and tracked metrics like:
Each test ran for 5 minutes with isolated, clean environments. Conversion logic and traffic shaping were applied to mirror real storefront behaviour as closely as possible.