Essential tips to reinforce your app and infrastructure
Video transcript
We utilized ChatGPT to enhance the grammar and syntax of the transcript.
Greg: Welcome to Upsun live stream! Today, we're diving into some essential tips to ensure apps, infrastructure, and everything related to DevOps and the development world are rock solid, especially with the holiday season approaching.
For some of us, it's like, "Wait, it's too early for this," but with 80 degrees Fahrenheit outside, and going up into the hundreds, it still feels early for the holidays. I don’t know the exact Celsius conversion, but that’s probably up into the 32-degree range on that scale. I should probably have my little Santa Claus hat on, or whatever holiday you celebrate, because the season is coming.
We’re going to focus on stability, how to handle traffic spikes, and more. We might even have a special guest joining us to share their horror stories from Black Friday, along with some tips. But all that aside, my name is Greg Qualls, and I'm a wannabe developer based out of Texas.
And with me, as always—well, because this is our first time—Thomas is here. I’m terrible at pronouncing names, so I’ll let him introduce himself.
Thomas: Hi, I’m Thomas di Luccio. It’s an Italian name, though I’m French. I’m a developer advocate and a designer-developer, and I’m really excited to talk about this topic today.
Greg: Before we dive in, we want to explore a few different segments. For anyone joining us for the first time, we’ll be covering a few topics, starting with emerging news.
First up, we’re diving into our emerging news segment! I love these little bumpers—shout out to Kirby for making them. Thomas, it looks like you're first. What’s been making headlines for you?
Thomas: Yeah, I've come across a few articles lately discussing the idea that maybe the AI bubble is finally bursting. To be clear, AI is here to stay. It’s an amazing technology, but there’s a sense that it’s time to cool down. A lot of startups have raised significant funds simply by adding AI to their pitch decks.
Greg: Really? I’ve never heard of that. I mean, all the AI on every site is 100% authentic and absolutely necessary, right?
Thomas: Exactly. I mean, do we really need an AI-powered toothbrush? Maybe we’re entering the phase where people ask, “Do we really need this?” It’s about figuring out how to add value for users, not just tacking on AI at a huge cost.
Greg: What fascinates me about this AI bubble is, as someone who lived through the dot-com bubble (even though I wasn’t heavily involved), there were similar jokes back then. It was like, "Oh, now everyone has a website. What could anyone need a website for?" Then the bubble burst, but now, if you don’t have a website, you don’t have a business.
I’m curious what the future holds for AI. After the hype dies down, I wonder how AI will actually integrate into the ecosystem. Ten years from now, AI might be as common in apps as websites are for businesses today, but it will be used in the right context—focused on productivity and functionality.
Thomas: Absolutely! I think the same thing could happen with AI—after the initial hype, it will find its proper place. Moving on from the AI bubble, there’s something that caught my attention this week: ChatGPT’s transition from Next.js to Remix.js. This has been all over my feed and TikTok.
Greg: OpenAI hasn’t directly explained why they made the move, but from what I’ve gathered, and I agree with Wes Bos on this, it seems like they’re moving away from server-side rendering and focusing more on client-side rendering to make things faster. It’s interesting because ChatGPT is such a huge app, and making a framework change like this is a big deal. What’s impressive is that, as a user, the transition was seamless—I didn’t even notice it until people started talking about it.
What do you think, Thomas? Should they have waited until after Black Friday to make this switch?
Thomas: Honestly, I’m always puzzled by this framework frenzy. People are so passionate about their favorite frameworks that they track who’s using what. As an active user of ChatGPT, I don’t really care about which framework they use, as long as it serves its purpose. If they’re happy with the switch, maybe I should spend more time learning how Remix works.
Greg: For me, what’s fascinating is the speculation about why they did it. Sometimes it’s as simple as someone trying it out over the weekend, noticing things run faster, and deciding to switch. Or, maybe one person just likes Remix more and has enough pull within the company to make it happen. There’s always the hunt for some deep technical reason, but sometimes it’s really just about personal preference.
And with that, we’ve covered the emerging news for today. Now, we’re moving into our "Stash of the Day" segment, where we share some tools or resources we’ve found useful recently.
I’ll kick things off! Here’s something fun I found—a Visual Studio Code plugin called Indent Rainbow. As the name suggests, it adds a rainbow of colors to your indents. It’s probably hard to see on screen, but it makes the indents more visually distinct. As I’ve been getting older, it’s harder to track which div belongs to which section, and these colors make it much easier. Plus, it just makes my code look happy!
Thomas: I tried it after you shared it with us, and it definitely fuels my OCD. If you struggle with keeping everything perfectly aligned, this plugin is a game changer, but it can also make you obsess over it even more!
Greg: Exactly! That’s why I combine it with Prettier. Prettier handles the formatting automatically, and then the colors help me quickly scan the code to find what I need. It’s simple and makes coding a bit more enjoyable.
Now, Thomas, I’m excited to hear about your stash of the day.
Thomas: Sure! My stash is Locust.io. I recently worked on a piece for the Blackfire.io blog, and I was exploring options for load testing. That’s when I discovered Locust.io, an open-source project for load testing with Python. I’m not the best Python developer, but I was able to set up load testing scenarios for an application in just a couple of hours. It’s super user-friendly and powerful.
Greg: That’s awesome! I’m familiar with Gatling for load testing, but it’s great to know there’s an open-source alternative like Locust.io. I’ve dabbled in Python too, so this sounds right up my alley. I’ve never actually run a load test myself—only been on the receiving end—so I’ll definitely have to check this out.
Thomas: It’s really straightforward! I ran the tests locally, and it was hitting a remote server. A few simple commands and everything was set up and running. Locust also offers a cloud version if you don’t want to handle the infrastructure yourself.
Greg: I love that it’s open source. You don’t always come across open-source tools for load testing, so I’ll definitely be giving it a try. Now that we’ve shared our stashes of the day, it’s time to dive into the main topic: prepping your app for the holiday rush!
As we mentioned earlier, with the holiday season fast approaching, it's crucial to ensure your app is ready for the increased traffic and demands. Today, we’ll discuss strategies to reinforce your app's stability and keep everything running smoothly during peak times, like Black Friday and Cyber Monday.
This isn’t necessarily a formal webinar; we’re just discussing some key ideas. Some of these might be no-brainers, but they’re worth revisiting, especially as a reminder to start implementing them now. Joining us for this segment is a special guest, Guillaume, who has over 25 years of development experience. He’s worked with various e-commerce companies and has weathered several Black Friday events. Some people have experience, and others have scars, and I think Guillaume might have a few of both!
Guillaume: Thanks for the intro, Greg, and thanks for the reminder about my 25 years of experience—always a great feeling. And yes, I started coding when I was 14, not five, but close enough!
Greg: So, Guillaume, what would be your first tip for getting ready for holiday traffic?
Guillaume: My first tip ties into what Thomas mentioned earlier about load testing. You need to run performance tests to see how your system behaves under heavy load. But you also need enough resources to simulate those users. Most major e-commerce platforms are trying to handle tens of thousands of transactions at once, so you’ll need to stress test at that scale. Tools like Locust Cloud are great because they save you from setting up dozens of AWS instances yourself just to generate fake traffic.
The key is preparation. At the e-commerce agency I worked for, we managed 30 to 40 large retailer websites, mostly in fashion. Black Friday was always a stressful time. Months before the event, we would start preparing and running tests. Defining your testing scenarios is crucial. You want scenarios that match what your users are actually doing, which can be difficult because users do all sorts of things—browsing catalogs for hours, adding tons of items to their carts, removing them, etc. We spent a lot of time working with clients and looking at analytics to figure out realistic test cases, but even then, you can’t predict everything.
When the actual Black Friday arrived, the pressure was enormous. Marketing teams, technical teams, and agencies were working 24/7 to keep everything running. Fortunately, with advancements in cloud technology, it's much easier now to provision resources. Back in the day, if a server failed, we’d have to physically drive to a data center and plug in a new one. Now, with cloud providers, you can just spin up new instances. But even that can be tricky. During COVID, for instance, we saw an insane amount of e-commerce traffic, and some cloud providers struggled to keep up with provisioning instances due to hardware shortages.
Greg: You mentioned working with a cloud provider—how crucial is it to scale up in advance? Would you recommend testing your infrastructure's ability to scale before Black Friday?
Guillaume: Absolutely! A few months before Black Friday, you need to start scaling up and stress testing. It’s important to slow down on new feature development during this time—not necessarily a full code freeze, but the focus should shift towards optimizing performance. This means running tests, identifying bottlenecks, and planning for the worst-case scenario.
During one Black Friday, we had a client who scaled up their resources to 1,200 CPUs just for the day. That’s the level of traffic we’re talking about. And it’s not just about the servers. Sometimes, components like Redis, which is single-threaded, become bottlenecks. You need to anticipate these issues and be ready to respond quickly.
Greg: Guillaume, would you say you learned these lessons the hard way through experience?
Guillaume: Oh, definitely. I’ve got plenty of scars to prove it. One time, while working for a ticketing company, we secured a deal with a large venue and were thrilled to launch their new season. Everything was running smoothly until the big rush hit, and the entire system collapsed. We had underestimated the load and didn't properly test for those scenarios. That was a harsh learning experience.
Testing under real-world conditions is critical. Create a clone of your production environment and simulate the same traffic levels you expect during peak times. Use observability tools and monitoring to see what’s breaking and where things are slowing down. You might find that parts of your app behave differently under load than in normal day-to-day traffic.
Greg: Thomas, I know you have some thoughts on testing and observability. What best practices would you recommend?
Thomas: Yes, absolutely! Observability is essential. When you're running load tests, use profiling tools like Blackfire or other observability platforms. These tools give you insight into what's happening inside your app, allowing you to pinpoint issues. You might find that certain database queries or functions are the root cause of performance issues under heavy traffic.
One lesson I learned from working at a ticketing company is that your setup needs to be designed for the worst-case scenario, not just for normal traffic. We had issues where the peak activity, like people scanning tickets at a venue, happened at night, and no one was available to scale up the infrastructure. If your system isn't designed to handle scaling automatically, you could be in big trouble.
Greg: That brings up a great point about automated scaling. It sounds like having the right infrastructure and observability tools can make or break your Black Friday.
Guillaume: Absolutely. Automated scaling is key, especially if you're dealing with high-traffic events. You don’t want to rely on manual intervention at 4 AM when your traffic peaks. If your infrastructure can scale automatically based on demand, that’s a huge advantage.
And to Greg’s point, it’s not just about scaling your servers—you need to ensure every part of your system, from your databases to your caching layers, can handle the load. Sometimes it’s the things you don’t expect, like caching issues, that can bring everything down.
Greg: On that note, Guillaume, can you talk more about caching? You mentioned earlier that it's one of the most important things to focus on when preparing for traffic spikes.
Guillaume: Definitely. Caching is one of the best ways to improve performance, especially during high-traffic periods. If you can serve the same page or content to thousands of users from a cache rather than generating it fresh each time, you’ll save a lot of resources and speed up response times.
That said, caching can also be tricky. It’s not just about turning caching on; you need to make sure you’re caching the right things. And when you’re dealing with e-commerce, for example, you want to be careful not to cache dynamic content like user-specific information. But for product pages, category listings, and other static content, caching is a no-brainer.
A lot of performance gains come from smart caching strategies. It reduces the strain on your backend and speeds up the user experience. In fact, one famous statistic from Amazon years ago suggested they could lose millions of dollars for every 100 milliseconds of additional load time. So, you can imagine how critical performance is during a busy shopping event.
Greg: Thomas, I know you’re a big proponent of observability. Could you talk more about the role of observability in caching and performance monitoring?
Thomas: Absolutely. Observability plays a huge role in not only catching performance issues but also identifying where your caching might be failing or underperforming. With tools like Blackfire, you can monitor your app in real time, see where bottlenecks are, and even get recommendations on how to fix them.
For example, let’s say your load testing reveals that your database queries are spiking during peak traffic. With observability, you can trace those queries back to specific parts of your code. Maybe there's a query that’s not optimized, or perhaps you’re pulling too much data. The key is to use these insights to make data-driven decisions about where to cache and where to optimize.
Also, observability helps you avoid situations where developers make the wrong assumptions. For instance, a developer might think, "Oh, it’s just a couple of extra database queries—no big deal." But over time, these small changes can add up, especially under heavy traffic, and cause significant performance degradation.
With observability in place, you get a clear picture of how your application behaves under different conditions, allowing you to make proactive changes before they become critical issues.
Greg: Guillaume, what are your thoughts on the role of testing in this? Is there one key thing you'd focus on if a team has limited time to prepare for the holidays?
Guillaume: If you only have time for one thing, I’d focus on caching and optimizing your backend infrastructure. If you can serve as much content as possible from cache, you reduce the load on your servers significantly. But if we’re talking about a second priority, then yes, observability and testing are crucial.
Testing shouldn’t just be about functionality—it’s about performance, too. Every time you release new features, you should be running performance regression tests to make sure they don’t introduce new bottlenecks. Automated tests are great for this because they can run continuously and alert you if something breaks or slows down.
For example, I remember working on an app where a new feature unintentionally added dozens of unnecessary SQL queries. The app still worked fine under normal traffic, but once the load spiked, those queries became a major issue. That’s why testing and observability go hand in hand. You need to know how every part of your app performs under load and have a plan to fix issues before they hit production.
Thomas: Start small but think strategically. You don't have to implement everything at once, but begin with observability and performance tests for your most critical paths—the parts of your app that handle the most traffic or have the biggest impact on user experience.
Define performance thresholds. For example, you might set a maximum number of SQL queries per request or a time limit on how long a specific operation should take. Then, use tools like Blackfire or similar platforms to track these metrics automatically. If something exceeds those limits, it’s a red flag to investigate.
Also, focus on educating your team. Not everyone has the same level of understanding about performance issues or caching strategies. Make sure your developers understand the impact their changes can have under heavy load and how to use the observability tools effectively.
If your team is strapped for time, even small improvements can make a big difference. For example, improving the performance of one frequently used function or reducing the number of database queries on a high-traffic page can drastically reduce load on your servers.
Greg: Exactly. I think the overall message here is to plan and test well in advance. Whether it's load testing, observability, or optimizing your caching strategy, preparing ahead of time can save you from a lot of headaches when the holiday traffic hits.
That's exactly it—preparation is everything when it comes to handling high traffic during peak times like Black Friday and Cyber Monday. The more you plan ahead, the more you can mitigate those last-minute emergencies that inevitably pop up. One other thing I’d like to add is about the human element in all of this.
When you’re doing load testing and running through your performance checks, you’re not just testing the system—you’re also testing your team. You want to make sure that everyone knows how to handle these situations when they arise. It’s not just about the tech; it's about the processes and communication among your team.
Thomas, Guillaume—do you agree that running these tests helps prep the people as much as the systems?
Guillaume: Absolutely, Greg. Running these tests isn’t just about validating your infrastructure; it’s also about making sure your team knows how to respond in real time. For example, if something breaks during a test, does everyone know what to do? Do they know who to contact? The best way to avoid chaos during the real event is to run through these scenarios in advance.
Thomas: Yeah, 100%. Black Friday is like a fire drill in some ways. You don’t just want your system to perform under load—you want your team to know what to do if something unexpected happens. The more you rehearse these scenarios, the better equipped everyone will be to respond quickly and minimize downtime.
Greg: That’s a great point, and it ties back to the importance of having processes in place. You should have clear backup plans if things go wrong. If something crashes, who’s responsible for fixing it? What’s the backup plan if the primary system fails? These are questions that need to be answered well before the traffic starts hitting your site.
With that in mind, what would be your one last takeaway for teams prepping for high traffic, whether for the holidays or any other major event?
Guillaume: I’d say don’t wait until the last minute. Start your load testing and performance optimization now. Even if you don’t have a lot of time or resources, every bit of optimization you do now will save you headaches later. And definitely make sure you’re leveraging caching and observability tools.
Thomas: My main takeaway is to invest in observability. It’s not just about monitoring—it’s about understanding how your app behaves under stress. If you can see the warning signs early, you can fix issues before they become catastrophic. Also, use your load tests to identify weak points and reinforce them ahead of time.
Greg: Great advice. I’d also add that communication is key—both within your team and with any external vendors you’re working with. Make sure everyone knows what’s happening and has a plan in place. That way, if something goes wrong, it’s not total chaos trying to figure out what to do.
With that said, I think it’s time to move into our poll request segment, where we answer questions from the audience.
Our amazing producer, Celeste, is pulling up questions now. The first one comes from the chat:
“Do you see a frenzy in frameworks, or do you think just a few frameworks will emerge and stick around long term?”
Guillaume: That’s a great question. I think there will always be new frameworks coming and going. Right now, React, Angular, and Vue are the big players, and I think they’ll be around for quite a while. But you also have frameworks like Svelte and Remix gaining popularity. The key is to not get too caught up in the hype. Use the framework that best suits your project’s needs and has a solid community and ecosystem behind it.
Thomas: I agree. There’s definitely a lot of excitement around new frameworks, but I tend to stick with the tried-and-true ones like React. It’s been around for a long time, has a massive community, and tons of resources. That said, it’s always good to keep an eye on emerging frameworks—just don’t switch for the sake of switching.
Greg: Yeah, I’m learning Flask right now, and it’s been great for what I need. For me, it’s less about the framework and more about what you’re comfortable with and what works best for the project at hand.
Greg: Awesome. Let’s move to the next question:
“How do you balance the pressure to release features with the need to maintain performance during high-traffic events?”
Guillaume: This is always a tough one. I think the key is to communicate with your stakeholders—whether that’s your product team, marketing, or whoever—and explain the potential risks of pushing too many new features right before a major event. Ideally, you should implement a feature freeze leading up to Black Friday or any high-traffic period so you can focus on performance and stability.
Thomas: Yeah, I’d say feature freezes are your friend in this case. It’s really tempting to push new features out before a big event, especially if there’s a marketing push behind them. But you have to weigh the risk of something breaking against the potential benefits of the new feature. Sometimes, it’s better to hold off and ensure the system is stable.
Greg: That’s great advice. I think the conversation between development and business teams is critical here. Both sides need to understand the trade-offs involved in releasing new features versus ensuring stability.
Greg: And with that, it looks like we’re wrapping up today’s live stream. Thank you to everyone who joined us for Upsun Live! We hope you found these tips helpful.
A big thank you to our guest, Guillaume, and of course, Thomas. Thank you as well to our producer, Celeste, and Pablo, who handled the technical side behind the scenes. Be sure to check out Upsun.com and Blackfire.io for more resources on app performance and observability.
Stay safe, keep coding, and we’ll see you next time! Take care, everyone.