4 min read

How We Cut a Lambda Bill by 96% on an AWS Environment Nobody Thought Was a Problem

The client is a global financial services group with operations across multiple jurisdictions. Their platform runs on AWS, and that's where this story takes place. When we started looking at their environment, Lambda jumped out almost immediately as a major cost driver. That's not something they could have spotted on their own, and it's worth explaining why.

Publication Date

May 2026

Reading Time

4 Min

Savings

96% Cost Reduction

• The Visibility Problem
• What We Found
• The Quick Win
• The Real Fix
• Why This Matters

The Visibility Problem

AWS, by default, does not show you cost per individual resource. The console gives you cost by service. You can see "Lambda cost us X this month", but you cannot see which specific Lambda function is responsible for that number, or which invocation pattern is driving it.

The only way to get that level of detail is to export your billing data through CUR (Cost and Usage Report) or the newer FOCUS format, drop it somewhere queryable, and run analysis on top of it. Most engineering teams never do this. It's not that they don't want to — it's that the setup is non-trivial and the analysis on top of it is even harder.

This is the standard starting point for us on any AWS engagement. CUR data is the ground truth.

What We Found

One Lambda function was responsible for a disproportionate chunk of the Lambda spend. When we looked at what it was doing, the picture got worse.

The function was making a call to a third-party API and waiting for the response. That wait was taking up to 14 minutes per invocation. On top of that, the function had been provisioned with a high memory allocation, presumably "to be safe".

Here's the issue: Lambda is billed by time multiplied by memory. Every gigabyte-second of execution costs money. A function that sits idle for 14 minutes waiting on someone else's server, with high memory attached to it the whole time, is the worst possible shape for a Lambda workload. You're paying full price for memory that isn't doing anything, for almost a quarter of an hour at a time.

The Quick Win

Before touching the architecture, we looked at the memory allocation. The function didn't actually need anywhere near what was provisioned. We dropped the memory by 66%.

Because Lambda pricing is linear in memory, that single configuration change cut the cost of that function by 66% overnight. No code change. No risk. Just a corrected setting.

The Real Fix

The quick win was good, but the architecture was still wrong. A Lambda function is not the right place to wait 14 minutes for a third party. The right pattern is to fire the request, hand off to a queue or a step function, and only resume execution when the third party responds. That way you're not paying for idle compute.

Once we worked through the architectural change with their team and the code was reshaped around an event-driven flow, total cost on that workload dropped by more than 96% versus the original baseline.

Why This Matters

Two things had to be true for this saving to surface.

First, you need the cost data at the resource level. Without CUR or FOCUS in place, the high cost of this Lambda was simply invisible. The bill said "Lambda" and the number looked like Lambda numbers usually look. Nothing flagged.

Second, once you have the data, you need someone who can read it and understand what the function is actually doing. Knowing that Lambda is billed on time multiplied by memory is one thing. Knowing that a 14-minute wait on a third party is an architectural smell, not just a cost problem, is another.

This is the difference between a FinOps dashboard and engineering-grade FinOps. A dashboard tells you Lambda is expensive. We tell you which function, why it's expensive, what to change today to cut the bill in two thirds, and what to change next quarter to cut it by 96%.

That's the kind of gain Koritsu is built to find.