Cloudflare and the Cloud

Cloudflare

Cloudflare has become a popular (or maybe the most popular?) provider for adding useful features to websites and APIs such as CDN and DNS, but what probably sets it apart from competitors is it’s sophisticated firewall and DDOS protection, allowing for fairly fine grained controls over who can access your endpoints, and to protect those endpoints from abuse. Cloudflare also probably has one of the best global networks, helping accelerate requests to your endpoints from anywhere in the world.

Sounds great. So what’s the problem?

Most major cloud providers also optimize their networks in a similar way. They’ll optimize your ingress and egress paths over their own private network so that you enter and exit as close to the endpoints you’re calling so you spend very little time over the public internet.

Sound even better. What’s the point of this infodump?

Well, nothing… most of the time… unless you’re very latency sensitive.

Behind Cloudflare

I’m hosting an API exposed via Cloudflare but my customers expect the absolute tightest latencies. You might be in a similar situation, or you are a client trying to access an API with the best possible latency.

If latency was such a concern, why am I using Cloudflare anyway? Maybe I am worried about DDOS attacks, or I want to make sure my API can only be accessed from a certain region.

It doesn’t really matter what the exact use case is for the purpose of this article.

To experience the observations of my API consumers, I simulate their requests with a simple API client. My API is hosted at GCP us-west-2 (LA) so I host my client code there as well: https://cloud.google.com/about/locations/

To make it even better, Cloudflare also has a datacenter in the same region: https://www.cloudflare.com/network/.

I run my client by calling my GCP endpoint directly without Cloudflare and see 1ms baseline latencies. Perfect! Hopefully Cloudflare won’t add much to this, so I run the same test with Cloudflare proxied endpoints.

My baseline latency is now over 10ms.

What happened? Is this a cloudflare induced cost? If so, seems a bit excessive.

Before we go ahead and try to debug this, why baseline latencies? It’s because I only care about the best possible latencies to my APU. There will be network induced jitter which will be worse (and sometimes significantly so) if the requests have to travel via he intenet to Cloudflare. Managing the tradeoffs of better latency vs worse jitter is out of the scope of this article.

Digging deeper

Cloudflare provides a /cdn-cgi/trace endpoint to provide some useful debugging info about how an endpoint proxied behind Cloudflare is being accessed. Running that (anonymized):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
$ curl my.apibehindcloudflare.com/cdn-cgi/trace
fl=123456
h=my.apibehindcloudflare.com
ip=123.456.789.10
ts=1678639842.232
visit_scheme=https
uag=curl/7.74.0
colo=SJC
sliver=none
http=http/3
loc=US
tls=TLSv1.3
sni=plaintext
warp=off
gateway=off
kex=X12345

Notice the color=SJC. SJC represents the San Jose Cloudflare data center. This essentially represents that the request is hitting the SJC data center even though I’m making the request from LA to an endpoint in LA while Cloudflare has an LA location. What gives?

AWS us-west-1 (N. California) should be pretty close so let’s try that as well. I now see 15ms baseline latencies. Turns out this is even worse:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
$ curl my.apibehindcloudflare.com/cdn-cgi/trace
fl=123456
h=my.apibehindcloudflare.com
ip=123.456.789.10
ts=1678639901.337
visit_scheme=https
uag=curl/7.74.0
colo=SEA
sliver=none
http=http/3
loc=US
tls=TLSv1.3
sni=plaintext
warp=off
gateway=off
kex=X12345

Couldn’t be worse. It’s hitting Cloudflare Seattle.

More digginng. Traceroute from GCP:

1
2
3
4
$ traceroute my.apibehindcloudflare.com
traceroute to my.apibehindcloudflare.com (123.456.789.10), 30 hops max, 60 byte packets
 1  * * *
 2  123.456.789.10 (123.456.789.10)  10.259 ms  10.240 ms  9.739 ms

MTR is equally unhelpful

1
mtr -rwnzc10 my.apibehindcloudflare.com

AWS turns out to be more helpful:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
$ traceroute my.apibehindcloudflare.com
traceroute to my.apibehindcloudflare.com (123.456.789.10), 30 hops max, 60 byte packets
 1  some internal AWS point
 2  some internal AWS point
 3  some internal AWS point
 4  some internal AWS point
 5  some internal AWS point
 6  some internal AWS point
 7  some internal AWS point
 8  some internal AWS point
 9  some internal AWS point
10  some internal AWS point
11  some internal AWS point
12  some internal AWS point
13  some internal AWS point
14  some internal AWS point
15  some internal AWS point
16  some internal AWS point
17  some internal AWS point
18  some internal AWS point
19  some internal AWS point
20  some internal AWS point
21  some internal AWS point
22  some internal AWS point
23  some internal AWS point
24  some internal AWS point
25  some internal AWS point
26  some internal AWS point
27  some internal AWS point
28  123.456.789.10 (123.456.789.10)  33.521 ms  33.837 ms 242.4.195.195 (242.4.195.195)  34.699 ms

Now that’s interesting.

What is going on

The requests from AWS takes a detour all the way to Seattle with 27 hops within AWS itself before it hits Cloudflare. Presumably Cloudflare has some sort of direct connectivity to AWS via it’s SEA datacenter (perhaps in AWS us-west-2 region), and broadcasts my endpoint there. AWS wants to optimize the network path taken to the endpoint via their dedicated network instead of the internet, which involves routing the request all the way to SEA thinking that it’s bypassing the internet entirely.

Looking at GCP docs regarding their network tells us that GCP is probably doing something similar (and likely most other cloud providers as well): https://cloud.google.com/network-tiers/docs/overview

GCP has two network tiers: Standard and Premium. Premium tier claims to deliver traffic from external systems to Google by using Google’s network with the ingress/egress to the internet being at a PoP google deems is optimal.

AWS and GCP are both doing the same thing that Cloudflare is, that is try to optimize the network path. In my case they are optimizing the path taken by the request to the endpoint so that the egress is as close to where they think the endpoint is, in case of AWS LAS that is Cloudflare SEA and GCP us-west-2 that is SJC. And in doing so, they’re just making things worse.

Fix

Remember the GCP standard tier?

GCP standard tier will have the traffic exiting google’s network at the closest PoP. Turning it on, we see a baseline latency of 2ms!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
$ curl my.apibehindcloudflare.com/cdn-cgi/trace
fl=123456
h=my.apibehindcloudflare.com
ip=123.456.789.10
ts=1678641229.124
visit_scheme=https
uag=curl/7.74.0
colo=LAX
sliver=none
http=http/3
loc=US
tls=TLSv1.3
sni=plaintext
warp=off
gateway=off
kex=X12345

Voila! We’ve hit the cloudflare LAX data center. Turns out in this case Google’s standard network tier works better than the Premium one.

Traceroute shows something similar. The requests route via if-ae-6-20.tcore1.eql-losangeles.as6453.net before hitting cloudflare.

I’m not sure what the equivalent of GCPs standard tier is on AWS or if it is even possible to get AWS to have my requests exit to the internet at the nearest PoP.

TL;DR

If you’re using Cloudflare in a latency sensitive environment in the cloud, take care to account for the fact that your cloud provider and Cloudflare might both be trying to optimize the network usage, clashing with each other’s improvements, ultimately resulting in a worsened performance.

In another article I will show how cloudflare itself might be adding more unpredictability to this.