GeoLocation Redirection with WAF and CloudFront Functions to Bypass AWS Rate Limit
on Amazon, Aws, Cloudfront, Functions, Lambda@edge, Waf, Cdk, Typescript, Geolocation, Redirection, Rate, Limit
When you serve your business in multiple countries, content can vary depending on location. For example, if your website only supports a few languages, to have instructions for nonsupported languages, you might have static pages for them.
You can handle this different geolocation redirection at the backend side but this will increase your computing resource usage and you will not cache the static page content at CDN. Handling on the front end with javascript is also possible but still, the system can only forward your customer based on language or you can make a call to API to learn location with IP but this will increase the RTT and latency which means a negative impact for the customers.
AWS CloudFront Supports Lambda@Edge computing which is the lightweight version of Lambda and directly runs at CloudFront Systems. At first sight, Lambda@Edge is a good solution, it is fast, easy to attach Cloufront, and manageable but If the served website consists of multiple and dynamic pages, the problems start to appear.
Firstly, when you attach Lambda@Edge to the /
path, a customer not required to redirection will invoke your lambda for each request, and then you will encounter a Rate limit For Lambda@Edge and 5xx errors.
Secondly, suppose you want to create exceptions for bots to crawl. In that case, you need to maintain and keep updating your lambda function for different bots, so we want to eliminate bot detection from our code, and then we want to invoke lambda when redirection is required.
Thirdly and most importantly you cannot create basic logic at CloudFront about which paths and conditions must invoke the Lambda but WAF is capable of that such as regex, bot detection, country, and a combination of those features as well.
AWS WAF CDK Implementation
The below stack is a combination of regular rules such as blocking unknown source IPs and our redirection rules.
allow-site-assets
Firstly we are allowing static assets from our firewall because they might be used from beside of web such as Android or iPhone application integration, to secure the backend, you can have different origin configurations for those and only allow GET and remove header and queries to your origin, with this your backend will be safe.
To allow only GET and ignore fields you can use the below configurations at CloudFront additional Behaviors part.
headerBehavior: cdk.aws_cloudfront.CacheHeaderBehavior.allowList(),
cookieBehavior: cdk.aws_cloudfront.CacheCookieBehavior.none(),
queryStringBehavior: cdk.aws_cloudfront.CacheQueryStringBehavior.none(),
AWS-AnonymousIPList
This is an example of regular built-in rules, which can be usable for everyone, there is no requirement for this rule, you can remove it if you want.
AWS-AWSManagedRulesBotControlRuleSet
Managed rules can be used with labelMatchStatement in namespace scope but to be able to use AWS managed rules labels on your custom implementation, AWS WAF requires managedRuleGroupStatement implementation before your rule because that rule load configuration to the system, otherwise labelMatchStatement will not work.
In our case, managedRuleGroupStatement is defined AWS-AWSManagedRulesBotControlRuleSet, and labelMatchStatement which is used to allow bots to crawl without facing redirection is located under the geolocationRedirection rule
geolocationRedirection
If the client is not a:
- bot
- access location is not in dynamically loaded countries
- the path is not assets or static pages
The system will return 302 /redirect which is your redirect lambda function is attached.
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
export class AhmetEngineerWAF extends cdk.Stack {
public readonly WafResource: cdk.aws_wafv2.CfnWebACL;
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
let wafRules: Array<cdk.waf.CfnWebACL.RuleProperty> = [];
let priorityNumber = 1;
function priorityCounter(): number { priorityNumber++; return priorityNumber * 10;}
const siteAssets = [
'manifest.json',
'assetlinks.json',
'sitemap.xml',
];
wafRules.push({
name: 'allow-site-assets',
priority: priorityCounter(),
statement: {
regexMatchStatement: {
fieldToMatch: { uriPath: {}, },
textTransformations: [ { type: 'NONE', priority: 0,}, ],
//! The redirect paths must be add there to prevent infinite loop
regexString: '^\\/(\\b(' + siteAssets.join('|') + '))\\/?$',
},
},
action: { allow: {},},
visibilityConfig: {
sampledRequestsEnabled: false,
cloudWatchMetricsEnabled: false,
metricName: 'ahmet-engineer-site-assets',
},
});
// Example stack
wafRules.push({
name: 'AWS-AnonymousIPList',
priority: priorityCounter(),
statement: {
labelMatchStatement: {
scope: 'LABEL',
key: 'awswaf:managed:aws:anonymous-ip-list:AnonymousIPList',
},
},
action:{ block:{} },
visibilityConfig: {
sampledRequestsEnabled: false,
cloudWatchMetricsEnabled: false,
metricName: 'AWS-AWSManagedRulesAnonymousIPList',
},
});
wafRules.push({
name: 'AWS-AWSManagedRulesBotControlRuleSet',
priority: priorityCounter(),
statement: {
managedRuleGroupStatement: {
vendorName: 'AWS',
name: 'AWSManagedRulesBotControlRuleSet',
excludedRules: [],
},
},
overrideAction: { count:{} },
visibilityConfig: {
sampledRequestsEnabled: true,
cloudWatchMetricsEnabled: true,
metricName: 'AWS-AWSManagedRulesBotControlRuleSet',
},
});
// The static paths, which does not require redirection
var pathWhiteList = [...siteConfigAssets, 'redirect', 'dynamic\\/images', 'world-wide', 'uk', 'ca', 'gl', 'br'];
// Geo Location Redirection
let geolocationRedirection: cdk.aws_wafv2.CfnWebACL.RuleProperty = {
name: 'geolocationRedirection',
priority: priorityCounter(),
statement: {
andStatement: {
statements: [
// if it is not bot, if bot's face with redirection, they can not crawl dynamic pages version of website.
{
notStatement: {
statement: {
labelMatchStatement: {
scope: 'NAMESPACE',
key: 'awswaf:managed:aws:bot-control:bot:',
},
},
},
},
// if the region is not in the list, which is only static page avaible
{
notStatement: {
statement: {
geoMatchStatement: { countryCodes: ['TR'], },
},
},
},
// and if the rule is not redirect path, which means static or similar for everyone
{
notStatement: {
statement: {
regexMatchStatement: {
fieldToMatch: { uriPath: {}, },
textTransformations: [ { type: 'NONE', priority: 0,}, ],
regexString: '^\\/(\\b(' + pathWhiteList.join('|') + '))/?',
},
},
},
},
],
},
},
action: {
block: {
// Redirect client to redirect Cloudfront function page
customResponse: {
responseCode: 302,
responseHeaders: [{ name: 'location', value: '/redirect' }],
customResponseBodyKey: 'geolocationRedirection-body',
},
},
},
visibilityConfig: {
sampledRequestsEnabled: true,
cloudWatchMetricsEnabled: true,
metricName: 'geolocationRedirection',
},
};
wafRules.push(geolocationRedirection);
// Create New waf
this.WafResource = new cdk.aws_wafv2.CfnWebACL(this, 'ahmetengineerWebACL', {
name: 'ahmet-engineer-waf',
description: 'protecting ahmet.engineer web resource',
defaultAction: {
allow: {},
},
scope: 'CLOUDFRONT',
visibilityConfig: {
cloudWatchMetricsEnabled: false,
metricName: 'ahmet-engineer-waf',
sampledRequestsEnabled: true,
},
rules: wafRules,
customResponseBodies: {
[geolocationRedirection.name + '-body']: {
content: JSON.stringify({
detectedRule: geolocationRedirection.name,
priority: geolocationRedirection.priority,
}),
contentType: 'APPLICATION_JSON',
},
},
});
}
}
CloudFront Implementation
We configured our AWS WAF to only trigger the redirection Lambda@Edge function on special conditions, with the below example CDK stack, we will attach created WAF to our CloudFront resource to serve our website.
I am not attaching the lambda code in this blog, you can find details and examples at CloudFront/example-function-redirect-url.
import * as constructs from 'constructs';
import * as cdk from 'aws-cdk-lib';
import path = require('path');
export class AhmetEngineerWeb extends cdk.Stack {
constructor(scope: constructs.Construct, id: string, props: cdk.StackProps) {
super(scope, id, props);
const redirectPageCloudfrontFunction = new cdk.aws_cloudfront.Function(this, 'GeoRedirect', {
functionName: 'GeoRedirect',
code: cdk.aws_cloudfront.FunctionCode.fromFile({
filePath: path.join(__dirname, 'fn/geo-redirect.js'),
}),
});
const appOrigin = new cdk.aws_cloudfront_origins.HttpOrigin("origin.ahmet.engineer", {
protocolPolicy: cdk.aws_cloudfront.OriginProtocolPolicy.HTTPS_ONLY,
});
new cdk.aws_cloudfront.Distribution(this, 'ahmetEngineerDist', {
defaultBehavior: {
origin: appOrigin,
allowedMethods: cdk.aws_cloudfront.AllowedMethods.ALLOW_ALL,
cachePolicy: cdk.aws_cloudfront.CachePolicy.CACHING_DISABLED,
viewerProtocolPolicy: cdk.aws_cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
originRequestPolicy: cdk.aws_cloudfront.OriginRequestPolicy.ALL_VIEWER,
compress: true,
},
additionalBehaviors: {
['/redirect*']: {
origin: appOrigin,
viewerProtocolPolicy: cdk.aws_cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
allowedMethods: cdk.aws_cloudfront.AllowedMethods.ALLOW_GET_HEAD_OPTIONS,
compress: true,
cachePolicy: new cdk.aws_cloudfront.CachePolicy(this, 'GeoRedirector',{
enableAcceptEncodingBrotli: true,
enableAcceptEncodingGzip: true,
cachePolicyName: 'GeoRedirector',
comment: 'redirect path and lambda at edge function',
defaultTtl: cdk.Duration.minutes(10),
headerBehavior: cdk.aws_cloudfront.CacheHeaderBehavior.allowList('CloudFront-Viewer-Country'), // To cache only country based
cookieBehavior: cdk.aws_cloudfront.CacheCookieBehavior.none(),
queryStringBehavior: cdk.aws_cloudfront.CacheQueryStringBehavior.none(),
}),
functionAssociations: [
{
function: redirectPageCloudfrontFunction,
eventType: cdk.aws_cloudfront.FunctionEventType.VIEWER_REQUEST,
},
],
},
},
enableIpv6: true,
minimumProtocolVersion: cdk.aws_cloudfront.SecurityPolicyProtocol.TLS_V1_2_2019,
enabled: true,
comment: 'ahmet.engineer',
domainNames: ["ahmet.engineer", "www.ahmet.engineer"],
webAclId: "arn:aws:wafv2:eu-west-1:123456789012:regional/webacl/ahmet-engineer",
httpVersion: cdk.aws_cloudfront.HttpVersion.HTTP2_AND_3,
certificate: cdk.aws_certificate_manager.Certificate.fromCertificateArn(this, 'ahmetEngineerSSL', "arn:aws:acm:eu-west-1:123456789012:certificate/12345678-1234-1234-1234-123456789012"),
});
}
}
Conclusion
With our system design, now Cloudfront Function is invoked at the required conditions, so this will reduce the invoke of lambda and customers no more face rate limit of the AWS Cloudfront Functions.