skies.dev

How to Host a Gatsby Site on AWS with Terraform

11 min read

Setting Up to Deploy Our Gatsby Site to AWS Using Terraform

In this article, we are going to look at setting up infrastructure as code using Terraform to host a Gatsby site on AWS.

All the code discussed in this article can be found here.

We will use the following AWS services to host our Gatsby site:

  • S3 to store the static assets produced by the Gatsby build
  • CloudFront for the CDN
  • Lambda@Edge to redirect users to the canonical URL
  • Route53 for DNS
  • AWS Certificate Manager to issue an SSL certificate for our site

The article is supplemented by the excellent YouTube playlist by Guiding Digital. If you find the videos helpful, be sure to give them a like.

Note, most of the Guiding Digital videos demonstrate creating resources from the AWS console. We will, however, be creating our AWS infrastructure using Terraform from the beginning. The videos will still help with understanding what we will be doing in Terraform.

You'll need to give Terraform programmatic access to create resources in AWS. See the AWS documentation if you need information setting this up. In general, it's best practice to follow the principle of least privilege and give Terraform only the permissions it needs to do its job—and nothing more.

To get a crash course of what we're going to be doing, start with the following video. Don't worry if everything doesn't make sense right away. We're going to be looking at each step in more detail throughout this article.

https://www.youtube.com/watch?v=AYtryLEs9HA

Let's start by setting up S3 and CloudFront.

Hosting a Gatsby Site in AWS using S3 and Cloudfront

https://www.youtube.com/watch?v=pjUrLnWXHB8

We'll create a terraform/ folder in our project's root directory. The name doesn't matter—call it whatever you want. We just need a place to put our terraform code so we can use the Terraform CLI to apply changes to our infrastructure.

First, I'll create a locals.tf file. This will be a place where we can define variables that we can use throughout our Terraform code. This will make it easy for us if we need to change something, then we only need to change it in one place.

In this file, we will go ahead and set

  • s3_bucket_name to be the name of our S3 bucket. Call this whatever name that makes sense to you.
  • region to specify an AWS region. For this, I recommend just setting this to us-east-1 because some of the infrastructure that we need requires deployment to this region. The videos will mention this as it comes up.
terraform/locals.tf
locals {
  s3_bucket_name = "www.yourdomain.com"
  region         = "us-east-1"
}

Next, we'll create a provider.tf file to tell Terraform which cloud provider we're using.

terraform/provider.tf
provider "aws" {
  region = local.region
}

Now we can configure our S3 bucket to host our static assets in s3.tf. In this step, we will

  • Define the bucket name as well as the website's index and error document.
  • Turn off blocking all public access. We need to do this so people can view our site.
  • Create a bucket policy to grant read-only permission to an anonymous user.
terraform/s3.tf
resource "aws_s3_bucket_public_access_block" "my_bucket_public_access_block" {
  bucket = aws_s3_bucket.my_bucket.id

  block_public_acls   = false
  block_public_policy = false
}

resource "aws_s3_bucket" "my_bucket" {
  bucket = local.s3_bucket_name

  website {
    index_document = "index.html"
    error_document = "404.html"
  }
}

resource "aws_s3_bucket_policy" "my_bucket_policy" {
  depends_on = [
  aws_s3_bucket_public_access_block.my_bucket_public_access_block]
  bucket = aws_s3_bucket.my_bucket.id
  policy = <<POLICY
{
  "Version": "2012-10-17",
  "Id": "MyBucketPolicy",
  "Statement": [
    {
      "Sid": "PublicRead",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "${aws_s3_bucket.my_bucket.arn}/*"
    }
  ]
}
POLICY
}

Now, we'll look at creating our CloudFront distribution. First, let's update locals.tf with a TTL value of 1 year. We set the CloudFront's TTL to a year because our Gatsby site is static.

terraform/locals.tf
locals {
  s3_bucket_name        = "www.yourdomain.com"
  region                = "us-east-1"
  cloudfront_ttl        = 31536000
}

Now we'll set up our CloudFront infrastructure in cloudfront.tf. There's a lot here, please do refer to the video above, Terraform documentation, and/or AWS documentation if any of the settings are unclear. Many of these settings are set by default when you set this up using the AWS console.

terraform/cloudfront.tf
locals {
  s3_origin_id = "myS3Origin"
}

resource "aws_cloudfront_distribution" "s3_distribution" {
  origin {
    custom_origin_config {
      http_port              = "80"
      https_port             = "443"
      origin_protocol_policy = "http-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }
    domain_name = aws_s3_bucket.my_bucket.website_endpoint
    origin_id   = local.s3_origin_id
  }

  enabled         = true
  is_ipv6_enabled = true
  comment         = "My website's CloudFront distribution"

  default_cache_behavior {
    allowed_methods  = ["GET", "HEAD"]
    cached_methods   = ["GET", "HEAD"]
    compress         = true
    target_origin_id = local.s3_origin_id

    forwarded_values {
      query_string = false

      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = local.cloudfront_ttl
    default_ttl            = local.cloudfront_ttl
    max_ttl                = local.cloudfront_ttl
  }

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }
}

Now we should be able to apply the changes we've created, upload our Gatsby build to S3, and view our page live by visiting the CloudFront URL.

Next, let's look at adding support for our custom domain name.

How to Make Your AWS Hosted Gatsby Site Production Ready

Note, in the video, Phil shows using two S3 buckets to handle the redirect from the non-www version of the site to the www version of the site. We will handle redirects in a different way—discussed in the next section.

https://www.youtube.com/watch?v=U0ZbJi-OKg4

First, let's update locals.tf to hold our domain name.

terraform/locals.tf
locals {
  s3_bucket_name        = "www.yourdomain.com"
  primary_domain_name   = "www.yourdomain.com"
  alternate_domain_name = "yourdomain.com"
  region                = "us-east-1"
  cloudfront_ttl        = 31536000
}

Now we'll create a route53.tf to create a Route53 hosted zone and the DNS records for the www and root domain names.

terraform/route53.tf
resource "aws_route53_zone" "zone" {
  name = "${local.alternate_domain_name}." // notice the dot at the end
}

resource "aws_route53_record" "www" {
  zone_id = aws_route53_zone.zone.id
  name    = local.primary_domain_name
  type    = "CNAME"
  ttl     = "300"
  records = [aws_cloudfront_distribution.s3_distribution.domain_name]
}

resource "aws_route53_record" "root" {
  zone_id = aws_route53_zone.zone.id
  name    = local.alternate_domain_name
  type    = "A"

  alias {
    name                   = aws_cloudfront_distribution.s3_distribution.domain_name
    zone_id                = aws_cloudfront_distribution.s3_distribution.hosted_zone_id
    evaluate_target_health = false
  }
}

Important note. After you apply these DNS changes, you will need to go to your domain name provider and update the nameservers to point to Route53. The video above shows an example using Google Domains. Look in the AWS console to see what the Route53 nameservers are.

Now we'll create acm.tf to use AWS Certificate Manager to issue and validate an SSL certificate.

terraform/acm.tf
resource "aws_acm_certificate" "cert" {
  domain_name = local.primary_domain_name
  subject_alternative_names = [
  local.alternate_domain_name]
  validation_method = "DNS"
}

resource "aws_acm_certificate_validation" "cert" {
  certificate_arn         = aws_acm_certificate.cert.arn
  validation_record_fqdns = [for record in aws_route53_record.cert_validation : record.fqdn]
}

resource "aws_route53_record" "cert_validation" {
  for_each = {
    for dvo in aws_acm_certificate.cert.domain_validation_options : dvo.domain_name => {
      name   = dvo.resource_record_name
      record = dvo.resource_record_value
      type   = dvo.resource_record_type
    }
  }

  allow_overwrite = true
  name            = each.value.name
  records         = [each.value.record]
  ttl             = 60
  type            = each.value.type
  zone_id         = aws_route53_zone.zone.id
}

Now, we can go back to CloudFront and assign the certificate and domain names to the distribution.

terraform/cloudfront.tf
// ✂️ Unchanged...

resource "aws_cloudfront_distribution" "s3_distribution" {
  origin {
    // ✂️ Unchanged...
  }

  // ✂️ Unchanged...

  aliases = [local.primary_domain_name, local.alternate_domain_name]

  default_cache_behavior {
    // ✂️ Unchanged...
  }

  restrictions {
    // ✂️ Unchanged...
  }

  viewer_certificate {
    acm_certificate_arn      = aws_acm_certificate_validation.cert.certificate_arn
    minimum_protocol_version = "TLSv1.2_2018"
    ssl_support_method       = "sni-only"
  }
}

Now we're going to update our S3 bucket policy so that users have to go through CloudFront. This makes it so users are getting the best performance possible by going through the CDN. This also helps us from an SEO perspective because Google would penalize us for duplicate content if the S3 URL is crawled.

We'll create a specific referrer value in the header that S3 will expect. I recommend using a complex referrer value for added security.

Let's set the referrer value in locals.tf.

terraform/locals.tf
locals {
  s3_bucket_name        = "www.yourdomain.com"
  primary_domain_name   = "www.yourdomain.com"
  alternate_domain_name = "yourdomain.com"
  region                = "us-east-1"
  cloudfront_ttl        = 31536000
  referer_key           = "uijCZ167tIJjC9jeDp6kbwKa3OF2tfweQSjR"
}

Now we'll update our bucket policy which tells S3 to only allow reads if the specific Referer header is passed along.

terraform/s3.tf
// ✂️ Rest of the code is unchanged...

resource "aws_s3_bucket_policy" "my_bucket_policy" {
  depends_on = [
  aws_s3_bucket_public_access_block.my_bucket_public_access_block]
  bucket = aws_s3_bucket.my_bucket.id
  policy = <<POLICY
{
  "Version": "2012-10-17",
  "Id": "MyBucketPolicy",
  "Statement": [
    {
      "Sid": "PublicRead",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "${aws_s3_bucket.my_bucket.arn}/*",
      "Condition": {
        "StringLike": {"aws:Referer": "${local.referer_key}" }
      }
    }
  ]
}
POLICY
}

Now we'll go back and tell CloudFront to add in this Referer header.

terraform/cloudfront.tf
// ✂️ Code unchanged...

resource "aws_cloudfront_distribution" "s3_distribution" {
  origin {
    custom_origin_config {
      http_port              = "80"
      https_port             = "443"
      origin_protocol_policy = "http-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }
    domain_name = aws_s3_bucket.my_bucket.website_endpoint
    origin_id   = local.s3_origin_id
    custom_header {
      name  = "Referer"
      value = local.referer_key
    }
  }

  // ✂️ Code unchanged...
}

At this point, we should be able to visit our site from our custom domain name. Moreover, we should not be able to visit the site by visiting the S3 URL unless we pass in the expected header.

Now let's set up a Lambda function to redirect users from the root domain to the www version of our domain.

How to Add Redirects Using Lambda@Edge

https://www.youtube.com/watch?v=RRQ5-M6-8Wc

First, let's write our Lambda function in lambda/redirect/index.js. The name doesn't matter so just pick a name and folder structure that makes sense for your project. The implementation is similar to the one mentioned in the video.

lambda/redirect/index.js
/* eslint-disable lines-around-directive */
/* eslint-disable strict */
'use strict';

exports.handler = (event, context, callback) => {
  const {request} = event.Records[0].cf;
  const {uri} = request;
  const host = request.headers.host[0].value;
  const {querystring} = request;

  // Make sure the URL starts with "www." and ends with "/"
  if (!host.startsWith('www.') || !uri.endsWith('/')) {
    let newUrl = 'https://www.skies.dev';

    // Add path
    if (uri) newUrl += uri;

    // Add trailing slash
    if (!newUrl.endsWith('/')) newUrl += '/';

    // Add query string
    if (querystring && querystring !== '') newUrl += `?${querystring}`;

    const response = {
      status: '301',
      statusDescription: '301 Redirect for root domain',
      headers: {
        location: [
          {
            key: 'Location',
            value: newUrl,
          },
        ],
      },
    };

    callback(null, response);
  } else {
    callback(null, request);
  }
};

Next, I'm going to use Terraform to zip up the Lambda function so it can deploy the Lambda on our behalf. We'll ask Terraform to zip the code into a dist/ folder. You can optionally add dist/ to your .gitignore to avoid committing the generated files.

.gitignore
dist/

Below we write the Terraform code in lambda.tf to

  • add permissions to our Lambda function via IAM
  • automatically zip the code for deployment
  • publish a new version that can be used by our CloudFront distribution

Let's start by defining the IAM permissions the redirect Lambda needs in policy/redirect.json.

terraform/policies/redirect.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": ["lambda.amazonaws.com", "edgelambda.amazonaws.com"]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Now, we can use Terraform's file function to import this policy where we define the Lambda infrastructure.

terraform/lambda.tf
resource "aws_iam_role" "iam_for_lambda" {
  name = "iam_for_lambda"

  assume_role_policy = file("policies/redirect.json")
}

locals {
  lambda_redirect_zip = "../lambda/redirect/dist/redirect.zip"
}

data "archive_file" "redirect_zip" {
  type        = "zip"
  source_file = "../lambda/redirect/index.js"
  output_path = local.lambda_redirect_zip
}

resource "aws_lambda_function" "redirect_lambda" {
  filename         = local.lambda_redirect_zip
  function_name    = "CloudFrontRedirect"
  role             = aws_iam_role.iam_for_lambda.arn
  handler          = "index.handler"
  source_code_hash = filebase64sha256(local.lambda_redirect_zip)
  publish          = true
  runtime          = "nodejs12.x"
}

resource "aws_lambda_permission" "allow_cloudfront" {
  statement_id  = "AllowExecutionFromCloudFront"
  action        = "lambda:GetFunction"
  function_name = aws_lambda_function.redirect_lambda.function_name
  principal     = "edgelambda.amazonaws.com"
}

Now we'll go back to CloudFront and associate it with this Lambda function.

terraform/cloudfront.tf
// ✂️ Unchanged...

resource "aws_cloudfront_distribution" "s3_distribution" {
  // ✂️ Unchanged...

  default_cache_behavior {
    // ✂️ Unchanged...

    lambda_function_association {
      event_type   = "viewer-request"
      lambda_arn   = aws_lambda_function.redirect_lambda.qualified_arn
      include_body = false
    }
  }

  // ✂️ Unchanged...
}

We now have the infrastructure needed to deploy our Gatsby site to AWS.

Closing Thoughts and Next Steps

In this article, we set up infrastructure as code using Terraform to host our Gatsby site in AWS.

If you thought this article was helpful and you think your network could learn from it, then I would be grateful if you shared it with your network on social media. Be sure to give the Guiding Digital videos a like on YouTube as well! This would help us a lot.

In the next article, we are going to set up a deployment pipeline to automatically deploy changes to AWS using GitHub Actions. Check it out!

Hey, you! 🫵

Did you know I created a YouTube channel? I'll be putting out a lot of new content on web development and software engineering so make sure to subscribe.

(clap if you liked the article)

You might also like