guix-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

05/14: cdn: Add a CloudFront distribution fronting berlin.


From: Chris Marusich
Subject: 05/14: cdn: Add a CloudFront distribution fronting berlin.
Date: Sat, 29 Dec 2018 02:04:54 -0500 (EST)

marusich pushed a commit to branch master
in repository maintenance.

commit d3600c75b9bf689bc20c7ce3ab6e32ec7d151c1e
Author: Chris Marusich <address@hidden>
Date:   Thu Dec 27 19:17:30 2018 -0800

    cdn: Add a CloudFront distribution fronting berlin.
    
    This is not the final version, but it gives us a good starting point.
    
    * cdn/terraform/main.tf (berlin-mirror): New resource.
    (berlin-mirror-id, berlin-mirror-status, berlin-mirror-domain-name):
    New outputs.
    * cdn/README.org: Update accordingly.
---
 cdn/README.org        | 271 ++++++++++++++++++++++++++++++++++++++++++++++++--
 cdn/terraform/main.tf |  87 ++++++++++++++++
 2 files changed, 350 insertions(+), 8 deletions(-)

diff --git a/cdn/README.org b/cdn/README.org
index 53580e2..a51b396 100644
--- a/cdn/README.org
+++ b/cdn/README.org
@@ -640,7 +640,6 @@ No Guix package yet.  But it's free software, so that's 
good.
 
 https://www.terraform.io
 
-
 * Initial manual bootstrap
 Create a user named safe-to-delete-admin and attach an IAM policy to
 it that lets it do anything.  We'll delete this in a little bit.
@@ -664,7 +663,9 @@ Now, run "terraform init" in the directory containing the 
file
 "main.tf", and Terraform will download the AWS provider if you don't
 already have it.
 
-Then run "terraform plan", and you should see something like this:
+Then run "terraform plan", and you should see something like this
+(note that originally, we hard-coded the "profile" in the main.tf
+file, so this command worked at that time):
 
 #+BEGIN_EXAMPLE
 [0] address@hidden:~/maintenance/cdn/terraform
@@ -737,18 +738,62 @@ Note: you have to specify AWS_DEFAULT_REGION or Terraform 
will ask you
 to enter a region manually, due to this bug:
 https://github.com/terraform-providers/terraform-provider-aws/issues/1767
 
-Cool.  Let's try creating it by running "terraform apply":
+Cool.  Let's try creating it by running "terraform apply".
+
+It worked, hooray!  Now we can update ~/.aws/configuration with the
+newly created access key (you have to decrypt its secret part from the
+output using GnuPG) and then delete the safe-to-delete-admin user
+manually (without using Terraform).  After that, we can control nearly
+all aspects of the AWS account and its resources via IAM users.
+
+** Enable IAM users to view billing information
+
+Some activities cannot be done by an IAM user, even an administrator,
+without taking some manual steps first to allow it.  Read more here:
+
+https://docs.aws.amazon.com/general/latest/gr/aws_tasks-that-require-root.html
+
+These tasks must be performed by the so-called "root user".  The "root
+user" is a term that AWS uses to refer to, essentially, the entity
+that owns and has truly full control over all aspects of the account.
+It is not an IAM user.
+
+One of these activities is viewing billing info, which is useful.
+Let's let IAM users do that:
+
+https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/grantaccess.html#ControllingAccessWebsite-Activate
+
+Once that's done, all administrators can now also view the billing
+information.  In addition, it is now possible to define new IAM
+policies to grant the specific permission to view the billing
+information (but not anything else).  For example, we could create a
+group called "accountants" that contains users who need access to view
+billing information (but nothing else).
 
 * Process
 
+Initial, one-time setup:
+
 - terraform init: to set things up and install the AWS provider if you
   don't have it already.
+
+After that, do this:
+
 - terraform apply: to show the actions Terraform will take, and then
   take them if you say "yes" at the prompt.
 - terraform show: to display information about the state.  In
   particular, this prints out information such as the output from the
   last run, which can be useful.
 
+When creating or updating a CloudFront distribution, "terraform apply"
+will finish quickly.  However, it seems this is an asynchronous
+operation, so the distribution may not return to the "Deployed" state
+for many minutes.  To check on its progress, you can simply run
+"terraform apply" repeatedly (maybe saying "no" at the prompt if it
+doesn't exit immediately with a message saying there are no proposed
+changes), and eventually the distribution should arrive at the desired
+end state.
+
 * Configuration strucure
 There can be multiple files (*.tf, *.tfvars), or just one file.  Name
 doesn't matter, as long as it ends in .tf or .tfvars.  We could
@@ -772,8 +817,7 @@ See:
 https://learn.hashicorp.com/terraform/getting-started/variables
 "Note: that the file can be named anything, since Terraform loads all
 files ending in .tf in a directory.  "
-* Problems
-
+* Terraform-specific Problems
 ** Downloads prebuilt binaries
 https://learn.hashicorp.com/terraform/getting-started/build
 By default, "terraform init" downloads and installs "plugin" binaries.
@@ -798,7 +842,7 @@ https://www.terraform.io/docs/state/remote.html
 a collection of 'modules':
 https://registry.terraform.io/
 
-* getting started guide
+* Terraform getting started guide
 A good, brief intro to all main concepts.
 https://learn.hashicorp.com/terraform/getting-started/install
 
@@ -806,10 +850,10 @@ This how-to guide is much better for newcomers than 
trying to read the
 reference documentation (e.g., for the configuration file syntax)
 first.
 
-* acm specific resources
+* Amazon ACM specific resources
 https://www.terraform.io/docs/providers/aws/d/acm_certificate.html
 https://www.terraform.io/docs/providers/aws/r/acm_certificate_validation.html
-* cloudfront specific resources
+* Amazon CloudFront specific resources
 
 https://www.terraform.io/docs/providers/aws/r/cloudfront_distribution.html
 * IAM Login URL
@@ -858,3 +902,214 @@ Currently, we have all the IAM configuration in Terraform 
config.  That's great!
 - Package Terraform
 - Package the AWS Provider plugin for Terraform
 - Simplify variable definitions by using .tfvars file?
+- Use origin failover to server requests via the CDN from berlin
+  first, and hydra second?
+  
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/high_availability_origin_failover.html
+
+* Questions
+** Guix build farm (berlin)
+
+- Does it ever return 3xx (e.g. redirects)?
+- Are there any URLs that are not returning a Cache-Control header but
+  should be?
+
+- What should be cached?  What should not be cached?  We can apply
+  different rules to different URLs according to a pattern language
+  similar to shell globbing.  However, if we set it up to respect the
+  origin's cache-related headers when they are included in the
+  response, we can configure all of this at the origin, independent of
+  the CloudFront distribution.  We can tell CloudFront not to cache
+  anything by including "Cache-Control: no-cache, no-store, and/or
+  private directives to objects", provided that we configure our
+  CloudFront distribution's minimum TTL to be 0.
+- How long should it be cached?  This can be set at the origin,
+  independent of the CloudFront distribution.
+- Should we include "Cache-Control: max-age" or "Cache-Control:
+  s-maxage" in responses we want to be cached?  It seems the
+  difference only matters when caching results in a web browser.  For
+  our use case, I don't think we need to bother using s-maxage at all.
+- Is it OK to ignore query parameters, headers, and cookies when
+  deciding whether or not to cache?
+
+** CloudFront
+
+- Do we need a "default root object"?  Probably not, but try making a
+  request to the distribution, and see what happens:
+  
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/DefaultRootObject.html
+* Avoiding "Service as a Software Substitute"
+Dave made an awesome Guile module for using CloudFormation:
+
+https://lists.gnu.org/archive/html/guix-devel/2018-12/msg00102.html
+https://gist.github.com/davexunit/db4b9d3e67902216fbdbc66cd9c6413e
+
+We could have used Dave's module.  However, Terraform...
+
+- is mature software - it has been around for years.
+- has a vibrant ecosystem surrounding it already
+- is popular and is used by lots of people.
+
+Finally, and most importantly: Terraform is free software that you can
+run on your own computer.  On the other hand, CloudFormation is
+essentially a "service as a software substitute" (SaaSS) that solves
+the same problem by offloading the work to a service.  There is no
+good reason to use CloudFormation when we can use or make free
+software like Terraform to do the job for us just as well - maybe even
+better:
+
+https://www.terraform.io/intro/vs/cloudformation.html
+https://www.gnu.org/philosophy/who-does-that-server-really-serve.html
+
+Primarily because CloudFormation is SaaSS, and secondarily because
+Terraform is mature and widely used, I chose to use Terraform.
+
+But if that's the case, then why are we using CloudFront, IAM, etc.?
+Aren't those services, too?  Well, yes.  They are.  But they are not
+SaaSS.  I will try to explain why.
+
+CloudFront is a CDN, and you cannot do what a CDN does by running a
+program on your computer.  To do what a CDN does would require a huge
+investment of capital and people power to build and operate an
+international network of computers.  In this way, a CDN is not SaaSS.
+
+IAM is also a service.  But again, you cannot replace what it does by
+running software on your computer.  IAM is Amazon's way of knowing who
+should be allowed to do what with the Amazon web services that you
+choose to use.  For example, creating an IAM group for administrators,
+and an IAM policy saying they can do anything they want, and adding an
+IAM user to that group named "Chris Marusich", is analogous to calling
+up your electric company and saying, "Please let Chris Marusich do
+whatever he needs to do with this account."  They record the
+information in their own system, and then when Chris calls asking them
+to change a billing address, they do some verification and determine
+that he's allowed to do that.  IAM is the same.  It doesn't replace
+software that you could have run on your own computer; it's an
+integral part of using the Amazon web services, and it has no function
+outside of that.  Therefore, IAM is also not SaaSS.
+
+Generally speaking, although SaaSS is bad because it takes freedom
+away from the computer user, services that are not SaaSS may be bad or
+good depending on the context.  Services are different from software,
+so they must be treated differently.  We shouldn't be afraid to use a
+service if (1) it isn't SaaSS and (2) it makes sense to use that
+particular service in that particular context.
+
+* Using the AWS CLI
+
+The AWS CLI is packaged in Guix.  It's called "awscli".  Here's some
+documentation:
+
+https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html
+https://docs.aws.amazon.com/cli/latest/index.html
+
+It understands many (perhaps all?) of the same environment variables
+that the Terraform AWS provider understands.
+
+Invoke it like this (customize the environment variables as needed):
+
+#+BEGIN_EXAMPLE
+[0] address@hidden:~
+$ AWS_DEFAULT_REGION=us-west-2 AWS_PROFILE=guix aws iam list-users
+{
+    "Users": [
+        {
+            "Path": "/",
+            "UserName": "civodul",
+            "UserId": "AIDAJXYCBKCDPUFEJVA3K",
+            "Arn": "arn:aws:iam::354378008360:user/civodul",
+            "CreateDate": "2018-12-27T07:37:19Z"
+        },
+        {
+            "Path": "/",
+            "UserName": "marusich",
+            "UserId": "AIDAJCXVTZTTRDUOTBAL2",
+            "Arn": "arn:aws:iam::354378008360:user/marusich",
+            "CreateDate": "2018-12-27T07:30:53Z",
+            "PasswordLastUsed": "2018-12-28T01:36:32Z"
+        },
+        {
+            "Path": "/",
+            "UserName": "rekado",
+            "UserId": "AIDAIZK2BC4U6R53UVING",
+            "Arn": "arn:aws:iam::354378008360:user/rekado",
+            "CreateDate": "2018-12-27T07:37:19Z"
+        }
+    ]
+}
+[0] address@hidden:~
+$ 
+
+#+END_EXAMPLE
+
+** Evict objects from CloudFront's cache
+You can evict cached responses from a CloudFront distribution.
+CloudFront refers to this process as "invalidation".  For details, see
+here:
+
+https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html
+
+Note that invalidation costs an additional amount of money, but it is
+negligible if you are only doing a few invalidation requests.
+Notably, "the charge to submit an invalidation path is the same
+regardless of the number of files you're invalidating" - this means
+you can invalidate everything if you want, and it will cost basically
+nothing:
+
+https://aws.amazon.com/cloudfront/pricing/
+
+Here is an example that invalidates all cached objects for a
+distribution with ID E2LCS83UL0PPNA (change the ID and paths as
+needed):
+
+#+BEGIN_EXAMPLE
+[0] address@hidden:~
+$ AWS_DEFAULT_REGION=us-west-2 AWS_PROFILE=guix aws cloudfront 
create-invalidation --distribution-id E2LCS83UL0PPNA --paths '/*'
+{
+    "Location": 
"https://cloudfront.amazonaws.com/2017-03-25/distribution/E2LCS83UL0PPNA/invalidation/I2PCH5JZ52HUX7";,
+    "Invalidation": {
+        "Id": "I2PCH5JZ52HUX7",
+        "Status": "InProgress",
+        "CreateTime": "2018-12-28T02:43:51.326Z",
+        "InvalidationBatch": {
+            "Paths": {
+                "Quantity": 1,
+                "Items": [
+                    "/*"
+                ]
+            },
+            "CallerReference": "cli-1545965030-886799"
+        }
+    }
+}
+[0] address@hidden:~
+$ 
+#+END_EXAMPLE
+
+You can also check on the invalidation status like so:
+
+#+BEGIN_EXAMPLE
+[0] address@hidden:~
+$ AWS_DEFAULT_REGION=us-west-2 AWS_PROFILE=guix aws cloudfront 
get-invalidation --id I2PCH5JZ52HUX7 --distribution-id E2LCS83UL0PPNA
+{
+    "Invalidation": {
+        "Id": "I2PCH5JZ52HUX7",
+        "Status": "Completed",
+        "CreateTime": "2018-12-28T02:43:51.326Z",
+        "InvalidationBatch": {
+            "Paths": {
+                "Quantity": 1,
+                "Items": [
+                    "/*"
+                ]
+            },
+            "CallerReference": "cli-1545965030-886799"
+        }
+    }
+}
+[0] address@hidden:~
+$ 
+#+END_EXAMPLE
+
+See the following for details:
+
+https://docs.aws.amazon.com/cli/latest/reference/cloudfront/create-invalidation.html
+https://docs.aws.amazon.com/cli/latest/reference/cloudfront/get-invalidation.html
diff --git a/cdn/terraform/main.tf b/cdn/terraform/main.tf
index 6900a68..dede45e 100644
--- a/cdn/terraform/main.tf
+++ b/cdn/terraform/main.tf
@@ -156,3 +156,90 @@ output "rekado-access-key-1-id" {
 output "rekado-access-key-1-secret" {
   value = "${aws_iam_access_key.rekado-access-key-1.encrypted_secret}"
 }
+
+# CloudFront
+
+resource "aws_cloudfront_distribution" "berlin-mirror" {
+  enabled = true
+  comment = "Distributed caching proxy for berlin.guixsd.org"
+  origin {
+    domain_name = "berlin.guixsd.org"
+    origin_id = "berlin.guixsd.org"
+    custom_origin_config {
+      http_port = 80 # Required, but not used.
+      https_port = 443
+      # Always use TLS when forwarding requests to the origin.
+      origin_protocol_policy = "https-only"
+      origin_ssl_protocols = ["TLSv1.2"]
+      origin_keepalive_timeout = 60
+      origin_read_timeout = 60
+    }
+  }
+  # The CNAME that will point to this CloudFront distribution.
+  aliases = ["ci.guix.info"]
+  is_ipv6_enabled = true
+  # This is actually the_maximum HTTP version to support. See:
+  # 
https://www.terraform.io/docs/providers/aws/r/cloudfront_distribution.html#http_version
+  http_version = "http2"
+  # Serve requests from all edge locations.
+  price_class = "PriceClass_All"
+  # Do not restrict access.
+  restrictions { geo_restriction { restriction_type = "none" }}
+  # When deleting the distribution, actually delete it.  See:
+  # 
https://www.terraform.io/docs/providers/aws/r/cloudfront_distribution.html#retain_on_delete
+  retain_on_delete = false
+  default_cache_behavior {
+    # Only allow "read" verbs.
+    allowed_methods = ["GET", "HEAD"]
+    cached_methods = ["GET", "HEAD"]
+    # The origin will compress data when necessary.
+    compress = false
+    # Cache responses that lack a Cache-Control header.
+    default_ttl = 86400 # 1 day
+    # When deciding whether or not to cache a response, ignore any
+    # cookies, headers, or query strings that the client included in
+    # their request.  This should increase the cache hit rate.  In
+    # addition, this also causes CloudFront to omit these values
+    # when forwarding the request to the custom origin. See:
+    # 
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/ConfiguringCaching.html
+    forwarded_values {
+      cookies { forward = "none" }
+      query_string = false
+    }
+    # Generally speaking, respect any Cache-Control or Expires
+    # headers that the origin includes in its responses.  The
+    # exception is that if a Cache-Control or Expires header says to
+    # cache the result for more than 1 year, we ignore that and only
+    # cache the result for 1 year at most.  Honestly, though, it
+    # seems unrealistic to expect CloudFront to actually keep the
+    # cached response for an entire year in that case.  See:
+    # 
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
+    max_ttl = 31536000 # 365 days
+    min_ttl = 0
+    target_origin_id = "berlin.guixsd.org"
+    viewer_protocol_policy = "https-only"
+  }
+  # TODO: Maybe add more behaviors for specific paths/prefixes.
+  # ordered_cache_behavior {}
+  # TODO: Maybe set a caching behavior for error responses.
+  # custom_error_response {}
+  # TODO: Integrate with ACM.
+  viewer_certificate {
+    cloudfront_default_certificate = true
+    # This is the only option when using the default CloudFront
+    # certificate.  See:
+    # 
https://www.terraform.io/docs/providers/aws/r/cloudfront_distribution.html#viewer-certificate-arguments
+    # 
https://docs.aws.amazon.com/cloudfront/latest/APIReference/API_ViewerCertificate.html
+    minimum_protocol_version = "TLSv1"
+  }
+}
+
+output "berlin-mirror-id" {
+  value = "${aws_cloudfront_distribution.berlin-mirror.id}"
+}
+output "berlin-mirror-status" {
+  value = "${aws_cloudfront_distribution.berlin-mirror.status}"
+}
+output "berlin-mirror-domain-name" {
+  value = "${aws_cloudfront_distribution.berlin-mirror.domain_name}"
+}



reply via email to

[Prev in Thread] Current Thread [Next in Thread]