Automating mostlygeek.com

Automating things is really complex. Automating an entire publising pipeline triggered by a github push is really, really complex. However, once set up it’s reliable, nearly zero maintenance and very cheap to run. As I was writing this post it started growing out of control in the AWS and CircleCI sections. I’ve focused on writing about things that took the longest to figure out.

I had several goals for mostlygeek.com. Most of them let me focus on writing instead of publishing and maintenance.

  1. A completely static website. No security updates or database to maintain.
  2. A simple way to write and manage posts.
  3. Automatic uploading of files to the web host.
  4. Hosting I don’t have to manage.
  5. CHEAP!

These are the tools I chose:

Hugo + Github

Why Hugo? I chose Hugo since it’s written in Go. Which I’m a big fan of. It’s open source, mature and fast. The best part is that it’s fairly opinionated in how content is organized. It does takes some effort to learn how use and configure it. Fortunately there is lots of documentation to get you started.

Why Github? Personal preference. Also the integration with CircleCI, which I use to build and publish the website.

CircleCI

I came to prefer CircleCI’s approach to automation and Github integration while working on the Dockerflow project at work. It’s also free for open source (non private repo) projects. When I push a new content to Github it automatically triggers CircleCI to build and publish the website.

When a CircleCI job starts it looks for a circle.yml in your repo to tell it what to do. Getting this configured can be quite a chore. You can use mostlygeek.com’s as a starting point:

machine:
  environment:
    #
    # Define environment variables used by the rest of the script
    # These control the operation of the script
    #
    # Note:
    #   $AWS_ACCESS_KEY_ID and $AWS_SECRET_ACCESS_KEY are required
    #   but they are defined in CircleCI's UI.
    #
    HUGO_VERSION: 0.30.2
    HUGO_ARCHIVE: hugo_${HUGO_VERSION}_Linux-64bit.tar.gz
    HUGO_SRC: https://github.com/spf13/hugo/releases/download/v${HUGO_VERSION}/$HUGO_ARCHIVE
    CIRCLE_BUILD_DIR: $HOME/$CIRCLE_PROJECT_REPONAME
    PATH: $PATH:$CIRCLE_BUILD_DIR/bin

  post:
    - mkdir -p $CIRCLE_BUILD_DIR/bin

# Install hugo, pygments and s3deploy tools. These are not included
# in CircleCI's environment so they must be installed manually.
dependencies:
  pre:
    - >
      if [ ! -e $CIRCLE_BUILD_DIR/bin/hugo ] || [[ `hugo version` =~ v${HUGO_VERSION} ]];
      then
          wget $HUGO_SRC;
          tar xvzf hugo_${HUGO_VERSION}_Linux-64bit.tar.gz -C $CIRCLE_BUILD_DIR/bin;
      fi

    # s3deploy simplifies static site deployment to S3
    - go get -v github.com/bep/s3deploy

  # speed things up for future builds
  cache_directories:
    - $CIRCLE_BUILD_DIR/bin

# use CircleCI's test stage to build the static site. If hugo exits
# with an error CircleCI will abort immediately.
test:
  override:
    - hugo -v

deployment:
  s3up:
    # deploy new assets to S3 when the `master` branch is updated.
    branch: master
    commands:

      # updates: http://mostlygeek.com.s3-website-us-east-1.amazonaws.com
      - s3deploy -source=public/ -region=us-east-1 -bucket=mostlygeek.com

      # Invalidate everything in the CDN. Since we do not know what
      # changed it is easier to just dump everything. Not a big
      # deal since mostlygeek.com isn't a massive site.
      #
      # This is an async step. This task does not wait for the CDN
      # to invalidate objects before continuing.
      - aws configure set preview.cloudfront true
      - >
        aws cloudfront create-invalidation
        --distribution-id $CLOUDFRONT_DISTRIBUTION_ID
        --paths '/*'

      # ping myself when the deployment steps have completed
      # only if PUSHOVER credentials have been set in CircleCI's UI
      - >
        [ ! -z "$PUSHOVER_APP_TOKEN" ] && curl
        --silent
        --form-string "token=$PUSHOVER_APP_TOKEN"
        --form-string "user=$PUSHOVER_USER_KEY"
        --form-string "message=blog deploy complete"
        "https://api.pushover.net/1/messages.json";

There’s some hidden magic you don’t see in the source above. There are some secrets like AWS credentials that are set up through CircleCI’s web interface. These are:

I skipped the pushover setup. I’m lazy and using Pushover to notify me when the site’s deployed so I don’t have to check it manually.

Check out my CircleCI deploy logs to see the above configuration in action. You may notice some (many) trial and error attempts at fixing weird bugs.

Hosting on AWS

I’m going to gloss over a lot of details and instead focus on tips to save you some time. You can fill in the gaps with AWS documentation (even though they can sometimes be impenetrable). Sorry. :)

My goals for hosting:

  1. Host files in S3
  2. Put Cloudfront in front of S3 to make it fast and secure with a TLS certificate
  3. A new AWS IAM user with access limited to only the S3 bucket and Cloudfront
  4. R53 to host the DNS

Create an S3 bucket for serving the site

  1. Create a new bucket, I named mine mostlygeek.com. Using a website’s DNS will let you point a DNS record directly at the bucket. I’m not doing that, opting instead to use Cloudfront to serve the site super fast.
  2. Enable static website hosting. Note, you’re going to need the S3 hosting endpoint later for setting up Cloudfront.
  3. Add a bucket policy. Use this to save you much painful documentation reading. Be sure to replace <BUCKET NAME> with the name of the bucket you just created.

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "AllowPublicRead",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<BUCKET NAME>/*"
        }
    ]
}

Create an IAM user

In AWS IAM:

  1. Create a new user, with a username like blog_uploader.
  2. Create a set of security credentials and copy/paste them into the CircleCI environment as AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. The s3deploy tool in circle.yml will automatically use these.
  3. Add a permission policy for blog_uploader so it can upload to S3 and update cloudfront. Just copy/paste mine, replacing <BUCKET NAME> with your S3 bucket name.

Copy/Tweak/Paste me:

{
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation",
                "s3:ListBucketMultipartUploads"
            ],
            "Resource": "arn:aws:s3:::<BUCKET NAME>"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:DeleteObject",
                "s3:DeleteObjectVersion",
                "s3:GetObject",
                "s3:GetObjectAcl",
                "s3:GetObjectVersion",
                "s3:GetObjectVersionAcl",
                "s3:PutObject",
                "s3:PutObjectAcl",
                "s3:PutObjectAclVersion"
            ],
            "Resource": "arn:aws:s3:::<BUCKET NAME>/*"
        },
        {
            "Effect": "Allow",
            "Action": "s3:ListAllMyBuckets",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "cloudfront:CreateInvalidation",
            "Resource": "*"
        }
    ]
}

Create a CloudFront Distribution

  1. Create a new Cloudfront distribution in the AWS console.
  2. Set an Origin to your S3 bucket. There’s a trick here. Since we enabled Static Website Hosting on the bucket you can just use the hosting DNS name. You can get it from the bucket’s Properties > Static website hosting configuration. For example, this blog’s is: mostlygeek.com.s3-website-us-east-1.amazonaws.com. I find this easier than using the pure S3 route.
  3. Copy/paste the distribution id into a new CircleCI environment variable as CLOUDFRONT_DISTRIBUTION_ID. This will allow CircleCI to flush Cloudfront’s cache when deploying the site.
  4. If you want an HTTPS certificate you can request a free one from AWS ACM. There’s a big button and a bunch of hoops to jump through but it should be pretty straightforward.

Pointing your DNS at your Cloudfront Distribution

I use R53 for DNS hosting. A really nice feature is being able to create an R53 alias to Cloudfront. This way when your computer resolves mostlygeek.com, it will come back as IP addresses for the closest Cloudfront servers instead of a CNAME.