Back to articles
December 17, 2024

Configuration Management and Infrastructure as Code

CloudFormation Template Errors

  • Invalid Value or Unsupported Resource Property
    • EITHER Parameter naming mistake
    • OR Property names are unsupported
  • Resource Failed to Stabilize
    1. timeout exceeded, or
    2. AWS service isn't available, or
    3. AWS service was interrupted

Stack Fails to Roll Back

  • most often happens when deployment account has permission to create stacks
    • but lacks permission to modify or delete stacks

EC2 Auto Scaling

  • add hooks to launching and terminating stages of instance lifecycle
    • hooks can send SNS notification and hold the instance in Pending state
    • trigger a Lambda function from the SNS topic

Elastic Network Interface (ENI)

  • EC2 AutoScaling does not allow specifying a second ENI in the Launch Configuration
  • aws:createENI is not a valid SSM automation document action

CloudFormation Resource Replacement

  • RDS DB port changed
    • if "Port" attribute on AWS:RDS:DBInstance has update requirement of "Replacement"
      • DB will be replaced with new instance with possible data loss
      • DB will have to be restored from backups

CloudFormation DependsOn Resource

  • ensure first stack sends completion signal before starting second stack
  • add CreationPolicy with long timeout so first stack doesn't time out

Create Thumbnail Images from Full-Size Images in S3

  • upload full-size image to S3
  • automate creating thumbnail images
    • S3 Event Trigger executes Lambda function
    • Lambda function creates thumbnail and saves it to a different bucket

Fargate Configuration Errors

  • Invalid CPU Setting
  • Invalid Memory Setting
    • ensure values in Task Definition show as supported in documentation
  • Repository Not Found
    • incorrect ECR image specification
    • resolve by correctly updating URI or ARN

Lambda Throttling

  • Lamdba Throttle option used to troubleshoot Lambda function endless loops

Step Function Best Practices

  1. Avoid latency when polling for Activity Tasks
  2. Use ARN's instead of passing large payloads
  3. Avoid Reaching History Limit (writing/saving too much history)

OpsWorks

  • allows creating applications with pre-built layer templates
    • create servers, RDS instances, load balancers, etc.
  • allows using Chef-Solo recipes
  • allows multiple access levels: i.e. deploy permission and manage permissions

EC2 Spread Placement

  • distribute EC2 instances across different AZ's

Elastic Beanstalk

  • rolling deployment
  • roll back with manual deployment

Step Functions

  • error scenarios and retry strategies
    • runtime errors can occur in any stage
      • i.e. Lambda function throws exception
      • i.e. Network exception, timeout
    • Runtime errors are handled differently than Failures
      • example: state machine definition errors
    • When state errors occur, default is to:
      • 1) log error
      • 2) retry 1 time after 1 second
        • if retry fails, AWS Step Functions will fail execution entirely
Loading comments...