AWS Pull Request and Pre-Merge Validation

Lamia Cero

DATA ENGINEER

This blog post goes into the specifics of the script on how it automates a critical part of infrastructure management, ensuring that any changes to CloudFormation templates or Step Functions configurations are validated and deployed correctly before they are integrated into the main branch of the repository. This automation helps maintain the integrity and stability of the infrastructure deployment process.

The Core of Automation

At the heart of the script is the automation of tasks that would otherwise require manual intervention and reduce potential human error. The script is written in a bash environment, leverages various AWS services, such as AWS CodeCommit, AWS SES (Simple Email Service), and AWS CloudFormation, among others.

Script Setup and Initial Configuration

Debug Mode Activation: The script is configured to operate within the AWS environment. Debugging is enabled from the start, which allows the script to log detailed execution steps, aiding in troubleshooting and ensuring transparency of the operations being performed.

Pull Request Management Functions

The script’s functionality to manage pull requests is foundational, enabling seamless development and review processes:

Extracting PR Numbers: The script uses AWS CodeCommit commands to fetch the latest PR associated with a specific repository. This functionality is vital as it identifies the current PR to be processed for checks, merges, or other necessary actions. By automating this retrieval, the script ensures that actions are always taken on the most recent submissions, maintaining workflow continuity.

# Function to extract PR number
extract_pr_number() {
    # Get the latest pull request associated with the stack
    latest_pr=$(aws codecommit list-pull-requests \
                    --region "$AWS_REGION" \
                    --repository-name "your-repo-name" \
                    --query "pullRequestIds[0]" \
                    --output text)

    # Extract PR number from the pull request ARN
    pr_number=$(aws codecommit get-pull-request \
                    --region "$AWS_REGION" \
                    --pull-request-id "$latest_pr" \
                    --query "pullRequest.pullRequestId" \
                    --output text)
    
    echo "$pr_number"
}

Extracting Author Email for Notifications: After identifying the PR, the script extracts the PR author’s email from AWS CodeCommit. This step is critical for direct communication, particularly in scenarios where merge conflicts or other issues that require developer intervention are detected. Automating this process ensures that notifications are timely and reduces manual lookup errors.

# Function to extract PR email
extract_pr_email() {
    pr_id="$1"

    pr_details=$(aws codecommit get-pull-request --pull-request-id "$pr_id" --query "pullRequest.authorArn" --output text)
}

Conflict Checking and Automated Merging

One of the script’s critical capabilities is its automated approach to handling PR merges:

Automated Conflict Detection: Before merging, the script checks for conflicts between the source and destination branches using AWS CodeCommit’s conflict detection features. This automated check is crucial to prevent problematic merges that could lead to broken builds or deployment failures.

mergeable=$(aws codecommit get-merge-conflicts --repository-name "your-repo-name" --source-commit-specifier "${source_commit_id}" --destination-commit-specifier "${destination_commit_id}" --merge-option SQUASH_MERGE | jq -r .mergeable)

Conditional Automated Merging: If no conflicts are detected, the script proceeds with the merge automatically, significantly speeding up the integration process. This automation reduces the manual workload on developers and integration teams, allowing them to focus on more complex tasks.

CloudFormation Integration

The integration with AWS CloudFormation showcases an advanced use case:

Template Validation: Each PR is checked to ensure that any CloudFormation templates modified are valid and error-free.

Change Set Management: The script automates the creation and management of change sets for CloudFormation stacks.

Change Set Management

Understanding Change Sets

Change sets are a crucial feature of AWS CloudFormation that provide a preview of how proposed changes to a stack might impact existing resources or create new ones before the changes are implemented. This allows for:

Risk Assessment: Administrators can review the changes to be applied, helping to avoid unintended consequences.

Approval Processes: Changes can be reviewed and approved in a controlled manner, ensuring only desired modifications are made.

Sequential Updates: Ensures that updates to the infrastructure are made in a systematic, predictable fashion.

This integration is particularly useful in environments where infrastructure needs to be repeatedly and reliably recreated:

Change Set Creation and Management: The script automates the creation of change sets within AWS CloudFormation. This preview capability is great for risk assessment and management, ensuring that errors in the templates are caught upon the creation of change sets. If a template is missing attributes, or has indentations errors, the creation of the change set will fail and the developer will be immediately notified that an error has occurred in the changed template.
Detailed Change Set Execution Monitoring: After initiating a change set, the script monitors its execution status. This monitoring is critical to ensure successful deployments and to quickly address any issues that arise during the execution phase. By automating this monitoring, the script not only saves time but also enhances the reliability of deployments. If the state machine definition has errors, that will cause the states stack to fail, the execution of the change set will fail and the developer will be notified that the changes made contain an error and that it needs to be fixed.

execute_change_set() {
    stack_name="$1"
    change_set_name="$2"
    
    # Executes the change set.
    aws cloudformation execute-change-set --region "$AWS_REGION" --stack-name "$stack_name" --change-set-name "$change_set_name"

    while true; do
        # Queries the execution status of the change set.
        execution_status=$(aws cloudformation describe-change-set --region "$AWS_REGION" --stack-name "$stack_name" --change-set-name "$change_set_name" --query "ExecutionStatus" --output text 2>/dev/null || echo "NOT_FOUND")
       
        case $execution_status in

            stack_status=$(aws cloudformation describe-stacks --region "$AWS_REGION" --stack-name "$stack_name" --query "Stacks[0].StackStatus" --output text)
            echo "Change set not found. Checking stack status: $stack_status"
            if [[ "$stack_status" == "UPDATE_COMPLETE" ]]; then
                echo "Stack update was applied successfully."
                return 0
            elif [[ "$stack_status" == "UPDATE_ROLLBACK_COMPLETE" ]]; then
                echo "Stack update failed."
                return 1
            else
                echo "Change set not found and stack status is not complete: $stack_status"
                return 1
            fi
            ;;
        *)
            echo "Unknown status, handling as error."
            return 1
            ;;
        esac
    done
}

Breaking down the main function

The function is designed to handle changes submitted through pull requests in a Git repository managed by AWS CodeCommit. When a pull request is created, the script gets triggered and performs the following tasks:

Retrieves the IDs of the source and destination commits involved in the pull request.
Identifies the files changed in the pull request.
Validates and potentially deploys these changes if they include CloudFormation templates or Step Functions configurations.

Detailed Breakdown

Step 1: Fetch Commit IDs

The function starts by identifying the source and destination commits associated with the pull request using the AWS CLI. These commits represent the changes proposed in the pull request and the current state of the base branch to which the pull request is made:

source_commit_id=$(aws codecommit get-pull-request --pull-request-id "$pr_id" --query "pullRequest.pullRequestTargets[0].sourceCommit" --output text)
destination_commit_id=$(aws codecommit get-pull-request --pull-request-id "$pr_id" --query "pullRequest.pullRequestTargets[0].destinationCommit" --output text)

Step 2: Identify Changed Files

Next, the function lists all files that have been changed in the pull request. It specifically looks for differences between the destination and source commits:

changed_files=$(aws codecommit get-differences --repository-name "your-repo-name" --before-commit-specifier "$destination_commit_id" --after-commit-specifier "$source_commit_id" --query "differences[].afterBlob.path" --output text)

If changes are detected, the function proceeds; if not, it exits, indicating no changes need to be validated.

Step 3: Process Each File

For each file identified in the list of changes, the function does the following:

Retrieves the blob ID of the file to access its content.
Downloads the file content and stores it locally.
Creates template for Change sets out of the downloaded file.
Validates template and/or state machine definition

Step 4: Validate and Deploy CloudFormation Templates and Step Functions

The function handles two specific types of files differently:

Step Functions Configuration Files: These files are uploaded to an S3 bucket for deployment.

After the file is uploaded to the S3 bucket, a command fetches the CloudFormation template, modifies this template in-place to update the Location parameter, pointing it to the new S3 location where the specific configuration file for this pull request was uploaded. This ensures that the CloudFormation stack uses the most current version of the Step Functions configuration.

sed -i "s@Location: .*@Location: !Sub s3://step-functions-bucket/PR${pr_id}_${file##*/}@g"

The function then creates a change set for the changed CloudFormation stack and attempts to execute it.

  execute_change_set "states-stack" "$change_set_name"
    execute_result=$?
    if [ $execute_result -ne 0 ]; then
        step_function_executed=false
    else
        echo "Change set executed successfully for predefined stack: state-machine-stack"
        step_function_executed=true

CloudFormation Templates: These templates are validated directly using the validate_template function. If the validation passes, further actions (like merging the pull request) might be triggered.

# Function to validate CloudFormation template
validate_template() {
    template_file="$1"

    # Validate CloudFormation template using commit IDs
    aws cloudformation validate-template \
        --template-body "file://$template_file" 
    
    # Check the exit status of the validation command
}

Step 5: Final Decision Making

Based on the results of the validations and deployments:

If all validations and deployments succeed, the pull request will be automatically merged.
If any validation or deployment fails, appropriate actions are taken, such as notifying the developer who created the PR.

Check results and act accordingly

if [ "$validation_failed" = false ] && [ "$step_function_executed" = true ]; then

    echo "All templates validated and executed successfully. Merging pull request..."

    merge_pull_request "$1"

elif [ "$validation_failed" = false ] && [ "$step_function_files_found" = false ]; then

    echo "CloudFormation template validation successful. Merging pull request..."

    merge_pull_request "$1"

# Additional checks and actions continue...

Conclusion

In summary, this script is not just a tool but a transformational element that bridges the gap between code development and operational deployment, ensuring that businesses can leverage the full potential of cloud computing and automation to stay competitive in a fast-paced technological landscape.