Jekyll2024-01-15T10:39:52+00:00https://kkentzo.github.io//kkentzo.github.ioNo-ssh deployment to EC2 using ansible and AWS Systems Manager2024-01-15T00:00:00+00:002024-01-15T00:00:00+00:00https://kkentzo.github.io/2024/01/15/ssm-ansible-deployment<p>In a <a href="/2020/03/25/deploying-with-ansible-systemd/">previous post</a>, we
saw how to deploy an application (small golang service) using ansible
and systemd. In that flow, ansible execution depended upon the remote
server accepting ssh connections. However, there are a lot of
situations in which the remote server does not have an open ssh port
due to security reasons (e.g. compliance to security requirements).</p>
<p>In such cases, where there is not direct access to the EC2 instance,
we have the option of using <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/what-is-systems-manager.html">AWS Systems
Manager</a>
as a route of sorts to our remote host. AWS Systems Manager (SSM in
short) enables a multitude of capabilities on a fleet of “managed
nodes”. In our case, a managed node is an EC2 instance that runs the
SSM agent and has the necessary IAM permissions for being part of
SSM’s fleet of managed nodes. We will use SSM’s ability to send a
command to a managed node (our EC2 instance). SSM’s <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/run-command.html">“Run
Command”</a>
functionality offers a variety of presets (called
<a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/documents.html">documents</a>
for various flows, including one that specifies how to execute an
ansible playbook locally (which is the one we will use).</p>
<p>In this post, we will build a solution step-by-step that will:</p>
<ul>
<li>prepare the AWS stack with all the necessary resources (tool: cloudformation)</li>
<li>perform the deployment of the application and the dependencies (tools: ansible & AWS Systems Manager)</li>
</ul>
<p>The full source code of the solution is hosted in <a href="https://github.com/kkentzo/deployment-ansible-ssm-systemd-demo">this
repository</a>.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>This guide depends on the existence of the following tools:</p>
<ul>
<li><a href="https://github.com/kkentzo/ork/releases/tag/v1.7.3"><code class="language-plaintext highlighter-rouge">ork</code></a>: a workflow automation tool which we will use in order to define the various actions that have to be performed, their dependencies and their content</li>
<li><a href="https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html"><code class="language-plaintext highlighter-rouge">aws-cli</code></a>: the official cli tool for interacting with the AWS APIs</li>
<li><a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html">Session manager plug-in</a> for <code class="language-plaintext highlighter-rouge">aws-cli</code> (optional): open interactive sessions to EC2 servers using AWS Systems Manager</li>
</ul>
<p>The presence of <code class="language-plaintext highlighter-rouge">ansible</code> on the dev machine is not necessary since
<code class="language-plaintext highlighter-rouge">ansible</code> will actually be executed on the remote server.</p>
<h2 id="the-application">The application</h2>
<p>The application that we are going to deploy is a trivial http server
in <code class="language-plaintext highlighter-rouge">golang</code> that just returns a greeting along with an http status of
<code class="language-plaintext highlighter-rouge">200</code> (file <code class="language-plaintext highlighter-rouge">demo.go</code>):</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span>
<span class="k">import</span> <span class="p">(</span>
<span class="s">"flag"</span>
<span class="s">"fmt"</span>
<span class="s">"log"</span>
<span class="s">"net/http"</span>
<span class="p">)</span>
<span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">var</span> <span class="n">port</span> <span class="kt">string</span>
<span class="n">flag</span><span class="o">.</span><span class="n">StringVar</span><span class="p">(</span><span class="o">&</span><span class="n">port</span><span class="p">,</span> <span class="s">"port"</span><span class="p">,</span> <span class="s">"8080"</span><span class="p">,</span> <span class="s">"Specify the service port"</span><span class="p">)</span>
<span class="n">flag</span><span class="o">.</span><span class="n">Parse</span><span class="p">()</span>
<span class="n">http</span><span class="o">.</span><span class="n">HandleFunc</span><span class="p">(</span><span class="s">"/"</span><span class="p">,</span> <span class="k">func</span><span class="p">(</span><span class="n">w</span> <span class="n">http</span><span class="o">.</span><span class="n">ResponseWriter</span><span class="p">,</span> <span class="n">r</span> <span class="o">*</span><span class="n">http</span><span class="o">.</span><span class="n">Request</span><span class="p">)</span> <span class="p">{</span>
<span class="k">var</span> <span class="n">name</span> <span class="kt">string</span>
<span class="k">if</span> <span class="n">name</span> <span class="o">=</span> <span class="n">r</span><span class="o">.</span><span class="n">URL</span><span class="o">.</span><span class="n">Path</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="p">];</span> <span class="n">name</span> <span class="o">==</span> <span class="s">""</span> <span class="p">{</span>
<span class="n">name</span> <span class="o">=</span> <span class="s">"stranger"</span>
<span class="p">}</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Fprintf</span><span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="s">"hello %s!"</span><span class="p">,</span> <span class="n">name</span><span class="p">)</span>
<span class="p">})</span>
<span class="n">log</span><span class="o">.</span><span class="n">Printf</span><span class="p">(</span><span class="s">"Starting service [port=%s]"</span><span class="p">,</span> <span class="n">port</span><span class="p">)</span>
<span class="n">log</span><span class="o">.</span><span class="n">Fatal</span><span class="p">(</span><span class="n">http</span><span class="o">.</span><span class="n">ListenAndServe</span><span class="p">(</span><span class="n">fmt</span><span class="o">.</span><span class="n">Sprintf</span><span class="p">(</span><span class="s">":%s"</span><span class="p">,</span> <span class="n">port</span><span class="p">),</span> <span class="no">nil</span><span class="p">))</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="solution-overview">Solution Overview</h2>
<p>Our solution is based on the approach that the ansible playbook will
be executed locally on our EC2 server (since there is no ssh
connection to the remote host). AWS SSM will be responsible for
downloading the ansible playbook to the server and executing it. For
that to happen we will need to send the relevant command to SSM over
the AWS API. This command needs the following pieces of information
which we will model in the form of environment variables to a bash
script containing the command:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">INSTANCE_ID</code>: the id of the ec2 instance in which the command needs to be sent</li>
<li><code class="language-plaintext highlighter-rouge">ANSIBLE_PLAYBOOKS_PATH</code>: a link to a zip file in S3 containing the ansible playbooks</li>
<li><code class="language-plaintext highlighter-rouge">PLAYBOOK_FILE</code>: the playbook file to be executed</li>
<li><code class="language-plaintext highlighter-rouge">LOG_GROUP</code>: the AWS Log Group to which the logs of the ansible execution will be sent</li>
<li><code class="language-plaintext highlighter-rouge">AWS_REGION</code>: the AWS region to which we want to send the SSM command</li>
</ul>
<p>The command script essentially checks that these variables are all
defined and subsequently <a href="https://awscli.amazonaws.com/v2/documentation/api/latest/reference/ssm/send-command.html">dispatches the SSM
command</a>
(<code class="language-plaintext highlighter-rouge">scripts/ssm_send_command.sh</code>):</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="c"># This script sends a command to AWS SSM that:</span>
<span class="c"># - instructs a particular instance ($INSTANCE_ID)</span>
<span class="c"># - to execute an ansible playbook ($PLAYBOOK_FILE)</span>
<span class="c"># - that is located in an S3 bucket ($ANSIBLE_PLAYBOOKS_PATH)</span>
<span class="c"># - and write the logs to a log group ($LOG_GROUP)</span>
<span class="c"># - the command will be executed in a specific AWS region ($AWS_REGION)</span>
<span class="c"># all these environment variables need to present for the script to run</span>
<span class="c"># the output of the command can be inspected using aws cli as follows:</span>
<span class="c"># $ aws logs tail $LOG_GROUP --follow</span>
<span class="c"># stop script on command error</span>
<span class="nb">set</span> <span class="nt">-e</span>
<span class="c"># do we have everything that we need?</span>
<span class="o">[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="k">${</span><span class="nv">INSTANCE_ID</span><span class="k">}</span><span class="s2">"</span> <span class="o">]</span> <span class="o">&&</span> <span class="o">{</span> <span class="nb">echo</span> <span class="s2">"INSTANCE_ID is missing"</span><span class="p">;</span> <span class="nb">exit </span>1<span class="p">;</span> <span class="o">}</span>
<span class="o">[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="k">${</span><span class="nv">ANSIBLE_PLAYBOOKS_PATH</span><span class="k">}</span><span class="s2">"</span> <span class="o">]</span> <span class="o">&&</span> <span class="o">{</span> <span class="nb">echo</span> <span class="s2">"ANSIBLE_PLAYBOOKS_PATH is missing"</span><span class="p">;</span> <span class="nb">exit </span>1<span class="p">;</span> <span class="o">}</span>
<span class="o">[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="k">${</span><span class="nv">PLAYBOOK_FILE</span><span class="k">}</span><span class="s2">"</span> <span class="o">]</span> <span class="o">&&</span> <span class="o">{</span> <span class="nb">echo</span> <span class="s2">"PLAYBOOK_FILE is missing"</span><span class="p">;</span> <span class="nb">exit </span>1<span class="p">;</span> <span class="o">}</span>
<span class="o">[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="k">${</span><span class="nv">LOG_GROUP</span><span class="k">}</span><span class="s2">"</span> <span class="o">]</span> <span class="o">&&</span> <span class="o">{</span> <span class="nb">echo</span> <span class="s2">"LOG_GROUP is missing"</span><span class="p">;</span> <span class="nb">exit </span>1<span class="p">;</span> <span class="o">}</span>
<span class="o">[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="k">${</span><span class="nv">AWS_REGION</span><span class="k">}</span><span class="s2">"</span> <span class="o">]</span> <span class="o">&&</span> <span class="o">{</span> <span class="nb">echo</span> <span class="s2">"AWS_REGION is missing"</span><span class="p">;</span> <span class="nb">exit </span>1<span class="p">;</span> <span class="o">}</span>
<span class="c"># run the command</span>
<span class="c"># we use interpolation within single quotes: https://unix.stackexchange.com/a/447974</span>
aws ssm send-command <span class="nt">--document-name</span> <span class="s2">"AWS-ApplyAnsiblePlaybooks"</span> <span class="nt">--document-version</span> <span class="s2">"1"</span> <span class="se">\</span>
<span class="nt">--targets</span> <span class="s1">'[{"Key":"InstanceIds","Values":["'</span><span class="s2">"</span><span class="k">${</span><span class="nv">INSTANCE_ID</span><span class="k">}</span><span class="s2">"</span><span class="s1">'"]}]'</span> <span class="se">\</span>
<span class="nt">--parameters</span> <span class="s1">'{"SourceType":["S3"],"SourceInfo":["{\"path\": \"'</span><span class="s2">"</span><span class="k">${</span><span class="nv">ANSIBLE_PLAYBOOKS_PATH</span><span class="k">}</span><span class="s2">"</span><span class="s1">'\"}"],"InstallDependencies":["True"],"PlaybookFile":["'</span><span class="s2">"</span><span class="k">${</span><span class="nv">PLAYBOOK_FILE</span><span class="k">}</span><span class="s2">"</span><span class="s1">'"],"ExtraVariables":["SSM=True"],"Check":["False"],"TimeoutSeconds":["3600"]}'</span> <span class="se">\</span>
<span class="nt">--timeout-seconds</span> 600 <span class="nt">--max-concurrency</span> <span class="s2">"50"</span> <span class="nt">--max-errors</span> <span class="s2">"0"</span> <span class="se">\</span>
<span class="nt">--cloud-watch-output-config</span> <span class="s1">'{"CloudWatchOutputEnabled":true,"CloudWatchLogGroupName":"'</span><span class="s2">"</span><span class="k">${</span><span class="nv">LOG_GROUP</span><span class="k">}</span><span class="s2">"</span><span class="s1">'"}'</span> <span class="se">\</span>
<span class="nt">--region</span> <span class="s2">"</span><span class="k">${</span><span class="nv">AWS_REGION</span><span class="k">}</span><span class="s2">"</span>
<span class="nb">echo</span> <span class="s2">"Command was sent. Monitor using:"</span>
<span class="nb">echo</span> <span class="s2">"aws logs tail </span><span class="k">${</span><span class="nv">LOG_GROUP</span><span class="k">}</span><span class="s2"> --follow"</span>
</code></pre></div></div>
<p>The dispatched SSM command will be received by the specified EC2
instance (which must be registered as an SSM node) and will be
executed locally in the server instance. Among other actions, the
playbook will download the release binary from the corresponding s3
bucket and install it locally on the EC2 instance as a systemd
service.</p>
<p>In general, the workflow is split in two parts. The first part runs on
our local machine / dev laptop (or alternatively in some CI process)
with the objective of uploading the necessary artifacts to S3 and
sending the SSM command and the second part runs on the EC2 server
consists solely of the ansible playbook execution.</p>
<p>More specifically, the workflow steps are:</p>
<ol>
<li>[laptop] Deploy the ansible playbooks to s3</li>
<li>[laptop] build the application and upload the binary to s3</li>
<li>[laptop] Send the command to SSM</li>
<li>SSM sends the command to EC2</li>
<li>[ec2] Download and execute (locally) the ansible playbook from s3</li>
<li>[ec2] Download the binary from s3 (part of ansible playbook)</li>
</ol>
<p><img src="/assets/ssm_ansible_workflow.png" alt="Workflow diagram" /></p>
<p>From the above, it is clear that there’s some amount of preparatory
work to be done before that flow can be executed. Before we send the
SSM command, will need to ensure that:</p>
<ul>
<li>the EC2 instance is set up as an SSM managed node</li>
<li>the ansible playbooks are zipped and uploaded to a specific S3 bucket (to which the EC2 instance has access)</li>
<li>the application binary is uploaded to a specific S3 bucket (to which the EC2 instance has access)</li>
</ul>
<p>Let’s now do that work.</p>
<h3 id="creating-the-aws-resources">Creating the AWS resources</h3>
<p>For the purpose of this post, we will assume that the EC2 instance
already exists, has the appropriate security group and has a public
elastic IP.</p>
<p>The existing instance must also have the <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html">SSM agent installed and
running</a>
(empirically speaking, most EC2 linux images have the agent
pre-installed and enabled).</p>
<p>We will create an AWS stack with the following resources:</p>
<ul>
<li>an S3 bucket <code class="language-plaintext highlighter-rouge">ssm-demo-release-artifacts</code> that will host:
<ul>
<li>the zipped ansible folder with the playbooks and roles</li>
<li>the application binary to be deployed on our server</li>
</ul>
</li>
<li>an AWS instance role that:
<ul>
<li>allows the instance to serve as an SSM managed node</li>
<li>allows access to CloudWatch Logs (ansible logs will be sent there)</li>
<li>allows access to the S3 bucket with the release artifacts</li>
</ul>
</li>
<li>an AWS Cloudwatch Log Group to collect the SSM logs (ansible output)</li>
</ul>
<p>We will express the above in a cloudformation template (<code class="language-plaintext highlighter-rouge">cloudformation/demo.yml</code>):</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">AWSTemplateFormatVersion</span><span class="pi">:</span> <span class="s1">'</span><span class="s">2010-09-09'</span>
<span class="na">Description</span><span class="pi">:</span> <span class="pi">>-</span>
<span class="s">Provision the necessary resources for enabling ansible deployment over SSM</span>
<span class="na">Parameters</span><span class="pi">:</span>
<span class="na">ReleaseArtifactsBucketName</span><span class="pi">:</span>
<span class="na">Type</span><span class="pi">:</span> <span class="s">String</span>
<span class="na">Description</span><span class="pi">:</span> <span class="s">The name of the release artifacts s3 bucket</span>
<span class="na">LogGroupName</span><span class="pi">:</span>
<span class="na">Type</span><span class="pi">:</span> <span class="s">String</span>
<span class="na">Description</span><span class="pi">:</span> <span class="s">The name of the log group for the ansible execution logs</span>
<span class="na">Resources</span><span class="pi">:</span>
<span class="c1"># ======================================</span>
<span class="c1"># === Ansible / Deployment Resources ===</span>
<span class="c1"># ======================================</span>
<span class="c1"># SSM will write the command logs to this log group</span>
<span class="na">AnsibleLogGroup</span><span class="pi">:</span>
<span class="na">Type</span><span class="pi">:</span> <span class="s">AWS::Logs::LogGroup</span>
<span class="na">Properties</span><span class="pi">:</span>
<span class="na">LogGroupName</span><span class="pi">:</span> <span class="kt">!Ref</span> <span class="s">LogGroupName</span>
<span class="na">RetentionInDays</span><span class="pi">:</span> <span class="m">30</span>
<span class="c1"># S3 Bucket for release artifacts</span>
<span class="na">ReleaseArtifactsBucket</span><span class="pi">:</span>
<span class="na">Type</span><span class="pi">:</span> <span class="s">AWS::S3::Bucket</span>
<span class="na">Properties</span><span class="pi">:</span>
<span class="na">BucketName</span><span class="pi">:</span> <span class="kt">!Ref</span> <span class="s">ReleaseArtifactsBucketName</span>
<span class="na">PublicAccessBlockConfiguration</span><span class="pi">:</span>
<span class="na">BlockPublicPolicy</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">BlockPublicAcls</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">IgnorePublicAcls</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">RestrictPublicBuckets</span><span class="pi">:</span> <span class="no">true</span>
<span class="c1"># policy for accessing the release artifacts</span>
<span class="na">ReleaseArtifactsBucketPolicy</span><span class="pi">:</span>
<span class="na">Type</span><span class="pi">:</span> <span class="s">AWS::IAM::Policy</span>
<span class="na">Properties</span><span class="pi">:</span>
<span class="na">PolicyName</span><span class="pi">:</span> <span class="s2">"</span><span class="s">ssm-demo-release-artifacts-access"</span>
<span class="na">PolicyDocument</span><span class="pi">:</span>
<span class="na">Version</span><span class="pi">:</span> <span class="s2">"</span><span class="s">2012-10-17"</span>
<span class="na">Statement</span><span class="pi">:</span>
<span class="pi">-</span>
<span class="na">Effect</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Allow"</span>
<span class="na">Action</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">s3:ListBucket</span>
<span class="pi">-</span> <span class="s">s3:GetObject</span>
<span class="na">Resource</span><span class="pi">:</span>
<span class="pi">-</span> <span class="kt">!Sub</span> <span class="s2">"</span><span class="s">arn:aws:s3:::${ReleaseArtifactsBucketName}"</span>
<span class="pi">-</span> <span class="s2">"</span><span class="s">arn:aws:s3:::${ReleaseArtifactsBucketName}/*"</span>
<span class="na">Roles</span><span class="pi">:</span>
<span class="pi">-</span> <span class="kt">!Ref</span> <span class="s">ServerRole</span>
<span class="c1"># Server Role</span>
<span class="na">ServerRole</span><span class="pi">:</span>
<span class="na">Type</span><span class="pi">:</span> <span class="s">AWS::IAM::Role</span>
<span class="na">Properties</span><span class="pi">:</span>
<span class="na">RoleName</span><span class="pi">:</span> <span class="s2">"</span><span class="s">ssm-demo-server-role"</span>
<span class="na">ManagedPolicyArns</span><span class="pi">:</span>
<span class="c1"># enable the instance to serve as an SSM managed node</span>
<span class="pi">-</span> <span class="s">arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore</span>
<span class="c1"># provide access to Cloudwatch logs (for ansible deployment over SSM)</span>
<span class="pi">-</span> <span class="s">arn:aws:iam::aws:policy/CloudWatchLogsFullAccess</span>
<span class="na">AssumeRolePolicyDocument</span><span class="pi">:</span>
<span class="na">Version</span><span class="pi">:</span> <span class="s2">"</span><span class="s">2012-10-17"</span>
<span class="na">Statement</span><span class="pi">:</span>
<span class="pi">-</span>
<span class="na">Effect</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Allow"</span>
<span class="na">Principal</span><span class="pi">:</span>
<span class="na">Service</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s2">"</span><span class="s">ec2.amazonaws.com"</span>
<span class="na">Action</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s2">"</span><span class="s">sts:AssumeRole"</span>
<span class="c1"># Instance Profile -- this must be attached to the EC2 instance</span>
<span class="na">ServerProfile</span><span class="pi">:</span>
<span class="na">Type</span><span class="pi">:</span> <span class="s">AWS::IAM::InstanceProfile</span>
<span class="na">Properties</span><span class="pi">:</span>
<span class="na">InstanceProfileName</span><span class="pi">:</span> <span class="s2">"</span><span class="s">ssm-demo-server-instance-profile"</span>
<span class="na">Roles</span><span class="pi">:</span>
<span class="pi">-</span> <span class="kt">!Ref</span> <span class="s">ServerRole</span>
</code></pre></div></div>
<p>Once the above cloudformation stack is deployed, we will need to
attach the instance profile that was created
(<code class="language-plaintext highlighter-rouge">ssm-demo-server-instance-profile</code>) to the existing EC2 instance
<a href="https://repost.aws/knowledge-center/attach-replace-ec2-instance-profile">either through the web console or using the CLI
tool</a>.</p>
<p>Our EC2 should now (hopefully) be visible under Systems Manager’s
Fleet Manager (AWS Web Console).</p>
<h3 id="deploying-the-application">Deploying the application</h3>
<p>Having set the necessary AWS resources, we will now focus on the
ansible playbook that will be executed on the remote host (EC2). We
will follow the pattern established the <a href="/2020/03/25/deploying-with-ansible-systemd/">previous post</a> which installs the
application as a <code class="language-plaintext highlighter-rouge">systemd</code> service. The difference in the approach is
that we no longer send the binary over ssh but, rather, we first copy
the binary to the s3 bucket (<code class="language-plaintext highlighter-rouge">ssm-demo-release-artifacts</code>) and then
download the binary from within the ec2 server using ansible.</p>
<p>The relevant ansible task makes use of the aws cli tool (which must
exist on the server) and looks like so:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Download artifact to server</span>
<span class="s">ansible.builtin.shell</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">aws s3 cp /usr/local/bin/demo</span>
<span class="s">chown demo:demo /usr/local/bin/demo</span>
<span class="s">chmod u+x /usr/local/bin/demo</span>
<span class="na">notify</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">Restart demo service</span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">demo</code> user and group are created in the rest of the ansible
playbook which can be found in its entirety in this
<a href="https://github.com/kkentzo/deployment-ansible-ssm-systemd-demo">repository</a>.</p>
<p>The port in which our demo service will bind is configurable in
<code class="language-plaintext highlighter-rouge">ansible/demo.yml</code> (ansible variable <code class="language-plaintext highlighter-rouge">demo_app_port</code>).</p>
<h3 id="bringing-it-all-together">Bringing it all together</h3>
<p>Having discussed all the pieces of the solution, we will now automate
the relevant workflows using <code class="language-plaintext highlighter-rouge">ork</code> and express the flow in the form of
<code class="language-plaintext highlighter-rouge">Orkfile</code> tasks (more details <a href="https://github.com/kkentzo/ork">here</a>).</p>
<p>As we saw previously, there are two main workflows:</p>
<ul>
<li>the management (creation/update) of the relevant AWS resources</li>
<li>the deployment / release of the demo application</li>
</ul>
<p>We will express the the management of the AWS resources by defining 3
ork tasks: one for creating the CF stack for the first time
(<code class="language-plaintext highlighter-rouge">cloudformation.create</code>), one for describing its status
(<code class="language-plaintext highlighter-rouge">cloudformation.describe</code>) and one for applying updates
(<code class="language-plaintext highlighter-rouge">cloudformation.update</code>). These tasks will make use of the
corresponding <code class="language-plaintext highlighter-rouge">aws-cli</code> functionality:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">global</span><span class="pi">:</span>
<span class="na">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">AWS_REGION</span><span class="pi">:</span> <span class="s">eu-central-1</span>
<span class="na">RELEASE_ARTIFACTS_BUCKET</span><span class="pi">:</span> <span class="s">ssm-demo-release-artifacts</span>
<span class="na">ANSIBLE_LOG_GROUP</span><span class="pi">:</span> <span class="s2">"</span><span class="s">/ssm/ansible/demo"</span>
<span class="na">tasks</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">cloudformation</span>
<span class="na">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">STACK_TEMPLATE</span><span class="pi">:</span> <span class="s">cloudformation/demo.yml</span>
<span class="na">STACK_NAME</span><span class="pi">:</span> <span class="s">demo-ansible-ssm</span>
<span class="na">tasks</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">create</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">create the cloudformation stack for the first time</span>
<span class="na">actions</span><span class="pi">:</span>
<span class="pi">-</span> <span class="pi">>-</span>
<span class="s">aws cloudformation create-stack</span>
<span class="s">--region ${AWS_REGION}</span>
<span class="s">--stack-name ${STACK_NAME}</span>
<span class="s">--template-body "file://${STACK_TEMPLATE}"</span>
<span class="s">--capabilities CAPABILITY_NAMED_IAM</span>
<span class="s">-- parameters</span>
<span class="s">ParameterKey=ReleaseArtifactsBucketName,ParameterValue=${RELEASE_ARTIFACTS_BUCKET}</span>
<span class="s">ParameterKey=LogGroupName,ParameterValue=${ANSIBLE_LOG_GROUP}</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">describe</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">show the current status of the cloudformation stack</span>
<span class="na">actions</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">aws cloudformation describe-stacks --stack-name ${STACK_NAME}</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">update</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">apply changes to the cloudformation stack</span>
<span class="na">actions</span><span class="pi">:</span>
<span class="pi">-</span> <span class="pi">>-</span>
<span class="s">aws cloudformation update-stack</span>
<span class="s">--region ${AWS_REGION}</span>
<span class="s">--stack-name ${STACK_NAME}</span>
<span class="s">--template-body "file://${STACK_TEMPLATE}"</span>
<span class="s">--capabilities CAPABILITY_NAMED_IAM</span>
<span class="s">-- parameters</span>
<span class="s">ParameterKey=ReleaseArtifactsBucketName,ParameterValue=${RELEASE_ARTIFACTS_BUCKET}</span>
<span class="s">ParameterKey=LogGroupName,ParameterValue=${ANSIBLE_LOG_GROUP}</span>
</code></pre></div></div>
<p>We can create the stack by running <code class="language-plaintext highlighter-rouge">ork cloudformation.create</code>; the
necessary AWS credentials must be properly set up in the shell before
we execute this command.</p>
<p>Having created these resources, we must now associate our
(pre-existing) EC2 instance with the instance profile
(<code class="language-plaintext highlighter-rouge">ssm-demo-server-instance-profile</code>) by modifying the instance’s IAM
role (e.g. from the web console). We should also verify that our
instance is indeed visible under AWS System Manager’s Managed Node
Fleet (it may take a while for the instance to appear under the
fleet).</p>
<p>We are now ready to deploy and release our application by:</p>
<ul>
<li>building the application (ork task: <code class="language-plaintext highlighter-rouge">build</code>)</li>
<li>uploading our ansible playbook to s3 (ork task: <code class="language-plaintext highlighter-rouge">ansible.deploy</code>)</li>
<li>upload the binary to S3 and trigger the SSM send command (ork task: <code class="language-plaintext highlighter-rouge">release</code>)</li>
</ul>
<p>Here are the definitions of those tasks in the <code class="language-plaintext highlighter-rouge">Orkfile</code> (we need to
replace the <code class="language-plaintext highlighter-rouge">INSTANCE_ID</code> variable in the <code class="language-plaintext highlighter-rouge">Orkfile</code> with the actual ID
of our EC2 instance):</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">global</span><span class="pi">:</span>
<span class="na">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">AWS_REGION</span><span class="pi">:</span> <span class="s">eu-central-1</span>
<span class="na">RELEASE_ARTIFACTS_BUCKET</span><span class="pi">:</span> <span class="s">ssm-demo-release-artifacts</span>
<span class="na">ANSIBLE_LOG_GROUP</span><span class="pi">:</span> <span class="s2">"</span><span class="s">/ssm/ansible/demo"</span>
<span class="na">tasks</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">build</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">build the demo binary</span>
<span class="na">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">GOOS</span><span class="pi">:</span> <span class="s">linux</span>
<span class="na">GOARCH</span><span class="pi">:</span> <span class="s">amd64</span>
<span class="na">actions</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">go build -o bin/demo app/demo.go</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ansible.deploy</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">deploy the ansible playbooks to AWS S3</span>
<span class="na">actions</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">zip -r -FS ansible.zip ansible</span>
<span class="pi">-</span> <span class="s">aws s3 cp ansible.zip https://${RELEASE_ARTIFACTS_BUCKET}.s3.${AWS_REGION}.amazonaws.com/ansible.zip</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">release</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">release the demo application over SSM</span>
<span class="na">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">INSTANCE_ID</span><span class="pi">:</span> <span class="s">i-REPLACE_ME_WITH_ACTUAL_INSTANCE_ID</span>
<span class="na">ANSIBLE_PLAYBOOKS_PATH</span><span class="pi">:</span> <span class="s">https://${RELEASE_ARTIFACTS_BUCKET}.s3.${AWS_REGION}.amazonaws.com/ansible.zip</span>
<span class="na">PLAYBOOK_FILE</span><span class="pi">:</span> <span class="s">ansible/demo.yml</span>
<span class="na">depends_on</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">build</span>
<span class="pi">-</span> <span class="s">ansible.deploy</span>
<span class="na">actions</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">aws s3 cp bin/demo s3://${RELEASE_ARTIFACTS_BUCKET}/demo</span>
<span class="pi">-</span> <span class="s">./scripts/ssm_send_command.sh</span>
</code></pre></div></div>
<p>Tasks <code class="language-plaintext highlighter-rouge">build</code> and <code class="language-plaintext highlighter-rouge">ansible.deploy</code> are expressed as dependencies of
task <code class="language-plaintext highlighter-rouge">release</code> (see <code class="language-plaintext highlighter-rouge">depends_on</code> attribute, so it suffices to run <code class="language-plaintext highlighter-rouge">ork
release</code> in order to release the application.</p>
<p>It is worth repeating that the ansible playbook will be executed in
the server, so, once the SSM command is sent to AWS, its progress can
be inspected by tailing the corresponding Cloudwatch Log Group like
so:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>aws logs <span class="nb">tail</span> /ssm/ansible/demo <span class="nt">--follow</span>
</code></pre></div></div>
<p>Once the playbook is finished, we should be able to perform an http
request to the service (depending also on the EC2’s public IP,
security group etc. which are out of scope in this guide).</p>
<h2 id="summary">Summary</h2>
<p>We have seen how to deploy an application to an AWS EC2 instance not
by going over ssh but by utilizing AWS Systems Manager bringing a lot
of security-related advantages (IAM authorization, command auditing
etc.). AWS SSM has a lot more features than the Run Command that we
used in this guide and it is worth looking over <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/what-is-systems-manager.html">the
documentation</a>.</p>
<p>This flow can be applied to any application (i.e. not just golang)
that can be packaged in an archive and transferred to EC2 via S3 with
the necessary adjustments in the ansible playbook that is responsible
for the deployment and release of the application on the EC2 instance.</p>
<p>The source code files that were used in this guide (<code class="language-plaintext highlighter-rouge">Orkfile</code>,
cloudformation template, <code class="language-plaintext highlighter-rouge">ssm_send_command</code> script and the ansible
playbook) can be found in <a href="https://github.com/kkentzo/deployment-ansible-ssm-systemd-demo">this
repository</a>.</p>
<p>Hope you enjoyed this!</p>In a previous post, we saw how to deploy an application (small golang service) using ansible and systemd. In that flow, ansible execution depended upon the remote server accepting ssh connections. However, there are a lot of situations in which the remote server does not have an open ssh port due to security reasons (e.g. compliance to security requirements).The golang for-loop gotcha2021-01-21T00:00:00+00:002021-01-21T00:00:00+00:00https://kkentzo.github.io/2021/01/21/golang-loop-variable-gotcha<p>This bit me <strong>again</strong> the other day so what better way to exorcize the
resulting bug in production code than to write a small article about
it?</p>
<p>So, let’s raise hands: who among us <strong>has not</strong> written code that
looks somewhat like this trivial example:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span>
<span class="k">import</span> <span class="p">(</span>
<span class="s">"fmt"</span>
<span class="s">"time"</span>
<span class="p">)</span>
<span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">i</span> <span class="o">:=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="m">5</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span> <span class="p">{</span>
<span class="k">go</span> <span class="k">func</span><span class="p">()</span> <span class="p">{</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Print</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="p">}()</span>
<span class="p">}</span>
<span class="c">// let's wait a little bit for the goroutines to execute before we exit</span>
<span class="n">time</span><span class="o">.</span><span class="n">Sleep</span><span class="p">(</span><span class="m">100</span> <span class="o">*</span> <span class="n">time</span><span class="o">.</span><span class="n">Millisecond</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>
<p><em>(very few hands raised in the audience)</em> :grin:</p>
<p>Now, one reasonable assumption would be that this program will print
the numbers from 1 to 5.</p>
<p>This assumption would be wrong though; the program consistently prints
the string <code class="language-plaintext highlighter-rouge">55555</code> (if the result is different on your computer, which
shouldn’t be, then try increasing the sleep duration at the end).</p>
<p>Why is that? The reason is that all goroutines close over the <strong>same</strong>
variable (<code class="language-plaintext highlighter-rouge">i</code> in this case) so, when they are executed, the variable
will have assumed its last value after the completion of the iteration
loop (which is <code class="language-plaintext highlighter-rouge">5</code>).</p>
<p>We can observe the difference in behaviour when we introduce a small
delay at the end of each iteration like so:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="o">...</span>
<span class="c">// data race</span>
<span class="k">for</span> <span class="n">i</span> <span class="o">:=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="m">5</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span> <span class="p">{</span>
<span class="k">go</span> <span class="k">func</span><span class="p">()</span> <span class="p">{</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Print</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="p">}()</span>
<span class="n">time</span><span class="o">.</span><span class="n">Sleep</span><span class="p">(</span><span class="m">50</span> <span class="o">*</span> <span class="n">time</span><span class="o">.</span><span class="n">Millisecond</span><span class="p">)</span>
<span class="p">}</span>
<span class="o">...</span>
</code></pre></div></div>
<p>The output in this case would be <code class="language-plaintext highlighter-rouge">01234</code> because each goroutine would
most probably be executed before the value of the shared loop variable
<code class="language-plaintext highlighter-rouge">i</code> was incremented by one.</p>
<p>In essence, this data race (which can lead to wicked and hard-to-find
bugs) is caused by the unsafe <strong>sharing of mutable state</strong> (in this
case <code class="language-plaintext highlighter-rouge">i</code>) among the competing concurrent goroutines and is well
described on the interwebs (this
<a href="https://eli.thegreenplace.net/2019/go-internals-capturing-loop-variables-in-closures/">article</a>
for example has an excellent explanation). The official Golang wiki
even lists this situation as <a href="https://github.com/golang/go/wiki/CommonMistakes">one of the two most common golang
mistakes</a> that one
can make (hint: the other mistake also has to do with <code class="language-plaintext highlighter-rouge">for</code> loop
variables).</p>
<p>So, what is the correct way of implementing the use case of
asynchronously processing a series of values in a <code class="language-plaintext highlighter-rouge">for</code> loop using
goroutines? The answer is simply to stop sharing the mutable state,
i.e. copy the state variable (in this case <code class="language-plaintext highlighter-rouge">i</code>) within each iteration
and pass it to each goroutine like so:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="o">...</span>
<span class="c">// no data race</span>
<span class="k">for</span> <span class="n">i</span> <span class="o">:=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="m">5</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span> <span class="p">{</span>
<span class="k">go</span> <span class="k">func</span><span class="p">(</span><span class="n">n</span> <span class="kt">int</span><span class="p">)</span> <span class="p">{</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Print</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="p">}(</span><span class="n">i</span><span class="p">)</span>
<span class="p">}</span>
<span class="o">...</span>
</code></pre></div></div>
<p>In this implementation, we are passing <code class="language-plaintext highlighter-rouge">i</code> <strong>by value</strong> as an argument
(<code class="language-plaintext highlighter-rouge">n</code>) into the goroutine; <code class="language-plaintext highlighter-rouge">n</code> is now a goroutine-local copy of <code class="language-plaintext highlighter-rouge">i</code> and
further changes to <code class="language-plaintext highlighter-rouge">i</code> are not reflected on <code class="language-plaintext highlighter-rouge">n</code>.</p>
<p>Alternatively, we could close over an iteration-local <strong>copy</strong> of <code class="language-plaintext highlighter-rouge">i</code>
as follows:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="o">...</span>
<span class="c">// no data race</span>
<span class="k">for</span> <span class="n">i</span> <span class="o">:=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="m">5</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span> <span class="p">{</span>
<span class="n">n</span> <span class="o">:=</span> <span class="n">i</span>
<span class="k">go</span> <span class="k">func</span><span class="p">()</span> <span class="p">{</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Print</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="p">}()</span>
<span class="p">}</span>
<span class="o">...</span>
</code></pre></div></div>
<p>It is perhaps worth pointing out that Golang’s <code class="language-plaintext highlighter-rouge">for</code> loop variable
gotcha is not some idiosyncratic language runtime misbehaviour that is
specific to the <code class="language-plaintext highlighter-rouge">for</code> loop but, rather, a classic example of failing
to synchronize concurrent access to a shared resource either by
locking it or copying it (our preferred solution). Here’s an example
of the same problem without a <code class="language-plaintext highlighter-rouge">for</code> loop:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span>
<span class="k">import</span> <span class="p">(</span>
<span class="s">"fmt"</span>
<span class="s">"time"</span>
<span class="p">)</span>
<span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="n">n</span> <span class="o">:=</span> <span class="m">10</span>
<span class="c">// data race</span>
<span class="k">go</span> <span class="k">func</span><span class="p">()</span> <span class="p">{</span>
<span class="n">time</span><span class="o">.</span><span class="n">Sleep</span><span class="p">(</span><span class="m">100</span> <span class="o">*</span> <span class="n">time</span><span class="o">.</span><span class="n">Millisecond</span><span class="p">)</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Print</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="p">}()</span>
<span class="n">n</span> <span class="o">=</span> <span class="m">11</span>
<span class="n">time</span><span class="o">.</span><span class="n">Sleep</span><span class="p">(</span><span class="m">200</span> <span class="o">*</span> <span class="n">time</span><span class="o">.</span><span class="n">Millisecond</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>
<p>This will print <code class="language-plaintext highlighter-rouge">11</code> instead of <code class="language-plaintext highlighter-rouge">10</code>.</p>
<p>So, when closing over an outer variable in something like a
gouroutine, a good tip would be to always think: who can access the
outer variable and if/how access needs to be synchronized. Passing a
copy of the state is always a safe choice!</p>This bit me again the other day so what better way to exorcize the resulting bug in production code than to write a small article about it?How to encrypt the NVS volume on the ESP322020-04-18T00:00:00+00:002020-04-18T00:00:00+00:00https://kkentzo.github.io/2020/04/18/esp32-nvs-encryption<p>There has been a lot of discussion around embedded device security during the last few years especially after <a href="https://blog.cloudflare.com/inside-mirai-the-infamous-iot-botnet-a-retrospective-analysis/">well-publicized DDoS incidents</a> involving armies of hijacked IoT devices. The demand for higher levels of security has put pressure on manufacturers and software providers to adopt and support modern security protocols in order to mitigate the relevant risks especially given the widening spread of IoT devices.</p>
<p>An IoT device may embed various security-related artifacts in order to enable the use of relevant protocols, for example certificates for communicating securely with remote servers (mqtt, http etc.). It is really easy for a bad actor with physical access to the device to snatch binary images from the device’s storage and search for such pieces of text. The possible leak of a security certificate may lead to escalated attacks into other parts of the infrastructure especially if the relevant permissions are incorrect or relaxed enough. In order to avoid that, we need to ensure that the device’s storage is encrypted and that no sensitive information embedded in the firware image or elsewhere can be read even with physical access to the device.</p>
<p>In this article, we will demonstrate how to encrypt the non-volatile storage of the <a href="https://www.espressif.com/en/products/hardware/esp32/overview">ESP32</a>. The ESP32 is a very popular choice for building embedded solutions considering the SoC’s capabilities (dual core, wifi, bluetooth), its low price (< 5$) and its <a href="https://docs.espressif.com/projects/esp-idf/en/latest/esp32/">comprehensive SDK framework</a> including a tcp/ip stack, http server/client (incl. TLS support), OTA firmware updates etc.</p>
<p>In general, encryption on the ESP32 is supported on the hardware level so as to prevent the recovery of (most) SPI flash contents using physical readouts. The ESP32 has two basic types of partitions: <code class="language-plaintext highlighter-rouge">app</code> which contain application-related artifacts such as the device firmware and <code class="language-plaintext highlighter-rouge">data</code> which contain arbitrary user data. The encryption of <code class="language-plaintext highlighter-rouge">app</code>-type partitions is fairly straight-forward and <a href="https://docs.espressif.com/projects/esp-idf/en/latest/esp32/security/flash-encryption.html">well-documented</a>. The process roughly consists of building the firmware with support for encryption, flashing the device and leaving the rest to the bootloader. Partitions marked as <code class="language-plaintext highlighter-rouge">data</code> however are not handled automatically by the bootloader and require a different process with respect to encryption.</p>
<p>The <a href="https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/storage/nvs_flash.html">non-volatile storage of the ESP32</a> is a <code class="language-plaintext highlighter-rouge">data</code>-type partition that uses a portion of the underlying flash over SPI and is typically used for storing key-value pairs. These can include, for example, unique device identification strings, wifi configuration data (incl. passwords), device-specific security certificates etc., in other words stuff that we would like to keep private from curious eyes. The ESP32 supports NVS encryption but, as mentioned before, the process is a little bit more involved.</p>
<p>NVS supports two flavours of encryption:</p>
<ul>
<li>runtime-encryption whereby the application itself generates the key and encrypts/decrypts data on the fly, and</li>
<li>build-time encryption whereby the nvs volume is pre-encrypted and flashed to the device</li>
</ul>
<p>In the runtime-encryption method, the application generates a key using the <a href="https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/storage/nvs_flash.html#_CPPv423nvs_flash_generate_keysPK15esp_partition_tP13nvs_sec_cfg_t">corresponding esp-idf function</a> and uses this key in order to encrypt/decrypt data in the nvs volume at runtime. These can be, for example, wifi or other passwords that may be known only at runtime and not beforehand.</p>
<p>In the build-time encryption method, the NVS partition containing all the necessary key-value pairs is <a href="https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/storage/nvs_partition_gen.html">prepared</a> and encrypted for downloading to the device. This method can be used in cases where the NVS data are known at compile time – example of such data include unique device IDs, device-specific security certificates etc.</p>
<p>In both cases, a separate partition is necessary for storing the nvs encryption key. ESP-IDF provides a partition subtype for this purpose (type <code class="language-plaintext highlighter-rouge">data</code> and subtype <code class="language-plaintext highlighter-rouge">nvs_keys</code>) and handles its encryption transparently via the bootloader.</p>
<p>The use case that we’d like to demonstrate here is the baking of pre-existing data (unique device ID, security certificates) into the device at compile time. OK, so let’s go through this procedure step-by-step by means of an example.</p>
<p>First, we need a custom partition table which can be configured using <code class="language-plaintext highlighter-rouge">idf.py menuconfig</code> and selecting “Custom Partition Table (CSV)” under the “Partition Table” menu (<code class="language-plaintext highlighter-rouge">CONFIG_PARTITION_TABLE_CUSTOM=y</code> and <code class="language-plaintext highlighter-rouge">CONFIG_PARTITION_TABLE_FILENAME="partitions.csv"</code>). Our <code class="language-plaintext highlighter-rouge">partitions.csv</code> file looks like this:</p>
<pre><code class="language-csv"># ESP-IDF Partition Table
# Name, Type, SubType, Offset, Size, Flags
nvs, data, nvs, , 0x4000,
otadata, data, ota, , 0x2000,
phy_init, data, phy, , 0x1000,
factory, 0, 0, , 1M,
ota_0, 0, ota_0, , 1M,
ota_1, 0, ota_1, , 1M,
nvs_key, data, nvs_keys, , 0x1000, encrypted
</code></pre>
<p>This partition table supports 3 app partitions (ESP’s standard OTA scheme with 1 factory and 2 OTA partitions), one 16KB <code class="language-plaintext highlighter-rouge">nvs</code> partition and one 4KB <code class="language-plaintext highlighter-rouge">nvs_keys</code> partition. We have only specified partition sizes – the offsets are calculated automatically by the tools. The <code class="language-plaintext highlighter-rouge">encrypted</code> flag of the <code class="language-plaintext highlighter-rouge">nvs_key</code> partition instructs the bootloader to automatically encrypt the contents, since we’re going to be storing the nvs encryption keys there. (Of course it’d be great if we could do that for the <code class="language-plaintext highlighter-rouge">nvs</code> partition as well, but this feature is not supported unfortunately as the existence of this article demonstrates…)</p>
<p>Next, we are going to prepare our nvs image locally by specifying its contents using the CSV format as follows:</p>
<pre><code class="language-csv"># NVS csv file
key,type,encoding,value
device_id,data,string,a_unique_value
cert,file,string,./path/to/certificate.pem.crt
pvkey,file,string,./path/to/private.pem.key
</code></pre>
<p>The actual image (<code class="language-plaintext highlighter-rouge">nvs.bin</code>) can be generated from the csv file (<code class="language-plaintext highlighter-rouge">nvs.csv</code>) using the partition generation tool (given a specified <code class="language-plaintext highlighter-rouge">$IDF_PATH</code>):</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ $IDF_PATH</span>/components/nvs_flash/nvs_partition_generator/nvs_partition_gen.py generate nvs.csv nvs.bin 0x4000 // not encrypted
</code></pre></div></div>
<p>However, the resulting image <code class="language-plaintext highlighter-rouge">nvs.bin</code> will be unencrypted. If instead of the <code class="language-plaintext highlighter-rouge">generate</code> command we use the <code class="language-plaintext highlighter-rouge">encrypt</code> command, the resulting image will be encrypted and the tool will output a second image (<code class="language-plaintext highlighter-rouge">nvs_keys.bin</code>) with the contents of the encryption key:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ $IDF_PATH</span>/components/nvs_flash/nvs_partition_generator/nvs_partition_gen.py encrypt nvs.csv encrypted_nvs.bin 0x4000 <span class="nt">--keygen</span> <span class="nt">--keyfile</span> nvs_keys.bin
</code></pre></div></div>
<p>These two images can now be flashed to the device using <code class="language-plaintext highlighter-rouge">esptool.py</code> (part of the esp-idf distribution):</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>esptool.py <span class="nt">-p</span> PORT <span class="nt">--before</span> default_reset <span class="nt">--after</span> no_reset write_flash 0xa000 encrypted_nvs.bin
</code></pre></div></div>
<p>where <code class="language-plaintext highlighter-rouge">PORT</code> is the serial comm device address (something like <code class="language-plaintext highlighter-rouge">/dev/cu.usbserial-0001</code>) and <code class="language-plaintext highlighter-rouge">encrypted_nvs.bin</code> is the image file that we generated in our previous step. The flash location address <code class="language-plaintext highlighter-rouge">0xa000</code> can be discovered either by inspecting the esp32 serial output (where the partitions are printed out) or by using the <code class="language-plaintext highlighter-rouge">gen_esp32part.py</code> utility on the project’s partition image (e.g. <code class="language-plaintext highlighter-rouge">gen_esp32part.py build/partition_table/partition-table.bin</code>).</p>
<p>The <code class="language-plaintext highlighter-rouge">nvs_keys</code> image also needs to be downloaded to the device:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>esptool.py <span class="nt">-p</span> PORT <span class="nt">--before</span> default_reset <span class="nt">--after</span> no_reset write_flash 0x320000 nvs_keys.bin
</code></pre></div></div>
<p>We’re almost done. On the application side, we now need to initialize the secure nvs volume like so:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">esp_err_t</span> <span class="nf">nvs_secure_initialize</span><span class="p">()</span> <span class="p">{</span>
<span class="k">static</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">nvs_tag</span> <span class="o">=</span> <span class="s">"nvs"</span><span class="p">;</span>
<span class="n">esp_err_t</span> <span class="n">err</span> <span class="o">=</span> <span class="n">ESP_OK</span><span class="p">;</span>
<span class="c1">// 1. find partition with nvs_keys</span>
<span class="k">const</span> <span class="n">esp_partition_t</span> <span class="o">*</span><span class="n">partition</span> <span class="o">=</span> <span class="n">esp_partition_find_first</span><span class="p">(</span><span class="n">ESP_PARTITION_TYPE_DATA</span><span class="p">,</span>
<span class="n">ESP_PARTITION_SUBTYPE_DATA_NVS_KEYS</span><span class="p">,</span>
<span class="s">"nvs_key"</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">partition</span> <span class="o">==</span> <span class="nb">NULL</span><span class="p">)</span> <span class="p">{</span>
<span class="n">ESP_LOGE</span><span class="p">(</span><span class="n">nvs_tag</span><span class="p">,</span> <span class="s">"Could not locate nvs_key partition. Aborting."</span><span class="p">);</span>
<span class="k">return</span> <span class="n">ESP_FAIL</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// 2. read nvs_keys from key partition</span>
<span class="n">nvs_sec_cfg_t</span> <span class="n">cfg</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">ESP_OK</span> <span class="o">!=</span> <span class="p">(</span><span class="n">err</span> <span class="o">=</span> <span class="n">nvs_flash_read_security_cfg</span><span class="p">(</span><span class="n">partition</span><span class="p">,</span> <span class="o">&</span><span class="n">cfg</span><span class="p">)))</span> <span class="p">{</span>
<span class="n">ESP_LOGE</span><span class="p">(</span><span class="n">nvs_tag</span><span class="p">,</span> <span class="s">"Failed to read nvs keys (rc=0x%x)"</span><span class="p">,</span> <span class="n">err</span><span class="p">);</span>
<span class="k">return</span> <span class="n">err</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// 3. initialize nvs partition</span>
<span class="k">if</span> <span class="p">(</span><span class="n">ESP_OK</span> <span class="o">!=</span> <span class="p">(</span><span class="n">err</span> <span class="o">=</span> <span class="n">nvs_flash_secure_init</span><span class="p">(</span><span class="o">&</span><span class="n">cfg</span><span class="p">)))</span> <span class="p">{</span>
<span class="n">ESP_LOGE</span><span class="p">(</span><span class="n">nvs_tag</span><span class="p">,</span> <span class="s">"failed to initialize nvs partition (err=0x%x). Aborting."</span><span class="p">,</span> <span class="n">err</span><span class="p">);</span>
<span class="k">return</span> <span class="n">err</span><span class="p">;</span>
<span class="p">};</span>
<span class="k">return</span> <span class="n">err</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">app_main</span><span class="p">()</span> <span class="p">{</span>
<span class="n">esp_err_t</span> <span class="n">err</span> <span class="o">=</span> <span class="n">nvs_secure_initialize</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="n">err</span> <span class="o">!=</span> <span class="n">ESP_OK</span><span class="p">)</span> <span class="p">{</span>
<span class="n">ESP_LOGE</span><span class="p">(</span><span class="s">"main"</span><span class="p">,</span> <span class="s">"Failed to initialize nvs (rc=0x%x). Halting."</span><span class="p">,</span> <span class="n">err</span><span class="p">);</span>
<span class="k">while</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span> <span class="n">vTaskDelay</span><span class="p">(</span><span class="mi">100</span><span class="p">);</span> <span class="p">}</span>
<span class="p">}</span>
<span class="c1">// rest of application code goes here</span>
<span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Once the app is built and flashed (using <code class="language-plaintext highlighter-rouge">idf.py encrypted-flash</code>), we’re good to go with our encrypted NVS volume and our IoT device can now be safely deployed to the field.</p>There has been a lot of discussion around embedded device security during the last few years especially after well-publicized DDoS incidents involving armies of hijacked IoT devices. The demand for higher levels of security has put pressure on manufacturers and software providers to adopt and support modern security protocols in order to mitigate the relevant risks especially given the widening spread of IoT devices.Deploying a service using ansible and systemd2020-03-25T00:00:00+00:002020-03-25T00:00:00+00:00https://kkentzo.github.io/2020/03/25/deploying-with-ansible-systemd<p>You may be a sole developer or member of a small development team with no dedicated ops people. You will probably have a handful of small-ish services, perhaps a few cronjobs and a couple of VPSs to run them on. Or you may have one or more servers at home and would like to automate the deployment of custom or open source tools and services. What are your options?</p>
<p>At one end of the spectrum, there’s the current kubernetes zeitgeist as recommended™ by the internetz. However, it may be that you can’t pay the price (i.e. time) or simply do not have the desire to ride the steep learning curve that this path entails. On the other end of the spectrum, there’s always <code class="language-plaintext highlighter-rouge">rsync</code>/<code class="language-plaintext highlighter-rouge">scp</code> and bash scripts but you’d like something better than that (including process management, logs, infrastructure as code checked into a git repo etc.). So, is there anything worthwile in between these two extremes?</p>
<p>This article is about how to deploy and run a service in a remote server using ansible and systemd. All the “configuration” that is neccessary to do that will be checked into a git repo and will be easily reproducible on an arbitrary set of servers (including your localhost) without the need to log into the servers and do any manual work (apart from setting up passwordless ssh access - but <a href="https://www.digitalocean.com/community/tutorials/how-to-configure-ssh-key-based-authentication-on-a-linux-server">you already have that</a>, right?). Now, a few words about the components that we are going to use.</p>
<p><a href="https://www.ansible.com/">Ansible</a> is a tool for automating task execution in remote servers. It runs locally on your development machine and can connect to a specified set of servers via ssh in order to execute a series of tasks without the need of an “agent” process on the server(s). There’s a <a href="https://docs.ansible.com/ansible/latest/modules/modules_by_category.html">wide variety of modules</a> that can accomplish common tasks such as creating users and groups, installing dependencies, copying files and many more. We will focus on the absolutely necessary in this guide, but those who would like to do more there’s this <a href="https://serversforhackers.com/c/an-ansible2-tutorial">nice tutorial</a> as well as ansible’s <a href="https://docs.ansible.com/ansible/latest/user_guide/intro_getting_started.html">official documentation</a>.</p>
<p><a href="https://www.freedesktop.org/wiki/Software/systemd/">systemd</a> is the basic foundation of most linux systems nowadays as the replacement of sysvinit and has a wide variety of features including managing processes and services (the feature that we’ll be using for this article).</p>
<p>For our demonstration, we will be using a simple custom service written in Go, which very nicely and conveniently consists of a single statically-linked binary, but the concepts are the same for anything that can be executed on the remote server (this includes programs writen in ruby/python/java/dotnet etc.). So, let’s start!</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>We will be needing the following on our local (development) machine:</p>
<ul>
<li>a <a href="https://golang.org/doc/install">working Go installation</a> in order to build our service</li>
<li>the <a href="https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html">ansible tool</a></li>
<li>the <code class="language-plaintext highlighter-rouge">make</code> program (check your system using <code class="language-plaintext highlighter-rouge">which make</code>)</li>
</ul>
<p>I have assumed that you have <a href="https://www.digitalocean.com/community/tutorials/how-to-configure-ssh-key-based-authentication-on-a-linux-server">passwordless ssh access</a> to a remote server running linux (I use Debian Buster but any linux system with sshd and systemd should do).</p>
<p>All the work that follows is checked into <a href="https://github.com/kkentzo/deployment-ansible-systemd-demo">this repo</a> which can be cloned using <code class="language-plaintext highlighter-rouge">git clone https://github.com/kkentzo/deployment-ansible-systemd-demo.git</code>. The repo contains the following components:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">cmd/demo/main.go</code>: our service</li>
<li><code class="language-plaintext highlighter-rouge">demo.yml</code>: the description of our deployment (ansible)</li>
<li><code class="language-plaintext highlighter-rouge">roles/demo</code>: the deployment specifics of the demo dervice (ansible)</li>
<li><code class="language-plaintext highlighter-rouge">hosts</code>: the inventory list of hosts to which the demo service will be deployed</li>
<li><code class="language-plaintext highlighter-rouge">Makefile</code>: targets for building and deploying the service</li>
</ul>
<h2 id="the-guide">The Guide</h2>
<h3 id="writing-the-service">Writing the service</h3>
<p>Our service is a very simple one: it accepts http requests and responds with a greeting to the client based on the contents of the url path. The code is dead simple:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span>
<span class="k">import</span> <span class="p">(</span>
<span class="s">"fmt"</span>
<span class="s">"log"</span>
<span class="s">"net/http"</span>
<span class="p">)</span>
<span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="n">http</span><span class="o">.</span><span class="n">HandleFunc</span><span class="p">(</span><span class="s">"/"</span><span class="p">,</span> <span class="k">func</span><span class="p">(</span><span class="n">w</span> <span class="n">http</span><span class="o">.</span><span class="n">ResponseWriter</span><span class="p">,</span> <span class="n">r</span> <span class="o">*</span><span class="n">http</span><span class="o">.</span><span class="n">Request</span><span class="p">)</span> <span class="p">{</span>
<span class="k">var</span> <span class="n">name</span> <span class="kt">string</span>
<span class="k">if</span> <span class="n">name</span> <span class="o">=</span> <span class="n">r</span><span class="o">.</span><span class="n">URL</span><span class="o">.</span><span class="n">Path</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="p">];</span> <span class="n">name</span> <span class="o">==</span> <span class="s">""</span> <span class="p">{</span>
<span class="n">name</span> <span class="o">=</span> <span class="s">"stranger"</span>
<span class="p">}</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Fprintf</span><span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="s">"hello %s!"</span><span class="p">,</span> <span class="n">name</span><span class="p">)</span>
<span class="p">})</span>
<span class="n">log</span><span class="o">.</span><span class="n">Fatal</span><span class="p">(</span><span class="n">http</span><span class="o">.</span><span class="n">ListenAndServe</span><span class="p">(</span><span class="s">":9999"</span><span class="p">,</span> <span class="no">nil</span><span class="p">))</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The code above starts an http server that listens on port 9999. If the url path is the root path (“/”) then the service greets the “stranger”, otherwise it greets whoever is mentioned in the url path (e.g. <code class="language-plaintext highlighter-rouge">GET /world</code> will return “hello world!”).</p>
<p>This file is placed under <code class="language-plaintext highlighter-rouge">cmd/demo</code> as <code class="language-plaintext highlighter-rouge">main.go</code> in our working directory and can be built in executable form (under <code class="language-plaintext highlighter-rouge">bin/</code>) as follows:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>go build <span class="nt">-o</span> ./bin/demo ./cmd/demo/...
</code></pre></div></div>
<p>OK, so now we have our service - how about we deploy it?</p>
<h3 id="deploying-the-service">Deploying the service</h3>
<p>We will use ansible to deploy our service to our remote server as a systemd service unit. As mentioned before, the remote server can be any linux system with ssh and systemd. If you don’t have access to such a system, you can use a tool such as <a href="https://www.virtualbox.org/">virtual box</a> in order to setup a <a href="https://www.debian.org/releases/buster/">debian buster</a> system.</p>
<p>We will specify our remote server in our inventory (file <code class="language-plaintext highlighter-rouge">hosts</code>) for use by ansible:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[myservers]
harpo
</code></pre></div></div>
<p>As you can see, this file can declare multiple named server groups (names in <code class="language-plaintext highlighter-rouge">[]</code> brackets can be referenced in other ansible files). We have specified the section <code class="language-plaintext highlighter-rouge">myservers</code> which contains the name of our single server <code class="language-plaintext highlighter-rouge">harpo</code>. In this case, <code class="language-plaintext highlighter-rouge">harpo</code> is an alias defined in our <code class="language-plaintext highlighter-rouge">.ssh/config</code> file as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Host harpo
HostName 12.34.56.789
User USERNAME
IdentityFile ~/.ssh/harpo
</code></pre></div></div>
<p>This configuration facilitates ansible’s access to the remote server (as mentioned before) and assumes that we have <a href="https://www.digitalocean.com/community/tutorials/how-to-configure-ssh-key-based-authentication-on-a-linux-server">correctly set up access</a> for user <code class="language-plaintext highlighter-rouge">USERNAME</code> in the server located in the address <code class="language-plaintext highlighter-rouge">12.34.56.789</code> (replace this with your own server’s IP).</p>
<p>Now that we have specified our remote server, we need to define a role (workbook in ansible terminology) for our server as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ mkdir roles
$ cd roles
$ ansible-galaxy init demo
</code></pre></div></div>
<p>The above command will generate a file/directory structure under <code class="language-plaintext highlighter-rouge">roles/demo</code> of which the following are relevant to our guide:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">roles/demo/tasks/main.yml</code>: the sequence of tasks to execute on the server</li>
<li><code class="language-plaintext highlighter-rouge">roles/demo/handlers/main.yml</code>: actions to execute when a task is completed</li>
<li><code class="language-plaintext highlighter-rouge">roles/demo/files/</code>: contains the files that we will need to copy to the remote server</li>
</ul>
<p>Let’s start with the latter and define our systemd unit in file <code class="language-plaintext highlighter-rouge">roles/demo/files/demo.service</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
Description=Demo service
[Service]
User=demo
Group=demo
ExecStart=/usr/local/bin/demo
[Install]
WantedBy=multi-user.target
</code></pre></div></div>
<p>As you can see, systemd units are defined simply using a declarative language. In our case, we declare our service executable (<code class="language-plaintext highlighter-rouge">ExecStart</code>) that will run under user <code class="language-plaintext highlighter-rouge">demo</code>. The <code class="language-plaintext highlighter-rouge">[Install]</code> section specifies that our service requires a system state in <a href="https://unix.stackexchange.com/a/506374">which network is up and the system accepts logins</a>.</p>
<p>Now, that we have our systemd unit, let’s define <a href="https://docs.ansible.com/ansible/latest/user_guide/playbooks.html">our ansible playbook</a>, starting from file <code class="language-plaintext highlighter-rouge">roles/demo/tasks/main.yml</code>:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">create demo group</span>
<span class="na">group</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">demo</span>
<span class="na">state</span><span class="pi">:</span> <span class="s">present</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">create demo user</span>
<span class="na">user</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">demo</span>
<span class="na">groups</span><span class="pi">:</span> <span class="s">demo</span>
<span class="na">shell</span><span class="pi">:</span> <span class="s">/sbin/nologin</span>
<span class="na">append</span><span class="pi">:</span> <span class="s">yes</span>
<span class="na">state</span><span class="pi">:</span> <span class="s">present</span>
<span class="na">create_home</span><span class="pi">:</span> <span class="s">no</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Copy systemd service file to server</span>
<span class="na">copy</span><span class="pi">:</span>
<span class="na">src</span><span class="pi">:</span> <span class="s">demo.service</span>
<span class="na">dest</span><span class="pi">:</span> <span class="s">/etc/systemd/system</span>
<span class="na">owner</span><span class="pi">:</span> <span class="s">root</span>
<span class="na">group</span><span class="pi">:</span> <span class="s">root</span>
<span class="na">notify</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">Start demo</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Copy binary to server</span>
<span class="na">copy</span><span class="pi">:</span>
<span class="na">src</span><span class="pi">:</span> <span class="s">demo</span>
<span class="na">dest</span><span class="pi">:</span> <span class="s">/usr/local/bin</span>
<span class="na">mode</span><span class="pi">:</span> <span class="m">0755</span>
<span class="na">owner</span><span class="pi">:</span> <span class="s">root</span>
<span class="na">group</span><span class="pi">:</span> <span class="s">root</span>
<span class="na">notify</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">Start demo</span>
</code></pre></div></div>
<p>The task file is mostly self-explanatory but a few items need clarifications:</p>
<ul>
<li>each task has a name and references an ansible module that accepts parameters</li>
<li><a href="https://docs.ansible.com/ansible/latest/modules/group_module.html">ansible’s <code class="language-plaintext highlighter-rouge">group</code> module</a> creates the specified group if it does not exist</li>
<li><a href="https://docs.ansible.com/ansible/latest/modules/user_module.html">ansible’s <code class="language-plaintext highlighter-rouge">user</code> module</a> creates users</li>
<li><a href="https://docs.ansible.com/ansible/latest/modules/copy_module.html">ansible’s <code class="language-plaintext highlighter-rouge">copy</code> module</a> copies files that exist locally under <code class="language-plaintext highlighter-rouge">roles/demo/files</code> (such as <code class="language-plaintext highlighter-rouge">demo.service</code> that we created previously) to the remote server</li>
</ul>
<p>Ansible’s <code class="language-plaintext highlighter-rouge">notify</code> directive enqueues a particular handler (<code class="language-plaintext highlighter-rouge">Start demo</code>) to be executed after the completion of all tasks. All handlers are defined in file <code class="language-plaintext highlighter-rouge">roles/demo/handlers/main.yml</code>:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Start demo</span>
<span class="na">systemd</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">demo</span>
<span class="na">state</span><span class="pi">:</span> <span class="s">started</span>
<span class="na">enabled</span><span class="pi">:</span> <span class="s">yes</span>
</code></pre></div></div>
<p>This notification uses <a href="https://docs.ansible.com/ansible/latest/modules/systemd_module.html">ansible’s <code class="language-plaintext highlighter-rouge">systemd</code> module</a> and requires the service to be started and enabled (i.e. started every time the remote server boots).</p>
<p>Finally, we complete our ansible configuration by combining our inventory and roles in file <code class="language-plaintext highlighter-rouge">demo.yml</code>:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="pi">-</span> <span class="na">hosts</span><span class="pi">:</span> <span class="s">myservers</span>
<span class="na">become</span><span class="pi">:</span> <span class="s">yes</span>
<span class="na">become_user</span><span class="pi">:</span> <span class="s">root</span>
<span class="na">roles</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">demo</span>
</code></pre></div></div>
<p>Here, we declare that we would like to apply the role <code class="language-plaintext highlighter-rouge">demo</code> that we just defined to the specified host group (<code class="language-plaintext highlighter-rouge">myservers</code> as specified in our inventory file).</p>
<h3 id="wrap-up">Wrap up</h3>
<p>We’re almost there! Let’s wrap up the whole thing in a <code class="language-plaintext highlighter-rouge">Makefile</code> that contains the two targets of interest, build and deploy our service, as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.PHONY: build
build:
env GOOS=linux go build -o ./bin/demo ./cmd/demo/...
.PHONY: deploy
deploy: build
cp ./bin/demo ./roles/demo/files/demo
ansible-playbook -i hosts demo.yml
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">build</code> action compiles our service (for linux) and outputs the executable under <code class="language-plaintext highlighter-rouge">bin/</code>. The <code class="language-plaintext highlighter-rouge">deploy</code> target first builds the service, then copies the executable under the demo role’s files and executes the entire ansible playbook by using the <code class="language-plaintext highlighter-rouge">demo.yml</code> spec.</p>
<p>Now, we can deploy our service by issuing:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>make deploy
</code></pre></div></div>
<p>The output of this command on my machine was as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>make deploy
env GOOS=linux go build -o ./bin/demo ./cmd/demo/...
cp ./bin/demo ./roles/demo/files/demo
ansible-playbook -i hosts demo.yml
PLAY [home] ********************************************************************
TASK [Gathering Facts] *********************************************************
ok: [harpo]
TASK [demo : create demo group] ************************************************
changed: [harpo]
TASK [demo : create demo user] *************************************************
changed: [harpo]
TASK [demo : Copy systemd service file to server] ******************************
changed: [harpo]
TASK [demo : Copy binary to server] ********************************************
changed: [harpo]
RUNNING HANDLER [demo : Start demo] ********************************************
changed: [harpo]
PLAY RECAP *********************************************************************
harpo : ok=6 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
</code></pre></div></div>
<p>We can now test our service using curl:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>curl 12.34.56.789:9999/world
</code></pre></div></div>
<p>where <code class="language-plaintext highlighter-rouge">12.34.56.789</code> needs to be replaced by your remote server’s actual IP. If you see the output “hello world!”, then you made it!</p>
<h3 id="status--monitoring">Status & Monitoring</h3>
<p>We can also have a look on how the <code class="language-plaintext highlighter-rouge">demo</code> process is doing on our remote server by logging in (via ssh) and using the systemd commands <code class="language-plaintext highlighter-rouge">systemctl</code> (control and status) and <code class="language-plaintext highlighter-rouge">journalctl</code> (logs) as follows:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># check the status of our service</span>
<span class="nv">$ </span><span class="nb">sudo </span>systemctl status demo
<span class="c"># tail our service's logs</span>
<span class="nv">$ </span><span class="nb">sudo </span>journalctl <span class="nt">-f</span> <span class="nt">-u</span> demo
</code></pre></div></div>
<h3 id="further-work">Further Work</h3>
<p>This approach can be used to do pretty much anything on one or more remote servers in a consistent and robust manner. Beyond process management, systemd can also be used to schedule events (ala cronjobs) using <a href="https://www.freedesktop.org/software/systemd/man/systemd.timer.html">timer units</a> and manage logs using its own binary journal files and syslog.</p>
<p>Ansible’s apt, shell and copy modules also facilitate the automated installation and configuration of standard software packages, even on the local machine using the “[local]” group name in the inventory file:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[local]
127.0.0.1
</code></pre></div></div>
<p>and executing any playbook using <code class="language-plaintext highlighter-rouge">ansible-playbook</code>’s <code class="language-plaintext highlighter-rouge">--connection=local</code> command argument.</p>
<h2 id="epilogue">Epilogue</h2>
<p><code class="language-plaintext highlighter-rouge">ansible</code> and <code class="language-plaintext highlighter-rouge">systemd</code> are two fantastic tools that allow one to build automated, simple and reproducible operational pipelines quickly and efficiently.</p>
<p>All the contents of the service and the deployment code are in <a href="https://github.com/kkentzo/deployment-ansible-systemd-demo">this repo</a>.</p>
<p>I hope that you enjoyed this guide and found it useful! Please feel free to leave your comments or ask your questions.</p>You may be a sole developer or member of a small development team with no dedicated ops people. You will probably have a handful of small-ish services, perhaps a few cronjobs and a couple of VPSs to run them on. Or you may have one or more servers at home and would like to automate the deployment of custom or open source tools and services. What are your options?A legit use case for the goto abomination2020-02-26T00:00:00+00:002020-02-26T00:00:00+00:00https://kkentzo.github.io/2020/02/26/a-legit-use-case-for-goto<p>Most programming educational material mentions, at some point, the <code class="language-plaintext highlighter-rouge">goto</code> keyword, usually as a despicable abomination that should not exist in the face of this earth. The general consensus is that the <a href="http://www.u.arizona.edu/~rubinson/copyright_violations/Go_To_Considered_Harmful.html">goto statement is considered harmful</a> and the advice is to avoid the use of <code class="language-plaintext highlighter-rouge">goto</code> and to restructure the code so as to eliminate the need for using it. Of course this is sound advice, intended to discourage the creation of code in which a function’s control flow is heavily influenced by jumps to and from various points within its body, making the understanding, testing and debugging of the function really difficult.</p>
<p>Such an evil construct should surely be banished by all languages by now. However, some widely-used languages offer the goto statement in their arsenal. These include C, C++, Golang, and C#, while Java interestingly has a goto statement but <a href="https://stackoverflow.com/a/4547764">it is not implemented</a>. However, most dynamic languages (ruby, python, javascript) do not have a goto statement (although <a href="http://patshaughnessy.net/2012/2/29/the-joke-is-on-us-how-ruby-1-9-supports-the-goto-statement">this was a fun read</a>!).</p>
<p>I had never used the goto statement myself until I started writing C code for a certain kind of embedded device and encountered multiple situations where I wish I had some kind of try/catch/finally construct in the language. The purpose of the latter is to better express the need for ensuring that a bunch of statements will run no matter what especially on early-exit error paths. Now, C of course does not have such facilities but does have <code class="language-plaintext highlighter-rouge">goto</code>. Can we use it to handle such use cases in a C program in a way that is clean and DRY?</p>
<p>Let’s see a (somewhat contrived) example: suppose that we have to parse a json string of the form <code class="language-plaintext highlighter-rouge">"{"name": "John"}</code> in a C program using the well-known library <a href="https://github.com/DaveGamble/cJSON">cJSON</a>. For this purpose, we define a <code class="language-plaintext highlighter-rouge">handler</code> function (that maybe validates and aggregates the attributes in some way - it doesn’t really matter) and a <code class="language-plaintext highlighter-rouge">process_json</code> function that parses the json string and calls <code class="language-plaintext highlighter-rouge">handler</code> for each attribute that is parsed, as follows:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// return 0 for success and -1 for error</span>
<span class="kt">int</span> <span class="nf">handler</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="n">key</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">val</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// some condition gives an error</span>
<span class="k">if</span> <span class="p">(</span><span class="n">key</span> <span class="o">==</span> <span class="nb">NULL</span> <span class="o">||</span> <span class="n">val</span> <span class="o">==</span> <span class="nb">NULL</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="p">}</span>
<span class="c1">// do something here</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// process the supplied json string</span>
<span class="c1">// returns 0 on success, -1 on error</span>
<span class="kt">int</span> <span class="nf">process_json</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">json</span><span class="p">,</span> <span class="p">)</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">rc</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="c1">// here we parse the json and allocate the resources</span>
<span class="c1">// root will need to be freed further down</span>
<span class="n">cJSON</span> <span class="o">*</span><span class="n">root</span> <span class="o">=</span> <span class="n">cJSON_Parse</span><span class="p">(</span><span class="n">json</span><span class="p">);</span>
<span class="c1">// let's grab a reference to the name item</span>
<span class="n">cJSON</span> <span class="o">*</span><span class="n">name</span> <span class="o">=</span> <span class="n">cJSON_GetObjectItem</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="s">"name"</span><span class="p">);</span>
<span class="c1">// and `handle` the inner string</span>
<span class="k">if</span> <span class="p">((</span><span class="n">rc</span> <span class="o">=</span> <span class="n">handler</span><span class="p">(</span><span class="s">"name"</span><span class="p">,</span> <span class="n">cJSON_GetStringValue</span><span class="p">(</span><span class="n">name</span><span class="p">)))</span> <span class="o"><</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// stop processing - we have an error</span>
<span class="c1">// free the resource</span>
<span class="n">cJSON_Delete</span><span class="p">(</span><span class="n">root</span><span class="p">);</span>
<span class="k">return</span> <span class="n">rc</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// happy path -- free the resource</span>
<span class="n">cJSON_Delete</span><span class="p">(</span><span class="n">root</span><span class="p">);</span>
<span class="k">return</span> <span class="n">rc</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Now, the above code works fine but it is a little bit repetitive since we have to free the <code class="language-plaintext highlighter-rouge">root</code> resource in two exit paths. There could also be more paths (e.g. for json attributes to parse) and more resources that are created in the flow and that need to be released before exit. All this repetition is error-prone and makes our function less robust and more succeptible to leaks.</p>
<p>Let’s see how we can use the <code class="language-plaintext highlighter-rouge">goto</code> statement in order to concentrate all of our clean-up code in one place and reduce the likelihood of a leak. The idea is to use an <code class="language-plaintext highlighter-rouge">end</code> label and drive error exit paths to that label using <code class="language-plaintext highlighter-rouge">goto</code> statements as follows:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">process_json</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">json</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">rc</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">cJSON</span> <span class="o">*</span><span class="n">root</span> <span class="o">=</span> <span class="n">cJSON_Parse</span><span class="p">(</span><span class="n">json</span><span class="p">);</span>
<span class="n">cJSON</span> <span class="o">*</span><span class="n">name</span> <span class="o">=</span> <span class="n">cJSON_GetObjectItem</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="s">"name"</span><span class="p">);</span>
<span class="k">if</span> <span class="p">((</span><span class="n">rc</span> <span class="o">=</span> <span class="n">handler</span><span class="p">(</span><span class="s">"name"</span><span class="p">,</span> <span class="n">cJSON_GetStringValue</span><span class="p">(</span><span class="n">name</span><span class="p">)))</span> <span class="o"><</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span> <span class="k">goto</span> <span class="n">end</span><span class="p">;</span> <span class="p">}</span>
<span class="c1">// ... other code here / attribute parsing etc. ...</span>
<span class="nl">end:</span>
<span class="c1">// free the resource</span>
<span class="n">cJSON_Delete</span><span class="p">(</span><span class="n">root</span><span class="p">);</span>
<span class="k">return</span> <span class="n">rc</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>In the above flow, we will reach the <code class="language-plaintext highlighter-rouge">end</code> label either by going through the whole function (happy path) or by exiting prematurely when an error occurs. This way, (a) there is no repetition, (b) we ensure that the cleanup will occur no matter what and, (c) we improve the readability of the function’s flow by reducing the visual statement clutter. In short, We have a cleaner and safer function.</p>
<p>So, when implementing a C function with the following characteristics/constraints:</p>
<ul>
<li>error conditions are handled before the happy path, and
n * one or more resources need to be cleaned up upon exit regardless of premature/error or normal exit,</li>
</ul>
<p>then the use of <code class="language-plaintext highlighter-rouge">goto</code> is quite legitimate and results in cleaner, safer, DRYer code.</p>Most programming educational material mentions, at some point, the goto keyword, usually as a despicable abomination that should not exist in the face of this earth. The general consensus is that the goto statement is considered harmful and the advice is to avoid the use of goto and to restructure the code so as to eliminate the need for using it. Of course this is sound advice, intended to discourage the creation of code in which a function’s control flow is heavily influenced by jumps to and from various points within its body, making the understanding, testing and debugging of the function really difficult.Emacs on MacOS2018-06-26T00:00:00+00:002018-06-26T00:00:00+00:00https://kkentzo.github.io/2018/06/26/emacs-on-osx<p>After almost 15 years of running emacs predominantly on linux in
terminal mode, I am planning on switching my workflow completely to
MacOS. At the same time, I have experimented with the capabilities of
emacs running in a window/graphical system instead of the terminal
interface. Both experiences have been pleasant so far, particularly
the fact that I can now view images in emacs buffers :-O However,
there have been some issues that I have had to address, so here it
goes.</p>
<h2 id="operating-emacs-in-a-window-system">Operating emacs in a window-system</h2>
<p>A lot of flows in emacs depend on things like environmental variables
being set correctly. My current bash setup involves a <code class="language-plaintext highlighter-rouge">.profile</code> file
that specifies stuff like the <code class="language-plaintext highlighter-rouge">PATH</code> and <code class="language-plaintext highlighter-rouge">GOPATH</code>, <code class="language-plaintext highlighter-rouge">AWS_*</code> variables,
various aliases, shell customizations (such as <code class="language-plaintext highlighter-rouge">PS1</code>) as well as
<code class="language-plaintext highlighter-rouge">rvm</code>-related functionality. The <code class="language-plaintext highlighter-rouge">.profile</code> file is sourced in
<code class="language-plaintext highlighter-rouge">.bashrc</code>, so a login shell is needed to get all these declarations in
the environment.</p>
<p>One of the first things I noticed when running emacs on the MacOS
window-system (<code class="language-plaintext highlighter-rouge">C-h v window-system</code>) was the fact that the
environment in emacs was pretty much void of all my
customizations. So, for example the <code class="language-plaintext highlighter-rouge">projectile-compile-project</code>
command in a go project failed with errors indicating that the
environment is not setup correctly (<code class="language-plaintext highlighter-rouge">GOPATH</code> not defined, <code class="language-plaintext highlighter-rouge">dep</code>
located in <code class="language-plaintext highlighter-rouge">/usr/local/bin</code> was not found etc.).</p>
<p>And indeed it makes sense for the environment to be void; why should
it be otherwise? When the application starts in a window system, there
is no reason why the custom bash initialization should be executed in
the context of the application. This should be true in all window
systems, not just MacOS (<code class="language-plaintext highlighter-rouge">ns</code>).</p>
<p>It turns out that this problem can be solved using the
<a href="https://github.com/purcell/exec-path-from-shell"><code class="language-plaintext highlighter-rouge">exec-path-from-shell</code></a>
package. This package essentially executes a shell, grabs the values
of certain environmental variables (which can be customized as well)
and sets them (<code class="language-plaintext highlighter-rouge">setenv</code>) in the current environment in which emacs is
operating.</p>
<p>Add the following in your emacs initialization file to get the package
working:</p>
<div class="language-lisp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">;; this will make sure that the package is installed during emacs init</span>
<span class="p">(</span><span class="nb">use-package</span> <span class="nv">exec-path-from-shell</span>
<span class="ss">:ensure</span> <span class="no">t</span><span class="p">)</span>
<span class="c1">;; this will initialize the package only when a window-system is detected</span>
<span class="p">(</span><span class="nb">when</span> <span class="p">(</span><span class="nv">memq</span> <span class="nv">window-system</span> <span class="o">'</span><span class="p">(</span><span class="nv">mac</span> <span class="nv">ns</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="nv">exec-path-from-shell-initialize</span><span class="p">))</span>
</code></pre></div></div>
<h2 id="supporting-multiple-system-types-in-initel">Supporting multiple system-types in <code class="language-plaintext highlighter-rouge">init.el</code></h2>
<p>Now, going back to the issue of switching emacs usage from linux to
MacOS, I would very much like to support both systems in a single
<code class="language-plaintext highlighter-rouge">init.el</code> because some usage on linux is to be expected after all. For
functionality that differs across the two systems, emacs provides the
variable <code class="language-plaintext highlighter-rouge">system-type</code> (<code class="language-plaintext highlighter-rouge">C-h v system-type</code>).</p>
<p>For example, to use the <code class="language-plaintext highlighter-rouge">badger</code> theme on MacOS only one needs to do
the following in <code class="language-plaintext highlighter-rouge">init.el</code>:</p>
<div class="language-lisp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">eq</span> <span class="nv">system-type</span> <span class="ss">'darwin</span><span class="p">)</span>
<span class="p">(</span><span class="nv">load-theme</span> <span class="ss">'badger</span> <span class="no">t</span><span class="p">))</span>
</code></pre></div></div>
<p>So, using the <code class="language-plaintext highlighter-rouge">system-type</code> variable, one can execute lisp code
conditionally upon the type of the system in which emacs is running.</p>After almost 15 years of running emacs predominantly on linux in terminal mode, I am planning on switching my workflow completely to MacOS. At the same time, I have experimented with the capabilities of emacs running in a window/graphical system instead of the terminal interface. Both experiences have been pleasant so far, particularly the fact that I can now view images in emacs buffers :-O However, there have been some issues that I have had to address, so here it goes.