Most applications today deals with some form of sensitive information. The most commonly seen are database connection strings, API keys, token etc. The web.config seems the best place to have these values, but it definitely is not. In most cases it gets pushed into the source control systems as well. If it is a private repository then you at least have one level of security on top of it. It still exposes sensitive information to anyone who has access to the repository. It’s worse when the repository is public.

Keep sensitive data out of source control

There are different ways you can avoid pushing sensitive data into source control. In this post, I will explore options that I am familiar with.

Use configuration files as template definitions for the configuration data your application requires. Have the actual values stored elsewhere

Azure App Settings

If you are deploying your application as a Web App on Azure, you can store application settings and connection strings in Azure. At runtime, Windows Azure Web Sites automatically retrieves these values for you and makes them available to code running in your website. This removes the need for having sensitive data in the configuration file.

Azure App Settings and Connection Strings

Release Management Tools

Release management tools like Octopus Deploy, Microsoft Release Management, that performs configuration transformation. It supports creating different environments (development, production) and corresponding configurations. On creating a package for an environment, it applies the corresponding environment configurations

Release Management Tools - Octopus Deploy

Packaging embeds the configuration value into the configuration file. This makes it available to anyone who has access to the host systems.

Azure Key Vault

Azure Key Vault acts as a centralized repository for all sensitive information. Key vault stores cryptographic keys and Secrets and makes them available over a HTTP Api. The objects (keys and secrets) in key vault has unique identifier to retrieve them. Check Azure Key Vault in real world application for more details on how to achieve this. A client application can authenticate with Azure Key Vault using a ClientID/secret or ClientID/certificate. Using certificate to authenticate is the preferred approach. To get Keys/Secret from key vault all you need is the AD Application Id, the client secret or certificate identifier and the key/secret names. The certificate itself can be deployed separately into the application host.

1
2
3
4
5
6
7
<appSettings>
  <add key="KeyVaultUrl" value="https://testvaultrahul.vault.azure.net"/>
  <add key="ADApplicationId" value="" />
  <add key="ADCertificateThumbprint" value="" />
  <add key="DbConnectionString" value="SqlConnectionString"/>
  <add key ="ApiToken" value="ApiToken/cfedea84815e4ca8bc19cf8eb943ee13"/>
</appSettings>

If you are using the ‘client secret’ to authenticate then the configuration file will have the Secret. In either cases, you should follow either of the previous approaches to keep the Application Id and authentication information out of configuration. The advantage of using Key Vault is that it is a centralized repository for all your sensitive data, across different applications. You can also restrict access permissions per application.

These are some approaches to keep sensitive information out of source control. What approach do you use? Irrespective of the approach you use, make sure that you don’t check them in!

Since the start of this year, I have been trying to blog to a schedule and publish posts more often. The goal that I have set myself with is to post four posts a month, preferably one each week. I have been sticking to it till now, and I hope it continues. Initially, I did not have this upper limit on the number of posts in a month. In the month of March 2016, I went a bit aggressive and published nine articles. It made me think more about setting an upper limit on the number of posts so that I don’t end up having higher expectations out of myself.

Staying Ahead

Having published nine posts, also made me realize that I could write faster if required and have posts ready for future. It will help me to stay ahead of the posting schedule and give me some off-time when I need it. But this also presented me with a new problem on how to manage and schedule posts for the future.

The more I automate the mundane tasks of blogging, the more I can concentrate on the writing part

Jekyll Future flag

Octopress is over Jekyll and it provides all the capabilities that Jekyll provides. The future flag in Jekyll indicates whether or not to publish posts or collection documents with a future date. With the flag set to false, Jekyll will not generate posts that have a date in the future. It works perfectly for me as all I need to do is to publish posts into the _posts directory once it’s ready, with a date in the future. I have a draft workflow, which puts posts into a _drafts folder and move them into the _posts folder once ready. I updated the rake script that publishes drafts as posts, to take in a publish date and use that to update the post date.

1
2
3
4
5
6
7
8
9
10
11
12
task :publish_draft do
...
puts "Publish Date?"
publishDateString = STDIN.gets.chomp
publishDate = DateTime.parse(publishDateString)
...
dest = "#{source_dir}/#{posts_dir}/#{publishDate.strftime('%Y-%m-%d')}-#{filename}"
puts "Publishing post to: #{dest}"
File.open(source) { |source_file|
contents = source_file.read
contents.gsub!(/^thisIsStillADraft:$/, "date: #{publishDate.strftime('%Y-%m-%d')}\ncompletedDate: #{DateTime.now.strftime('%Y-%m-%d %H:%M:%S %z')}")
...

The rake script appends the publish date to the post file name and also the yaml date information and moves it from the _drafts to _posts folder. It also adds a completedDate set to the current time with the timezone information, just for reference.

Integrating with Travis CI

I have the deployment of my blog automated via Travis CI, which builds and deploys the site when committing to the GitHub repository. For future posts since there might not be a commit on the publish date, I need to trigger the build on those days, to publish the posts scheduled. The Azure Scheduler enables scheduling requests and also provides out of the box support to invoke web service endpoints over HTTP/HTTPS. Travis CI exposes an API to interact with the build system and is the same API that the official Web interface uses. The API supports triggering builds by making a POST request with an API token and the build details. The API has an existing bug that requires the slash separating username and repository name in the trigger URL be encoded(%2F). Azure, however, does not like this and treats it as an invalid URL with the bellow error.

Azure Scheduler Encoded URL error

The only way now is to have to custom write this code and have it scheduled. I chose the one with the least work involved - Azure Automation. Azure Automation allows to create Run books and automatically trigger it on a schedule. The Azure Automation has a pricing plan with 500 minutes free Job run time in a month, which meets my requirements. I created a PowerShell script and added in the token (TravisToken) and the build URL (TravisBuildUrl) as parameters to the script.

1
2
3
4
5
6
7
8
9
10
11
12
$travisBlogTriggerApiUrl = Get-AutomationVariable -Name 'TravisBuildUrl'
$token = Get-AutomationVariable -Name 'TravisToken'

$body = "{""request"": {""message"":""Scheduled Automated build"",""branch"":""master""}}"
$headers = @{
    'Content-Type' = 'application/json'
    'Accept' = 'application/json'
    'Travis-API-Version' = '3'
    'Authorization' = 'token ' + $token
}

Invoke-WebRequest -Method Post $travisBlogTriggerApiUrl -Body $body -Headers $headers -UseBasicParsing

The script runs on a schedule every day and triggers the Travis CI build. It deploys the latest generated site to Azure WebApp that hosts the site. Any posts scheduled for the current date gets picked up by Jekyll and included in the site generation.

Automatic Deployment of Future Posts With Octopress
Scheduler triggers TravisCI build. For details on how TravisCI is set up check Continuos Delivery of Octopress Blog Using TravisCI and Docker

Post to Social Media

With the posts getting deployed automatically, I want to update all my social networks. I already use Buffer to post updates to all social networks. Buffer is like ‘Write Once, Post Everywhere’ kind of service. It clubs all your social media profiles into one place and allows you to post to all of them by just writing it once.

IFTTT(‘If This Then That’) is a service that helps connect different apps and devices together based on a trigger. As the name says, you can trigger an action based on a trigger. IFTTT has many Channels that can act as a source of the trigger. In my case, the trigger is a post getting published and I can hook into that event using the Feed Channel. The feed channel has an option to trigger when a new item is available on the feed. I use this to trigger an update to Buffer. Buffer is available as a channel on IFTTT but allows only update to one of the connected accounts in Buffer, which requires me to setup a recipe per social media account. I chose to use update via email feature in Buffer. It allows me to have just one recipe in IFTTT to update to all of my connected profiles in Buffer.

Trigger Buffer Email When New Post is Published

With the Automated publishing of posts and ability to schedule them, I can concentrate more on just the writing part. I no longer have to push out posts manually. I had never thought that I would be scheduling posts in the future. But now that it is happening it’s a great feeling when there are posts for a few weeks ahead all ready to go.

My Morning Routine was the first posts to be deployed using the schedule.

It’s been a while since I have wanted to deploy my blog automatically whenever there is a new commit pushed into the associated git repository. I use Octopress as my blog engine and have been tweaking it to my blogging workflow. Octopress is a static blog generator built over Jekyll. So anytime I make any updates to the blog, I need to build the blog with the accompanying rake tasks and push the generated output (HTML, JavaScript, and CSS) to an Azure Web App that hosts my blog. For this I use the git deployment feature of web apps, so just pushing the built output to a git repo (branch) deploys it to my website. As you see every time, I make a change I have to build the site and push it to the git repository and this can be automated. Since Octopress is in Ruby, I decided to use Travis CI for the build and deploy.

Local Build Environment with Docker

I am on an older repository fork of Octopress and have not updated to latest version. So it has hard dependencies with specific versions of gem packages that it needs and also on the Ruby and Jekyll version. So every time I change laptop it’s difficult to set up the blog environment. In the past, I manually installed the dependencies whenever I got a new laptop. As changing laptop does not happen frequently, I had been delaying creating any script for this. But now since I had to setup the Travis build environment, I thought of also having a local build environment to test before pushing it up to Travis. Travis provides a Docker image that matches exactly their build environment.

Setting up Docker is just a few steps:

  1. Install the Docker components
  2. Load the Docker image
1
docker run -it -p 4000:4000 quay.io/travisci/travis-ruby /bin/bash

Once in the container, you can run the same build scripts that you manually run yo deploy and check. I had a few issues with the gem packages and fixed it by specifying hard package dependency. To launch the site hosted in Docker from host system I expose incoming ports through the host container. Once I have the local server running in the docker container (in port 4000) I can access it via localhost:4000 from my host computer.

Post Dates and TimeZones

When building from the container, I noticed that the dates of posts were off by 1. For posts that were on month start (like Aug 1), it started coming up in July, on the archive page. After a bit of investigation, I realized that Jekyll parses the date time from the post and converts them into local system time. The container was running in UTC and when generating the site it converted post DateTime to UTC. All the posts that I had written after coming to Sydney had an offset of +1000 (or +1100) and most were published early in the morning. So it converted those posts to the previous date. Since I am not that worried about the time zone of the post, I decided to remove it. I removed timezone information getting set for new posts in my Rake scripts. For the existing posts, I removed all the timezone information from the datetime YAML header in the posts. I set the config.yml to built in UTC irrespective of the system timezone that it is getting build.

Setting up TravisCI

Setting up automated build on Travis CI is smooth and easy process TDK. I just added a travis.yml with the ‘rake generate. TDK The post build script does the following

  • Clones the current statically generated code from my blog branch.
  • Perform a rake deploy that updates the cloned code above with the latest site.I updated the existing rake deploy to use GitHub token in push URL. As I did not want the token to be logged on to the Travis console I redirect the output using &> /dev/null.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
language: ruby
rvm:
  - 1.9.3
branches:
  only:
  - master
script:
  bundle exec rake generate;
after_success: |
  if [ -n "$GITHUB_PUSH_URL" ]; then
    cd "$TRAVIS_BUILD_DIR"
    git clone -b blog --single-branch https://github.com/rahulpnath/rahulpnath.com.git _azure &> /dev/null
    bundle exec rake gitdeploy["$GITHUB_PUSH_URL"] &> /dev/null
    echo "Deployed!"
  fi

Every time I make a commit to the GitHub master branch, the automated build triggers and deploys the latest generated site.

Current Blogging Workflow Build Status

Continuos Delivery of Octopress Blog

  • Write posts on my phone or laptop. (Using Dropbox to sync posts across devices)
  • Publish and Push to Github from laptop
  • Travis builds triggered by Github webhook
  • Travis pushes back generated site into Github (blog branch).
  • Azure Web App triggers automated deployment from Github.

With the automated deployment, I have one less thing to take care of when writing posts. The whole process might feel a bit complicated, but it is not. It is just that I have been tweaking few things to ease blogging. And since I am a programmer, I like hacking things. If you are new to blogging you do not need them and don’t get overwhelmed (if at all you are). All you need to make sure is to have a blog and you own the URL.

Yesterday I was late to leave office as I had to data fix some of the systems that we are currently building. We had just migrated few hundred clients onto the new platform. Invoices generated for the clients had wrong invoice amounts due to some mismatching data used when migrating. We had the expected invoice from the old system which made finding the problem easy. We ran a few scripts to correct the data in different systems and fixed the issue.

Data Hotfix

WARNING! Normally I do not recommend making any changes directly in production server. In this case, there was a business urgency and was forced to do the data fix the same night, for smooth functioning the day after. We still managed to get in some testing in the development environment before running it in production.

It All Starts with a Few

I have seen it repeatedly happen that this kind of data fixes starts with a few in the beginning. Within a short span of time the affected data size grows drastically and manual updates might not be a good solution.

If you get a second thought of whether to script the fix or not, then you should script it.

Yesterday it started with data fix for 30 clients and the fix was relatively small. It could either be through UI or API. Fix through the UI took around 45 seconds each, and there were two of us. So it was just a matter of 12-15 minutes to fix it. While fixing, one of us found an extra scenario where the same fix needs to be applied. Re-running the query to find such clients bombarded the number to 379. At this moment, I stood up and said I am going to script this. There is no way I am doing this manually. Manually fixing this would take five man hours, but will finish in two and half hours, as there were two of us. Even writing the script is going to take around an hour but that’s just one man hour.

There is happiness you get when you script the fix and not manually plow through the UI fixing each of them

The script was in C#, written as a test case, invoked from a test runner (which I don’t feel great about now) updating the systems with the data fix. It did its job and fixed all the cases it was supposed to. But I was not happy with the approach that I had chosen to make the fix. Correcting production data through a unit test script does not sound a robust solution. The reason to choose tests was that the test project had all the code required to access the other systems. It was just about changing the configuration values to point to the production system. It was the shortest path to having at least one client updated and verified.

Having it as a test script restricted me from scaling the update process (though I could have done some fancy things to run tests in parallel). It also forced me to hard-code the input data.Logging was harder and I used Debug.WriteLine to the VS output window. All those were the aftermath of choosing the wrong execution method - running it as a test script!

In retrospective, here are a few things that I should have done different and should be doing if ever I am in a similar situation again.

Create Stand-alone Executable

Having a stand-alone executable running the script provides the capability to scale the number of processes as I wanted. Input can be passed as a file or as an argument to the application allowing to break the large data set into smaller subsets.

Log Error and Success

It’s very much possible that the ‘fix-to-fix errors’ can go wrong or throw exceptions. So handle for errors and log appropriate message to take any corrective actions. It’s better to log to a file or other durable storage as that is more foolproof. Logging to the output window (Debug.Writeline/Console.Writeline) is not recommended, as there is a risk of accidentally losing it (with another test run or closing VS).

Logging successes are equally important to keep track of fixed records. It helps in cases where the process terminates suddenly while processing a set of data. It gives a track of all data sets that were successfully processed and exclude from following runs.

Test

It is very likely that the script has bugs and does not handle all possible cases. So as with any code, testing the data fix script is also mandatory. Preferably, test in a development/test environment, if not try for a small subset of input in the production. In my case, I was able to test in the development environment and then in production. But still, I ran a small subset in production first and ended up finding an issue that I could not find in development.

Parallelize if Possible

In cases where the data fixes are independent of each other (which likely is when dealing with large data fixes), each of the updates can be in parallel. Also using nonblocking calls when updating across the network helps speed up the process, by reducing the idle time and improves the overall processing time.

Parameterize Input

Parameterizing of input to the script (console) application helps when you want to scale the application. In my case updating each of the clients took around 8-10 seconds as it involved calling multiple geographically distributed systems. (Updating a system in the US from Australia does take a while!). Having a parameterized application enables to have multiple applications running with different input sets updating the data and speeds up the overall processing time.

It’s hard to come up with a solid plan for critical data fixes. It might not be possible to follow all of the points above. Also, there might be a lot other things to be done other than these. These are just a few things for reference so that I can stop, take a look and move on when a similar need arises. Hope this helps someone else too! Drop in a comment if you have any tips for the ‘eleventh hour’ fix!

Humans are creatures of habit and things work well if made as a routine. It’s what you build as your daily plan that defines what you end up achieving in the day and in turn with life.

If you are a late night person, you can read ‘morning’ as ‘late night’ - the focus here is routine!

Win your day, create a morning routine

A morning routine is nothing but a sequence of actions regularly followed. I have tried switching my routine too, to late in the night a few times but found that mornings work better for me. But this could be different for you, so stick to the time of day that works for you. Before going to how my morning routine looks like (which I just started a week back), I will explain how I made the plan for the routine.

Brain Dump

At any point in time, there are a lot of things in my mind and things that I kept committing to myself and others. It is not possible to keep up with everything that I wish to do. So the very first thing to do is to dump everything out onto to paper and then decide what needs attention. The Incompletion Trigger List assists to get everything out of your mind onto paper. It’s a good idea to block some time of yours to perform this exercise and give it all the attention it needs. At times it helps to Slow Down to Go Fast.

Slow down, to go faster

Most Important Task (MIT)

If you are following along, hope the brain dump helped to flush out all that was there in your mind. This exercise needs to be occasionally done (maybe every 2-3 months) to stay clear and stay on top of things. I was sure that I could not do everything on that list after the brain dump. Now comes the hard part of choosing what matters to you and choosing those that aligns well with your goals. For the morning routine, I stuck to items from the brain dump that fall under ‘Personal Projects’ category (as highlighted in the below image).

Choosing tasks for morning routine

These are the items that matter to me and aligns to the short-long term goals that I have. Below is a part of my list.

1
2
3
4
5
6
7
Start Youtube channel
Become Pluralsight Author
Blogging
Learn Functional Programming
Learn Ruby
Open Source Contribution
Improving Writing Skills and Language

The key is not to prioritize what’s on your schedule, but to schedule your priorities.

Stephen Covey

Progressing towards all of these items on the list at the same time is not possible, as time available each day for achieving them is limited. I usually get around 2-3 hours a day of ‘me time’, provided I wake up at 4 in the morning (more on this shortly). The number of hours you have might differ, and you could choose those many items as you think you can fit in. But 3 is a good number to choose, as that helps to mix in a few different goals and gives the flexibility to shuffle around with them on a day. For me, it also means I roughly get around 40-60 minutes daily, for each item.

Currently, the ones that I have in my morning routine are:

  • Blogging
  • Learn Functional Programming
  • Open Source Contribution

MITs to Mini Habits

Having high-level short-long term goals is good, but does not provide anything actionable on a daily basis. It feels overwhelming to approach them because it does not give any sense of direction. So it’s important that I have small actionable items that I can work on and progress towards achieving the goal.

Break your goal into the smallest possible task that you can think of so that you don’t feel to skip it

For me, the mini habits look like this

  • Write at least one sentence for the blog
  • Read at least one line about Functional Programming
  • Read at least one line of code of an Open Source Project

The idea behind keeping it so small is just to start. It’s very rare that I have stopped writing after writing a sentence or stopped reading after a line. The trouble is only with getting started - once done you can easily carry on for at least 20-30 minutes. Even if I make 2 out of the 3 of the above tasks, I take it as a success, which gives me some flexibility each day.

Waking up Tricks

There are days when Resistance beats me to it, and I don’t get up to the routine. But I feel low on those days for not able to progress on my goals. So I try hard to have less of such days.

  • Alarm Phone Inside Pillow: I use Timely on a spare phone to set alarms. Till a while back I used to keep the phone at the bedside while sleeping. But I noticed that I often ended up snoozing the alarm, at times even without full consciousness. So to make I wake up to the alarm, I now keep the phone buried inside my pillow with just the vibration. The vibration forces me to wake up and also removes the need for any alarm sound - my kid and wife does not get disturbed.

  • Wear a Sweater: During winter, at times the cold beat me to it. It’s hard to leave all the blankets and wake up to the cold. I started sleeping with the sweater, and I don’t feel that cold when I wake up.

  • Rationalize Against Resistance: However hard I try not to rationalize on getting up when the alarm sounds off, looking at the snooze button I end up rationalizing. Often I have found that when I try to use the tasks that I can achieve if I wake up, to motivate myself, I end up justifying that it can wait for tomorrow. Because there is no hard deadlines or accountability to anyone - it’s just me!. Now I try just the opposite - Think about the resistance that is trying to force me to the bed and reassure myself that I should not fall to it. The ‘me’ waking up after sleeping in is not going to like it then. So wake up!

My Routine

  1. Wake at 4 am
  2. Brush
  3. Drink Water
  4. Stretching
  5. Review the tasks for the day (Todoist)
  6. Mini Habits (2 or 3)
  7. Wake up wife at 5:45 am
  8. Continue Mini Habits (2 or 3)
  9. Tea
  10. Wake up Gautham at 7 am

Having a morning routine has helped me focus more on things that matter and not wander from one task to another. It has also helped set a sense of direction to what I do every day and spent less time in thinking what to do. I find my days starting in gradually and not rushing into it, setting up the pace for the day. Hope this helps you too!

References

It was a busy week with NDC Sydney and a lot of other user group conferences happening at the same time since all the international speakers were in town.The conference was three days long with 105 speakers, 37 technologies, and 137 talks. Some the popular speakers were Scott Hanselman, Jon Skeet, Mark Seemann, Scott Allen,Troy Hunt and a lot more.

NDC Sydney

Sessions

Each talk is one hour long and seven talks happen at the same time. The talks that I attended are:

All sessions are recorded and are available here. I hope the NDC Sydney ones too will be there in some time.

Networking

Events like this are a great place to network with other people in the industry and was one of the reasons I wanted to attend NDC. I am a regular reader of Mark Seemann’s (@ploeh) blog, and his ideas resonate with me a lot. Also, I find his Pluralsight videos and his book, Dependency Injection in .NET helpful. It was great to meet him in person and enjoyed both of his talks on FSharp.

With Mark Seemann (ploeh)

Sponsors

Most of the event sponsors had they stall at the conference, spreading their brand (with goodies and t-shirts) and also the work they do (a good way to attract talent to the company). There were also raffles for some big prices like Bose headphones, Das keyboards, Drones, Coffee machines, Raspberry Pi, etc. I was lucky enough to win a Raspberry Pi3 from @ravendb.

Won a Raspberry Pi3. Ravendb raffle @ NDCSydney

It’s confirmed that NDC Sydney is coming back next year. If you are in town during that time, make sure you reserve a seat. Look out for the early bird tickets, those are cheap, and the conference is worth it. Thanks to Readify for sponsoring my tickets and it’s one of the good things about working with Readify.

See you at NDC Sydney next year!

Recently I have been trying to contribute to open source projects, to build the habit of reading others code. I chose to start with projects that I use regularly. AsmSpy is one such project.

AsmSpy is a Command line tool to view assembly references. It will output a list of all conflicting assembly references. That is where different assemblies in your bin folder reference different versions of the same assembly.

AsmSpy assembly conflicts

I started with an easy issue to get familiar with the code and to confirm that the project owner, Mike Hadlow, accepts Pull Requests (PR). Mike was fast to approve and merge in the changes. There was a feature request to make AsmSpy available as Chocolatey package. Chocolatey is a package manger for Windows, to automate software management. AsmSpy, being a tool that’s not project specific, it makes sense to deliver this via Chocolatey and makes installation easier. Mike added me as a project collaborator, which gave better control over the repository.

Manually Releasing the Chocolatey Package

AsmSpy is currently distributed as a zip package. Chocolatey supports packaging from a URL with a PowerShell script Install-ChocolateyZipPackage. For the first release I used this helper script to create the Chocolatey package and uploaded it to my account. After fixing a few review comments the package got published.

choco install asmspy

Automating Chocolatey Releases

Now that I have to manage the AsmSpy Chocolatey package installations, I decided to automate the process of Chocolatey package creation and upload. Since I had used AppVeyor for automating Click-Once deployment of CLAL, I decided to use AppVeyor for this.

The Goal

I wanted to automatically deploy any new version of the package to Chocolatey. Any time a tagged commit is made in the main branch (master) it should trigger a deployment and push the new package to Chocolatey. This will give us the flexibility to control version numbers and decide when we actually want to make a release.

Setting up the Appveyor Project

Since now I am a collaborator on the project, AppVeyor shows the AsmSpy GitHub repository in my AppVeyor account too. Setting up a project is really quick in AppVeyor and most of it is automatic. Any commits now to the repository triggers an automated build

Appveyor add new project

After playing around with different Appveyor project settings and build scripts, I noticed that AppVeyor was no longer triggering builds on commit pushes in the repository. I tried deleting and adding the AppVeyor project, but with no luck.

The AppVeyor team was quick to respond and suggested a possible problem with the Webhook URL not configured under the GitHub repository. The Webhook URL for AppVeyor is available under the projects settings. Since I did not have access to the Settings page of the GitHub repository, I reached out to Mike, who promptly updated the Webhook URL for AppVeyor under GitHub project settings. This fixed the issue of builds not triggering automatically when commits are pushed to the GitHub repository.

Github webhook url for appveyor

Creating Chocolatey Package

AppVeyor has support for Chocolatey commands out of the box, which makes it easy to create packages on a successful build. I added in the nuspec file that defines the Chocolatey Package and added an after-build script to generate the package. AppVeyor exposes environment variables, that are set for every build. In the ‘after_build’ scripts I trigger Chocolatey packaging only if the build is triggered by a commit with a tag (APPVEYOR_REPO_TAG_NAME). Every build generates the zip package that can be used to test the current build.

1
2
3
4
5
6
7
8
9
10
11
12
13
version: 1.0.{build}
build:
  verbosity: minimal
after_build:
- cmd: >-
    7z a asmspy.zip .\AsmSpy\bin\Debug\AsmSpy.exe
    if defined APPVEYOR_REPO_TAG_NAME choco pack .\AsmSpy\AsmSpy.nuspec --version %APPVEYOR_REPO_TAG_NAME%
    if defined APPVEYOR_REPO_TAG_NAME appveyor PushArtifact asmspy.%APPVEYOR_REPO_TAG_NAME%.nupkg -DeploymentName ReleaseNuget
artifacts:
- path: asmspy.zip
  name: Zip Package
- path: '\AsmSpy\bin\*.nupkg'
  name: Nuget Package

Setting up Chocolatey Environment

Since Chocolatey is built on top of NuGet infrastructure, it supports deployment to it like you would do for a NugGet package. The NuGet deployment provider publishes packages to a NuGet feed. All you need to provide is the feed URL and the API key and the package to deploy. I created a NuGet deployment environment with the chocolatey NuGet URL, my account API key and the Artifact to deploy.

AppVeyor Chocolatey environment

The projects build setting is configured to deploy to the Environment created above for a build triggered by a commit with a tag.

1
2
3
4
5
6
deploy:
- provider: Environment
  name: AsmSpy Chocolatey
  on:
    branch: master
    APPVEYOR_REPO_TAG: true

From now on any tagged commit is pushed into the master branch on the repository it will trigger a release into Chocolatey. I have not tested this yet as there were no updates to the tool. I might trigger a test release sometime soon to see if it all works well end to end. Since with this automated deployment, we no longer use the zip URL to download the package in Chocolatey. The exe gets bundled along with the package. There might be some extra build scripts required to support the upgrade scenario for Chocolatey. I will update the post after the first deployment using this new pipeline!

It’s been a year since I have moved over to Readify in Sydney, Australia. Like I never believed that one could earn money on the Internet, I never thought that I could easily find a job abroad.

If you are not a developer this post might not be fully applicable to you. However the things that helped me might help you too.

How I found a job in Sydney

I was not so keen to move abroad with an on-site opportunity from India, given the overhead of onsite-offshore coordination is a pain. Getting a resident visa for countries like Australia, Canada etc. and finding a job after moving in was another option. I didn’t prefer that either, as ending up in a foreign country without a job didn’t look great to me, especially with Gautham. So the only option left was to find an employer who recruits internationally and move in with a sponsored work visa. There are a lot of companies that are looking for people across the world and ready to sponsor for the visa . Since there is more money and effort involved in the whole process of recruiting internationally, I have felt that companies look for something more than just passing an interview process. Here are a few things that helped me find such an employer and things that I could have done better.

The Third Place

There needs to a place other than Home and Work, where you spend time and this is often referred to as a Third Place. It could be your blog, stack overflow, msdn forums, GitHub, podcast, YouTube channel, social media pages etc. For me primarily it is this blog and then GitHub and a bit of msdn forums. Having a Third Place increases your chances of landing a job and also acts as a good ‘Resume’. I feel a resume is not worthy these days, as you can put whatever you feel like in that and needs to be validated through an interview. A blog, forum profile etc cannot be faked and always speaks about your experience.

There is no reason to believe in a Resume, it can always be faked - but a history of events, posts or articles is hard to be faked. A resume should self-validate.

A resume, if at all it needs to be there, should be just a highlight of all your experiences, with relevant links to your ‘Third Place’. This also helps keep resume short and clear.

Using Social Media For The Advantage

Social media is a really powerful to connect with different people, especially Twitter and LinkedIn. These are good channels to establish relationships with different people from different geographies. It’s good to follow and start general conversations with employees of companies that you wish to join. Try to get involved with any tweets, messages, open source projects that they are also involved in . With time, once you become known to them, either you can reach out to them for an opportunity to work together or they themselves might offer you one. Don’t try to fake it or overdo this, as it can affect you adversely. Do this only if you are genuinely interested in what they do.

The Hague - Amsterdam

For me, I landed an interview with eVision, through one of my Twitter contact, Damian Hickey. it just happened that we followed each other and he worked for eVision, and I found the company interesting. Just a message to him and 2 months later I was in Amsterdam attending an interview with them and an offer for employment a week later. But I ended up not joining them because of the visa getting delayed for a long period as I was not fully ‘travel ready’ (more on this below). But I enjoyed every bit of the time I spend with ‘Team Tigers’ in eVision.

Reaching out to Companies Directly

Readify was another company that interested me. The company itself took pride in its employees and values a lot in their Professional development. The different people that were part of the company was another reason that I like Readify - MVPs, Book authors, Pluralsight authors, bloggers, musicians, photographers - name it and there was a person with that interests. It is also one of the best consulting companies in Australia. The recruitment process was straightforward and all started with the knock-knock challenge, followed by a series of interviews. Everything just fell in place and on time and I ended up joining them and moving over to Sydney Australia on a work visa. Readify is still hiring and if you are interested and find yourself a match send me your profile (for no reason but to earn me the referral bonus) - if not head off to the knock knock challenge.

Good developers are in great demand and ‘Good’ is relative - you have a place out there all you need is to reach out!

Look out for companies in LinkedIn or StackOverflow Careers who sponsor visa and follow their recruitment process.

Being Travel Ready

As I mentioned above one of the reasons not to take up the offer at eVision, was because of the visa getting delayed for a long period. Since all my documents are from India, I had to get all my documents attested and Apostille’d, which takes around a month’s time. Added to that neither I nor my wife had a birth certificate (as it was not so common thing in my place when I was born). So I had to first get all the documents and then get them apostilled, which was not to happen in at least 2-3 months. Even if you have no interest in moving abroad,get all your travel documents and keep them ready. Most commonly asked for documents are

  • Passport
  • Birth certificate
  • Education certificates
  • Work Experience certificates
  • Marriage certificate (if you are married)

Making the Move

One of the biggest challenges that I faced while making the move, was to get around the currency conversion and to ensure that the same standard of living can be maintained once I move across it. Sites like Expatistan, Numbeo etc. help give an idea on approximate costs. But what’s worked for me more is to not use the actual currency exchange rate, but to find a ‘Personal Exchange Rate (PER)’ and use that to compare.

Personal Exchange Rate is your Disposable Income in Home Country/Disposable Income in Destination Country. Multiply costs of items in destination country by PER to see how it compares to prices in your country.

View from my office, Sydney

I moved over to Sydney last year around the same time. It was difficult to adjust in during the initial days, but soon it all started to fall into the rhythm. It’s been a year with Readify and at my client and I am enjoying the new experiences.

Often we write ourselves or come across code that has both business language and the programming language semantics mixed together. This makes it very hard to reason about the code and also fix any issues. It’s easier to read code that is composed of different smaller individual functions doing a single thing.

If you follow the One Level of Abstraction per Function Rule or the Stepdown Rule as mentioned in the book Clean Code (I recommend reading it if you have not already), it is easier to keep the business and programming language semantics separate.

We want the code to read like a top-down narrative. We want every function to be followed by those at the next level of abstraction so that we can read the program, descending one level of abstraction at a time as we read down the list of functions. Making the code read like a top-down set of TO paragraphs is an effective technique for keeping the abstraction level consistent.

Recently while fixing a bug in one of the applications that I am currently working on, I came across a code with the business and programming language semantics mixed together. This made it really hard to understand the code and fixing it. So I decided to refactor it a bit before fixing the bug.

Code should be readable

The application is a subscription based service for renting books, videos, games etc. and enabled customers to have different subscription plans and terms. Currently, we are migrating away from the custom built billing module that the application uses to a SAAS based billing provider to make invoicing and billing easy and manageable. In code, a Subscription holds a list of SubscriptionTerm items, that specifies the different terms that a customer has for the specific subscription. A term typically has a start date, an optional end date and a price for that specific term. A null end date indicates that the subscription term is valid throughout the customer lifetime in the system.

1
2
3
4
5
6
7
8
9
10
11
12
public class Subscription
{
    public List<SubscriptionTerm> Terms { get; set; }
}

public class SubscriptionTerm
{
    public int Id { get; set; }
    public double Price { get; set; }
    public DateTime StartDate { get; set; }
    public DateTime? EndDate { get; set; }
}

But in the new system to which we are migrating to, does not support subscription terms that overlap each other with a different price. This had to be data fixed manually in the source system, so we decided to perform a validation step before the actual migration. The code below does exactly that and was working fine until we started seeing that for cases where there were more than one SubscriptionTerm without an end date and also when end date of one was the start date of another, there were no validation errors shown.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public bool Validate(Subscription subscription)
{
    var hasOverlappingItems = false;
    foreach (var term in subscription.Terms)
    {
        var otherTerms = subscription.Terms.Where(a => a.Price != term.Price);
        if (otherTerms.Any())
        {
            if (
                (!term.EndDate.HasValue && otherTerms.Any(a => term.StartDate < a.EndDate)) ||
                (otherTerms.Where(a => !a.EndDate.HasValue).Any(a => a.StartDate < term.EndDate)) ||
                (otherTerms.Any(a => term.StartDate <= a.EndDate && a.StartDate <= term.EndDate))
            )
            {
                hasOverlappingItems = true;
                break;
            }
        }
    }

    return hasOverlappingItems;
}

The code, as you can see is not that readable and difficult to understand, which increases the chances of me breaking something else while trying to fix it. There were no tests covering this validator, which made it even harder to change it. While the algorithm itself to find overlappings can be improved (maybe a topic for another blog post), we will look into how we can refactor this existing code to improve its readability.

Code is read more than written, so it’s much better to have code optimized for reading

Creating the Safety Net

The first logical thing to do in this case is to protect us with test cases so that any changes made does not break existing functionality. I came up with the below test cases (test data shown does not cover all cases), to cover the different possible cases that this method can take.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[InlineData("10-Jan-2016", "10-Feb-2016", 1, "11-Feb-2016", "10-Dec-2016", 2, false)]
[InlineData("10-Jan-2015", "10-Feb-2015", 1, "20-Jan-2015", "1-Feb-2016", 2, true)]
public void ValidateReturnsExpected(
    string startDate1, string endDate1, double price1,
    string startDate2, string endDate2, double price2,
    bool expected )
{
    // Fixture setup
    var subscription = new Subscription();
    var term1 = createTerm(startDate1, endDate1, price1);
    var term2 = createTerm(startDate2, endDate2, price2);
    subscription.Terms.Add(term1);
    subscription.Terms.Add(term2);
    // Exercise system
    var sut = new OverlappingSubscriptionTermWithConflictingPriceValidator();
    var actual = sut.Validate(subscription);
    // Verify outcome
    Assert.Equal(expected, actual);
    // Teardown
}

All tests pass, except for those where there were issues in the destination system and I was about to fix.

Refactoring for Readability

Now that I have some tests to back me up for the changes that I am to make, I feel more confident to do the refactoring. Looking at the original validator code, all I see is DATETIME - There is a lot of manipulation of dates that is happening, which strongly indicates there is some abstraction waiting to be pulled out. We had seen in, Thinking Beyond Primitive Values: Value Objects, that any time we use a primitive type, we should think more about the choice of type. We saw that properties that co-exist (like DateRange) should be pulled apart as Value Objects. The StartDate and EndDate in SubscriptionTerm fall exactly into that category.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public class DateRange
{
    public DateTime StartDate { get; private set; }

    public DateTime? EndDate { get; private set; }

    public DateRange(DateTime startDate, DateTime? endDate)
    {
        if (endDate.HasValue && endDate.Value < startDate)
            throw new ArgumentException("End date cannot be less than start Date");

        StartDate = startDate;
        EndDate = endDate;
    }
}

Since these properties are used in a lot of other places, I did not want to make a breaking change, by deleting the existing properties and adding in a new DateRange class. So I chose to add a new read-only property TermPeriod to SubscriptionTerm which returns a DateRange, constructed from it’s Start and End dates, as shown below.

1
2
3
4
5
6
7
public DateRange TermPeriod
{
    get
    {
        return new DateRange(StartDate, EndDate);
    }
}

From the existing validator code, what we are essentially trying to check is if there are any SubscriptionTerms for a subscription that overlaps, i.e if one TermPeriod falls in the range of another. Introducing a method, IsOverlapping on DateRange to check if it overlaps with another DateRange seems logical at this stages. Adding a few tests cases to protect myself here to implement the IsOverlapping method in DateRange class. I also added in the tests to cover the failure scenarios that were seen before.

Tests for IsOverlapping
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[InlineData("10-Jan-2016", "10-Feb-2016", "11-Feb-2016", "10-Dec-2016", false)]
[InlineData("10-Jan-2015", "10-Feb-2015", "20-Jan-2015", "1-Feb-2016", true)]
[InlineData("10-Jan-2015", null, "20-Jan-2016", null,  true)]
[InlineData("28-Jan-16", "10-Mar-16", "10-Mar-16", null, true)]
public void OverlappingDatesReturnsExpected(
    string startDateTime1,
    string endDateTime1,
    string startDateTime2,
    string endDateTime2,
    bool expected)
{
    // Fixture setup
    var range1 = CreateDateRange(startDateTime1, endDateTime1);
    var range2 = CreateDateRange(startDateTime2, endDateTime2);
    // Exercise system
    var actual = range1.IsOverlapping(range2);
    // Verify outcome
    Assert.Equal(expected, actual);
    // Teardown
}
IsOverlapping in DateRange
1
2
3
4
5
6
7
8
9
10
11
12
13
14
public bool IsOverlapping(DateRange dateRange)
{
    if (!EndDate.HasValue && !dateRange.EndDate.HasValue)
        return true;

    if (!EndDate.HasValue)
        return StartDate <= dateRange.EndDate;

    if (!dateRange.EndDate.HasValue)
        return dateRange.StartDate <= EndDate;

    return StartDate <= dateRange.EndDate
        && dateRange.StartDate <= EndDate;
}

Given two DateRange’s I can now tell if they overlap or not, which now can be used to check if two SubscriptionTerms overlap. I just need to check if their TermPeriod’s overlap. The validator code is now much more easy to understand.

IsOverlapping in SubscriptionTerm
1
2
3
4
public bool IsOverlapping(SubscriptionTerm term)
{
    return TermPeriod.IsOverlapping(term.TermPeriod);
}
Validator after Refactoring
1
2
3
4
5
6
7
8
9
10
11
public bool Validate(Subscription subscription)
{
    foreach (var term in subscription.Terms)
    {
        var termsWithDifferentPrice = subscription.Terms.Where(a => a.Price != term.Price);
        return termsWithDifferentPrice
            .Any(a => a.IsOverlapping(term));
    }

    return false;
}

The code now reads as a set of TO Paragraphs as mentioned in the book Clean Code.

To check if a subscription is valid, check if the subscription has overlapping SubscriptionTerms with a conflicting price. To check if two subscriptions are overlapping, check if their subscription term periods overlap each other. To check if two term periods overlap check if start date of one is before the end date of other

Readability of code is an important aspect and should be something that we strive towards for. The above just illustrates an example of why readability of code is important and how it helps us on a longer run. It makes maintaining code really easy. Following some basic guidelines like One Level of Abstraction per Function, allows us to write more readable code. Separating code into different small readable functions covers just one aspect of Readability, there are a lot of other practices mentioned in the book The Art of Readable Code. The sample code with all the tests and validator is available here.

In computing, a newline, also known as a line ending, end of line (EOL), or line break, is a special character or sequence of characters signifying the end of a line of text and the start of a new line. The actual codes representing a newline vary across operating systems, which can be a problem when exchanging text files between systems with different newline representations.

I was using a Resource (resx) file to store large text of comma separated values (CSV). This key-value mapping represented the mapping of product codes between an old and new system. In code, I split this whole text using Environment.NewLine and then by comma to generate the map, as shown below.

1
2
3
4
AllMappings = Resources.UsageMap
    .Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries)
    .Select(s => s.Split(new[] { ',' }))
    .ToDictionary(item => item[0], item => item[1]);

It all worked fine on my machine and even on other team members machines. There was no reason to doubt this piece of code, until on the development environment we noticed the mapped value in the destination system always null.

Analyzing the Issue

Since in the destination system, all the other values were getting populated as expected, except for this mapping it was easy to narrow down to the class that returned the mapping value, to be the problematic one. Initially, I thought this was an issue with the resource file not getting bundled properly. I used dotPeek to decompile the application and verified that resource file was getting bundled properly and had exactly the same text (visually) as expected.

Resource file disassembled in dotPeek

I copied the resource file text from disassembled code in dotPeek into Notepad2 (configured to show the line endings) and everything started falling into place. The resource text file from the build generated code ended with LF (\n), while the one on our development machines had CRLF (\r\n). All machines, including the build machines are running Windows and the expected value for Environemnt.Newline is CRLF - A string containing “\r\n” for non-Unix platforms, or a string containing “\n” for Unix platforms.

Difference between build generated and development machine resource file
Difference between build generated and development machine resource file

Finding the Root Cause

We use git for our source control and configured to use ‘auto’ line endings at the repository level. This ensures that the source code, when checked out, matches the line ending format of the machine. We use Bamboo on our build servers running Windows. The checked out files on the build server had LF line endings, which in turn gets compiled into the assembly.

The checkout step in Bamboo used the built in git plugin (JGit) and has certain limitations. It’s recommended to use native git to use the full git features. JGit also has a known issue with line endings on a Windows machine and checks out a file with LF endings. So whenever the source code was checked out, it replaced all line endings in the file with LF before compilation. So the resource file ended up having LF line endings in the assembly, and the code could no longer find Environment.Newline (\r\n) to split.

Possible Fixes

Two possible ways to fix this issue is

  • Switch to using native git on the bamboo build process
  • Use LF to split the text and trim any excess characters. This reduces dependency on line endings variations and settings between different machines only until we are on a different machine which has a different format.

I chose to use LF to split the text and trim any additional characters, while also updating Bamboo to use native git for checkout.

1
2
3
4
AllMappings = Resources.UsageMap
    .Split(new string[] {"\n"}, StringSplitOptions.RemoveEmptyEntries)
    .Select(s => s.Split(new[] { ',' }))
    .ToDictionary(item => item[0].Trim().ToUpper(), item => item[1].Trim());

Protecting Against Line Endings

The easiest and fastest way that this would have come to my notice was to have a unit test in place. This would ensure that the test fails on the build machine. A test like below will pass on my local but not on the build machine as UsageMap would not return any value for the destination system.

1
2
3
4
5
6
7
8
9
[Theory]
[InlineData("MovieWeek", "Weekly-Movie")]
[InlineData("Dell15", "Laptop-Group3")]
public void SutReturnsExpected(string sourceSystemCode, string expected)
{
    var sut = new UsageMap();
    var actual = sut.GetDestinationCode(sourceSystemCode);
    Assert.Equal(expected, actual);
}

Since there are different systems with different line endings and also applications with different line ending settings and issues of its own, there does not seem to be a ‘one fix for all’ cases. The best I can think of in these cases is it protect us with such unit tests. It fails fast and brings it immediately to out notice. Have you ever had to deal with an issue with line endings and found better ways to handle them?