Yesterday I was late to leave office as I had to data fix some of the systems that we are currently building. We had just migrated few hundred clients onto the new platform. Invoices generated for the clients had wrong invoice amounts due to some mismatching data used when migrating. We had the expected invoice from the old system which made finding the problem easy. We ran a few scripts to correct the data in different systems and fixed the issue.
It All Starts with a Few
I have seen it repeatedly happen that this kind of data fixes starts with a few in the beginning. Within a short span of time the affected data size grows drastically and manual updates might not be a good solution.
If you get a second thought of whether to script the fix or not, then you should script it.
Yesterday it started with data fix for 30 clients and the fix was relatively small. It could either be through UI or API. Fix through the UI took around 45 seconds each, and there were two of us. So it was just a matter of 12-15 minutes to fix it. While fixing, one of us found an extra scenario where the same fix needs to be applied. Re-running the query to find such clients bombarded the number to 379. At this moment, I stood up and said I am going to script this. There is no way I am doing this manually. Manually fixing this would take five man hours, but will finish in two and half hours, as there were two of us. Even writing the script is going to take around an hour but that’s just one man hour.
There is happiness you get when you script the fix and not manually plow through the UI fixing each of them
The script was in C#, written as a test case, invoked from a test runner (which I don’t feel great about now) updating the systems with the data fix. It did its job and fixed all the cases it was supposed to. But I was not happy with the approach that I had chosen to make the fix. Correcting production data through a unit test script does not sound a robust solution. The reason to choose tests was that the test project had all the code required to access the other systems. It was just about changing the configuration values to point to the production system. It was the shortest path to having at least one client updated and verified.
Having it as a test script restricted me from scaling the update process (though I could have done some fancy things to run tests in parallel). It also forced me to hard-code the input data.Logging was harder and I used Debug.WriteLine to the VS output window. All those were the aftermath of choosing the wrong execution method - running it as a test script!
In retrospective, here are a few things that I should have done different and should be doing if ever I am in a similar situation again.
Create Stand-alone Executable
Having a stand-alone executable running the script provides the capability to scale the number of processes as I wanted. Input can be passed as a file or as an argument to the application allowing to break the large data set into smaller subsets.
Log Error and Success
It’s very much possible that the ‘fix-to-fix errors’ can go wrong or throw exceptions. So handle for errors and log appropriate message to take any corrective actions. It’s better to log to a file or other durable storage as that is more foolproof. Logging to the output window (Debug.Writeline/Console.Writeline) is not recommended, as there is a risk of accidentally losing it (with another test run or closing VS).
Logging successes are equally important to keep track of fixed records. It helps in cases where the process terminates suddenly while processing a set of data. It gives a track of all data sets that were successfully processed and exclude from following runs.
It is very likely that the script has bugs and does not handle all possible cases. So as with any code, testing the data fix script is also mandatory. Preferably, test in a development/test environment, if not try for a small subset of input in the production. In my case, I was able to test in the development environment and then in production. But still, I ran a small subset in production first and ended up finding an issue that I could not find in development.
Parallelize if Possible
In cases where the data fixes are independent of each other (which likely is when dealing with large data fixes), each of the updates can be in parallel. Also using nonblocking calls when updating across the network helps speed up the process, by reducing the idle time and improves the overall processing time.
Parameterizing of input to the script (console) application helps when you want to scale the application. In my case updating each of the clients took around 8-10 seconds as it involved calling multiple geographically distributed systems. (Updating a system in the US from Australia does take a while!). Having a parameterized application enables to have multiple applications running with different input sets updating the data and speeds up the overall processing time.
It’s hard to come up with a solid plan for critical data fixes. It might not be possible to follow all of the points above. Also, there might be a lot other things to be done other than these. These are just a few things for reference so that I can stop, take a look and move on when a similar need arises. Hope this helps someone else too! Drop in a comment if you have any tips for the ‘eleventh hour’ fix!
Humans are creatures of habit and things work well if made as a routine. It’s what you build as your daily plan that defines what you end up achieving in the day and in turn with life.
If you are a late night person, you can read ‘morning’ as ‘late night’ - the focus here is routine!
A morning routine is nothing but a sequence of actions regularly followed. I have tried switching my routine too, to late in the night a few times but found that mornings work better for me. But this could be different for you, so stick to the time of day that works for you. Before going to how my morning routine looks like (which I just started a week back), I will explain how I made the plan for the routine.
At any point in time, there are a lot of things in my mind and things that I kept committing to myself and others. It is not possible to keep up with everything that I wish to do. So the very first thing to do is to dump everything out onto to paper and then decide what needs attention. The Incompletion Trigger List assists to get everything out of your mind onto paper. It’s a good idea to block some time of yours to perform this exercise and give it all the attention it needs. At times it helps to Slow Down to Go Fast.
Most Important Task (MIT)
If you are following along, hope the brain dump helped to flush out all that was there in your mind. This exercise needs to be occasionally done (maybe every 2-3 months) to stay clear and stay on top of things. I was sure that I could not do everything on that list after the brain dump. Now comes the hard part of choosing what matters to you and choosing those that aligns well with your goals. For the morning routine, I stuck to items from the brain dump that fall under ‘Personal Projects’ category (as highlighted in the below image).
These are the items that matter to me and aligns to the short-long term goals that I have. Below is a part of my list.
1 2 3 4 5 6 7
The key is not to prioritize what’s on your schedule, but to schedule your priorities.
– Stephen Covey
Progressing towards all of these items on the list at the same time is not possible, as time available each day for achieving them is limited. I usually get around 2-3 hours a day of ‘me time’, provided I wake up at 4 in the morning (more on this shortly). The number of hours you have might differ, and you could choose those many items as you think you can fit in. But 3 is a good number to choose, as that helps to mix in a few different goals and gives the flexibility to shuffle around with them on a day. For me, it also means I roughly get around 40-60 minutes daily, for each item.
Currently, the ones that I have in my morning routine are:
- Learn Functional Programming
- Open Source Contribution
MITs to Mini Habits
Having high-level short-long term goals is good, but does not provide anything actionable on a daily basis. It feels overwhelming to approach them because it does not give any sense of direction. So it’s important that I have small actionable items that I can work on and progress towards achieving the goal.
Break your goal into the smallest possible task that you can think of so that you don’t feel to skip it
For me, the mini habits look like this
- Write at least one sentence for the blog
- Read at least one line about Functional Programming
- Read at least one line of code of an Open Source Project
The idea behind keeping it so small is just to start. It’s very rare that I have stopped writing after writing a sentence or stopped reading after a line. The trouble is only with getting started - once done you can easily carry on for at least 20-30 minutes. Even if I make 2 out of the 3 of the above tasks, I take it as a success, which gives me some flexibility each day.
Waking up Tricks
There are days when Resistance) beats me to it, and I don’t get up to the routine. But I feel low on those days for not able to progress on my goals. So I try hard to have less of such days.
Alarm Phone Inside Pillow: I use Timely on a spare phone to set alarms. Till a while back I used to keep the phone at the bedside while sleeping. But I noticed that I often ended up snoozing the alarm, at times even without full consciousness. So to make I wake up to the alarm, I now keep the phone buried inside my pillow with just the vibration. The vibration forces me to wake up and also removes the need for any alarm sound - my kid and wife does not get disturbed.
Wear a Sweater: During winter, at times the cold beat me to it. It’s hard to leave all the blankets and wake up to the cold. I started sleeping with the sweater, and I don’t feel that cold when I wake up.
Rationalize Against Resistance: However hard I try not to rationalize on getting up when the alarm sounds off, looking at the snooze button I end up rationalizing. Often I have found that when I try to use the tasks that I can achieve if I wake up, to motivate myself, I end up justifying that it can wait for tomorrow. Because there is no hard deadlines or accountability to anyone - it’s just me!. Now I try just the opposite - Think about the resistance that is trying to force me to the bed and reassure myself that I should not fall to it. The ‘me’ waking up after sleeping in is not going to like it then. So wake up!
- Wake at 4 am
- Drink Water
- Review the tasks for the day (Todoist)
- Mini Habits (2 or 3)
- Wake up wife at 5:45 am
- Continue Mini Habits (2 or 3)
- Wake up Gautham at 7 am
Having a morning routine has helped me focus more on things that matter and not wander from one task to another. It has also helped set a sense of direction to what I do every day and spent less time in thinking what to do. I find my days starting in gradually and not rushing into it, setting up the pace for the day. Hope this helps you too!
It was a busy week with NDC Sydney and a lot of other user group conferences happening at the same time since all the international speakers were in town.The conference was three days long with 105 speakers, 37 technologies, and 137 talks. Some the popular speakers were Scott Hanselman, Jon Skeet, Mark Seemann, Scott Allen,Troy Hunt and a lot more.
Each talk is one hour long and seven talks happen at the same time. The talks that I attended are:
- Keynote: “If I knew then what I know now…” – Teaching Tomorrow’s Web to Yesterday’s Programmer
- Stairway to Cloud: Orleans Framework for building Halo-scale systems
- 50 Shades of AppSec
- ASP.NET Core Kestrel: Adventures in building a fast web server
- Domain Architecture Isomorphism and the Inverse Conway Maneuver
- Left early from Building SOLID ASP.NET Core 1.0 Apps to attend Alt.net user group
- Making Hacking Child’s Play
- A cloud architecture – Azure from the bottom up
- Functional Architecture: the Pits of Success
- Moved to Lightning talks after getting bored with Let’s talk auth
- Building Reactive Services using Functional Programming
- Microtesting: How We Set Fire To The Testing Pyramid While Ensuring Confidence
- What does an “Open Source Microsoft Web Framework” look like
- Accessing the Google Cloud Platform with C#
- One kata, three languages
- Head to Head: Scott Allen and Jon Skeet
- Deploying and Scaling Microservices
- The Experimentation Mindset
- C# 7
All sessions are recorded and are available here. I hope the NDC Sydney ones too will be there in some time.
Events like this are a great place to network with other people in the industry and was one of the reasons I wanted to attend NDC. I am a regular reader of Mark Seemann’s (@ploeh) blog, and his ideas resonate with me a lot. Also, I find his Pluralsight videos and his book, Dependency Injection in .NET helpful. It was great to meet him in person and enjoyed both of his talks on FSharp.
Most of the event sponsors had they stall at the conference, spreading their brand (with goodies and t-shirts) and also the work they do (a good way to attract talent to the company). There were also raffles for some big prices like Bose headphones, Das keyboards, Drones, Coffee machines, Raspberry Pi, etc. I was lucky enough to win a Raspberry Pi3 from @ravendb.
It’s confirmed that NDC Sydney is coming back next year. If you are in town during that time, make sure you reserve a seat. Look out for the early bird tickets, those are cheap, and the conference is worth it. Thanks to Readify for sponsoring my tickets and it’s one of the good things about working with Readify.
See you at NDC Sydney next year!
Recently I have been trying to contribute to open source projects, to build the habit of reading others code. I chose to start with projects that I use regularly. AsmSpy is one such project.
AsmSpy is a Command line tool to view assembly references. It will output a list of all conflicting assembly references. That is where different assemblies in your bin folder reference different versions of the same assembly.
I started with an easy issue to get familiar with the code and to confirm that the project owner, Mike Hadlow, accepts Pull Requests (PR). Mike was fast to approve and merge in the changes. There was a feature request to make AsmSpy available as Chocolatey package. Chocolatey is a package manger for Windows, to automate software management. AsmSpy, being a tool that’s not project specific, it makes sense to deliver this via Chocolatey and makes installation easier. Mike added me as a project collaborator, which gave better control over the repository.
Manually Releasing the Chocolatey Package
AsmSpy is currently distributed as a zip package. Chocolatey supports packaging from a URL with a PowerShell script Install-ChocolateyZipPackage. For the first release I used this helper script to create the Chocolatey package and uploaded it to my account. After fixing a few review comments the package got published.
Automating Chocolatey Releases
Now that I have to manage the AsmSpy Chocolatey package installations, I decided to automate the process of Chocolatey package creation and upload. Since I had used AppVeyor for automating Click-Once deployment of CLAL, I decided to use AppVeyor for this.
I wanted to automatically deploy any new version of the package to Chocolatey. Any time a tagged commit is made in the main branch (master) it should trigger a deployment and push the new package to Chocolatey. This will give us the flexibility to control version numbers and decide when we actually want to make a release.
Setting up the Appveyor Project
Since now I am a collaborator on the project, AppVeyor shows the AsmSpy GitHub repository in my AppVeyor account too. Setting up a project is really quick in AppVeyor and most of it is automatic. Any commits now to the repository triggers an automated build
After playing around with different Appveyor project settings and build scripts, I noticed that AppVeyor was no longer triggering builds on commit pushes in the repository. I tried deleting and adding the AppVeyor project, but with no luck.
The AppVeyor team was quick to respond and suggested a possible problem with the Webhook URL not configured under the GitHub repository. The Webhook URL for AppVeyor is available under the projects settings. Since I did not have access to the Settings page of the GitHub repository, I reached out to Mike, who promptly updated the Webhook URL for AppVeyor under GitHub project settings. This fixed the issue of builds not triggering automatically when commits are pushed to the GitHub repository.
Creating Chocolatey Package
AppVeyor has support for Chocolatey commands out of the box, which makes it easy to create packages on a successful build. I added in the nuspec file that defines the Chocolatey Package and added an after-build script to generate the package. AppVeyor exposes environment variables, that are set for every build. In the ‘after_build’ scripts I trigger Chocolatey packaging only if the build is triggered by a commit with a tag (APPVEYOR_REPO_TAG_NAME). Every build generates the zip package that can be used to test the current build.
1 2 3 4 5 6 7 8 9 10 11 12 13
Setting up Chocolatey Environment
Since Chocolatey is built on top of NuGet infrastructure, it supports deployment to it like you would do for a NugGet package. The NuGet deployment provider publishes packages to a NuGet feed. All you need to provide is the feed URL and the API key and the package to deploy. I created a NuGet deployment environment with the chocolatey NuGet URL, my account API key and the Artifact to deploy.
The projects build setting is configured to deploy to the Environment created above for a build triggered by a commit with a tag.
1 2 3 4 5 6
From now on any tagged commit is pushed into the master branch on the repository it will trigger a release into Chocolatey. I have not tested this yet as there were no updates to the tool. I might trigger a test release sometime soon to see if it all works well end to end. Since with this automated deployment, we no longer use the zip URL to download the package in Chocolatey. The exe gets bundled along with the package. There might be some extra build scripts required to support the upgrade scenario for Chocolatey. I will update the post after the first deployment using this new pipeline!
If you are not a developer this post might not be fully applicable to you. However the things that helped me might help you too.
I was not so keen to move abroad with an on-site opportunity from India, given the overhead of onsite-offshore coordination is a pain. Getting a resident visa for countries like Australia, Canada etc. and finding a job after moving in was another option. I didn’t prefer that either, as ending up in a foreign country without a job didn’t look great to me, especially with Gautham. So the only option left was to find an employer who recruits internationally and move in with a sponsored work visa. There are a lot of companies that are looking for people across the world and ready to sponsor for the visa . Since there is more money and effort involved in the whole process of recruiting internationally, I have felt that companies look for something more than just passing an interview process. Here are a few things that helped me find such an employer and things that I could have done better.
The Third Place
There needs to a place other than Home and Work, where you spend time and this is often referred to as a Third Place. It could be your blog, stack overflow, msdn forums, GitHub, podcast, YouTube channel, social media pages etc. For me primarily it is this blog and then GitHub and a bit of msdn forums. Having a Third Place increases your chances of landing a job and also acts as a good ‘Resume’. I feel a resume is not worthy these days, as you can put whatever you feel like in that and needs to be validated through an interview. A blog, forum profile etc cannot be faked and always speaks about your experience.
There is no reason to believe in a Resume, it can always be faked - but a history of events, posts or articles is hard to be faked. A resume should self-validate.
A resume, if at all it needs to be there, should be just a highlight of all your experiences, with relevant links to your ‘Third Place’. This also helps keep resume short and clear.
Using Social Media For The Advantage
Social media is a really powerful to connect with different people, especially Twitter and LinkedIn. These are good channels to establish relationships with different people from different geographies. It’s good to follow and start general conversations with employees of companies that you wish to join. Try to get involved with any tweets, messages, open source projects that they are also involved in . With time, once you become known to them, either you can reach out to them for an opportunity to work together or they themselves might offer you one. Don’t try to fake it or overdo this, as it can affect you adversely. Do this only if you are genuinely interested in what they do.
For me, I landed an interview with eVision, through one of my Twitter contact, Damian Hickey. it just happened that we followed each other and he worked for eVision, and I found the company interesting. Just a message to him and 2 months later I was in Amsterdam attending an interview with them and an offer for employment a week later. But I ended up not joining them because of the visa getting delayed for a long period as I was not fully ‘travel ready’ (more on this below). But I enjoyed every bit of the time I spend with ‘Team Tigers’ in eVision.
Reaching out to Companies Directly
Readify was another company that interested me. The company itself took pride in its employees and values a lot in their Professional development. The different people that were part of the company was another reason that I like Readify - MVPs, Book authors, Pluralsight authors, bloggers, musicians, photographers - name it and there was a person with that interests. It is also one of the best consulting companies in Australia. The recruitment process was straightforward and all started with the knock-knock challenge, followed by a series of interviews. Everything just fell in place and on time and I ended up joining them and moving over to Sydney Australia on a work visa. Readify is still hiring and if you are interested and find yourself a match send me your profile (for no reason but to earn me the referral bonus) - if not head off to the knock knock challenge.
Good developers are in great demand and ‘Good’ is relative - you have a place out there all you need is to reach out!
Look out for companies in LinkedIn or StackOverflow Careers who sponsor visa and follow their recruitment process.
Being Travel Ready
As I mentioned above one of the reasons not to take up the offer at eVision, was because of the visa getting delayed for a long period. Since all my documents are from India, I had to get all my documents attested and Apostille’d, which takes around a month’s time. Added to that neither I nor my wife had a birth certificate (as it was not so common thing in my place when I was born). So I had to first get all the documents and then get them apostilled, which was not to happen in at least 2-3 months. Even if you have no interest in moving abroad,get all your travel documents and keep them ready. Most commonly asked for documents are
- Birth certificate
- Education certificates
- Work Experience certificates
- Marriage certificate (if you are married)
Making the Move
One of the biggest challenges that I faced while making the move, was to get around the currency conversion and to ensure that the same standard of living can be maintained once I move across it. Sites like Expatistan, Numbeo etc. help give an idea on approximate costs. But what’s worked for me more is to not use the actual currency exchange rate, but to find a ‘Personal Exchange Rate (PER)’ and use that to compare.
Personal Exchange Rate is your Disposable Income in Home Country/Disposable Income in Destination Country. Multiply costs of items in destination country by PER to see how it compares to prices in your country.
I moved over to Sydney last year around the same time. It was difficult to adjust in during the initial days, but soon it all started to fall into the rhythm. It’s been a year with Readify and at my client and I am enjoying the new experiences.
Often we write ourselves or come across code that has both business language and the programming language semantics mixed together. This makes it very hard to reason about the code and also fix any issues. It’s easier to read code that is composed of different smaller individual functions doing a single thing.
If you follow the One Level of Abstraction per Function Rule or the Stepdown Rule as mentioned in the book Clean Code (I recommend reading it if you have not already), it is easier to keep the business and programming language semantics separate.
We want the code to read like a top-down narrative. We want every function to be followed by those at the next level of abstraction so that we can read the program, descending one level of abstraction at a time as we read down the list of functions. Making the code read like a top-down set of TO paragraphs is an effective technique for keeping the abstraction level consistent.
Recently while fixing a bug in one of the applications that I am currently working on, I came across a code with the business and programming language semantics mixed together. This made it really hard to understand the code and fixing it. So I decided to refactor it a bit before fixing the bug.
The application is a subscription based service for renting books, videos, games etc. and enabled customers to have different subscription plans and terms. Currently, we are migrating away from the custom built billing module that the application uses to a SAAS based billing provider to make invoicing and billing easy and manageable. In code, a Subscription holds a list of SubscriptionTerm items, that specifies the different terms that a customer has for the specific subscription. A term typically has a start date, an optional end date and a price for that specific term. A null end date indicates that the subscription term is valid throughout the customer lifetime in the system.
1 2 3 4 5 6 7 8 9 10 11 12
But in the new system to which we are migrating to, does not support subscription terms that overlap each other with a different price. This had to be data fixed manually in the source system, so we decided to perform a validation step before the actual migration. The code below does exactly that and was working fine until we started seeing that for cases where there were more than one SubscriptionTerm without an end date and also when end date of one was the start date of another, there were no validation errors shown.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
The code, as you can see is not that readable and difficult to understand, which increases the chances of me breaking something else while trying to fix it. There were no tests covering this validator, which made it even harder to change it. While the algorithm itself to find overlappings can be improved (maybe a topic for another blog post), we will look into how we can refactor this existing code to improve its readability.
Code is read more than written, so it’s much better to have code optimized for reading
Creating the Safety Net
The first logical thing to do in this case is to protect us with test cases so that any changes made does not break existing functionality. I came up with the below test cases (test data shown does not cover all cases), to cover the different possible cases that this method can take.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
All tests pass, except for those where there were issues in the destination system and I was about to fix.
Refactoring for Readability
Now that I have some tests to back me up for the changes that I am to make, I feel more confident to do the refactoring. Looking at the original validator code, all I see is DATETIME - There is a lot of manipulation of dates that is happening, which strongly indicates there is some abstraction waiting to be pulled out. We had seen in, Thinking Beyond Primitive Values: Value Objects, that any time we use a primitive type, we should think more about the choice of type. We saw that properties that co-exist (like DateRange) should be pulled apart as Value Objects. The StartDate and EndDate in SubscriptionTerm fall exactly into that category.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Since these properties are used in a lot of other places, I did not want to make a breaking change, by deleting the existing properties and adding in a new DateRange class. So I chose to add a new read-only property TermPeriod to SubscriptionTerm which returns a DateRange, constructed from it’s Start and End dates, as shown below.
1 2 3 4 5 6 7
From the existing validator code, what we are essentially trying to check is if there are any SubscriptionTerms for a subscription that overlaps, i.e if one TermPeriod falls in the range of another. Introducing a method, IsOverlapping on DateRange to check if it overlaps with another DateRange seems logical at this stages. Adding a few tests cases to protect myself here to implement the IsOverlapping method in DateRange class. I also added in the tests to cover the failure scenarios that were seen before.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Given two DateRange’s I can now tell if they overlap or not, which now can be used to check if two SubscriptionTerms overlap. I just need to check if their TermPeriod’s overlap. The validator code is now much more easy to understand.
1 2 3 4
1 2 3 4 5 6 7 8 9 10 11
The code now reads as a set of TO Paragraphs as mentioned in the book Clean Code.
To check if a subscription is valid, check if the subscription has overlapping SubscriptionTerms with a conflicting price. To check if two subscriptions are overlapping, check if their subscription term periods overlap each other. To check if two term periods overlap check if start date of one is before the end date of other
Readability of code is an important aspect and should be something that we strive towards for. The above just illustrates an example of why readability of code is important and how it helps us on a longer run. It makes maintaining code really easy. Following some basic guidelines like One Level of Abstraction per Function, allows us to write more readable code. Separating code into different small readable functions covers just one aspect of Readability, there are a lot of other practices mentioned in the book The Art of Readable Code. The sample code with all the tests and validator is available here.
In computing, a newline, also known as a line ending, end of line (EOL), or line break, is a special character or sequence of characters signifying the end of a line of text and the start of a new line. The actual codes representing a newline vary across operating systems, which can be a problem when exchanging text files between systems with different newline representations.
I was using a Resource (resx) file to store large text of comma separated values (CSV). This key-value mapping represented the mapping of product codes between an old and new system. In code, I split this whole text using Environment.NewLine and then by comma to generate the map, as shown below.
1 2 3 4
It all worked fine on my machine and even on other team members machines. There was no reason to doubt this piece of code, until on the development environment we noticed the mapped value in the destination system always null.
Analyzing the Issue
Since in the destination system, all the other values were getting populated as expected, except for this mapping it was easy to narrow down to the class that returned the mapping value, to be the problematic one. Initially, I thought this was an issue with the resource file not getting bundled properly. I used dotPeek to decompile the application and verified that resource file was getting bundled properly and had exactly the same text (visually) as expected.
I copied the resource file text from disassembled code in dotPeek into Notepad2 (configured to show the line endings) and everything started falling into place. The resource text file from the build generated code ended with LF (\n), while the one on our development machines had CRLF (\r\n). All machines, including the build machines are running Windows and the expected value for Environemnt.Newline is CRLF - A string containing “\r\n” for non-Unix platforms, or a string containing “\n” for Unix platforms.
Finding the Root Cause
We use git for our source control and configured to use ‘auto’ line endings at the repository level. This ensures that the source code, when checked out, matches the line ending format of the machine. We use Bamboo on our build servers running Windows. The checked out files on the build server had LF line endings, which in turn gets compiled into the assembly.
The checkout step in Bamboo used the built in git plugin (JGit) and has certain limitations. It’s recommended to use native git to use the full git features. JGit also has a known issue with line endings on a Windows machine and checks out a file with LF endings. So whenever the source code was checked out, it replaced all line endings in the file with LF before compilation. So the resource file ended up having LF line endings in the assembly, and the code could no longer find Environment.Newline (\r\n) to split.
Two possible ways to fix this issue is
- Switch to using native git on the bamboo build process
- Use LF to split the text and trim any excess characters. This reduces dependency on line endings variations and settings between different machines only until we are on a different machine which has a different format.
I chose to use LF to split the text and trim any additional characters, while also updating Bamboo to use native git for checkout.
1 2 3 4
Protecting Against Line Endings
The easiest and fastest way that this would have come to my notice was to have a unit test in place. This would ensure that the test fails on the build machine. A test like below will pass on my local but not on the build machine as UsageMap would not return any value for the destination system.
1 2 3 4 5 6 7 8 9
Since there are different systems with different line endings and also applications with different line ending settings and issues of its own, there does not seem to be a ‘one fix for all’ cases. The best I can think of in these cases is it protect us with such unit tests. It fails fast and brings it immediately to out notice. Have you ever had to deal with an issue with line endings and found better ways to handle them?
We were facing a strange ‘could not load DLL issue’, when building and running multiple host projects in Visual Studio (VS 2015), side by side. We had 2 host projects - an NServiceBus worker role project (a console application) and a Web application and a few other projects, a couple of which are shared between both the host projects. It often happened in our team, when running the IIS-hosted Web application, it threw the error :
Could not load file or assembly ‘Newtonsoft.Json’ or one of its dependencies. The located assembly’s manifest definition does not match the assembly reference. .
The bin folder of the Web application did have a Newtonsoft.Json DLL, but of a different version of it than what was specified in the packages.config/csproj file. On a rebuild, the correct DLL version gets placed into the bin folder and everything works fine. Though the exception was observed by most of the team members, it did not happen always, which was surprising
Knowing what exactly caused the issue, I created a sample project to demonstrate it for this blog post. All screenshots and code samples are of the sample application.
Using AsmSpy to find conflicting assemblies
AsmSpy is a command-line tool to view conflicting assembly references in a given folder. This is helpful to find the different assemblies that refer to different versions of the same assembly in the given folder. Using AsmSpy, on the bin folder of the web application, it showed the conflicting Newtonsoft.Json DLL references by different projects in the solution. There were three different versions of Newtonsoft Nuget package referred in the whole solution. The web project referred to an older version than the shared project and the worker project.
1 2 3 4 5 6 7
The assembly binding redirects for both the host projects were correct and using the version of the package that it referred to in the packages.config and project (csproj) file.
Using MsBuild Structured Log to find conflicting writes
Using the Msbuild Structured Log Viewer to analyze what was happening with the build, I noticed the below ‘DoubleWrites’ happening with Newtonsoft DLL. The double writes list shows all the folders from where the DLL was getting written into the bin folder of the project getting building. In the MSBuild Structured log viewer, a DLL pops up only when there are more than one places from where a DLL is getting written, hence the name ‘Double writes. This is a problem as there is a possibility of one write overriding other, depending on the order of writes, causing DLL version conflicts (which is exactly what’s happening here).
But in this specific case, the log captured above does not show the full problem but hints us of a potential problem. The build capture when building the whole solution (sln) shows that there are 2 writes happening from 2 different Newtonsoft package folders, which shows a potential conflict (as shown above). This does not explain the specific error we are facing with the Web application. Running the tool on just the Web application project (csproj), it does not show any DoubleWrites (as shown below).
This confirms that there is something happening with the Web application bin outputs when we build the worker/shared dependency project.
Building Web application in Visual Studio
When building a solution with a Web application project in Visual Studio (VS), I noticed that VS copies all the files from the bin folder of referred projects into the bin of the Web application project. This happens even if you build the shared project alone, as VS notices a change in the files of a dependent project and copies it over. So in this particular case, every time we build the dependent shared project or the worker project (which in turn triggers a build on the shared project), it ended up changing the files in shared projects bin folder, triggering VS to copy it over to the Web application’s bin folder. This auto copy happens only for the Web application project and not for the Console/WPF project. (Yet to find what causes this auto copy on VS build)
Since CopyLocal, by default was true for the shared project, Newtonsoft DLLs were also getting copied into the shared project bin and in turn into Web applications bin (by VS). Since the Web application did not build during the above rebuild, it now has a conflicting DLL version of Newtonsoft in its bin folder, that does not match the assembly version it depends on, hence throws the exception, the next time I load the Web application from IIS.
I confirmed with other team members on the repro steps for this issue
- Get the latest code and do a full rebuild from VS
- Launch Web app works fine
- Rebuild just one of the dependent projects that have Newtonsoft DLL dependency (which has CopyLocal set to true)
- Launch Web app throws the error!
It was a consistent repro with the above steps.
To fix the issue, I can choose either to update the Newtonsoft Package version across all the projects in the solution, or set CopyLocal to false, to prevent the DLL getting copied into the bin folder of the shared project and end up copied to Web application bin. I chose to set CopyLocal to false in this specific case.
The Sample Application
Now that we know what exactly causes the issue, it is easy to create a sample application to reproduce this issue.
- Create a Web application project and add NuGet package reference to older version of Newtonsoft
- Create a console application/WPF application with a newer version of Newtonsoft Package.
- Create a shared library project with a newer version of Newtonsoft Nuget package. Add this shared project as a project reference to both Web application and console/WPF application.
1 2 3
Follow the build repro steps above to reproduce the error. Change CopyLocal or update NuGet references and see issue gets resolved.
Hope this helps in case you come across a similar issue!
Over the past couple of years, I have been successful in motivating a couple of my friends to start a blog and I am happy about that. Every time I discuss starting a blog the most common things that come up are what platform to choose, where to host, where to buy the domain, blogging frequency,will anyone even read my posts, what to blog etc. These were the same questions that I had when I wanted to start a blog and good reasons to procrastinate on for a long time. Over the years having changed the blog platform a couple of times, changing domain name providers, hosting platform, blogging tools here is what I have learned.
The Golden Rule of Blogging!
You should have already guessed that from the post title - The URL where you blog under - Make sure you own it. Buy one of your choices, from any of the domain providers (more on this later) and blog under that. While it’s easy to start blogging under popular services like WordPress.com/Blogspot.com, you actually don’t own your domain.
I have committed this mistake, of blogging under a URL not owned by me, not once but twice,and ended up having to edit the posts to redirect here. (I did not want to pay the providers a monthly fee just for the redirect and didn’t have many readers). It’s not that these external platforms are bad, but you are just throwing away the flexibility to change platforms (without any extra charge) when you want to. Owning your URLs allows you to change platforms, hosts, URL formats, redirects anything that you want to and I like to have that flexibility.
The lesser important things
Other than the URL that you blog on, other things are not that important and here’s why I feel that. If you are completely new to terms like domain, website, hosting etc. check out this article on the difference between all these.
Where to buy the domain?
Choose one and move on with it, even if you do not like it you can change your domain provider anytime. You can transfer from one host to another very easily and I have done that. I started off with GoDaddy for the first year as they seemed the cheapest when I started. But at the time of renewal, I learned that it was just for the first year that it was cheap but for renewals, it was costlier. So I moved on to Namecheap(affiliate) and am with them ever since. Nothing against GoDaddy though, if it works for you get it from there or somewhere - Get one right now if you don’t own a domain!
Where to host?
For a website to be accessible to all, it needs to be running somewhere on the internet and accessible to all, and this is what typically a hosting service provides. The hosting and domain provider need not be the same entity, so you can get your hosting space anywhere and link it with your domain. Your domain provider will give you some console/website where you can configure this. There are also free hosts like Github, tumblr, Blogger etc., which allows custom domain mapping for free, unlike wordpress.com. Godaddy, Namecheap, Azure, AWS etc are few popular options for hosting, if not Google to find what matches your need. Switching from one web host to another is even easier than transferring domains, so do not spend much time deciding where to host.
Which blogging platform to choose?
When starting to blog, choose a platform that makes writing easy and not have the overhead of using the platform stop you from writing
Who will read my blog?
This one was the biggest stopper for me - Who cares about what I write? Maybe no one does, but now I write for myself, it makes me happy as I like to share information with others. Blogging helps me understand and explore topics more deeply. Finally, it’s a reference that I can always go back to when I face something similar. If you have faced a particular scenario (which might be totally weird) then it is very likely that it is going to be experienced by somebody else too. And Google makes finding things easy, and it will be found!
Frequency and commitment
I have been irregular with my blogging schedules (until lately) and used to blog just when motivation strikes. But since the start of this year, I am trying to blog on a schedule. Just experimenting with it and seeing how it goes, so far I am really liking it.It has given a different outlook to work and life in general, as it makes me look more closely for opportunities to generate a blog post. But it’s fine even without a commitment and blog irregularly unless you have some expectations of your reader base or generate revenue from the blog. But have a place to go and scribble down whenever you feel to and own that place!
It’s a good idea to have a schedule to blog , so that you don’t just end up writing one post and forget about your blog forever. Pick a schedule that works for you and try to stick with it.
No Original Content
It’s fine to blog about things that you have not created yourself, the way you learned things or anything you feel to write about, necessarily not even current at the time of writing. For eg., You might be working on Mainframes system and MainFrame is not current and cutting edge now. But that does not mean you should not be blogging about it. Just like you, there are a lot of mainframe developers and it might just help one of them. So don’t bother much about the originality of content, it’s your experiences and the way you see it- that’s always going to be unique.
“A blog is neither a diary nor a journal. Many people think of blogging in relation to those two things, confessional or practical. It is neither but includes elements of both.” - Lemn Sissay
Every ‘professional’ should have a blog, and if you don’t yet, now is a good time to start one. Sound off in the comments on what you feel about blogging in general, especially if you bought a domain and set up your blog after reading this. It will prove a point for this post!
While migrating a few Azure Cloud Services to Web Jobs, we started facing the error, Could not load assembly … /msshrtmi.dll,for just one of the projects. The error provides the exact path from where it is trying to load the DLL and is the same path from which the process is running. But the location does have the msshrtmi.dll, which for some reason the process is not able to load.
To our surprise, this was happening only with a specific worker, while all others (around 8) were working fine. All of the workers are generated by the same build process on a server. For some reason (I am still investigating into this) the msshrtmi.dll is added as an external reference in the project and referred from there in all the project files. This was done mainly because we had a few external dependencies that were dependent on specific Azure SDK version (2.2). But this explicit reference should not have caused any issues as all, as the other processes were working fine and only a specific one was failing.
One useful tool to help diagnose why the .NET framework cannot locate assemblies is Assembly Binding Log Viewer(Fuslogvw.exe). The viewer displays an entry for each failed assembly bind. For each failure, the viewer describes the application that initiated the bind; the assembly the bind is for, including name, version, culture and public key; and the date and time of the failure.
Fuslogvw.exe is automatically installed with Visual Studio. To run the tool, use the Developer Command Prompt with administrator credentials.
Running fuslogvw with the application shows the assembly binding error, double clicking which gives a details error information, as shown below. This error message gives more details and tells us that the assembly platform or ContentType is invalid.
In the Task Manager, the worker with the assembly loading error (last worker in the image below) shows as a 64-bit process, while the others as 32-bit. Since the referred msshrtmi DLL is 32-bit, it explains why it was unable to find the correct platform matching msshrtmi assembly.
CorFlags.exe is used to determine whether an .exe file or .dll file is meant to run only on a specific platform or under WOW64. Running the corflags on all the workers produces the below two results:
Version : v4.0.30319
CLR Header: 2.5
PE : PE32
CorFlags : 0x1
ILONLY : 1
32BITREQ : 0
32BITPREF : 0
Signed : 0
Version : v4.0.30319
CLR Header: 2.5
PE : PE32
CorFlags : 0x20003
ILONLY : 1
32BITREQ : 0
32BITPREF : 1
Signed : 0
The 32BITPREF flag is ‘0’ for the worker that shows the error, whereas for the rest shows 1. The 32BITPREF flag indicates that the application runs as a 32 bit process even on 64-bit platforms. This explains why the problematic worker was running as 64-bit process since the flag is turned off.
From .NET 4.5 and Visual Studio 11, the default for most .NET projects is again AnyCPU, but there is more than one meaning to AnyCPU now. There is an additional sub-type of AnyCPU, “Any CPU 32-bit preferred”, which is the new default (overall, there are now five options for the /platform C# compiler switch: x86, Itanium, x64, anycpu, and anycpu32bitpreferred). When using that flavor of AnyCPU, the semantics are the following:
- If the process runs on a 32-bit Windows system, it runs as a 32-bit process. IL is compiled to x86 machine code.
- If the process runs on a 64-bit Windows system, it runs as a 32-bit process. IL is compiled to x86 machine code.
- If the process runs on an ARM Windows system, it runs as a 32-bit process. IL is compiled to ARM machine code.
All the projects are getting built using the same build scripts, and we are not explicitly turning off/on this compiler option. So the next possible place where any setting for this flag is specified is the csproj file. On the properties of the worker project file (the one that shows error), I see that ‘Prefer 32-bit’ option is not checked and the csproj file has it explicitly set to false (as shown below). For other projects, this option is checked in Visual Studio and has no entry in the csproj file, which means the flag defaults to true.
Deleting the Prefer32Bit attribute from the csproj and building fixed the assembly loading issue of msshrtmi!
Though this ended up being a minor fix (in terms of code change), I learned a lot of different tools that can be used to debug assembly loading issues. It was using these right tools that helped me identify this extra attribute on the csproj file and help solve the issue. So the next time you see such an error , either with mssrhtmi or another DLL, hope this helps to find your way through!