Showing posts with label development. Show all posts
Showing posts with label development. Show all posts

Feb 19, 2014

TFS Internals - How does TFS store Git repositories?

Git in TFS is just Git. Plain old vanilla git. Nothing fancy about it at all.

Well, almost. On the server, there is one significant change to be aware of. Files aren’t stored on the file system like they would be when git is running on your local machine. Instead they’re stored in the TFS SQL server database. Apart from that, it's the same as any other git server out there.

For a little fun I decided to dig into the TFS 2013 database to see just how the git files are stored.

NOTE: Don’t ever go hacking on your TFS databases. You’ll put your system into an unsupported state. All of the information that follows is a result of select statements only.

CAVEAT: The information here is based on me digging through the tables in the database. I’ve likely missed some items and may have made some bad assumptions. Feel free to leave a comment if you spot an error so I can correct the post.

Firstly, looking through the tables in the TFS database we find a number of them are named tbl_Git*.  This looks like a good place to start. Let’s see what they’re call and then what’s in them,
image
You’ll note there’s no ‘objects’ or ‘logs’ tables that might mirror the way a normal .git folder would look, though there is tbl_GitRef that might mirror the refs folder and that tbl_GitCommit table looks pretty interesting.

To figure out what ends up where, I made a new local repository, added a single commit to it, then pushed it to the git repository on TFS

Here’s what I found in each table:

tbl_GitRepository

As you’d expect on a server where you can have multiple repositories, this table just has a list of the repos that have been created. Repositories have partition ids and a Guid for the repository ID, but they also have an ‘InternalRepositoryId’

The InternalRepositoryId is used in other tables as part of their clustered indexes, avoiding the problems of using guids in indexes.
image

tbl_GitRef

This mimics the refs folder in a normal .git repo. You can see that the refs folder structure is tracked in the ref name and that the ObjectId matches my local repo.
image

image

tbl_GitRefLog

This is as you might expect, a log of changes for the various refs. The thing to note here is that there is a pushId maintained in the table as well.
image

tbl_GitPush

Talking of push operations, tbl_GitPush is used purely to track the time of a push and who the person was who did it (via a Guid). Nothing much to see here, so let’s move on.

tbl_GitPluginProcessedCommit

This looks to be a simple log of what jobs were executed against which commits.

tbl_GitCommit

OK, so this one as it turns out is pretty straightforward. It’s a table of git commit SHA1’s mapped to internal commit ids.

My local git commit was as follows:
image
On the server we can see the commit’s SHA1 in this table with an InternalCommitId (unique across repositories by the looks of it) and what push it was related to.
image

tbl_GitCommitMetadata

This table is for the commit comments and foreign key values for the committer and author.
image

tbl_GitCommitParent

Since git tracks the parents of each commit, this table is used for that information. My example commit here has no parents so there’s nothing to show. The table itself only has three columns. The partition id, the InternalCommitId and a ParentInternalCommitId.

When you have merge commits, for example you will end up with multiple rows, as shown in this example. Note again the use of the internal ids instead of SHA1s to allow for clustered indexes on the tables.
image

tbl_GitCommitUser

As alluded to before, this is simple a reference table of user InternalId values to names. Here’s my record for example
image

tbl_GitCommitChanges

So we’re on the last table with a git name. Looking at this for our commit we see the following
image
It’s indicating there was a file changed in the root of the git repo named aa.txt. Great. But where’s the content?

It turns out that TFS doesn’t hold the git content in a Git prefixed table, but somewhere else. There’s actually a few more tables in use, so let’s keep digging.

tbl_Container

Remember that GUID for the git repository we saw, right back at the start? We if we look at the container table we see that GUID referenced in an artifact URI on this table.
image
This ‘container’ URI is actually a reference to a location on the file system. Where? you might ask. Where indeed. If you look under the TFS Application Tier folder you will find a _tfs_data folder. Drill down past that to the Proxy folder and under that you will should find a folder with the same name as our repository’s GUID.

Look in there and you’ll find some interesting items, like those PACK files I’ve highlighted. These are your standard git PACK files.  TFS is storing the data directly in the database but rather in the standard git pack format for efficiency. Interestingly the idx and pack files don’t share the same name as they would on a normal file system based git repo. I’m not sure why.
image
The statement that TFS stores the files on the file system is not entirely true. They’re probably there for performance reasons as if source was only stored on the file system, then the git content wouldn’t be backed up when SQL was backed up and that wouldn’t make people happy when they needed to restore a backup. So let’s see what else we can find.

tbl_ContainerItem

If we select all items in the container item table we see the following. We can see the pack and index files we saw on the file system, but we also see that each has a file id and a file length.
image

tbl_File

These fileids are part of the the tbl_File table which is effectively a mapping of a file id to a resource id.
image
OK. One last stop – where are these resources?

tbl_Content

Finally, we arrive at our destination. The Content column is a varbinary(max) column (i.e. blob storage) and contains our encoded content. Lovely!
image
As for where the aa.txt file lives, well that’s going to be determined by git not TFS. Git will look at the index file and use that to decide where in the related pack file it should extract the content from. You’ll want to be looking into gits internals if you want to understand this process. See http://git-scm.com/book/en/Git-Internals-Packfiles for a good run through on this if you’re interested.


That’s about it for now. I think it’s pretty interesting to see how it all works under the hood and I hope you enjoyed the walkthrough.

Jan 20, 2014

A little nostalgia with the Wang VS and RPG-II

So this happened on Twitter

image

Which of course led me to write this post showing off a little of the language (just a little).

But before I show you the awesomeness that is RPG-II, let me talk a little about the background to this story. I was going through uni to get my computer science degree and my Dad managed to land me an holiday job where I could write some ‘real world’ code and get some experience. The application I needed to write was a cyclic stocktake for a warehouse management system. It would select a user defined number of random locations in the warehouse that hadn’t been counted recently, print out a count sheet for the warehouse staff to use, provide a screen for entering those counts and another report to go show the variances.

I had to write this on a Wang VS midframe. Of course, the machine needed to be in it’s own air conditioned room, on a raised floor, running on 3-phase power. A bit like this one, though a different model:

image

(image via http://community.fortunecity.ws/roswell/goldendawn/232/WangComputers1.htm)

It had a reel-to-reel tape drive that took 1200’ tapes and needed the lid closed before it would spin up,

image

(Image via http://www.ricomputermuseum.org/Home/equipment/wang-peripherals/wang-2248v-1-9-track-tape-drive)

and had removable drives that took 9” “data cartridges” like this one

image

(image via http://ripsaw.cac.psu.edu/~mloewen/Oldtech/Media/RigidDisk.html)

To code on this beastie I used an 80 column green screen Wang terminal:

image

(image via http://www.ricomputermuseum.org/Home/equipment/wang-vs45)

and, of course, I had to code in RPG-II. The fact that at University I was coding in Pascal and C++ using a mix of Macs and Solaris made this feel like a step back in time.

What’s different about RPG-II, you ask? Well, it’s a column formatted language for one. You think a whitespace significant language like Haskell and Python can be confusing, try having each column have different meanings and woe betide you if you get your columns wrong and things don’t compile. That sounds like fun, right?

This is structured programming that takes the word ‘structured’ to the extreme. So much so that the programming manual (http://www.scribd.com/doc/75247909/Rpgii-and-Sys36) included grid paper style code samples so you could understand the syntax better:

image

In the end you end up with code that looks something like this (and yes, you needed to use GOTO statements)image

Pretty awesome right? It’s only just a slight step above assembly programming! If you have a few minutes go and read the Wikipedia article for a quick run down on the main parts of the language. It’s an interesting approach to language design and the limitations of the hardware and terminals they were working with.

As it turns out, my first full time job ended up being with this company and the Wang VS was being replaced with the thoroughly modern and completely brilliant Digital VAX/VMS when I joined. It meant I ended up writing code in VAX BASIC and not once missing the RPG-II language. And those Wang terminals? We replaced those with VT420s and Wyse WY50 terminals.  Now we could have our green screens in orange as well if we wanted! Superb!

Side note: RPG-II isn’t the most esoteric language I ever coded in. For one job I wrote in a thing called ‘English Application Language’ (EAL), a variant of COBOL developed by Cyborg Systems for payroll applications. But that’s a different story for another time.

Leave a comment with oldest language you’ve ever coded in!

Jan 10, 2014

How to configure IISExpress for localtest.me

This is another case of me blogging something simply because I keep forgetting it..

Quick scenario: I want to use the localtest.me domain for a site on my local dev machine running in IISExpress (handy for a number of reasons). What do I need to do?

1. In ApplicationHost.config locate the site I’m interested in and add a new binding, changing myprefix and the port number as appropriate:

<binding protocol="http" bindingInformation="*:2744:myprefix.localtest.me" />

*Note: the config file is in c:\Users\you\Documents\IISExpress\config\ if you’re having trouble locating it.

2. From an elevated command prompt:

netsh http add urlacl url=http://myprefix.localtest.me:2744/ user=everyone

3. Restart IISExpress and browse to http://myprefix.localtest.me:2744 URL.

4. Slap self on forehead for forgetting yet again.

5. Job done. Get back to producing awesomeness!

May 23, 2013

Hey “Programmers”, It’s Time to Grow Up and Be Professional

When the Programming, Mother**** site first showed up some time ago I though it was kind of amusing. At it’s heart it’s an over the top response to people who claim to be doing agile but are instead forgetting everything agility is about, killing team productivity, wasting time, screwing over their customers and doing what they’ve always done but under a different flag. I also, mistakenly, thought that it would die down like so many other rant sites and then disappear. Instead it managed to hit a sweet spot for many people; people who missed the irony and who instead love to trot it out as an excuse for doing whatever they damn well please and ignoring what it means to be a professional software developer.

If you are one of those people who brings up the Programming, M**** site, even in jest then please, stop!

You only help to reinforce the perception that programmers are a gabble of childish, arrogant, self-centred malcontents who think they’re smarter than everyone else, that code is the only thing that matters and who then react with hostility when challenged about why they are doing something or when asked to do something they don’t feel like doing. One would think that as long as you write great code then who cares if it actually does what your customers or users need it to? Who cares if that feature you just took a week to perfectly code will never be used? At least you had fun writing it. Who cares if the usability of what you built is so horrible people would rather stab their eyes out with frozen prawns than use it? If you’re a Programmer, then you write code and that’s all that matters!

Oh, what’s that? You don’t care if anyone thinks you’re a professional? You think you already act professionally; it’s not your problem if the requirements are daft and the users are stupid? You just want to be left alone to write great code since other people just get in the way of doing the best you can? Being professional would stifle your creativity? Well, that’s your right, I suppose but I beg to differ.

Just pause for a moment and have a look at some of the epic fails across our industry and then ask yourself again wether we shouldn’t think a little more beyond the code we write. How about the 21 dead from a radiation overdose, or the woman killed when an ambulance respirator crashed or insulin pumps that are susceptible to remote attacks. And it’s not just software that can directly impact human life either. Think about the regular disclosure of personal information through poor security (too many to cite), regular air travel glitches, entire regions of Canada losing communications, people unsuspectingly racking up massive mobile data charges and a whole lot more. Software runs pretty much everything these days and while some blame lays in the way projects are run, much of what we’re seeing in the world these days can be laid at the feet of programmers who acted unprofessionally and forget their prime directive; helping their customers.

Everyone has the right to act as stupidly and childishly as they want on their own time, but when it comes to the workplace and being paid to write software other people will use then it’s high time to get over any sense of entitlement one may have and grow up. If you believe that knowing how to write code or that an understanding of how computers work somehow makes you special and that should be treated differently, then you’re dead wrong.

The only thing that will make you special is how much of an impact the software you build has on those who rely on it, and for that you’ll need to do a whole lot more than just writing awesome code and Programming, M****

Apr 17, 2013

What’s Visual Studio’s Code Map All About?

In Visual Studio 2012 Update 1 (VS 2012.1) Microsoft delivered a feature for the Ultimate edition called Code Map with the goal of visualising relationships in code.

In Visual Studio 2012 Update 2 (VS 2012.2) they’ve extended the Code Map experience to include debugging support and the ability to generate code maps on the fly as a debugging session is conducted.

If you saw the announcement then you may have had one of the following reactions. Reaction 1 would be something like “Oh. It’s in Ultimate. I don’t have it. I’ll ignore it”. The second might be similar to mine: “Is this really going to be that useful?”. After ignoring it for a bit I decided to answer that question for myself.

For the purposes of this post, I’m used the excellent RestSharp project. You can grab it from https://github.com/restsharp/RestSharp. Oh, if you’re not aware of what it does, it’s a library that help’s you build REST based client and server applications by taking away a lot of the plumbing work you would otherwise have to do.

Getting Started

To start begin by asking a simple question. How does RestSharp make a client request? I’d never looked at the internals of RestSharp before so I wasn’t sure where to look.

Fortunately RestSharp comes with a set of unit and integration tests that I could use to explore the code with. I started with the integration tests and tried to find a suitable test in there to use and came across the Handles_Non_Existent_Domain() test. That seems a reasonable place to start. Note that if you want to follow along at home you’ll need the XUnit.Net Test Runner for Visual Studio 2012 extension installed since the tests are written for xUnit.

First thing to do? Set a breakpoint on the first line of the Handles_Non_Existent_Domain() test and start debugging the test. When the breakpoint is hit open a Code Map for the debugging session by either clicking the button in the toolbar (highlighted in the screenshot), choosing Show Call Stack on Code Map from the Debug menu, or pressing Ctrl+Shift+` if you’re using the default keyboard mappings.

image

Visual Studio will now split the document window and display a second document pane with the code map in it. Initially the code map will looks like this:

image

Not very interesting, right? Don’t worry, it gets better. Let’s start stepping into various methods to see how this test works.

Stepping Deeper

Press F11 to step into the RestClient constructor. As you do, you will notice the code map updating. Continue to step through the code until the first call to AddHandler() has completed. Your Code Map should now show something like the following:

image

Note that as we step back up the call stack that the colours of nodes in the Code Map change to reflect that methods we have called but that are no longer in the call stack. It helps us keep track of where we’ve been, not just where we are now.  This is kinda cool. I no longer have to mentally keep this tree in my head, and can focus more on what the code is doing rather than where I’ve been and which methods might be relevant later on.

Depending on your resolution, you may also start noticing the zoom level on your Code Map keeps resetting. If you want it to use a specific zoom value then go to the layout options and turn off “Automatically Layout when Debugging”

image

What Happens When You Skip Over Methods?

The code map only updates when execution pauses. It means that if you step over methods the Code Map will not show you what happened in the code you stepped over. It also means that if you use multiple breakpoints and after hitting the first breakpoint continue execution until you hit the next one, then the Code Map will only reflect the call stack from when the breakpoints were hit. You won’t see anything related to what happened in between breakpoints.

It helps keep the noise in the map down, and presumes that if you’ve skipped over code you’re also not interested in seeing it in the map. That seems to be a reasonable assumption to me.

Anyway, let’s get back to the test we were stepping through and continue execution until we get to the line that calls the Client.Execute() method.

As you step into the method the Code Map will update. Look at the tooltip for the newly added node and you will see details of the method itself based on XML Doc comments, as shown:

image

This can be handy when you’re deep within the bowels of something and can’t remember exactly how you got to where you are or which specific overload of a method was called.

If we then step into the switch statement, we see that we’re passing in a value of Method.GET. It might be handy to remember that; especially as both parts of the switch statement call Execute(). We can make a note of that on the Code Map by right clicking the node and selecting Add Comment (or hitting Ctrl+Shift+K)

image

At various point you’ll notice that the some calls go through external code first. The code map will reflect this when it occurs by marking the fact on the call stack arrow, as shown.

image

Way Down Deep

If you continue to step through the code you will eventually get to the RestSharp.Http.GetRawResponse() method and your Code Map will start to look somewhat busy, as follows:

image

A lot of the nodes in the Code Map are not that interesting now that we’ve stepped through things and are improving our understand of how it fits together. It’s fast becoming noise. To organise the map better we could simply select nodes we’re not interested in anymore and delete them or we can choose to group them by selecting a few, hitting the mouse right-click context menu and choosing Add Parent Group, to get something like the following:

image

We can then simply collapse groups to hide items we’re not really interested in at the moment.

With a little bit of judicious grouping I produced a Code Map as follows and now I have something that helps me better navigate the internals of RestSharp and get familiar with it.

image

Wrapping it Up

I can add to this map over time as I explore the code further and step through other tests to see how it works, I can also get a feel for which parts of the code are called often and what happens in those methods.

If I wished I could also save the code map and share it with the rest of team, or grab an image of it and post it in the team wiki for documentation purposes or put it on the wall as a reference.

While Code Maps might seem a little gimmicky at first, I’m finding them fast becoming an useful tool in helping me better understand code I’m wading through and in improving my knowledge of what exactly is happening when I’m using an application.

Give Code Maps a try (assuming you have a copy of Ultimate) and see what you think.

Sep 13, 2012

HATEOAS? Surely we can come up with a better acronym!

Today on twitter I mentioned that a class I was teaching REST to was having trouble with the HATEOAS acronym and what it was all about.

For reference HATEOAS stands for Hypermedia As The Engine Of Application State. The concept that an application’s state is in the hypermedia sent between client and server and that both the client and server themselves are stateless.

The awesome Paul Batum responded with this:

image

I completely agree with his sentiments and for that reason I’m proposing a new acronym. This one is a TLA, and one you can say! What more could you want. So here it is:

ASH Application State in Hypermedia

What do you think? Can you come up with something better? What would you propose? Drop comment on the post and let’s see what you think.

Mar 4, 2012

How To Unit Test Async Methods with MSTest, XUnit and VS11 Beta

MSTest finally got some love with the Visual Studio 11 Beta and one of those changes was to enable tests to run asynchronously using the async and await keywords.

This is required if you want to write tests against any async methods (especially with WinRT!) but can also be used anywhere else you need to perform asynchronous operations.

Here’s a silly sample test to show you how it’s done

[TestMethod]
public async Task LoadGoogleHomePageAsync()
{
    var client = new System.Net.Http.HttpClient();
    var page = await client.GetStringAsync("www.google.com");
    Microsoft.VisualStudio.TestTools.UnitTesting.StringAssert.Contains(page, "google");
}

XUnit also supports this option as shown here:

[Xunit.Fact]
public async Task XUnitAsyncTestMethod()
{
    var c = new System.Net.Http.HttpClient();
    var result = await c.GetStringAsync("http://www.google.com");
    Xunit.Assert.Contains("google", result);
}

Be aware that if you have a testsettings file specified in the Unit Test Explorer that async tests will not work.  This applies to the beta only.  Apart from that everything works as expected.

Mar 3, 2012

Improved Unit Testing with Visual Studio 11 Beta

imageThere’s just so much new stuff in the Visual Studio 11 Beta! In fact, someone should write a book about it… Oh wait, I am! (more on that when the time is right). For now, let’s have a look at one feature that makes me so very happy: Visual Studio’s new and improved unit testing capabilities.
It’s widely recognised by those with a desire to do unit testing that there are better unit test frameworks out there than MSTest but given that Visual Studio has always been so tightly coupled with MSTest it’s always been more difficult than it should be to get other test frameworks working well in Visual Studio.  The TestDriven.NET and ReSharper test runners have helped, but the integration back to visual studio was always lacking.
That all changes with the Visual Studio 11. Now it’s a case of “Use the framework that makes you happy. We don’t mind”.
You want to use XUnit? No problem!
NUnit? Easy!
MSTest (without the baggage it usually has)? Sure thing!
QUnit for JavaScript? Bring it on!
Something random that we’ve never heard of? Write an adapter and it’ll work just fine!

Dogs and Cats Living Together!

Visual Studio 11 introduced a test adapter model so that any test framework can run inside Visual Studio if there is an adapter for it.  The adapter model also means that you can not only run XUnit, NUnit, or any other type of test but you can run them in the same test assembly if you really wanted to! Why? I don’t know! But you can :-)
Maybe you have a suite of MSTest tests but you also want use XUnit’s data driven test features because MSTest sucks for unit testing. You can do just that.  It’s really nice.
Note: In MSTest’s favour, in this release when MSTest is used in a plain old class library for unit testing the MSTest test adapter uses a cut down, light weight version of MSTest with just the features needed for unit testing and none of the baggage that it normally comes with, making it quite usable for most unit testing needs. For most developers with an existing investment in MSTest tests they will see an improvement in performance as a result.

Pre-Requisites

Get yourself started by loading the appropriate adapter for your unit test framework from the Visual Studio Extension Manager.  MSTest is already in the box so you don’t have to worry about that one.  In this screen shot I’ve loaded up the XUnit, NUnit and Chutzpah test adapters.
SNAGHTML5b8384c

Running Unit Tests

Create a new C# Assembly project (NOT a test project) and add the XUnit and NUnit test frameworks to your project using NuGet
SNAGHTML5c26ddd
You can also add a reference to Microsoft.VisualStudio.QualityTools.UnitTestFramework so that you can do MSTest based unit tests.
Here, I’ve started by adding a simple XUnit test, then building the code and running the unit test as shown:
SNAGHTML5d08bb9
As you can see in the output window there is now a “Discover test started” phase where Visual Studio looks at the assemblies and determines what tests are in the system so that it can spin up the right framework and execute the tests.
The Unit Test explorer on the left shows the tests that were run, the time it took and any error information for failed tests.
In the same code I have then added MSTest and NUnit tests as shown, however at this stage I have not yet built the project – take note of the unsaved changes icon in the document tab, indicating that the project is neither saved nor built as yet:
image

Run Tests After Build (Almost Continuous Testing)

Continuous Testing is the idea that as you do your work all the unit tests are constantly running in the background and giving you live feedback when there are problems code by highlighting where tests have failed and where your code is broken.  The immediate feedback cycle makes test driven development an even faster development process since there’s no waiting around for all the tests to run.
Visual Studio has taken a step towards this ideal with the “Run Tests After Build” option as shown in the image below. Turn that setting on and as soon as you compile your code Visual Studio will run the tests automatically on a background thread so that you don’t end up with a blocked UI and can get on with coding the next thing on your list.
image
If a test failed the unit test explorer goes red and it’s obvious that there’s a problem.  The interesting thing is that in smaller projects the tests often run so fast that you don’t even notice them happening!
SNAGHTML5de845a
As a tip, once you’ve been using the test after build feature for a while you will probably want to stop the Output window from popping up every time you build so that you don’t have to keep closing it.  You can do this in the Visual Studio Options as shown
SNAGHTML5e0d612

Don’t Forget JavaScript Unit Tests!

OK, I won’t.  Make sure you have the Chutzpah test adapter Visual Studio extension installed (see above).
In a standard web project include QUnit or Jasmine in your project and then create a JavaScript file for your tests.  Once you have your tests written run them as you normally would and Chutzpah will do the tricky work of finding the tests and running them.  Here’s a screen shot of a web project with a QUnit test in it
SNAGHTML6029dd3

Conclusion

So there you have it! A brief overview of the new unit testing features in Visual Studio 11 Beta. Go and get it now and start playing with it.
Having Visual Studio automatically running your tests each time you do a build will change your development workflow for the better and keep you more focused on coding and help you stay in the mythical zone if you ever get there.
Remember that since all tests run in the background and this removes the time spent waiting for tests to finish that you can have tens of thousands of tests taking minutes to run each time and you won’t even feel a delay in your development activities.  Fantastic!

Nov 24, 2011

Parameters: You’re Doing It Wrong!

Having parameters for a method is perfectly fine however like anything, they can be used for evil. So let me give you a tip: If your code looks anything like this method signature (and I kid you not, this is a real method) then YOU”RE DOING IT WRONG!

SaveContentSetItem(ContentSetItem,String,String,Int32,Int32,Int32,Int32,DateTime,DateTime,DateTime,DateTime,
    DateTime,DateTime,DateTime ,DateTime,Boolean,Boolean,Boolean,Int32,Int32,Int32,Int32,Int32,Int32 ,Boolean,
    Boolean,Boolean,Boolean,Boolean,Single,Boolean,Boolean ,Boolean,Boolean,Boolean,Boolean,Boolean,Boolean,
    Boolean,Boolean ,Boolean,Boolean,Boolean,Boolean,Boolean,Boolean,Boolean,Boolean,
    FileLocation,String,Stream,String,FileDisplayFormat,Boolean,Stream)

Please, for the love of all things good, turn off your computer right now. Pack it in a box.  Put the box in a locked safe.  Put the safe in a bunker under a mountain. Seal the bunker using 40 foot thick concrete and collapse the entrance.  Place a minefield and barbed wire around the bunker, and never EVER WRITE A LINE OF CODE AGAIN!

Sep 2, 2011

Using Parallel Task Library to Unit Test Threading Issues

I was doing some work recently on a demo application where data was being pulled in from multiple locations and being added to a collection that was also being iterated over in the same method.  Because this data was arriving on multiple threads (i.e. async network call backs for example) I’d occasionally see the usual “collection was modified” error messages indicating that another thread had altered the collection while the first was iterating over it.  Obvious threading bug, #FacePlam applied.

While it can be complex at times to find these kinds of errors, in this case it was fairly easy to diagnose and fix, so following good bug fix practices I took the standard approach of writing a test to prove the bug exists, fixing the code and then running the test again to prove it’s fixed.

Now, it should be noted that testing threading issues in a deterministic way is nigh on impossible, and there is no guarantee that a unit test for threading issues will genuinely prove the code bug free, however the approach taken here was good enough to throw the threading exception each and every time I ran the test and also the throw the exception on the build server.

Here’s the code:

[TestMethod]
public void ThreadingFun()
{
    InitializeControllerAndGroup();

    Task[] tasks = new Task[10]
                            {
                                Task.Factory.StartNew(() => MakeMove(1)),
                                Task.Factory.StartNew(() => MakeMove(2)),
                                Task.Factory.StartNew(() => MakeMove(1)),
                                Task.Factory.StartNew(() => MakeMove(2)),
                                Task.Factory.StartNew(() => MakeMove(1)),
                                Task.Factory.StartNew(() => MakeMove(2)),
                                Task.Factory.StartNew(() => MakeMove(1)),
                                Task.Factory.StartNew(() => MakeMove(2)),
                                Task.Factory.StartNew(() => MakeMove(1)),
                                Task.Factory.StartNew(() => MakeMove(2)),
                            };
    Task.WaitAll(tasks);
}
Ignore the first line, that’s just where the collection is being initialised.  Also ignore the fact that there’s no Assert statements in this code.  The test passes if we have no threading exceptions thrown and fails if we have one.

The important thing here is to see how easy it is to fire off a lot of threads in a single, easy to read unit test without all the usual threading plumbing code that would litter something like this.

The way it works is that we define a set of tasks via the Task Parallel Library (part of .NET 4.0) each of which calls the code where we have our threading problem.  When Task.Factory.StartNew() is called the Task Parallel Library (TPL) immediately creates a new thread and calls the method returning control to our code along with a Task object so would can check the state of the task or cancel it if so desired.  In this case we don’t care and immediately start another thread as soon as possible.

We then use the Task.WaitAll statement to wait until all the Tasks we defined are completed so that the test doesn’t complete prematurely.  Too easy.

Note that we could also just as easily have used Parallel.Invoke for this.  The same test using Parallel Invoke would be something like this:

[TestMethod]
public void ParallelInvoke()
{
    InitializeControllerAndGroup();

    Parallel.Invoke(
        () => MakeMove(1),
        () => MakeMove(2),
        () => MakeMove(1),
        () => MakeMove(2),
        () => MakeMove(1),
        () => MakeMove(2),
        () => MakeMove(1),
        () => MakeMove(2),
        () => MakeMove(1),
        () => MakeMove(2)
    );
}
I personally prefer the first approach because I like the more explicit control over the thread creation, though it’s obviously noisier than the Parallel.Invoke version.  Note that with Parallel.Invoke you hand over control to the TPL and it figures out how many threads it will use to run the actions you define based on the number of cores available on the machine.

Regardless of the method you choose you can take advantage of the TPL to help you unit test your multithreaded code and make your application more resilient.

Jun 6, 2011

Anaesthetic for your #region-itis

If your code (or the rest of your team) suffers from a bad case of #region-itis then help is at hand.

There’s a lovely little Visual Studio extension over at http://teamsearchapp.com/region-tool that will automatically expand #regions, and also make those #region lines much smaller and harder to read than your normal code.  No more pressing shortcuts to expand all those blocks, less visual clutter and optionally, the ability to prevent collapsing of #regions.  What’s not to like?

Borrowing a few pictures from their site, it takes the #region afflicted code (on the left) and reveals it’s ugly inner truth (on the right):

Regions1 Regions2 

Ahh! That’s much better.

May 1, 2011

Oh Dear

Found via Rob Conery’s blogA Methodology for Better Website Development

My eyes! They burn!! Make it stop!!!

It’s waterfall being touted as the best way to do web development! Seriously!? Who wrote this article? Wait a second while I look….  Oh, it’s the CEO of Ektron, the makers of Ektron CMS. A product so good that the only times I’ve heard of it has been from companies that bought it, got screwed by poor service, bad documentation, horrible support and the vendor’s inability to supply updates in a timely manner, and that have then written it off as a mistake, and then kicked it out so it can be replaced by other commercial products or solutions from that most waterfall-ish of development environments, the open source world!  This explains so much!

It seriously annoys me that people write this sort of article as if it’s the one true way and even more when a publisher like ZDNet doesn’t include a public health warning with it.

This isn’t to say agile is the only way to do web development either, not by a long shot.  Agile is a mechanism that has been shown to work well in web development and we’ve seen time and again that short development cycles and rapid, continuous releases are a path to success.  The real secret to success though? Find the best people you can, educate them in how to best communicate with you and each other, then tell them what you want and then get the hell out of their way!

Individuals and interactions over process and tools.

I think I need to take an aspirin and go have a lie down. Reading that article gave me a small aneurism.

Mar 3, 2011

Git-TFS Recent Improvements

If you like using Git for local development work but work in a team environment where TFS in use then you’ll be glad to know that the git-tfs project has been progressing well since I last posted about it.

The best new feature is that now you can do a checkin direct from git-tfs instead of needing to shelve, and then check in via team explorer, making the whole process much, much smoother.

Here’s an example of how things now work (assuming you have already cloned the TFS repository)

image

When you run the git tfs ct (aka checkintool) command, you will see the check in dialog so you can commit the changes.  Note that this is only supported with TFS2010.

image

I’ve got a small change in my fork which will hopefully be pulled in to the main project shortly that pulls the commit messages from git and populates the comment field of the check in. UPDATE: This is now included in the main project so use spraint's version.

Once the check in completes git-tfs then automatically pulls the changeset from the remote TFS server and merges the change locally to save you having to remember doing that yourself.

image

image

Finished!  Now we can get back to our normal development flow. This is so much easier.

The advantage of having the checkin tool is that we can also associate our commit with work items in TFS as well as dealing with check in policies.  This is excellent!

If you don’t want to use the check in tool UI, then you can use the git tfs checkin command and supply the –w option to associate to a work item.  Policy failures should result in the checkin failing.

Note that for now, you will have to build git-tfs from source to get this functionality, but that shouldn’t be a problem for anyone wanting to use this tool :-)

Feb 22, 2011

Running Neo4j on Azure

For those of you who don’t know what I’m talking about, Neo4j is a graph DB, which is great for dealing with certain types of data such as relationships between people (I find it ironic that relational databases aren’t great for this) and it’s perfect for the project I’m working on at the moment.

However this project also needs to be deployable on Azure, which means if I want a graph DB for storing data, I need something running on Azure.  I’d originally looked at Sones for doing this since it was the only one around that I knew could run on Azure, but my preference was to run Neo4j instead, because I know it a little better, and because I know that the community around it is quite helpful.

The great news is that as of just recently (a week ago at time of writing) Neo4j can now run on Azure! Happy days!! and “Perfect Timing”™ for my project.  I just had to try it out.  P.S. Feel free to go ahead and read the Neo4j blog post because it has a basic explanation of how to set it up, has some interesting notes on the implementation and also talks about the roadmap ahead.

Here’s a simple step-by-step on how to get Neo4j running on you local Azure development environment:

1. Download the Neo4j software from http://neo4j.org/download/.  I grabbed the 1.3.M02 (windows) version.

2. You’ll also need to download the Neo4j Azure project, which is linked to from the Neo4j blog post.

3. I’m assuming you have an up to date Java runtime locally, so go to your program files folder and zip up the jre6 folder.  Call it something creative like jre6.zip :-)

4. If you haven’t already done so I’d also suggest you get yourself a utility for managing Azure blob storage.  I’m using the Azure Blob Studio 2011 Visual Studio Extension.

5. Spin up Visual Studio and load the Neo4j Azure solution.  Make sure the Azure project is the startup project

image

6. Next, upload both the Ne04j and JRE zip files you have to Azure blob storage.  Note that by default the Azure project looks for a neo4j container, so it’s best to create that container first:

image

7. Now it’s time to spin this thing up!  Press the infamous F5 and wait for a while as Azure gets it act in order, deploys the app and starts everything up for you.  You’ll know things are up and running when a neo4j console app appears on your desktop.  You may also get asked about allowing a windows firewall exception, depending on your machine configuration.

Behind the scenes, the Neo4j Azure app will pull both the java runtime and neo4j zip files down from blob storage, unzip them and then call java to start things up.  As you can imagine, this can take some time.

If you have the local Compute Emulator running, then you should be able to see logging information indicating what’s happening, for example:

image

8. Once everything is up and running you should be able to start a browser up and point it to the administration interface.  On my machine that was http://localhost:5100.  You can check the port number to use by having a look at the top of the log for this:

image

You should then see something like this:

image

That chart is showing data being added to the database.  Just what we want to see!

So that’s it.  The server is up and running and it wasn’t too hard at all!  The Neo team is looking to make the whole process much simpler, but I think this is a great start and it is very welcome indeed!

If you do have problems, check the locations in the settings of the Neo4jServerHost in the Azure project to make sure they match your local values.  You might just need to adjust them to get it working.

P.S. If you have questions, then ask in the Neo4j forums.  I’m not an expert in any of this ;-)  I’m just really pleased to be able to see it running on Azure.