Monday, July 21, 2014

OMG! It's OGF! How to Gauge Code Quality

Little while ago we were having trouble figuring out a way of determining code quality. Sure you could use metrics produced by tools like check style or unit test code coverage tools but I've never found these metrics to tell the whole picture. Technical debt and code quality are multifaceted problems that require the skills and experience of senior engineers. It's unlikely that any computer program will ever be devised that can give an accurate picture on code quality.

It's very easy to create a computer program that can find monstrously awful code. Some might call it the compiler. It's more difficult to create a computer program that can find merely mediocre code. If you're at an organization that's worried about code quality odds are you don't have monstrously awful code to worry about.

In QA they use a metric called overall good feeling (or OGM for the acronym obsessed. ie: aka OGF). The concept behind this is very simple: you just give your overall feeling as to the quality of the product as a number from 1 to 5. Five is very high confidence and one is no confidence. The reason we use this system is because we had trouble determining the quality of our products using metrics alone. You could use bug counts, regression counts and similar things to try and create an objective measure of the quality of a product but this will never give you the full picture. Why not just ask? OGF is a great way of polling the intuition of the people who are responsible for testing the product. Why not use the same technique for measuring code quality?

Let's say we wanted to figure out the quality of the code that makes up some module, let's call it module A. First we gather the relevant developers together in a room. The next step is we asked them all to come up with a number between one and five (where five is excellent code quality and one is terrible code quality) that best encapsulates code quality of the overall module. All the developers would then produce a number at the same time. The best way to do this is to use a system similar to planning poker where all the developers have five playing cards that go between one and five. Why not using playing cards? The developers first select the card that corresponds to their number and put put it on the table. When everyone has chosen, the cards are turned over at the same time. The point of doing it this way is you want all developers to poll their intuitions without being affected (infected?) by the views of their peers.

Of course, this number doesn't tell the whole story. It's also important to know how familiar developer is a piece of code. This familiarity quotient can give us insight into the developer's choice. Code quality number (aka CQN) the developers can rate their own familiarity with the code (cards aren't needed fo this step).


Let's assume that we have a group of developers who have given code quality and familiarity numbers for our module A. We now graph each developer's point on a graph like below:



If a developer is very familiar with the code and rates the code quality very highly we would get a point on the graph like this:



If the developer is not very familiar with the code and thinks the code quality is terrible we would get a point on the graph like this:



Once all the points of all the developers are graph we can see patterns very easily. For instance, this is good code:



This, on the other hand, is bad code:



However, I expect other patterns as well. These "other" patterns indicate a lack of convergence. But why?

Code with a steep learning curve might look like this:



Graphs like the above might also indicate that the cluster of developers who wrote the code like it but no one else can make heads or tails of it. It could be that the developers have written it in an idiosyncratic style (I know! I'll parse this using Perl and a banana!) or it might simply be intrinsically complex. Either way it's going to cause problems because new developers will have a tremendously difficult time learning how to interact with the code (What? I need a banana?).

If a module is simple and easy to understand but doesn't address edge cases in the design space we might get a graph that looks like this:



This code is more dangerous because developers just coming to the code feel that it should be easy to change and modify. However, anyone who's spent time with code will realize that it's a pain to make any changes work.


Controversial code would look like this:



There are many potential reasons why this could happen. All of them are worth investigating.

The following is code that everyone is afraid to touch. I call it "Haunted House Code":



...because no one goes in there. Most likely all the developers who wrote this code have moved on. Graphs like this imply estimates will be random numbers and that much work will be done before the real difficulty of the task emerges.

So, in conclusion, I believe that while code metrics are very useful I don't believe they can give a completely accurate story. I think that simply asking the developers what they think of the code quality is a valid metric. They are using it every day, after all. They are the most qualified people to give an assessment. It's important to know where the crap is buried because it is these pieces of code that will give you problems when you try to add new features. Software development is a minefield, think of these graphs is a mine detector.

Friday, July 18, 2014

Dependency Stack

I recently finished reading a book called Designing Games by Tynan Syvester. It's a really great treatment of the basics of designing fun games. Near the end of the book he goes into the process of programming a game and the various organizational, team dynamic, communication issues etc.. that can arise.

http://tynansylvester.com/book/


One of the things he points out is that there's a relationship between game mechanics, level design and artwork. Essentially that a tiny tweak in a game mechanic might cause significant changes to all levels in the game some of which are hidden. These little changes can in turn affect the artwork in a game to the point were it might need to be redone.

He says that iteration is necessary because trying to plan the game out in its entirety at the start is doomed to fail. In order to reduce the burden of iteration he suggests mapping the dependencies between the different aspects of the game. He called it a dependency stack. It's used to figure out which things to work on now in which things to work on later.

When a dependency stack is built is you draw all the different aspects of the game such that a concept with a dependency on the fundamental concept of online as those from it to the fundamental concept. The more fundamental a concept lowered as in the diagram. Using this diagram is easy to see which concepts are the most fundamental and how changes to them will cascade through the whole system. The idea is you do the most core elements of the game first in until you a basic game consisting of just those items. This will allows you to reduce uncertainty in the interactions of those elements first before tacking other things.

I like this idea of a dependency stack. At Intelerad we've used similar techniques to map the relationships between various projects we have in the pipeline. Many of these projects have common core components. While every developer may carry around one of these diagrams in their head it very useful to put it down on paper because then it easier to communicate to those people who are making decisions about which project green light next.

Intelerad has been around a while and has a large amount of infrastructure in its software. Adding a new feature can sometimes cause a cascading series of changes that go all the way back to a central piece of infrastructure. Changes to the central piece of infrastructure and cause a cascade of changes up to the various pieces that depend on it. I suspect it would be possible to build a dependency stack which would attempt to explain the dependency relationships between various different pieces of our software. Using this dependency stack we could then demonstrate the risk of causing a cascading series of changes to its dependencies. Not to mention which other teams are likely to be brought in. When combined with a list of design deficiencies (design level technical debts) it could prove even more illuminating.

Thursday, July 17, 2014

Being a Maven maven

I spent the morning trying to use the Quasar library but it mostly came down to reading a lot about Maven to get all the libraries dependencies required by Quasar.

Maven is a Java build tool similar to Ant, Make or SCons. Unlike most build tools, Maven does dependency resolution of third party libraries. If your project depends on other libraries Maven will resolve and download them with a similar logic to tools like Yum . Meaning you don't need to go and find them yourself (or check-in your library dependencies into version control). Maven will read the list of the dependencies for your project and it recursive download exactly the right libraries for you from its fancy, federated package management repository system.

This is both wonderful and frustrating. It's wonderful because it's pretty much how things should work. It's frustrating because third-party libraries that use Maven for *their* dependent libraries force you to either resolve all the dependencies yourself or use maven.

Resolving all the dependencies yourself is a huge pain. Maven makes resolving dependencies easier so encourages adding a dependencies to third-party library. This make it more likely that there are a greater number of  dependencies to resolve. It also makes tracking down the project page difficult because no one bothered to document that because it's all in Maven so why not just use Maven?

Switching your project over to Maven can be difficult too. Not only do you have to learn a new tool for but Maven is very anal about exactly how it expects your project to be set up and behave. Project configuration for IDEs should be generated by Maven. Code files should be in such and such a place. Unit test file should be in this other such and such a place. Resources should be in this other other such and such a place. And so on. Dependencies should be in maven, version control by maven installable by maven. Maven is in line with best practices and makes a lot of sense. The problem is the universe does not revolve around Maven. All legacy projects cannot always be restructured to the Maven world order just because of Maven!

Maven maven maven? Maven maven! Maven maven maven maven...!?





With my own projects I've historically included the dependencies in the source repository. Dependency resolution is a finicky thing and a source of breakage. I don't want whoever is building or using my project to have to worry about it. Hard drive concerns be damned. Programmer time is more precious than hard disk space so just give me everything I need in the project checkout.





A few people online have been suggesting using maven as a dependency resolution tool and just ignoring the build tool aspect. I could see how this makes sense but convincing an organization to use maven for dependencies and a different build tool for building when maven does both is going to be hard. There are simply so many opportunities for confusion about which tool to use when and why. Oh woe is me..

And now back to working with Quasar..

Sunday, November 11, 2012

Releasing on a Fixed Schedule

This is a continuation of Firefox and version Numbering which I wrote a year ago and promptly forgot about. This draft was dated 2/20/11. I've been making edits to it the last two days. Hopefully it still makes some sort of sense.

Second system syndrome is a software development pathology that often happens when a piece of software is developed in two phases. The first release of the software includes everything that's relatively easy and quick to implement. The second version includes the rest. Unfortunately some of the features in "the rest" turn out to be unfeasible. This can lead to the second version never being completed. Time goes by, the seasons change, but the software release is always a year away.

The best way to develop software is to admit from the start that there will be many iterations and that not all features will make any particular release. First you make a priority list consisting of the features you want in order of importance. You then implement this list, making sure to revise it as new information becomes available. You then release your software on a fixed schedule.

The fixed schedule is useful for making sure you ship something. It changes the question from "What are we going to do?" to "What can we do by the next release?". This is subtly important because in software there are things that will take an extremely long time to do but nothing is really impossible. Estimating how long something will take becomes harder the longer the time span and the more complex the problem. If a new feature is really complex then you can find yourself on a project that will takes years to complete. What is more, complexity feeds on itself, causing schedule ships and increasing the amount of confusion about how to fix problems. Software can become a hellish tar-pit of intense pressure and slipping schedules.

Making the commitment to release software on a fixed schedule turns the problem on its head. Instead of shipping a product late because one single feature is horribly behind schedule you instead focus on trying to accomplish the most important things you can before the next release. This essentially guarantees that you'll always have the most critical features in any given amount of development time. If a feature is behind schedule it will miss the release date and get automatically pushed back to the next one.



If you release on a fixed schedule make sure that you're aggressive about taking features out of the release that they aren't ready. There needs to be a clear and enforced policy that if a feature isn't ready on time then it's not going to be in the release. This means that developers have to ensuring the feature they are working on doesn't break the codebase. Invasive new features need to be written in such a way that they can be turned off or otherwise disabled if need be. Frankly, clearly separating new development is a good practice in all cases since it helps QA test new features or bug fixes without getting mangled in a new feature. Modern revision control tools like Git and Mercurial can be a help here with their branching features.

The most common complaint when trying to implement rapid releases is that some feature is vitally important and it's only a little behind and so the release should be delayed just a little bit. This is not only the thin end of the wedge but things are rarely only "a little delayed". What I've seen is that the release gets delayed to include this "vital" feature but several others are added in since there's now time to do them. Next *those* features get delayed slightly so more features are added to fill in the gap. Then those get delayed.. It can turn into Zeno's project management. The release is always just a little bit in the future and the goal of a release every month turns into a single release in one year. What you're actually doing is you're delaying all features for a single feature. It's not a nice thing to do to all your customers whose features are actually shippable.

Being strict about the cutoff date stops feature creep, it allows customers to get features sooner and it increases software quality.

My favourite benefit is that development teams no longer rush to meet an unrealistic deadlines and by skipping on testing. If development is running late they can take their time and do it properly because their feature can hop on the next release. It also remove the temptation to commit the sins of self deception; passing off unfinished software as merely buggy, for example. The effect of this one is an endless QA cycle as developers use QA as a sort of to-do list generator. "I'm done! And on time too! Oh, there's a bug? Ok I'll fix it. At least I was officially 'done' on time.". Yeah right.

If you release often enough, say every four months, you don't need to create maintenance releases for existing branches. Any bugs can be fixed on the trunk because customers will have that code in good time. Additionally, it's less likely that you will introduce a regression because less has changed since the last release. If there's a really critical problem you can still ship a maintenance release but it's rare you'll need to do this in practice.

When it comes to testing, adding automated unit tests and regression test becomes more important in rapid release software. Since the codebase is always changing it's important to not constantly break things every release. Automated unit test and regression tests is a best practice to avoid unmaintainable software. Rapid releases just make the consequences of unmaintainability more dire.

There are a overheads associated with creating a new release of the software. Most of these should be automated anyway. Those that can't like documentation updates and manual QA feature testing should be easier with short release cycles since less has changes since the last version of the software. This means less to update and test.

https://officeimg.vo.msecnd.net/en-us/images/MH900443454.jpg 
Version 4 is out? Yeah, whatever, everyone knows that version 3.4 is the best. Nice laptop by the way.


Another potentially annoying aspect of releasing on a schedule is it's hard to make a fuss about the new version of the software because major features are done incrementally. When I used a rapid release cycle on the Myster project this turned out to not be a problem. What happened was that we were releasing so often people would be visiting the website regularly to see if there had been a new release. I released on a monthly basis and the public realized the Myster was constantly being updated an improved. A majority of our user base upgraded every time we released a new version. And we didn't have a auto-update system! Having rapid release cycle communicates to customers that you care about issues and new features and can fix them quickly. It also creates a constant background buzz. We found ourselves on the front pages of many a news web site every time we released - once a month.

Releasing often and on a fixed schedule does mean that your marketing team has to think of the product more as a continuous stream rather than a single specific version. It doesn't stop you from selling the features of the new version but it does mean you should direct people to the latest version and not develop a brand around a specific release. If anything, I'd considered creating a brand around a specific version of a piece of software a marketing anti-pattern. It means you have to compete against your own software's older version every time you release. How silly is that?

Remember, only you can help prevent second system syndrome.

Saturday, November 10, 2012

Balancing Starcraft II - Making an E-Sport

It's no big secret that I'm a big Starcraft II fan. Apart from Portal and the odd session of Angry Birds it's the only video game I play. For those not in the know, Starcraft II is a real time strategy game. Think of it as competitive SimCity building but with marines. The best way to play Starcraft II is over the network with friends  However, it's really hard to design a real time strategy game that's balanced and fun. In fact it's taken Blizzard 5 tries to get up to this point.

Warcraft-logo.gif

Blizzard's first attempt was the original Warcraft way back in 1994. It featured two races; orcs and humans. You can play either side and each side had different units and abilities. Well, by "different" I mean mostly different graphics. The actual abilities between the two sides were really quite similar. The game units were also of vastly different abilities meaning that games always degenerated into a rush to some big unit and then produce as many as possible. It was a fun game but a bit simplistic.

Warcraft-2-Tides-Of-Darkness-Pc.jpg

Blizzard's next attempt was Warcraft II: Tides of Darkness. This game was also a huge amount of fun. In all the inital head-to-head games I played it felt really balanced. Unfortunately there was this unbalancing orc spell called blood-lust that would allow you to obliterate your human opponent. In the end, my friends and I reverted to simply playing against the computer on custom maps - which was also an insane amount of fun.

The box art of StarCraft

The original Starcraft was a big win for head to head play. The expansion pack called "Brood war" was even better. This was the game that created the e-sport phenomenon in South Korea. Not only were all 3 sides (!) balanced but the game had depth to it. You could play forever and keep getting better. There were two big problems though. Finding a person to play against online was hit and miss. Because there was so much depth you'd either play against someone who was clearly better or against someone who was clearly worse. There was also the problem that the super balanced head to head play meant that playing against a real person was very, very intense. So intense, in fact, that we often just played against the computer. That could be intense but not to the point where you had to take a break after each game :-O.

WarcraftIII.jpg

Warcraft III came next and it was a serious attempt to create good head to head playing experience. The biggest improvement over Starcraft was the opponent matching system. The system would keep track of who won against who and try to automatically match you with someone of your skill level.

Warcraft III was also less intense than Starcraft. Warcraft III was built so you focused more on managing your troops and less on building an maintain you bases. It focused on the generaling more and less on the SimCity building aspect. Blizzard's idea was that the troop control was the fun part and base building a distraction. It isn't. The real fun in Starcraft, and the earlier games, was managing both the SimCity aspect and the troops at the same time! To be honest it's actually more multi dimensional. You can have to balance your technology and upgrade, with the quantity and composition of your army, while balancing troop production with how quickly your mineral production expands and then balancing that with how many troop production building an of what type you want to build. Oh, yes and on top of that you have to be the general in the field and tell your troops what to do.

StarCraft II - Box Art.jpg

Thanks to the enormous success of Stacraft and its use in e-sports, Blizzard, for the first time, made a game that was focused primarily on making that experience awesome. They also took the opponent-matching ladder system from Warcraft III, made a huge number of improvements and stuck it into Starcraft II. The whole package is a work of art.

So that brings me to the point of this post. I have recently come across a talk by one of the people involved with designing the Starcraft II online gaming experience. In this talk he relates how difficult it was to build a game that will work as an eSport while looking good and being fun to play for multiple levels of players. I found it fascinating.

Thursday, November 8, 2012

Light vs heavy mutex

I discovered a nice post at Preshing on Programming that discusses Light vs Heavy Mutexes. Mutexes are what allow you to do critical sections which, in turn, allow you to create programs that run on multiple processors. That got me thinking about how Java's "synchronize" keyword is implemented. Using Synchronized is the default way of creating critical sections. It used to be really slow but has gotten much, much faster recently. Apparently, synchronize is implemented using a combination of light and heavy locks as well as other techniques that make it even lighter than a light mutex.



Given that so much work has been done on it, I've still had performance issues with it. Using things like AtomicInteger with its check and set is still much faster.. Assuming that you can use it (it's not always possible.).

Thursday, June 7, 2012

2012 Student Protests Tuition Fee Graphs

It's summer in Montreal and I'm surrounded by student protests. It's impressive how the issue of university tuition costs has divided the city.

The students are protesting the Government's plan to increase tuition by 325$ a year until 2017. Here's a graph of the historical tuition costs in Quebec. Included are the inflation adjusted tuition costs in 2012 dollars. You can see that if the government plan goes through Quebec well see the highest tuition costs in over 40 years. That's probably what's got the students upset.

Click to zoom in


Inflation adjusted values calculated using the Bank of Canada's inflation calculator:
http://www.bankofcanada.ca/rates/related/inflation-calculator/

Projected inflation is 2% per year from 2013 - 2017.

Historical tuition fees taken from here:
http://www.globalnews.ca/quebec+tuition+fee+timeline/6442619736/story.html