Sunday, December 30, 2007

Scalability of software product teams

I'm curious that with all the software development practices being flung around these days whether nay though has been given to how to scale up software development.,

I've often heard it said that small teams are better than large teams but what happens when you have a large application or perhaps a large suite of applications that are all supposed to work together. How do you scale up development in such a way that you don't end up programming resources.

Fred Brooks says that a large programming teams won't as well as a small programming team. The thing is, in some sense, every day programmer work on huge projects even if all they do is write a small, 100 line python program. That python script will use python (a large project) which relies on an operating system (a large project) made up of large components etc... We are always leveraging someone else's code on someone else's project.

writing library of framework code is much harder than writing application code. The trick is how much more effort does it take? ..especially if the library is only being used in house? How big should development teams get before the project is split into two groups or three? If we say that teams should be of about 5 people, how many teams should we have before we should create a team responsible for tracking down and eliminating duplicate code - by turning it into an inter-team library?

I haven't seem anything that attempts to tackle this.

There's another question.

Java's API and python's API both have really nice documentation. I would expect that any teams that try an work together would need at least this amount of documentation. Do they need anything else? I mean python's API and Java's documentation are all I use.

If joint design is being done, that is an interface is being negotiated, which is the best way of doing this? Sure, if you on have 2 teams negotiating a handful of interfaces this probably isn't a problem but what if you have 3? 4? 5? 10? 50? If a project is made up of many teams is it a good idea to have a meta-team responsible for maintaining the large project's design integrity?

I don't know. I haven't seen anything written about this. It's important, though. Many software projects are huge and right now most best practices are optimized for smaller teams and small software projects. Imagine trying to to native agile program with 50 teams of 5. You're project would be a mess.

..now what?

Friday, December 21, 2007

Sleeeppppp zzzz

I've been reading a book about people who started up companies. These guys are crazy. The hours they work are insane.

There's a story about someone who would sleep for 4 hours every 2 days. Another guy would work for 4 days solid until he just fell asleep. That's pretty gosh darn bonkers.

I am unusual in that I appear to need a couple more hours of sleep than the average person. I usually like between 9 and 10 hours of sleep. Usually it's 10 hours and it can be more if I'm learning something (taking a course, going to a conference for example).

The "average" amount of sleep per night is 8 hours. There are some people who like more some people that like less. Apparently the typical range is between 6 and 10 hours.

I'm not sure how that works, I start to go a little crazy if I have only 8 hours sleep.

I've heard that attention deficit disorder might have a sleep deprivation angle to it. Given that I doubt many people are getting 8 hours of sleep, many people, if not just about everyone, may be walking around having slept too little. That's not good.

Humm.. better not become a member of that group. I'm going to sleep now.

Thursday, December 20, 2007

Programmer's theater group

I belong to a amateur theater group. Every year we put on two large shows. These shows take month of preparation. Not only do the actors need to learn their lines and go on stage but there's a huge amount of behind the scenes work to do as well.

There's things like:
- Choosing the script
- Directing
- Renting the hall for the play
- Renting the space for the rehearsals
- Ticket selling
- Costumes
- Designing and building the set
- Props
- The staff running front-of-house during the productions
- Advertising
- Ticket selling

All in all even with a cast of about 15 people, we end up using more than 30. All of this is done on a volunteer basis. All done in people's spare time.

It takes allot of time. As an actor, I spend 5 hours a week for the first two months at group rehearsals then 9 hours for the last month culminating in about 20 hours during the week of the production. Generally, people also tend to spend hours on their own learning their lines as well.

The more I think about it the more I'm amazed at how all this works. I'm curious if anyone is trying to do this sort of thing with software project.

The idea would be to get about 5 or so programmers who are interested doing a project and aren't particularly picky about what the project is. Then brainstorming on ideas until there's one that stands out then coding it, setting people up as project leads, designers, etc... As well as grabbing others for things like building the website and doing all the work with registering the finished projects with websites and sending out press releases.

With 5 people spending about 5 hours a week or more on the project I don't see why we can't have something interesting in 6 months or so.

The idea would be to set it up as an agile style process. Meet once a week for a SCRUM type status. Have a "director" or lead to decide the high level direction and focus of the project. .. and guarantee some hours of availability for code paring.

The goal would be to set some fixed time span (this is important) and a goal and try to ship a workable solution to the goal, preferably as an OS project. The timespan would probably be about 5 to 6 months in total with about 4 months of coding/design time.. the other 2 months would be just deciding which problem to tackle by way of looking at problems/possible programs/possible new features that can be written in such a short time. The idea here being that if you're going to spend the next few months tackling a problem you might as well think hard about which problem to tackle.

Given what I've seen with the theater group, some hours of inter-team interaction together would be needed for social reasons and as a good motivator. This is why I would say that it's important for team members to set aside some time at which all team members would be working on the codebase at the same time. If everyone has a laptop it could all be at the same location too. It's always great coding in an environment where you can bounce ideas off each other... nt to mention things like peer review too.

The project would proceed (Look Phil! Two "e"s!) in phases:

- Meet once a week for a while and each week present a possible project as a problem or need to fulfill and a goal for the project. After some time recap all the projects and vote on which one it the best. The project must have a team lead/"director" and must have enough programmers who want to work on it. New features to existing projects are allowed.

- Build a high level design and mock up for how the program should be built. This is mostly up to the director to organize. They should pick someone to help design the project. During this phase, any part of the project that may not be feasible should be investigated up until the point where everyone is convinced it will work. Brain storming sessions should be held twice a week on invitation of the project lead.

- Director choose who should work on which section of the problem and works with those people on explaining what the behavior should be. Teams are made. High level design of these module is done and code is started. Two sessions per week ~ 2.5 hours a week.

- After two months or so we start to attempt to going all the pieces together and run the project looking for bugs and behaviors that are not desirable. Schedule adds a 4 hour session to the existing 2.5 hour sessions.

- During the last week, the schedule accelerates to bug fixing every day of the week and work is wrapped up. Over the week-end the program is compile and uploaded or if the project is a website it goes live and links submitted to search engines etc..


So currently I'm curious to try this out. I've currently got two potential prospects. We've been brain storming potential ideas. I'm curious to know if we can get something out of it.

Anyhow.. food for thought. You want to audition for the next project? You know how to reach me. :-)

Wednesday, December 19, 2007

The dangers of reducing coupling.

This is a continuation of part 1

Once programmers discover the joys of sectioning off code and reduced coupling there's a tendency to go a bit too far. (This is all from a java programmer perspective)

One of the common anti-patterns people fall into is to try and manage their dependencies by tracking which classes/packages know about which other classes/packages. The idea is that a class should only know about certain other classes. As a result it should be possible to group related classes into a package and then use a tool to automatically generate a map of dependencies between packages. This map can then be used to figure out where classes are making dependencies and destroy stupid ones.

There's a few problems with this:

1) It gives a false sense of security because two classes can depend on one another without having a compile time dependencies. The most common problem I've seen is some sort of complex event-driven system that ends up getting tied in knots because the code was written using this event driven system partly as an attempt to compile time dependencies with the goal of retaining the positive aspects avoiding compile time dependencies.
2) Compile time dependencies are not run time dependencies. Even if you consciously try and avoid falling into the trap of turning your compile time dependencies into run time dependencies, you will still fail because there's often no way of expressing the run time behavior of a system with a static, compile time dependency map. In the worse case, the attempt to do so will put limits on what sorts of patterns you can use while coding in order to try and keep the static dependency graph matching with the actual real graph.

Essentially these two reasons boil down to 1) it won't work in practice and 2) even if you could make it work in practice it still wouldn't be a good idea.

Keeping your inter-package dependencies clear and clean is absolutely a good idea. It is not, however, a panacea. It can't solve world hungry and it won't bring anyone back from the dead.

The thing with managing compile time dependencies is it's works up to a point. That point is the point at which the compilers understand how your program is put together.

What you should be trying to do is manage the coupling of your code. A dependency is a hint that there's some sort of coupling. It could be high or low but the hint is there's a coupling. If the compiler doesn't show a link that doesn't mean there's no coupling, it just means there's no compile time coupling. To be more accurate it means there's not compile time dependency. A compile time dependency can be thought of a form of coupling.

Ok, so let's say we're a developer and we've seen the light and now we know that coupling is bad. Is there any other ways to screw this up? Yep. Trying to reduce coupling to zero.

Reducing coupling between components to 0 can't work. I've seen people try to do this by removing as many constraints as possible.. For example, removing compile times checks via using something like a Map or events or something. Don't do this. There are languages out there that don't have compilers. Ask programmers in those languages if they experience problems with things being too coupled.

Another favorite is to keep chopping up code way past the point of sanity. The way to replicate this at home is to take a reasonably well written program and try to make every line of it into a framework. After a few hours you'll wind up with a dense soup of spaghetti.

This happens because you can't remove coupling. If you try to remove coupling by splitting things into incredibly tiny pieces you end up with a program with more tightly coupled components and more coupling related issues and more complexity than ever.

Before I cojntinue I'd like to introduce the idea of cohesion. Cohesion represents the idea that some things in life are inheritantly* coupled together tightly. A real world example of this is a table leg. Every molecule in that table leg can be viewed as a separate entity. However, it makes sense to manipulate the table leg as a whole so we don't consider the fact it's made up of molecules which are made up of atoms. Sure it's a lie but it's a convenient lie that makes the world easier to understand.

* however the heck you spell that.

The good programmers recognize a cohesive object when they see one.

Using this approach of building sections of code with high internal cohesion and low external coupling you can build some fairly amazing things.

It works in ANY language.

Oh, you can even use the trick recursively. Build an object out of some cohesive properties then use a collection of relatively coupled object to build a meta-object and so on. As such you can build mind bindingly complex things.

Everytime you use Swing or SOAP or RSS to write an application you're building a sort of meta-object that includes libraries with high internal cohesion. TCP/IP, XML, Swing object like JTable (shudder)..

Ok, so how do we do this? Well, the easy way is to do test driven development. I'm not sure why but test driven development seems to help programmers make programs that have objects with higher internal cohesion. I have my theories:

1) It actually forces programmers to think first - to design*.
2) Writing unit tests is easier if you have objects with simple interfaces that aren't dependent on a myriad of things.
3) Writing objects with simple interfaces also means you want to clump related functionality into one object so you don't drive yourself insane writing tests for millions of little, tiny objects.

* Design is the "D" word. Don't say it at any agile software conferences or you'll spend the next half an hour explaining yourself.

I suspect that taken together these things can be responsible for the worst hyperbole I heard while at the SD2007 best practices conference. I quote (more or less):

"Test driven development is the silver bullet Fred Brooks say didn't exist."

sigh...

Low coupling, high cohesion. It's the mantra of good software.

Further reading:

http://en.wikipedia.org/wiki/Test-driven_development
http://en.wikipedia.org/wiki/Coupling_%28computer_science%29
http://en.wikipedia.org/wiki/Fred_Brooks

Tuesday, December 18, 2007

Why coupling is bad.

There's a large difference between a 1000 line program and a 100000 line program. Most of it has to do with making an application's architecture scalable.

Uh-oh, someone asked me to introduce a new feature into my established codebase!

When programming there's twos sorts of new features. There's the kind of feature I like to call a "vertical" new feature and there's a "horizontal" new feature.

A vertical feature is one that doesn't really interact much with other code. If you picture the code for the feature as being coded as a stack of layers that only interact with one another you end up with a vertical stack. When this stack is added to the established codebase it doesn't make it much more complicated. I mean there's the complexity in the feature itself but that complexity is nicely contained it's own stack. It's almost as if it's another program that just happens to be compiled with the established codebase. If you were to make a change somewhere in the code of the existing codebase you wouldn't need to worry about breaking the new feature because there's practically no chance that what you're doing will affect that feature. Vertical features are self contained and are therefore lightly coupled with the established code.

Horizontal features are those that affect a large cross-section of the application. The best example of a horizontal feature in InteleViewer would be key images.

InteleViewer is a application that shows medical images like CTs or magnetic imaging or X-rays etc.. The thing is, key images are images that aren't really images. They are references to images. From the Viewer's perspective, it gets a command that says download image X. The Viewer goes "OK", downloads it and then is surprised to find out that image X is actually a reference to three other images Y, Z and Q. The Viewer then has to go out and transfer them.

Now this all sounds perfectly simple. In your head you can imaging that all you'd need to do is have the KeyImage code transparently get the image it's referring to and download it. Well, yeah.. except for that fact that KeyImages came relatively late in the history of the Viewer and the Viewer was making lots of assumptions about the nature of an images to implement other existing features. Here are a few issues:

- The KeyImage might refer to an image that's already loaded and these things are huge so we don't want to load them twice. We have to add some code to make sure we're not loading the same thing twice in the loading code itself instead of in the code calling the loader.
- We have multiple different protocols which we can use to download images. Some of them have their own constraints as to what's possible to do vis-a-vis downloading images. We have to be aware of this and deal with each source individually or try and build the key images code out of abstracted operations that already exist.
- If we had any protocols that we wrote that assumed we were only sending images we need to re-write them a bit.
- Since key images, on their own, are a file, but don't contain any image data, you can't blindly send the files themselves to image manipulation routines.
- We cache all images on disk but key images don't have any image data so we need to be aware of this in the cache code too.
- key images have filtering operations that apply to the underlying images, so these filtering operations have to be combinable.

..the list goes on. key images have introduced constraints all over the code and the more constraints you have the more likely the next feature you add will need to know about key images. Key images adds constraints across the application's loading and caching systems and therefore makes any code in those systems more complex and subtle than before.

Here's a question for you? Can we abstract away the annoyances of key images by clever use of layers and factories and such?

Well, every programmer should try and some of the things I mentioned can be hidden by interfaces and abstractions. The simple fact that adding KeyImages was possible, was because we'd worked hard trying to hide the complexity of the loading system. The thing is, there's a fundamental limit to what you can do with abstractions.

Consider this: You're abstractions are shaped by what is possible to do with thing(s) you're abstracting. It's fairly easy to come up with a case that makes abstraction impossible. Here's an example:

You want create a program that downloads a movie file from an existing website and then plays it. The system must allow for the movie to start playing as it is coming in. Unfortunately, one of these movie formats puts some important information at the end of the file. It's not possible to actually play the movie until it's arrived and the server you're talking to doesn't allow you to asks for specific bytes before others. Net result: you're doomed. No abstraction can save you because it's not possible to provide an implementation that will do what you want.

This exact same thing can happen in more subtle ways with horizontal features. If you have a feature that adds on a requirement that's a contradiction of an existing requirement there's no abstraction you can do that will fix it.

The first rule here is to make horizontal features into vertical features whenever humanly possible. If you don't you're application will get old before its time. I would go further and say "no" to horizontal features or even look for little used horizontal features to remove from your app. Functionality not being used? Every piece of functionality has some horizontal component. If it's not used it should be removed.

What we're doing here is actually reducing the feature's coupling with other components. Being paranoid about coupling is a very powerful idea. When programmers discover it they jump for joy and then go onto make applications that are much larger and more complicated then ever before.



Then they run into the next wall.. More about that later.

Part 2 - When reducing coupling goes bad.

Monday, December 17, 2007

CSS etc..

So I've updated the look of the blog recently. I'm having trouble getting it to look decent.

Basically this whole re-design was prompted by the fact that the previous template didn't grow horizontally with the size of the web browser. On my large monitors with large text it looked completely silly; there was this skinny column of text centered right in the middle of the page. sigh..

Well, what I did was I took a blogger template that actually did change its size depending on the horizontal width of the web browser window and tweak the heck out of it.

Here's the original:



Yeah.. I changed it a bit. I consider it an improvement based just on the fact that there's no freaking orange in it. :-) Orange makes me an sad panda. If I remeber correctly the tangerine iMac was the least popular so it looks like I'm not alone.

One of the things I was unhappy with is there's no way of dividing up the space between two widgets in such a way that one widget takes up a fixed amount of space in pixels and another takes up whatever is left over. Not sure why that is since I seem to remember doing something like that in the old do-everything-with-tables days.. Although I might be confusing HTML table layout with GridBagLayout.

When doing this site redesign I ran into the links-that-don't-look-like-links problem. Essentially the only way you can tell the what is a link on a web page and what isn't is by either mousing over it and noticing your cursor changes or by looking at it and seeing it's a link because it's a different color or because it's underlined. The thing is, the default blogger template don't do this consistently. Links come in at least 3 different colors. Some are underlined and some aren't. In ye olden days this wasn't a problem. Site couldn't over-ride the link color. But now they can and what's worse is CSS actually allows you do have a different linking style for every piece of text and widget on the screen. It's often happened to me that I'll be mousing around a webpage suddenly notice that a piece of text is actually a link. This happened to me on coding horror. If you mouse over a blog posting title it's actually a LINK! It's not underlined and it's not the same color as any other links on the page.

So when I was doing my blog I wanted all the links to be underlined and the same color.. but it looked like crap so I though ok, I'll at least have them all underlined.. I still haven't dug deep enough to figure out how to convert all the sidebar links to be underlined but I'm getting there.

At any rate I still feel fairly bad about having my blog post titles as links but having them white instead of the blue.

I like blue links. It was the default color for all hyperlinks for years.. and red was the default color for previously visited links.. argh.. that reminds me! I have to figure out why previously visited links don't change color.. they should be a shade of red but it's not working.. grumble grumble.... Ok, time to do that.