Bitten by type conversion

Created 15th August, 2007 08:02 (UTC), last edited 28th August, 2007 08:16 (UTC)

The URL class that we have in FOST.3™ does a few different tasks for us, but the main one is also the obvious one. Internally it's made up of a few classes for handling things like the scheme, host and path specification.

When we drop a URL into a page though it's convenient if we can write a shorter URL that skips things like the scheme and the host when the link is for the same site. So, instead of having every link fully qualified like http://www.example.com/Something we can shorten them on pages at http://www.example.com/ to /Something.

The code we had was pretty much this:

FSLib::wstring Url::asString( const Url &relative_from ) const {
	if ( ( scheme().monicker() == L"http" || scheme().monicker() == L"https" ) &&
			relative_from.scheme() == scheme() &&
			relative_from.host() == host() && 
			relative_from.port() == port() )
		return concat( concat( m_pathspec, L"?", query().asString() ), L"#", anchor() ).value();
	else
		return asString();
}

The idea being that for HTTP and HTTPS we can shorten the URL so long as the host and port number match (we don't worry about trying to work out a relative path specification — it doesn't seem worth it for the extra complexity).

The problem was that given a URL like http://www.example.com/Something, asking for the URL relative from http://www.example.com/ would always give http://www.example.com/Something rather than /Something.

The if clause isn't all that complex and no matter how many times you examine it the error won't be found in there.

The real problem was with the class design for the Host class. This class needs to be able to do IP number look ups using DNS in order for us to be able to implement a number of network protocols. The class definition was this:

class Host {
public:
	Host();
	Host( const FSLib::wstring & );
	Host( address_type );
	Host( unsigned char, unsigned char, unsigned char, unsigned char );

	address_type address() const;
	FSLib::wstring hostName() const;

	operator const void *() const;
};

The first clue that something is up is that the class doesn't declare any operator == and the compiler won't provide one for us. There's also a very odd looking operator const void *() const in there.

The C++ compiler is allowed to try a single type conversion when it comes across instances of our classes used in places that they're not explicitly catered for (a common source of confusion and bugs, like this one). The funny looking operator is a type conversion operator and allows the compiler to convert any Host instance to a const void * pointer and then compare those in the if statement.

Because the URLs had separate instances of the host member this would always come up false even when the content of the two hosts was identical. The fix was to remove the conversion and add an equality operator.

So, what was that operator doing there in the first place? It's a common way to check the validity of some object. For example, the same technique is used in std::ofstream and it allows you to write:

std::ofstream file( filename );
while ( file ) {
	// Write to the file
}

Instead of:

std::ofstream file( filename );
while ( !file.failed() ) {
	// Write to the file
}

On our Host class it was used to determine if the IP number was available. The irony of the situation was that there was no longer any code that used this conversion to check if the IP number was available.

The moral is that if you want to check the validity of an object it's much better to explicitly state the check that you are expecting it to pass.

As a final note, there is a reason for using void * rather than bool, but I don't remember what it is (this isn't a technique I ever use). Whatever it might be, this is not a recommended practice.

Update 1: Thanks to gb for pointing out my poor spelling.

Update 2: Bjorn Karlsson writes about bool conversion at Artima Developer and provides a pretty safe implementation that doesn't cause the problems I was having here. He also cautions that a bool conversion is often not a good idea.

Discussion for this page