Wrong(?) straightforward solution JordiB 2nd December, 2008 07:36 (UTC)

I'm not entirely sure, but couldn't you just solve this equation: x^8 / 8 = y

Where x is the number of location types and y is the target number of different locations you want to be able to describe. We take x to the 8th power, because that's the number of possible combinations to concatenate x numbers 8 times and we divide by 8, because we established that every location has up to 8 different representations. Now, it might not be eight for every location (e.g. you could be surrounded by houses), but it is probably a safe estimate. Rewriting gives: x = (8y)^(-1/8)

Filling in 100,000 for y gives the value of 5.469, which means that you would need six building types (which actually allows for 209,952 locations).

However, a lot of locations might have the same representation. In order to make this chance smaller, you could increase the number of building types, but you can never make it zero. It just depends on the town.

This is very simplistic reasoning, so it is probably wrong. If it is, can someone explain why?

Wrong(?) straightforward solution Mike 2nd December, 2008 08:05 (UTC)
JordiB said

I'm not entirely sure, but couldn't you just solve this equation: x^8 / 8 = y

Where x is the number of location types and y is the target number of different locations you want to be able to describe. We take x to the 8th power, because that's the number of possible combinations to concatenate x numbers 8 times and we divide by 8, because we established that every location has up to 8 different representations. Now, it might not be eight for every location (e.g. you could be surrounded by houses), but it is probably a safe estimate. Rewriting gives: x = (8y)^(-1/8)

Filling in 100,000 for y gives the value of 5.469, which means that you would need six building types (which actually allows for 209,952 locations).

However, a lot of locations might have the same representation. In order to make this chance smaller, you could increase the number of building types, but you can never make it zero. It just depends on the town.

This is very simplistic reasoning, so it is probably wrong. If it is, can someone explain why?

Quick off the block! Well done.

Kirit suggested something like this when we discussed it on the way to the last Beercamp. The tricky bit is the places that are the same as other places when mirrored or rotated. 'Up to 8' is an estimate as you rightly point out. The interesting bit is finding the exact answers. The problem uses reflection as well as rotation to increase the number of equivalent codes. So, until we can calculate exactly how many of these there are, we can't say if this ball park estimate is correct.

Actually it is not so tricky to brute force it, any programmer should be able to tackle this problem, but I am hoping someone will come up with an insightful way of dealing with the number of equivalent codes.

Wrong(?) straightforward solution Kirit Sælensminde 2nd December, 2008 08:29 (UTC)
Mike said

'Up to 8' is an estimate as you rightly point out.

It seems to me that JordiB's calculation puts an upper bound on the number of locations. We know that five won't do for 100,000 locations because there just isn't enough information even before you take the duplications into account.

Whether six is enough or not depends on working out how many duplicates there are. You might still need 7 or maybe even more. I've not worked this one through yet, but have an idea of how I might brute force it in O(n^8) operations with O(1) memory overhead — I like brute forcing apparently :)


To join in the discussion you should register or log in
Wrong(?) straightforward solution JordiB 4th December, 2008 02:09 (UTC)
Kirit Sælensminde said
Mike said

'Up to 8' is an estimate as you rightly point out.

It seems to me that JordiB's calculation puts an upper bound on the number of locations. We know that five won't do for 100,000 locations because there just isn't enough information even before you take the duplications into account.

Whether six is enough or not depends on working out how many duplicates there are. You might still need 7 or maybe even more. I've not worked this one through yet, but have an idea of how I might brute force it in O(n^8) operations with O(1) memory overhead — I like brute forcing apparently :)

Actually, it's more of an upper bound on the number of required building types. Suppose you would generate a set of all x^8 permutations of length 8 of the x location types. Then, for each permutation you reorder it through rotation and mirroring so that you get the lowest possible lexical value. This will cause some original permutations to be reordered into the same value. If you now remove the duplicates from the set you are left with (unique) representations of all the locations that can be described (so the size of the set is the number of locations you can describe with x building types). The number of original permutations that can be reordered into the same value can never be higher than 8 because we use rotation and mirroring. In other words, each permutation in the reordered set, may have up to 8 different 'sources', but never more. If every permutation would have exactly 8 sources, we could obtain the number of describable locations by just dividing the original (not-reordered) permutation set and dividing by 8. This saves the trouble of reordering them through numerous rotation and mirroring actions and removing the duplicates. Actually, the permutations don't even need to be generated at all in this case, because you can simply calculate how many there are. This is exactly what my formula does.

However, the reordered permutation r-r-r-r-r-r-r-r (you're surrounded by road) can have only one source, namely r-r-r-r-r-r-r-r. And many other reordered permutations also have fewer than eight different sources. So actually, we shouldn't divide the total number of possible permutations by 8, but by some smaller number that is the average of the number of different sources each reordered permutation has. I don't know of an elegant mathematical way of obtaining that average, but seeing that it will be composed by numbers between 1 and 8, it will obviously be smaller than 8. So, the formula becomes: x^8 / c = y where c is smaller than 8.

The formula for calculating the needed number of building types becomes x = (cy)^(1/8). Using the lowest possible (and also incorrect) value of 1 for c, and 100,000 for y again, we see that only 4.217 building types are needed (so actually 5). We can also calculate that 5 building types will definately do the job if c <= 3.9 (which we don't know yet, but seems unlikely) and 6 buildings will be enough if c < 16.8 (which we know it is, because it has to be smaller than 8).

So far, we have only been considering the number of different representations one location can have, but not how many locations a representation can represent. I don't think it's really possible to calculate this number, but it might be possible to make the number of building types so big that having two locations with the exact same representation is very unlikely.

Wrong(?) straightforward solution till 4th December, 2008 06:31 (UTC)

I came up with a formula, if you guys could test it against brute force, that would be great:

def poss(i):
    return (i**8+5*i**4+4*i**2-7*i)/8

poss(4)
8356
poss(5)
49226
poss(6)
210774

It also does large numbers quickly:

poss(1000)
125000000000625000499125L
Wrong(?) straightforward solution till 4th December, 2008 06:39 (UTC)

whoops, typo in the formula this should be correct:

def poss(i):
    return (i**8+5*i**4+3*i**2-7*i)/8

poss(4)
8354
poss(5)
49223
poss(6)
210770
Wrong(?) straightforward solution JordiB 5th December, 2008 00:46 (UTC)
till said

I came up with a formula, if you guys could test it against brute force, that would be great:

def poss(i):
    return (i**8+5*i**4+4*i**2-7*i)/8

poss(4)
8356
poss(5)
49226
poss(6)
210774

It also does large numbers quickly:

poss(1000)
125000000000625000499125L

Could you perhaps comment on how you arrived at this formula?

Wrong(?) straightforward solution Kirit Sælensminde 5th December, 2008 05:18 (UTC)
JordiB said

Could you perhaps comment on how you arrived at this formula?

Having listened to Till explain it last night I'm not sure you want to hear :)

Mike and I now both have code (his is Python, mine is Haskell) that come up with the same values for between 2 and 6 landmark types (not run on larger n, the Haskell code isn't very clean and is quite slow). We're using the same memory optimisation though so it would be good to get an independent set of figures.

Anybody else up for writing some code to brute force it to check the equations against? If Mike and I are correct then the equations proposed so far undercount the actual number of locations (but not by much).


To join in the discussion you should register or log in
Wrong(?) straightforward solution till 8th December, 2008 11:18 (UTC)

Mike and I now both have code (his is Python, mine is Haskell) that come up with the same values for between 2 and 6 landmark types (not run on larger n, the Haskell code isn't very clean and is quite slow). We're using the same memory optimisation though so it would be good to get an independent set of figures. Anybody else up for writing some code to brute force it to check the equations against? If Mike and I are correct then the equations proposed so far undercount the actual number of locations (but not by much).

In a final effort, i revisited my formula and I have this now:

def loc(x):

return (x**8 + x**4 + 4*x**5 - 2*x**3 + 4*x**2)/8

Output:

2 : 50
3 : 949
4 : 8728

Mike just told me that my numbers are very close, but I just cannot get it right. The next step would be to actually generate the values and compare with the brute forced ones.

JordiB said

Could you perhaps comment on how you arrived at this formula?

I will try to explain my model:

The basic formula is

L = L1 + L2 + L4a + L4s + L8a + L8s

where

L: Total number of locations

L1: Locations that can be expressed with one repeated landmark (r-r-r-r-r-r-r-r)

L2: … two repeated landmarks (r-h-r-h-r-h-r-h) and not L1

L4s: … a repeated string of four landmarks with symmetry (r-h-o-h-r-h-o-h) and none of the above

L4a: … without symmetry (r-h-o-o-r-h-o-o) and none of the above

L8s: … eight landmarks with symmetry (r-r-h-h-o-h-h-r) and none of the above

L8a: … without symmetry (r-h-h-h-h-o-o-o) and none of the above

Keeping these groups mutually exclusive is the tricky part, but here's my shot at it (x is number of distinct landmarks):

L1 = x

No problem here

L2 = x^2 - x

We have to subtract x so we don't count L1 twice

L4s = (x^3 - x^2) / 2

In a repeating 4-tuple with symmetry, the second and fourth (and sixth and eighth) landmark will always be the same, hence there are x^3 representations to fit this characteristic. After we subtract L1 and L2, every location in this set will have exactly 2 representations, therefore we divide by 2.

L4a = (x^4 - x^3) / 4

Basically I'm taking all possible 4-tuples and subtract the ones where the second and fourth position are the same. Note that those already include all of the above sets. The remaining set of locations is represented 4 times each (rotated and mirrored), so I divide by 4. Quick example:

r-o-o-r-r-o-o-r
o-r-r-o-o-r-r-o
r-r-o-o-r-r-o-o
o-o-r-r-o-o-r-r

L8s = (x^5 - x^3)

An 8-tuple that is symmetric will have spots 2, 3 and 4 mirrored in 8, 7 and 6. Therefore there is a total of x^5 combination with this characteristic. Subtract any representations for locations in L4s (x^3 - x^2), L2 (x^2 - x) and L1 (x), and we should be left with a nice exclusive group of locations that have one representation each, so we don't divide by anything.

L8a = ((x^8 - x^4) - 4 * (x^5 - x^3)) / 8

This is where keeping the set exclusive gets a real pain in the ass, but luckily it's the last step. (x^8 - x^4) are all representations that are not repeated 4-tuples. These also contain 4 (rotated) representations of every element of L8s, so we subtract 4 * (x^5 - x^3). The remaining set of locations is represented 8 times (mirrored and rotated) so we divide by eight.

Putting it all together and contracting the formula, we get the values above.

Wrong(?) straightforward solution till 8th December, 2008 17:38 (UTC)

I just could not let it go, I started writing out all the permutations on a sheet of paper and realized another mistake: L4a included some locations that should have been in L4s (where the first and the third landmark are the same).

Anyway, I think I got it now. Too tired to make a nice formula, but here is a quick test for 2 as x:

2**2

4

2**3-2**2

4

2**4-2**3*2+2**2

4

2**5-2**3

24

2**8-2**4-4*2**5+4*2**3

144

144/8

18

18+24+1+4+4

51

If I remember correctly, 51 is what you guys got? Please, please confirm this so I can finally let it go.

I will make a nice formula tomorrow and test for higher numbers, meanwhile I would really appreciate it if you guys could post some confirmed brute forced results.

Wrong(?) straightforward solution Mike 8th December, 2008 17:56 (UTC)
Where does the 1 come from?
Wrong(?) straightforward solution Kirit Sælensminde 8th December, 2008 18:06 (UTC)
till said

If I remember correctly, 51 is what you guys got? Please, please confirm this so I can finally let it go.

Ignore Mike, he's just being cruel. 51 is indeed the number for two's world. I would also like to know where the 1 comes from though.


To join in the discussion you should register or log in
Wrong(?) straightforward solution till 9th December, 2008 01:53 (UTC)
Mike said

Where does the 1 come from?

The third number (L4a) should always be divided by four, just like the last step (L8a) should be divided by eight. I'm sorry I didn't make that clear, the numbers I posted were just a quick test I did for myself.

Okay, now for the formula. Let me explain what changed and why:

L1 = x

L2 = x^2 - x

These are still the same.

L4s = x^3 - x^2

This is now no longer divided by two, because there are two ways a 4-tuple can be symmetric: The second and fourth position are identical (mirroring does nothing) and - this I found out by writing out the permutations one by one - the first and third position are identical (mirroring and rotating gives the same new representation).

L4a = (x^4 - 2*x^3 + x^2) / 4

I adjusted the number of representations for this set for the error above by subtracting representations now in L4s and verified the new formula independently by calculating the number of possible 4-tuples where position 1 != 3 and 2 != 4. It turns out to be (x^4 - 2*x^3 + x^2) as well. (If it hadn't I probably would have given up at that point.)

L8s = (x^5 - x^3)

L8a = ((x^8 - x^4) - 4 * (x^5 - x^3)) / 8

These are still the same.

When I contract the formula, this is what I get:

L(x) = ( x^8 + 4*x^5 + x^4 + 2*x^2 ) / 8

Here are my results for a couple of x's:

0 : 0

1 : 1

2 : 51

3 : 954

4 : 8740

5 : 50475

6 : 214011

7 : 729316

8 : 2114064

9 : 5411205

10 : 12551275

11 : 26877246

12 : 53874756

13 : 102155599

14 : 184747395

15 : 320747400

16 : 537403456

17 : 872690121

18 : 1378453059

19 : 2124199810

How long would it take your brute force scripts to verify some of these?

Wrong(?) straightforward solution till 9th December, 2008 01:59 (UTC)
Never mind the values for x > 14, they're shortened in the console.
Wrong(?) straightforward solution till 9th December, 2008 05:57 (UTC)

Unbelievably, Mike confirmed my numbers and I can finally sleep at night.

I guess we proved once again that brute force is always the quicker solution. On the other hand, check this out:

$ ./locations.py 1000000
125000000000000000500000125000000000250000000000

This happened when I tested large numbers. For powers of 10 you can actually see the single elements of the formula at work. Anybody wanna try to brute force it? ;-)

100 maybe?

$ ./locations.py 100
1250005012502500