Tutorial for Kayaks |
The problem Kayaks from the October contest was a nice problem - a simple and easily understandable problem statement, and was easy to play around with. Coming up with the correct solution, on the other hand, required a few key insights. Rather than just directly demonstrating the solution, the following is a discussion of both the wrong and right approaches I took to solving the problem, with all of them helping to get to a final solution.
Make sure we understand the problem.
The first time through I skimmed over the problem statement and missed a key statement - the weight of the kayak comes into the equation. We are given up to 100000 pairs of integers, wi, the weight of the ith person, and pi, the strength of the ith person. We want to pair people together to maximise the smallest value of:
si + sj
------------
wi + wj + 20
for each pair of people (i,j).
Key idea #1: We can get rid of the 20.
The 20 in the equation is quite annoying - it feels like things are rather unbalanced. But it shouldn't be hard to see we can eliminate it - if we increase all of the weights by 10, then the extra 20kg is accounted for when we add a pair of weights. From this point on we can forget about the 20 entirely, and just use the sum of the weights instead.
Failed attempt #1: A greedy approach
If you have come across the Farey sequence before, you will know that if we take two fractions and add their numerators + denominators like we are doing here, we get a result called the mediant - and it lies in between the two original fractions. So if we consider each person as a fraction (their strength divided by their adjusted weight), it seems logical that we want to pair up 'big' fractions with 'small' ones to get a result somewhere in between.
Right now would be a good time to try a greedy approach - let's sort the fractions from highest to lowest, pair up the highest with the lowest, the second highest with the second lowest, and so on. Sadly, this approach will lead to the wrong answer - you could either come up with a test case yourself, or simply try submitting this approach like I did.
Key idea #2: Binary search
Let's take a step back. We are asked to maximise something. If we knew the answer, would that help? In other words, given a speed x, can we determine whether we can pair people up so all the resulting speeds are at least x?
This is exactly what binary search does. The smallest possible speed is (50+50)/(110+110) = 0.454545, and the largest possible speed is (100+100)/(60+60) = 1.666667. If we knew how to test whether a speed was attainable or not, then binary search would quickly lead to a solution. This then leads to:
Failed attempt #2: Maximum matching
Suppose we want to test whether a speed of s is attainable. Then for each person, we could loop through the other people and test whether their combined speed was at least s. We now have a list of viable pairs of people, and want to see whether a perfect matching exists. The most famous matching algorithm requires a bipartite graph - and it appears as if this isn't the case here. Or is it? As we mentioned before with the mediant, the result of pairing two fractions results in a fraction in between - so there is no point in pairing two fractions less than s together. We could consider those fractions greater than s, and those fractions less than s.
But let's not get ahead of ourselves. We should have realised a long time ago this approach would never work with an input of size 100000 - even storing the pairs requires 100000^2 memory let alone a large running time, which will never work. In fact..
Key idea #3: Think about the constraints
As reminded above, any O(n^2) algorithm is too slow. An input of size 100000 tends to imply an algorithm around O(n log n) or so - this could be from divide and conquer, or some sort of tree structure, or sorting. This isn't any point in thinking of anything much more complex than this. Perhaps some simple sorting approach like we first tried may hold the key after all?
But we seem to be stuck. Now is a good time to back away from the problem and hope to come back with a fresh insight later. And then it may hit you..
Key idea #4: Think about the problem in another way
We've been thinking of fractions the whole time. Each person is really a pair of numbers - what happens if we plot these people on a graph?
How do we represent the critical speed we are trying to test? This is simply a line with gradient s - anything above this line has a higher speed, and anything below the line has a lower speed.
How does pairing up points work? We add the x coordinates and add the y coordinates. Isn't that sort of like taking a midpoint? The only difference is we don't divide each sum by 2 - but wait! Dividing the top and bottom of a fraction by two doesn't change the result - so two points can be paired up if their midpoint is above the critical line.
Now we seem to be in the same position as we were before with the matching idea - there are just too many pairs to try out. But this is where the graphical approach helps. Let's look at this on an angle..
Can you see what has happened? Since the midpoint is halfway between the two points, all that matters is how far the points are from the line. We can pair up a point on the left with a point on the right as long as the left point is further from the line than the right point.
The final solution
Now the greedy approach we first tried will work - but we sort by the distance from the line. The leftmost point may as well be paired up with the rightmost point, and so on. If this process leads to all pairs being compatible, that speed is attainable, otherwise it isn't. If you don't know how to calculate the distance from a point to a line, this is a good time to learn.
Summarising the solution:
- Increase all weights by 10 to eliminate the extra value of 20.
- Run a binary search on the optimal speed, s.
- Basic Co-ordinate Geometry of a line: Calculate the distance of each point from a line with equation y = sx (treating those above/below the line separately)
- Sort these distances, and pair the leftmost point with the rightmost point, and so on.
- If all pairs result in a speed >= s, this speed is attainable, otherwise it isn't.
If in our calculations we give points to the left of the line a 'negative' distance, then we can sort everything together in one array, and then just test whether the sum of each pair of values gives a positive or negative value. This is what I did in my solution (though I might have mixed up negatives and positives :))