Abrar Hussain

Problem: Closest Pair

26 Jun 2016

Given an input string vector, find the distance between the closest pair of equal value entries.

Constraints

We can start off with determining the constraints of our problem.

Now that we have a better understanding of the problem constraints, we can move on to generating ideas for the solution.

Ideas

Native Solution: For each word, iterate through and find the other places that that word occurs. Calculate the distance at each point and return the global minimum distance.

O(n2) time and O(1) space solution, where n is the length of the input array.

Optimized Solution: In order to calculate a distance for a specific entry, all we need to know is the previous location that the entry was found. All other positions for the entry that came before have greater distance. Based on that piece of information, a hash table that uses a entry as an key and the entry’s previous location as a value would be a good choice. As we iterate through the input, we can do lookups on the hash table in order to calculate the distance for the current entry. Then, we can update the minimum distance as min(global_min, curr_distance). After we iterate through the entire input vector, we should have the shortest distance computed.

This solution approach uses O(n) time and O(c) space, where n is the length of the input array, and c is the unique set of entries from the input array.

Code

We can code out the optimized solution detailed in the previous section.


int find_closest_pair(const vector<string>& paragraph) {

    unordered_map<string, int> positions;
    int min_distance = numeric_limits<int>::max();

    for (int i = 0; i < paragraph.size(); i++) {
        string word = paragraph[i];
        if (positions.find(word) != positions.end()) {
            int distance = i - positions[word];
            if (distance < min_distance ) {
                min_distance = distance;
            }
        }
        positions[word] = i;
    }
    
    return min_distance;
}

Tests

Google Test can be used for performing tests against our solution.

TEST(find_closest, simple) {
    int highest = numeric_limits<int>::max();
    vector<string> empty;
    vector<string> normal = {"he", "said", "things",           "he", "did"};
    vector<string> multiple = {"he", "her", "them", "he", "her"};
    ASSERT_EQ(highest,  find_closest_pair(empty));
    ASSERT_EQ(2, find_closest_pair(normal));
    ASSERT_EQ(3, find_closest_pair(multiple));
}

Tweet me @abrarisme if you like this post.