Problem Statement in English

You’re given two string arrays wordsContainer and wordsQuery.

For each query in wordsQuery, you need to find the index of the word in wordsContainer that has the longest common suffix with the query.

If there are multiple such words, return the shorter one . If there are multiple words with the same length, return the one with the smallest index.


Approach

In order to efficiently store and query the longest common suffixes, we can use a Trie data structure.

Since the question cares about suffixes, we will insert the words into the Trie in reverse order. This way, when we query for a suffix, we can traverse the Trie from the end of the query string and find the longest matching path.

In order to factor in all the conditions to determine the best matching word, we will store at each Trie node the index of the best matching word that passes through or ends at that node. This way, when we traverse the Trie for a query, in constant time we can retrieve the best matching index at the end of our traversal.

The reason we calculate the global best index for the empty suffix "" is to handle the case where a query has no common suffix with any word in wordsContainer. In that case, we should return the index of the shortest word in wordsContainer, which is what we store at the root of the Trie.

And we’re done!


Solution in Python


class TrieNode:
    def __init__(self):
        self.children = {}
        # Stores the index of the best matching word passing through or ending at this node
        self.best_idx = -1 

class Solution:
    def stringIndices(self, wordsContainer: list[str], wordsQuery: list[str]) -> list[int]:
        root = TrieNode()
        
        # 1. Find the global default best index (for empty suffix "")
        global_best_idx = 0
        for i in range(1, len(wordsContainer)):
            if len(wordsContainer[i]) < len(wordsContainer[global_best_idx]):
                global_best_idx = i
        
        root.best_idx = global_best_idx
        
        # 2. Build the Trie
        for i, word in enumerate(wordsContainer):
            curr = root
            # Insert characters backwards
            for char in reversed(word):
                if char not in curr.children:
                    curr.children[char] = TrieNode()
                curr = curr.children[char]
                
                # Update the node's best index if this word is a better fit
                if curr.best_idx == -1:
                    curr.best_idx = i
                else:
                    old_idx = curr.best_idx
                    if len(word) < len(wordsContainer[old_idx]):
                        curr.best_idx = i
                    # Tie-breaker (earlier index) is naturally handled because 
                    # we iterate 'i' from 0 upwards, so we only replace if strictly shorter.

        # 3. Process Queries
        ans = []
        for query in wordsQuery:
            curr = root
            for char in reversed(query):
                if char in curr.children:
                    curr = curr.children[char]
                else:
                    break # Can't match any further, stop here
            ans.append(curr.best_idx)
            
        return ans

Complexity

  • Time: $O(M + N)$
    where $M$ is the total number of characters in all words in wordsContainer and $N$ is the total number of characters in all queries in wordsQuery.

  • Space: $O(M)$
    for storing the Trie structure.


Mistakes I Made

I was going to maintain the precedence in a separate hashmap.


And we are done.