Hide sidebar

Substring with Concatenation of All Words

Substring with Concatenation of All Words

Sliding WindowHash Map
HardLeetCode #30
45 min

Problem Statement

You are given a string s and an array of strings words of the same length. Return all starting indices of substring(s) in s that is a concatenation of each word in words exactly once, in any order, and without any intervening characters.

Example

Example 1:

Input: s = "barfoothefoobarman", words = ["foo","bar"]

s: "barfoothefoobarman", words: [foo, bar]

Output: [0, 9]

Output: [0,9]

The substrings starting at index 0 and 9 are "barfoo" and "foobar" respectively.

Solution

This problem can be solved by using a sliding window. We iterate through the string, maintaining a window of size `len(words) * len(words[0])`. For each window, we check if it is a valid concatenation of all words.

Algorithm Steps

  • Create a frequency map of the words in `words`.
  • Iterate through the string `s` with a starting index `i` from 0 to `word_len - 1`.
  • For each `i`, use a sliding window. Maintain a count of words seen in the current window.
  • If a word is not in the frequency map, reset the window.
  • If a word is seen more times than in the frequency map, shrink the window from the left.
  • If the number of words seen equals the total number of words, we have found a valid substring. Add the left pointer to the result.

Substring with Concatenation of All Words

Sliding Window with Hash Map

Progress1 / 1

Ready to start the visualization

s

b
a
r
f
o
o
t
h
e
f
o
o
b
a
r
m
a
n

Words Seen

Result

Substring with Concatenation of All Words Solution

from collections import Counter

class Solution:
    def findSubstring(self, s: str, words: list[str]) -> list[int]:
        if not s or not words:
            return []
        
        word_len = len(words[0])
        word_count = len(words)
        total_len = word_len * word_count
        word_map = Counter(words)
        res = []
        
        for i in range(word_len):
            l = i
            words_seen = Counter()
            for j in range(i, len(s) - word_len + 1, word_len):
                word = s[j:j+word_len]
                if word in word_map:
                    words_seen[word] += 1
                    while words_seen[word] > word_map[word]:
                        left_word = s[l:l+word_len]
                        words_seen[left_word] -= 1
                        l += word_len
                    
                    if j - l + word_len == total_len:
                        res.append(l)
                else:
                    words_seen.clear()
                    l = j + word_len
        return res