deepSWE.io

Find Duplicate Subtrees

TreesDFSHashing

Problem Statement

Given the `root` of a binary tree, return all **duplicate subtrees**. For each kind of duplicate subtrees, you only need to return the root node of any one of them. Two trees are duplicate if they have the **same structure** with the **same node values**.

Example

Output: [[2,4],[4]]

Algorithm Explanation

To find duplicate subtrees, we need a way to represent the structure and values of each subtree in a way that can be easily compared. A common approach is to serialize each subtree into a string.

Algorithm Steps

DFS and Serialization: We'll use a post-order DFS traversal. For each node, we'll create a string representation (serialization) of the subtree rooted at that node. The serialization should include the node's value and the serializations of its left and right children.
Hash Map: We'll use a hash map to store the serializations and their frequencies.
Check for Duplicates: As we generate the serialization for each subtree, we check if it's already in our hash map. If it is, and its count is 1, it means we've found a duplicate, so we add the node to our result list. We then increment the count in the map.

Start

Starting DFS from the root.

Serialization Map

Duplicate Subtrees

Find Duplicate Subtrees Solution


class Solution:
    def findDuplicateSubtrees(self, root: Optional[TreeNode]) -> List[Optional[TreeNode]]:
        result = []
        counts = {}
        
        def serialize(node):
            if not node:
                return "#"
            
            s = f"{node.val},{serialize(node.left)},{serialize(node.right)}"
            
            if counts.get(s, 0) == 1:
                result.append(node)
            
            counts[s] = counts.get(s, 0) + 1
            return s

        serialize(root)
        return result