207. Course Schedule 🔥

Medium

Problem:

There are n courses represented by pairs like [0, 1], indicating that to complete course 0, one must finish course 1 first. Given the number of courses, n, and these pairs, determine if it's possible to complete all courses.

Input: numCourses = 2, prerequisites = [[1,0]]
Output: true

Explanation: There are a total of 2 courses to take. 
To take course 1 you should have finished course 0. So it is possible.
Input: numCourses = 2, prerequisites = [[1,0],[0,1]]
Output: false

Explanation: There are a total of 2 courses to take. 
To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible.

https://leetcode.com/problems/course-schedule/

What to learn:

RuntimeError: dictionary changed size during iteration

When iterating over defaultdict, you will get this error. By using the defaultdict from the collections module, you can handle dictionaries without keys to prevent NULL errors when accessing empty values. This means it's structured to always create a default value to avoid errors. Therefore, you need to wrap it with list() to resolve this issue. In other words, you need to create a new copy.

Solution:

Time Limit Exceeded: DFS

This problem can be solved by determining if the graph has a cyclic structure.

  1. To identify a cyclic structure, nodes that have been previously visited are stored in the traced variable.

  2. In the dfs() function, if the current node is found in the traced set of previously visited nodes, it indicates a cyclic structure and hence returns False. This False value will be propagated up and will eventually be the final result.

  3. Once all explorations using the current node are done, it's imperative to remove this node from traced using traced.remove(i). Failing to do so might retain the node, leading to it being explored again as a previously visited path. This can mistakenly be interpreted as a cycle.

To better understand, refer to this drawings below:

Having empty list means no prerequisites, therefore, it can be compeleted. Before marking the course node as completed, we should get rid of the prerequisite.

To detect cycle in the graph, we need one more variable called visitSet.

class Solution:
    def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
        preMap = collections.defaultdict(list)
        for course, pre in prerequisites:
            preMap[course].append(pre)
        
        # To keep track of the nodes currently being explored
        visitSet = set()

        def dfs(course):
            # If cycle exists, return False
            if course in visitSet:
                return False

            visitSet.add(course)
            for pre in preMap[course]:
                if not dfs(pre):
                    return False
            visitSet.remove(course)

            return True
        
        # If the graph is not all connected
        # 1 -> 2, 3 -> 4
        for course in list(preMap):
            if not dfs(course):
                return False
        
        return True

Note: the graph is constructed "backwards" from what one might expect. Typically, one might expect the key to be the course and the value to be the courses that follow it. However, this representation is chosen because the DFS traversal later in the code is checking for prerequisites, so it's more convenient to have the graph structured this way.

However, the problem could also be solved with the opposite representation (where the key is the prerequisite and the value is the courses that require it), but the DFS logic would need to be adjusted accordingly.

Optimization: Pruning

The previous DFS solution explores all child nodes until a cycle is found. Therefore, by improving the structure to return True unconditionally without visiting a graph node more than once, the search time can be reduced.

class Solution:
    def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
        graph = collections.defaultdict(list)
        for x, y in prerequisites:
            graph[x].append(y)

        traced = set()
        visited = set()

        def dfs(i):
            # If cycle exists, return False
            if i in traced:
                return False
            
            # If already visited, return True
            if i in visited:
                return True
            
            traced.add(i)
            
            for y in graph[i]:
                if not dfs(y):
                    return False
                    
            traced.remove(i)
            visited.add(i)

            return True
        
        # If the graph is not all connected
        # 1 -> 2, 3 -> 4
        for x in list(graph):
            if not dfs(x):
                return False
        
        return True

Summary

The method used in the provided codes above are not directly performing a topological sort. Instead, it's using Depth First Search (DFS) to detect cycles in the graph. The goal of the function is not to produce a sorted order of courses but to determine if it's possible to take all courses (i.e., if there's no cycle in the course dependency graph).

For cycle detection using DFS, the "normal" representation (where each course maps to its prerequisites) is more straightforward. Starting from a course, the DFS explores its prerequisites, and the prerequisites of its prerequisites, and so on. If at any point during this exploration you encounter a course you're currently exploring (i.e., a course in the traced set), then there's a cycle.

In summary:

  • If you were doing a topological sort, you'd indeed process the nodes such that B comes before A and C comes before B.

  • But for cycle detection using DFS, the "normal" representation where A maps to B and B maps to C is more direct and intuitive.

Therefore, Why is the "normal" representation suitable for this? You start from a course, explore its prerequisites, and if you ever come back to the starting course (or any course in the current DFS path), you've detected a cycle.

A simple example to illustrate how DFS works for cycle detection

Example:

Consider the following prerequisites:

A -> B (To take course A, you need to take course B first)
B -> C (To take course B, you need to take course C first)
C -> A (To take course C, you need to take course A first)

This creates a cycle: A -> B -> C -> A.

Graph Representation:

Using the "normal" representation, the graph would be:

graph = {
    'A': ['B'],
    'B': ['C'],
    'C': ['A']
}

DFS Exploration:

  1. Starting from A:

    • Mark A as being visited (add to traced set).

    • Explore its prerequisites. The prerequisite for A is B.

  2. Exploring B:

    • Mark B as being visited.

    • Explore its prerequisites. The prerequisite for B is C.

  3. Exploring C:

    • Mark C as being visited.

    • Explore its prerequisites. The prerequisite for C is A.

  4. Revisiting A:

    • At this point, while exploring C, we see that its prerequisite is A. However, A is already marked as being visited and is in our current DFS path. This means we've looped back to a course we started from, indicating a cycle.

The DFS exploration path was A -> B -> C, and from C we found a direct link back to A, which is our cycle.

This is the essence of cycle detection using DFS in the "normal" representation. By marking nodes as "being visited" and checking if we ever revisit them while they're still marked, we can detect cycles.

Last updated