Spreading Message To Everyone In A Social Network
Application of Topological Sort, Strongly Connected Components and Tarjan's Algorithm

Customer Support: admin@thealgorists.com

Feedback:
We are listening to your every feedback,
and taking action to constantly improve your learning experience.
If you have any feedback, please use this form:
https://thealgorists.com/Feedback.
 If you are a student, email to admin@thealgorists.com to get special student discount.
Problem Statement: In a certain social network, when someone gets a message he/she sends that message to all of his/her followers.
Now, I am considering to spread a promotion message across all people in that social network.
Your task is to find the minimum number of people to reach out (for example, who doesn't follow anyone etc) so that my promotion message spreads out across entire social network.
We need to consider loops like, if A follows B, B follows C, C follows D, D follows A (A > B > C > D > A) then reaching only one of them is sufficient to spread your message.
Input: List of followers for every user in the social network:
1: 3, 4 2: 3: 1 4: 3
Here, user 1 has user 3 and 4 as followers. User 2 has no followers. User 3 has user 1 s follower. User has user 3 as follower.
Output: Minimal list of people to be reached to spread out message across everyone in the network.
Tag(s): #facebook_interview
Solution:
It's quite evident from the problem statement that we would need to construct a directed graph to represent the entire social network where the edges represent the followee  follower relationship. Edge A > B represents B follows A.Let's look at 3 different social networks below:
The users that need to be informed at the minimum in order to let the whole social network get the message are marked in RED.
What's the common characteristics you notice for all the vertices marked in red ? They all have no inbound edges. Basically if you do a Topological Sort of all the users in the given social network, you need to let only those users know who have indegree = 0 and appears all the way to the left in the topological sort (i.e, towards the beginning of the topologically sorted list). These users have indegree = 0 because they do not follow any other users. All other users are reachable from one or more of these users (with zero indegree). Recall the concept of Topological Sort. For those users who have no inbound edges, there is no way they would come to know about the message if we do not let them know implicitly.
Presence of Strongly Connected Component(s):
What's missing from the directed graphs in the above diagram is Strongly Connected Components. Please read the chapter Strongly Connected Components (SCC) to know more about it.
A very important observation here is that all the users belonging to a Strongly Connected Component has nonzero indegree since all of them have at least one inbound edge.
These Strongly Connected Components (SCC) can be of two types:

Isolated SCC, i.e, SCC with no incoming crossedge or a forward edge from a vertex that is not part of the same SCC:
This kind of SCCs are totally disconnected from the rest of the graph (i.e, social network), and cannot be reached from a node outside of the SCC or from other SCC. In this case we just need to make the indegree for the root of the Strongly Connected Component zero and include that in our final list of minimum number of users need to be informed. The SCC below is an isolated SCC.

SCC reachable from some other vertex which is not part of the same SCC:
This kind of Strongly Connected Component is connected to the rest of the social network by one or more vertices outside of the SCC (through forward edge) or from any other SCC (through crossedge).
In this case there is at least one crossedge or forward edge from a node that is not part of this SCC. In this case we do not need to let any of the users in the SCC explicitly know (by that I meant, none of the users in the SCC need to be part of the minimum number of people to reach out), since they will come to know from the rest of the social network since the root of the SCC is already connected (in addition to the root, even a nonroot vertex can be connected to the rest of the directed graph through an incoming crossedge). In the below diagram SCC 567 is connected to the SCC 234 through the crossedge 4>6. Also both SCC 234 and 567 are connected to the rest of the graph by vertex A. Both of these two SCCs have incoming forward edges from vertex A. So if user A gets the message all the users in SCC 234 and 567 will get the message too.
In order to determine if an SCC is reachable from any vertex outside of the SCC, we need to remove all the edges in the SCC and see if one or more nodes in the SCC is still connected to at least one node that is not part of the same SCC. If it is, then there is no need to include any user from this SCC to our minimum list, since the message would get propagated to everyone in the SCC from the outsidernode to which this SCC is connected through a forward edge or cross edge.
But if the above is not true, then we would need to include any one node from the SCC.
How can we determine if an SCC is connected to at least one node outside of the SCC ?
We find all the SCCs in the given directed graph. For each SCC: for each of the nodes present in the SCC we decrement indegree of all its adjacent nodes that are part of this SCC (to avoid including crossedges) by 1. After this operation, if the indegree of all the nodes in the current SCC are zero, then it is an isolated SCC and one user from this SCC needs to be included in our minimal list. If not, then no node from this SCC needs to be included. Try to run this logic on the above two diagrams and it will become crystal clear.
Algorithm:
From the above observations we come to the below simple yet robust algorithm to solve the problem:
Create a directed graph which captures followee > follower relationship.

Compute
indegree
for all the vertices.

Compute all the SCCs in the graph.
 For each SCC: for each of the nodes present in the SCC we decrement indegree of all its adjacent nodes that are part of this SCC (to avoid including crossedges) by 1.
 Return all the vertices with indegree = 0.
Concept of indegree is discussed in Topological Sort by BFS chapter.
Working Solution:
This is a Premium content. Please subscribe to access the code.
After subscribing please come back and refresh this page.
Don't forget to take a look at other interesting Graph Problems:
The above content is written by:
Abhishek Dey
A Visionary Software Engineer With A Mission To Empower Every Person & Every Organization On The Planet To Achieve More
Microsoft  University of Florida
If you have any feedback, please use this form: https://thealgorists.com/Feedback.
Follow Us On LinkedIn