It is always a good idea to test your code on a dataset that is small enough for you to compute the results by hand. You will create two such datasets for testing.
Above: a small graph containing 6 nodes, labeled A, B, C, D, E, and F
Create the above graph in
get_practice_graph(). Note that there are letters inside each node (these may not show up in a printout of the assignment.) Use the Graph class as demonstrated in lecture (not DiGraph, MultiGraph, or MultiDiGraph).
To help you verify that your graph is correct, the provided code draws the graph to a window. The graph may be hidden behind other windows, so look for it! Compare your graph to the figure above.
The nodes may appear in different locations; that’s fine as long as the same nodes exist and they are connected in the same way by the edges. If you redraw the graph, the nodes may be plotted using a different layout each time - this is also fine!
Your program will pause until you close the window that contains the graph drawing.
When you are happy with your graph, comment out the call to
draw_practice_graph() in the
main() function at the bottom of the file (so you don’t have to close the window every time you run your program). Now look at the other
.py file provided in the starter code:
social_network_tests.py. This file has been set up for you to run when you are ready to test parts of your code. Run this test file with:
If you have drawn the practice graph correctly, you should see
practice graph tests passed printed in the terminal. If you do not see this, consider looking at the error printed as well as the test function (
test_practice_graph()) to figure out why your graph was incorrect. (You might also try writing some of your own tests to check your work!)
Once you have verified that your code passes the test, you should comment out the line in
social_network.py where the
practice_graph variable is defined, since you won’t need it for other homework problems.
You are likely to see several other assertions in the test program failing after
practice graph tests passed. This is to be expected since you have not yet implemented those parts of the program! In fact, this will continue to happen as you progress through the problems. This should not discourage you from regularly running
social_network_tests.py to check your code. Success usually means you are no longer failing the same assertion, but that instead a different assertion is failing further down.
Create a graph in the
get_romeo_and_juliet_graph() function corresponding to the Romeo and Juliet graph below (ignoring the shaded family/house information).
To help you verify that your graph is correct, the provided code draws the graph to a window and to a file named
romeo-and-juliet.pdf. Compare your graph to the Romeo and Juliet graph above.
The nodes may appear in different locations; that’s fine as long as the same nodes exist and they are connected in the same way by the edges.
social_network_tests.py will check this for you.
When you are happy with your graph and the
romeo-and-juliet.pdf file, comment out the call to
draw_rj() in the
main() function at the bottom of the file, so you don’t have to close the graph window every time you run your program.
If a non-friend “Y” is your friend’s friend, then maybe Y should be your friend too. If person Y is the friend of many of your friends, then Y is an even better recommendation. The best friend recommendation is the person with whom you have the largest number of mutual friends. In this problem, you will implement this heuristic.
As a concrete example, consider “A” in
practice_graph (pictured again below). Suppose we are interested in giving person “A” recommendations of people that they might want to be friends with. By this algorithm, the number of friends you have in common with someone is a measure of how likely it is that they would be a good friend recommendation for you. Therefore, the more friends someone has in common with you, the better their “friendship score” is. We will use the number of friends in common as the friendship score itself.
In this case, we want to find out who has friends in common with “A”.
Because D has two mutual friends with A, D has a friendship score of 2 making D the top recommendation for A. Using the same logic, F is the second best recommendation. We would not recommend B or C because A is already friends with them. We also would not recommend E because A has no friends in common with E.
Similarly, consider Mercutio in the Romeo and Juliet graph.
Capulet and Montague are the best friend recommendations for Mercutio. Our algorithm should return this list ordered from best recommendation to worst. In the case of ties in friendship scores (such as between Capulet and Montague, or between Benvolio, Friar Laurence, Juliet), we list people alphabetically. Therefore, in this example we would return:
['Capulet', 'Montague', 'Benvolio', 'Friar Laurence', 'Juliet']
We are not listing Paris, Escalus or Romeo because they are already friends of Mercutio.
Now, let us think about how we might create this list. We will need to calculate these “friendship scores” for some set of people in the graph. Because we are using “number of common friends” as our metric, we only care about calculating scores for people who are friends of X’s current friends and not currently a friend of X.
There could be many people in a large graph, so we do not want to calculate friendship scores for every person in the graph. Many of those scores will likely be zero because they do not share any friends with X.
Because of this, we will need to calculate the set of “friends-of-friends” for user X. For each of those friends-of-friends, we will calculate the set of friends that they have in common with X. If we want to give user X a ranked list of recommendations from best to worst, then it would be useful to have a data structure to keep track of the mapping of “friend of friend” to friendship score. Finally, given this mapping of people to friendship scores, we will want to sort the potential friends from best to worst before presenting it to the user.
Remember that a dictionary is often called a “map”.
For this problem, you need to write the following five functions, whose documentation strings appear in the template file
common_friends(graph, user1, user2)
Be sure to NOT modify any of the provided function definitions and arguments, as they will causes our autograder tests to fail
The template file defines a helper function,
friends(graph, user), that you may find useful.
In the code we have provided, the bodies of all functions are empty except for a single statement:
pass. This is a Python statement that does nothing, but can be used in situations where you need an instruction to be there but you don’t want it to do anything.
In previous homeworks we instead wrote something like:
# REMOVE THIS COMMENT AND REPLACE IT WITH YOUR CODE .... It is more common to see something like a single
pass statement as a placeholder. Remove the pass statement when you have finished implementing the function.
social_network_tests.py file contains several
assert_equals statements and functions to help you test your code. We strongly encourage you to write additional assertions in these functions as well, in order to verify that your code is correct. You may also find the assertions useful for clarifying what the function is supposed to return. The function
assert_equals(expected, received) takes in the first argument as the expected value, and the second argument as the actual value returned from the function. Like
assert, this will stop your program if the two arguments aren’t equal.
For all of these functions, we strongly suggest that you start by outlining them in pseudocode.
Testing Dictionary Equality
Writing Additional Functions
You are NOT allowed to use
lambda. (If you don’t know what this means, that’s okay: you’re not allowed to use it anyways!)
In this problem, we will implement a different algorithm for computing a friendship score.
Consider the following purely hypothetical situation:
Since Anita is highly selective in terms of friendship, and is a friend of yours, you are likely to have a lot in common with Anita’s other friend. On the other hand, Margaret has so many friends that there is little reason to believe that you should be friendly with any particular one of Margaret’s other friends.
Incorporate the above idea into your friend recommendation algorithm. We call this technique influence scoring. The drawing in this PDF depicts influence scoring graphically.
user2 have three friends in common:
f3. In Problem 2, the score for
user2 as a friend of
user1 is 1+1+11+1+1. Each common friend contributes 1 to the score.
With influence scoring, the score for
user2 as a friend of
user1 now becomes:
In the example above, Anita’s one other friend would have a score of ½, and each of Margaret Hamilton’s friends would have a score of 1/7000000000.
We recommend that you calculate (by hand) the updated friendship scores for each of the friend recommendations that would be returned to Mercutio using this same metric. To see what the right answer should be, take a look at the
assert_equals statement in the
test_influence_map function in the file
For Problem 3, implement two functions:
social_network.py file provides starter definitions for these functions, including docstrings. You may find that their implementations are quite similar to code that you have already written in Problem 2. That is OK. The
social_network_tests.py file also tries one test case for each of the two functions. Do not change the code that you wrote for Problem 2. However, you can call many of the functions you wrote for Problem 2 to help with this problem.
You can solve the problem with just the two new functions (
recommend_by_influence), plus re-using some unchanged functions from Problem 2.
Does the change of recommendation algorithm make any difference? Maybe not: you can see by looking at the
assert_equals statements in
social_network_tests.py that Mercutio should get the same friend recommendations with both recommendation approaches (see the functions
test_recommend_by_influence). Does that mean everyone gets identical results with the two recommendation approaches?
Using the Romeo and Juliet graph, write code to print a list of people for whom the two approaches make the same recommendations, then print a list of people for whom the two approaches make different recommendations. Each list should be sorted in alphabetical order. You do not need to write a separate function that does this, writing the code in the
main() function is fine.
The format of this problem’s printed output is given in the Output Format section.
In the Romeo and Juliet graph, there are 5 people for whom the recommendations are the same, and 6 people for whom the recommendations are different.
You are almost done with Part 1!
social_network.py via Gradescope. You do not need to submit
social_network_tests.py, even if you edited it.
Now you are done with Part 1! On to Part 2!