Shortest path in graph
Description
The program is given as input a file containing connected nodes in a graph and a weight assigned to the edge between the nodes. The program shall answer the questions: Is there a path between two given nodes in the graph? If so, what is the shortest path ?
This is useful in a number of situations: Protein interaction, which proteins interact together, thereby discovering f.ex. new pathways. Social networks, who knows who, proving the "six degrees" theory.
Input/output
The program is given a graph file as input on the command line. A graph file is a tab delimited file, that looks like this (social network example):
Chris Workman Peter Wad Sackett 1 Chris Workman Mette Voldbye 2 Carsten Friis Sune Frankild 4
Each line contains an edge in the graph, given by the name of the two connected nodes, and a weight. All edges are two-way, i.e. if Chris knows Peter, then Peter knows Chris.
Having read the file and constructed the internal data structures, that represents the graph the program should ask for two nodes (names in this example) and find the shortest path between them, if any, and print out the result.
Example output
No node is named Anders Fjog
There is no path between Sune Frankild and Mette Voldbye
Shortest path: Chris Workman -> Peter Wad Sackett, weight 1
Shortest path: Mette Voldbye -> Chris Workman -> Peter Wad Sackett, weight 3
Examples of program execution
shortest_path.py graphfile.txt
Details
If there is more than one path between two nodes, then the shortest path should be selected. The length of the path is determined by the weight of the edge between two nodes. The weights are simply added together, smallest number wins. The program is a general purpose tool.
The nature of the graphs and thereby the problems that can be solved, are changed by either giving all edges the same weight, or different weights. Even negative weights are possible, making certain edges hugely preferred.
Good resource on the problem: https://en.wikipedia.org/wiki/Shortest_path_problem
It will be good for inspiration. I recommend implementing Dijkstra's algorithm or Bellman–Ford algorithm.
Here is a list of human protein-protein interactions.
Your program should detect that COL4A2 <=> Nid2 <=> ITGA5 <=> Vpr <=> IL12A are interacting like this and TPD52 and RasGRP3 do not interact. Note that COL4A2 and IL12A can interact in other equally short ways.