<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk/22140/index.php?action=history&amp;feed=atom&amp;title=ExYeastSysBio_R_answers</id>
	<title>ExYeastSysBio R answers - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk/22140/index.php?action=history&amp;feed=atom&amp;title=ExYeastSysBio_R_answers"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22140/index.php?title=ExYeastSysBio_R_answers&amp;action=history"/>
	<updated>2026-05-05T19:39:51Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22140/index.php?title=ExYeastSysBio_R_answers&amp;diff=50&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;= Answers to the first yeast systems biology exercise = &#039;&#039;&#039;Answers by:&#039;&#039;&#039; Lars Rønn Olsen and Rasmus Wernersson  == Report questions #1 ==  &lt;pre style=&quot;overflow:auto;&quot;&gt; library(igraph) library(ggraph)  load(&quot;/home/projects/22140/exercise4.Rdata&quot;)  g &lt;- graph_from_data_frame(interactions, directed = FALSE, vertices = node_attributes)  ggraph(g) +   geom_edge_link(aes(color = score)) +    scale_edge_color_continuous(limits = c(0,1), low = &quot;red&quot;, high = &quot;black&quot;) +   geom_n...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22140/index.php?title=ExYeastSysBio_R_answers&amp;diff=50&amp;oldid=prev"/>
		<updated>2024-03-05T14:27:40Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;= Answers to the first yeast systems biology exercise = &amp;#039;&amp;#039;&amp;#039;Answers by:&amp;#039;&amp;#039;&amp;#039; Lars Rønn Olsen and Rasmus Wernersson  == Report questions #1 ==  &amp;lt;pre style=&amp;quot;overflow:auto;&amp;quot;&amp;gt; library(igraph) library(ggraph)  load(&amp;quot;/home/projects/22140/exercise4.Rdata&amp;quot;)  g &amp;lt;- graph_from_data_frame(interactions, directed = FALSE, vertices = node_attributes)  ggraph(g) +   geom_edge_link(aes(color = score)) +    scale_edge_color_continuous(limits = c(0,1), low = &amp;quot;red&amp;quot;, high = &amp;quot;black&amp;quot;) +   geom_n...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;= Answers to the first yeast systems biology exercise =&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Answers by:&amp;#039;&amp;#039;&amp;#039; Lars Rønn Olsen and Rasmus Wernersson&lt;br /&gt;
&lt;br /&gt;
== Report questions #1 ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style=&amp;quot;overflow:auto;&amp;quot;&amp;gt;&lt;br /&gt;
library(igraph)&lt;br /&gt;
library(ggraph)&lt;br /&gt;
&lt;br /&gt;
load(&amp;quot;/home/projects/22140/exercise4.Rdata&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
g &amp;lt;- graph_from_data_frame(interactions, directed = FALSE, vertices = node_attributes)&lt;br /&gt;
&lt;br /&gt;
ggraph(g) +&lt;br /&gt;
  geom_edge_link(aes(color = score)) + &lt;br /&gt;
  scale_edge_color_continuous(limits = c(0,1), low = &amp;quot;red&amp;quot;, high = &amp;quot;black&amp;quot;) +&lt;br /&gt;
  geom_node_point()&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Report question #2 ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
node_attributes[node_attributes$cluster %in% &amp;quot;cluster1&amp;quot;,]$description&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Repeat for each cluster)&lt;br /&gt;
&lt;br /&gt;
The idea is simply to quickly look through the gene/protein descriptions in order to get an overall idea of what types of proteins are present in each cluster.&lt;br /&gt;
&lt;br /&gt;
# Function: DNA-replication&lt;br /&gt;
# Function: Origin of replication recognition/Cell division control&lt;br /&gt;
# Function: Mixed - trehalose synthesis&lt;br /&gt;
# Function: Cyclins, CDC28&lt;br /&gt;
# Function: Anaphase-promoting complex&lt;br /&gt;
# Function: DNA damage repair&lt;br /&gt;
# Function: Cell division control&lt;br /&gt;
# Function: Unknown&lt;br /&gt;
&lt;br /&gt;
== Report question #3 ==&lt;br /&gt;
Yes - based on what we have learn about cell cycle phases and cell cycle regulation, the following clusters stands out:&lt;br /&gt;
&lt;br /&gt;
# Function: DNA-replication. &amp;#039;&amp;#039;&amp;#039;YES:&amp;#039;&amp;#039;&amp;#039; (S-phase)&lt;br /&gt;
# Function: Origin of replication recognition/Cell division control &amp;#039;&amp;#039;&amp;#039;YES&amp;#039;&amp;#039;&amp;#039; (S-phase)&lt;br /&gt;
# Function: Mixed - trehalose synthesis&lt;br /&gt;
# Function: Cyclins, CDC28 &amp;#039;&amp;#039;&amp;#039;YES&amp;#039;&amp;#039;&amp;#039; (cell cycle regulation)&lt;br /&gt;
# Function: Anaphase-promoting complex &amp;#039;&amp;#039;&amp;#039;YES&amp;#039;&amp;#039;&amp;#039; (M-phase)&lt;br /&gt;
# Function: DNA damage repair &amp;#039;&amp;#039;&amp;#039;YES&amp;#039;&amp;#039;&amp;#039; (S-phase)&lt;br /&gt;
# Function: Cell division control &amp;#039;&amp;#039;&amp;#039;YES&amp;#039;&amp;#039;&amp;#039; (cell cycle regulation)&lt;br /&gt;
# Function: Unknown&lt;br /&gt;
&lt;br /&gt;
== Report question #4 ==&lt;br /&gt;
&lt;br /&gt;
Below are solutions for the following:&lt;br /&gt;
&lt;br /&gt;
TASK: make a subgraph of the &amp;quot;big cluster&amp;quot;:&lt;br /&gt;
* Use the igraph function &amp;quot;decompose&amp;quot; to make a list of connected graphs.&lt;br /&gt;
* Calculate the number of nodes in each subgraph in the list using vcount. This can be quickly done using the lapply function.&lt;br /&gt;
* Visualize the &amp;quot;big cluster&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Investigate the inter-connectivity: Visually there appears to be a pattern to the way the nodes are connected - this could indicate that this sub-network is not evenly connected.&lt;br /&gt;
* Investigate this by visualizing the &amp;quot;big cluster&amp;quot; network with the node size based on node degree.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make a list of all connected subgraphs&lt;br /&gt;
&lt;br /&gt;
connected_graphs &amp;lt;- decompose(g)&lt;br /&gt;
&lt;br /&gt;
# Extract the graph with the most vertices&lt;br /&gt;
&lt;br /&gt;
big_cluster &amp;lt;- connected_graphs[[which.max(lapply(connected_graphs, vcount))]]&lt;br /&gt;
&lt;br /&gt;
# Visualize, and set the size of the nodes according to node degree&lt;br /&gt;
&lt;br /&gt;
ggraph(big_cluster, layout = &amp;quot;kk&amp;quot;) +&lt;br /&gt;
  geom_edge_link(aes(color = score)) + &lt;br /&gt;
  scale_edge_color_continuous(limits = c(0,1), low = &amp;quot;red&amp;quot;, high = &amp;quot;black&amp;quot;) +&lt;br /&gt;
  geom_node_point(aes(color = cluster, size = degree(big_cluster)))&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now, for the following tasks:&lt;br /&gt;
&lt;br /&gt;
TASK: explore the interaction partners&lt;br /&gt;
* Randomly select a single protein from the global graph, extract a subgraph with the first order interaction partners using the &amp;quot;neighborhood&amp;quot; function and look at the descriptions of this sub-set.&lt;br /&gt;
* Do this for 5-10 randomly chosen proteins - perhaps with small, medium, and high node degree - and note down if any obvious patterns start to emerge.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
random_subgraph_list &amp;lt;- neighborhood(graph = g, order = 1, nodes = sample(names(V(g)), 1))&lt;br /&gt;
&lt;br /&gt;
random_subgraph &amp;lt;- induced_subgraph(g, unlist(random_subgraph_list))&lt;br /&gt;
&lt;br /&gt;
ggraph(random_subgraph, layout = &amp;quot;kk&amp;quot;) +&lt;br /&gt;
  geom_edge_link() + &lt;br /&gt;
  geom_node_point()&lt;br /&gt;
&lt;br /&gt;
# Repeat the above 5-10 times&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Below is the code for the following questions:&lt;br /&gt;
* Start, once again, with a single random protein and select its interaction partners in the &amp;quot;big cluster&amp;quot;&lt;br /&gt;
* Then extend this selection with the interaction partners of those as well (using the &amp;quot;neighborhood&amp;quot; function with both your selected proteins).&lt;br /&gt;
* Repeat this until the entire &amp;quot;big cluster&amp;quot; is selected:&lt;br /&gt;
* How many steps do you need?&lt;br /&gt;
* Try to find one of the proteins most distantly connected - how many steps do you need here?&lt;br /&gt;
* Which network topology measurement is at play here?&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Iteratively expanding a network with first order interactants from a random vertex in the big_cluster graph&lt;br /&gt;
&lt;br /&gt;
random_subgraph_list &amp;lt;- neighborhood(graph = big_cluster, order = 1, nodes = sample(names(V(big_cluster)), 1))&lt;br /&gt;
random_subgraph &amp;lt;- induced_subgraph(big_cluster, unlist(random_subgraph_list))&lt;br /&gt;
ggraph(random_subgraph, layout = &amp;quot;kk&amp;quot;) +&lt;br /&gt;
  geom_edge_link() + &lt;br /&gt;
  geom_node_point()&lt;br /&gt;
&lt;br /&gt;
vcount(random_subgraph) == vcount(big_cluster)&lt;br /&gt;
&lt;br /&gt;
random_subgraph_list2 &amp;lt;- neighborhood(graph = big_cluster, order = 1, nodes = names(V(random_subgraph)))&lt;br /&gt;
random_subgraph2 &amp;lt;- induced_subgraph(big_cluster, unlist(random_subgraph_list2))&lt;br /&gt;
ggraph(random_subgraph2, layout = &amp;quot;kk&amp;quot;) +&lt;br /&gt;
  geom_edge_link() + &lt;br /&gt;
  geom_node_point()&lt;br /&gt;
&lt;br /&gt;
vcount(random_subgraph2) == vcount(big_cluster)&lt;br /&gt;
&lt;br /&gt;
random_subgraph_list3 &amp;lt;- neighborhood(graph = big_cluster, order = 1, nodes = names(V(random_subgraph2)))&lt;br /&gt;
random_subgraph3 &amp;lt;- induced_subgraph(big_cluster, unlist(random_subgraph_list3))&lt;br /&gt;
ggraph(random_subgraph3, layout = &amp;quot;kk&amp;quot;) +&lt;br /&gt;
  geom_edge_link() + &lt;br /&gt;
  geom_node_point()&lt;br /&gt;
&lt;br /&gt;
vcount(random_subgraph3) == vcount(big_cluster)&lt;br /&gt;
&lt;br /&gt;
# And so on until the number of vertices in the &amp;quot;big cluster&amp;quot; is the same as the expanded sub graphs. Usually around 4-5 steps is needed, depending of course on your randomly selected first node.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;Try to find one of the proteins most distantly connected - how many steps do you need here?&amp;#039;&amp;#039; - &amp;#039;&amp;#039;&amp;#039;A: 13 steps&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* &amp;#039;&amp;#039;Which network topology measurement is at play here?&amp;#039;&amp;#039; - &amp;#039;&amp;#039;&amp;#039;A: the &amp;quot;longest shortest path&amp;quot; / network diameter&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== Report question #5 ==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HTB1_subgraph_list &amp;lt;- neighborhood(graph = g, order = 1, nodes = &amp;quot;HTB1&amp;quot;)&lt;br /&gt;
HTB1_subgraph &amp;lt;- induced_subgraph(g, unlist(HTB1_subgraph_list))&lt;br /&gt;
ggraph(HTB1_subgraph, layout = &amp;quot;kk&amp;quot;) +&lt;br /&gt;
  geom_edge_link() + &lt;br /&gt;
  geom_node_point() +&lt;br /&gt;
  geom_node_text(aes(label = name), repel = TRUE)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Report question #6 ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
spd_vertices &amp;lt;- node_attributes[grepl(x = node_attributes$description, ignore.case = TRUE, pattern = &amp;quot;Spindle Pole Body&amp;quot;), ]$name&lt;br /&gt;
spb_subgraph &amp;lt;- delete_vertices(g, !names(V(g)) %in% spd_vertices)&lt;br /&gt;
ggraph(spb_subgraph, layout = &amp;quot;kk&amp;quot;) +&lt;br /&gt;
  geom_edge_link() + &lt;br /&gt;
  geom_node_point() +&lt;br /&gt;
  geom_node_text(aes(label = name), repel = TRUE)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Report question #7 ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cluster  cell_cycle_role phase&lt;br /&gt;
cluster1 DNA replication     S&lt;br /&gt;
cluster2 DNA replication     S&lt;br /&gt;
cluster3            &amp;lt;NA&amp;gt;  &amp;lt;NA&amp;gt;&lt;br /&gt;
cluster4      Regulation  &amp;lt;NA&amp;gt;&lt;br /&gt;
cluster5      Regulation     M&lt;br /&gt;
cluster6      DNA repair     S&lt;br /&gt;
cluster7      Regulation  &amp;lt;NA&amp;gt;&lt;br /&gt;
cluster8            &amp;lt;NA&amp;gt;  &amp;lt;NA&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Report question #8 ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
g &amp;lt;- graph_from_data_frame(interactions, directed = FALSE, vertices = node_attributes)&lt;br /&gt;
&lt;br /&gt;
ggraph(g) +&lt;br /&gt;
  geom_edge_link() + &lt;br /&gt;
  geom_node_point(aes(color = cluster, shape = cell_cycle_role))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!---&lt;br /&gt;
&lt;br /&gt;
== Report question #7 ==&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Function / phase summary for each cluster&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
{| {{table}}&lt;br /&gt;
| align=&amp;quot;center&amp;quot; style=&amp;quot;background:#f0f0f0;&amp;quot;|&amp;#039;&amp;#039;&amp;#039;Cluster #&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| align=&amp;quot;center&amp;quot; style=&amp;quot;background:#f0f0f0;&amp;quot;|&amp;#039;&amp;#039;&amp;#039;Cell cycle role&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| align=&amp;quot;center&amp;quot; style=&amp;quot;background:#f0f0f0;&amp;quot;|&amp;#039;&amp;#039;&amp;#039;Phase&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
| 1||DNA replication||S&lt;br /&gt;
|-&lt;br /&gt;
| 2||DNA replication||S&lt;br /&gt;
|-&lt;br /&gt;
| 3||-||-&lt;br /&gt;
|-&lt;br /&gt;
| 4||Regulation||-&lt;br /&gt;
|-&lt;br /&gt;
| 5||Regulation||M&lt;br /&gt;
|-&lt;br /&gt;
| 6||DNA repair||S&lt;br /&gt;
|-&lt;br /&gt;
| 7||Regulation||-&lt;br /&gt;
|-&lt;br /&gt;
| 8||-||-&lt;br /&gt;
|-&lt;br /&gt;
| 9||DNA replication||S&lt;br /&gt;
|-&lt;br /&gt;
| 10||Chromosome segregation||M&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Report question #8 ==&lt;br /&gt;
Q: &amp;#039;&amp;#039; Make a screen-shot of the clusters you have concluded to be relevant to cell cycle - make a brief comment on why you reason they are important for cell cycle.&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
=== Cell cycle phases ===&lt;br /&gt;
Clusters which contains proteins linked to a specific phase in cell cycle should be active at these specific points of time, and therefore need to be cell cycle regulated.&lt;br /&gt;
&lt;br /&gt;
[[image:ExYeast1_phases.png|thumb|800px|center|Green = M-phase, Yellow = S-phase]]&lt;br /&gt;
&lt;br /&gt;
=== Cell cycle role ===&lt;br /&gt;
Here we have chosen to select clusters which each have a function that can be directly linked to either phases in the cell cycle or regulation. Chromosome segregation will be needed during the M-phase, DNA replication and repair will be needed during the S-phase, and regulation will be needed to coordinate it all.&lt;br /&gt;
&lt;br /&gt;
[[image:ExYeast1_cc_roles.png|thumb|800px|center|Orange = Chromosome segregation, Purple = Regulation, Green = DNA repair, Cyan = DNA replication]]&lt;br /&gt;
---&amp;gt;&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>