![]() |
Networks - Gene Disruption Networks
IntroductionWe built and analysed a network of mutation effects in budding yeast, Saccharomyces cerevisiae. The study was presented at the ECCB 20002 conference in Saarbrcken. The study is based on a genome-wide microarray dataset of systematic gene deletions in yeast, published by Hughes et al,102, 109-126, (2000). The data is normalised and discretised, then the set of genes changing their expression levels as an effect of the mutation is observed. The information for 248 single gene deletions is merged and the resulting sets are represented as a network of mutation effects using graphs. In the resulting network, at most 248 genes have outgoing connections, and any yeast gene (there are 6316) can have incoming connections. The number of connections change depending on the discretisation threshold, but the main structure stays the same, as demonstrated in the paper. We compared the disruption networks built from data with literature data as found in YPD, the Yeast Proteome Database. We found a significant overlap between these networks for a wide range of discretisation thresholds, compared to the overlap between randomised versions of the disruption networks and the YPD network.
Fig. 1: Overlap between the disruption network and the YPD network as a function of the discretisation threshold. TopologyWe investigated the distribution of degrees. The indegree of a gene is the number of ingoing connections, and the outdegree is the number in outgoing connections. Because of the limited number of experiments, the indegree can be at most 248 for our networks, whereas the outdegree can be 6316. The total degree is defined as the number of connections for a gene, regardless of direction. ![]() Fig.
2: Degree distributions. Left graph shows a histogram
of the distribution of indegrees, middle graph the same
view for the distribution of outdegrees. The right graph
shows a log-log plot of the distributions of total degree,
with the probability of a degree as a function of that
degree.
The distribution of degrees seems to follow a power-law, similarily to many other natural and man-made networks where a stability against random perturbations is important. There are several theories about what kind of mechanisms that lie behind the evolution of such networks, including mechanisms that optimize the robustness against random perturbations, and where new nodes attach to old ones with different preference depending on the connectivity properties of the potential binding node. We can see that only very few nodes are affected by many genes, and only very few affect other genes. Which genes are these? A guess is that the genes that are affected by many others are highly regulated and response to a wide range of mutations, whereas the ones that affect many are central regulators of the network. To test this, we analysed the lists of highly connected genes for their functional annotation in YPD. We found that the genes with the most incoming connections almost always are involved in metabolism, whereas the genes with the most outgoing connections were often involved in transcription regulation, RNA turnover and stress response. Another observation was that the genes have either high indegree or high outdegree. This network topology is associated with so called scale-free networks. Other examples of such networks are metabolic networks and the internet. To investigate the modularity and robustness against
random attacks, we observed the amount and size of connected
components in the network. A connected component is
subnetwork where it is possible to find an (undirected)
path between any two nodes. If a network has two or
more connected components, a node in one component has
no path to any node in any other component. Biological meaning of neighbourhoodsIt is interesting to see if genes related by cellular role are located close to each other in the network. We investigated this by selecting groups of genes classified with the same cellular role in YPD, filtering the networks for these genes and their immediate neighbours. It turns out that the initially selected genes turn out close in the network, and that many of their neighbours are genes participating in the same biological processes or otherwise related. The results vary between different groups, but still looks promising. ![]() Fig. 3: Neighbourhood of mating response genes. 20 genes in mating response (in red) were selected, and their immediate neighbours in the network work analysed. Many neighbouring genes are related. (Click picture for enlargement) Comparison with Featherstone&Broadie/WagerTwo other groups studied as well the Rosetta Inpharmatics data (Hughes et al) to address the question of what can be learned about gene regulatory networks from microarray data. The approach of Featherstone&Broadie is similar to our study, and they come to similar conclusions in terms of modularity of the networks (one giant module). However, they focused their study only on one particular threshold.The approach by Wagner is different and leads to the expectation of many modules: He uses the degree of the nodes in the disruption network as an indication to select networks from a pool of randomly generated networks with power-law distribution. The advantage of working with generated networks is that he can tackle the problem of indirect effects, i.e. changes in gene expression which are not directly due to the deletion of a gene but rather due to other regulatory processes in response to these changes. However, the problem is then to find the right algorithm to generate networks which are highly similar to the ones in question. For a more detailed discussion of the differences and
similarities between the three studies see Schlitt and
Brazma in Comp
Funct Genom 2002(3):499-503. Literature
![]() |