Although computer models scientists develop of the inner workings of cells are now quite good at predicting whether suppressing a gene will have negative effects, they do much less well at figuring out if overexpressing a gene–say, having two or more active versions of a gene in a cell–will lead to problems. Now scientists have developed a computer algorithm that can do so, report findings detailed in the Proceedings of the National Academy of Sciences.
Computational biologists Allon Wagner and Eytan Ruppin and evolutionary microbiologist Uri Gophna at Tel-Aviv University in Israel and their colleagues build software models of cells that simulate the complex interplay of genes inside them on the genome level.
“The metabolic network is so complex that no human can trace it, and this is why computer simulations become so handy. They help us to understand how the cells work, but even more important, they allow us to try and predict what would happen if we tweak the cell in a certain way,” Wagner says.
The researchers developed an algorithm called EDGE (which stands for expression-dependent gene effects) to see what might happen to a cell if it activates or expresses one of its genes at unnaturally high levels.
“This question arises mainly in biotechnology,” Wagner says. “Metabolic engineers often manipulate microorganisms to express particular genes very highly–sometimes these are even foreign genes that come from a whole different species. This is one of the ways that microorganisms are manipulated to produce desirable chemicals, such as drug precursors.”
The algorithm supposes that simulated cells each want to reach a certain objective. In their simulations, the scientists always took this objective to be maximizing production of biomass. The algorithm next systematically ranked genes as beneficial, detrimental or neutral depending on how overexpressing them affected simulated cells in realizing this objective.
First, the scientists used EDGE to simulate E. coli to predict which genes might prove toxic in the bacterium when overexpressed. They next picked 26 native genes for subsequent experiments with real E. coli, 12 that were confidently predicted to be toxic and 14 that were confidently predicted not to be toxic. After genetically modifying E. coli to overexpress these genes, the investigators validated EDGE’s predictions. Similar results were seen when analyzing large-scale past experiments that transferred foreign genes into E. coli.
“If we can warn metabolic engineers that some genes are going to be lethal if overexpressed in a given context–for example, specific growth environments–then we can save everyone a lot of time and trouble,” Wagner says.
Second, the research team compared EDGE predictions with extensive microarray data for E. coli and the yeast S. cerevisiae taken across multiple growth conditions, and with transcriptomic data for 79 different human tissues and a wide variety of organs, developmental phases and other biological contexts in the plant Arabidopsis thaliana. In all cases, genes EDGE predicted were detrimental were indeed significantly expressed at low levels in prior research. This may reflect a universal phenomenon in which cells keep potentially deleterious genes in check by reducing their expression.
Third, Wagner, Ruppin, Gophna and their colleagues investigated how applicable EDGE was to human disease by analyzing the system’s ability to predict genes whose activation would suppress proliferation of cancer cells. By re-examining data of past studies, they found that genes that were predicted by EDGE to impede proliferation of human cells were indeed inhibited in cancerous tissues as opposed to their healthy counterparts.
“Although its present applications are biotechnological, in the future, one could use EDGE in an attempt to find potential drug targets that would be lethal to cancer cells when overexpressed,” Wagner says.