Welcome to the ISCB Community News Blog

This blog collects news, announcements or other information which could be of interest to our ISCB members. We are a group ISCB members who volunteer to populate this blog on a regular basis. In case you want to become an "ISCB-News Reporter" yourself, let us know: contact ISCB
Don't repost copyrighted content! The guidelines are:
- Include a link to the source page
- Include a short summary about the article. You can quote up to ONE paragraph from the original story, but not more
- Don't repost an entire articles originating from another source
- Never post content without attribution — always include the source

To post a news, please use this form.

Thursday, December 15, 2011

Protein 3D Structure from High-Throughput Sequencing  

Fifty years after the pioneering discovery hat a protein's three-dimensional structure is determined solely by the sequence of its amino acids, an international team of researchers has taken a major step toward fulfilling the tantalizing promise: predicting the structure of a protein from its DNA alone.

The team at Harvard Medical School (HMS), Politecnico di Torino / Human Genetics Foundation Torino (HuGeF) and Memorial Sloan-Kettering Cancer Center in New York (MSKCC) has reported substantial progress toward solving a classical problem of molecular biology: the computational protein folding problem.

In molecular biology and biomedical engineering, knowing the shape of protein molecules is key to understanding how they perform the work of life, the mechanisms of disease and drug design. Normally the shape of protein molecules is determined by expensive and complicated experiments, and for most proteins these experiments have not yet been done. Computing the shape from genetic information alone is possible in principle. But despite limited success for some smaller proteins, this challenge has remained essentially unsolved. The difficulty lies in the enormous complexity of the search space, an astronomically large number of possible shapes. Without any shortcuts, it would take a supercomputer many years to explore all possible shapes of even a small protein.

The team tested this hypothesis: That evolution can provide a roadmap to how the protein folds. Their approach combined three key elements: evolutionary information accumulated for many millions of years; data from high-throughput genetic sequencing; and a key method from statistical physics.

Using the accumulated evolutionary information in the form of the sequences of thousands of proteins, grouped in protein families that are likely to have similar shapes, the team found a way to solve the problem: an algorithm to infer which parts of a protein interact to determine its shape. They used a principle from statistical physics called "maximum entropy" in a method that extracts information about microscopic interactions from measurement of system properties.

With these internal protein interactions inhand, widely used molecular simulation software generated the atomic details of the protein shape. The team was for the first time able to compute remarkably accurate shapes from sequence information alone for a test set of 15 diverse proteins, with no protein size limit in sight, with unprecedented accuracy.

To test their method, the researchers initially focused on the Ras family of signaling proteins, which has been extensivelystudied because of its known link to cancer. The structure of several Ras-type proteins has already been solved experimentally, but the proteins in the family are larger–with about 160 amino acid residues–than any proteinsmodeled computationally from sequence alone. And there is no reason that the method couldn't work with even larger proteins.

The researchers caution that there are other limits, however: experimental structures, when available, generally are more accurate in atomic detail. And, the method works only when researchers havegenetic data for large protein families. But advances in DNA sequencing have yielded a torrent of such data that is forecast to continue growing exponentially in the foreseeable future.

The next step, the researchers say, is to predict the structures of unsolved proteins currently being investigated bystructural biologists, before exploring the large uncharted territory of currently unknown protein structures; and, collaborating with computational and experimental structural biologists to further refine the "evfolded" proteins and to make structure determination more efficient.

Adapted from text by R Alan Leo, Harvard Medical School.

A full news report: http://www.sciencedaily.com/releases/2011/12/111207175635.htm

_________________________ details:

Supplement, images, 3D structures, restraint lists and more: http://evfold.org
URL: http://bit.ly/tob48p

Contact Person: EVfold team (foldingproteins@cbio.mskcc.org)