Welcome to the ISCB Community News Blog

This blog collects news, announcements or other information which could be of interest to our ISCB members. We are a group ISCB members who volunteer to populate this blog on a regular basis. In case you want to become an "ISCB-News Reporter" yourself, let us know: contact ISCB
Don't repost copyrighted content! The guidelines are:
- Include a link to the source page
- Include a short summary about the article. You can quote up to ONE paragraph from the original story, but not more
- Don't repost an entire articles originating from another source
- Never post content without attribution — always include the source

To post a news, please use this form.

Sunday, May 30, 2010

U of T researchers crack the splicing code

The coding capacity of the vertebrate genome is greatly expanded by alternative splicing, which enables a single gene to produce more than one distinct protein. Alternative splicing shapes how genetic information controls cellular processes, and many human disease mutations affect splicing. The ability to predict expression of different alternatively spliced messenger RNAs from genomic sequence data is a long-sought goal in the field of gene expression. The Frey and Blencowe labs at the University of Toronto have combined forces to develop a 'splicing code' that accurately predicts how hundreds of RNA features work together to regulate tissue-dependent alternative splicing for thousands of exons. It has been used to predict how alternative splicing may play important roles in development and neurological processes, and has provided insights into mechanisms of splicing regulation. The code has also been incorporated into a web tool that allows researchers to scan uncharacterized exon and intron sequences to predict tissue-dependent splicing patterns.

Source: Nature Editor's Summary

Read more:

Wednesday, May 19, 2010

LifeDB Autonomous Data Integration System Trial Release

LifeDB Autonomous Data Integration System Trial Release

It is with great excitement and anticipation that we
announce the release of LifeDB Autonomous Data Integration
system [1] for Life Sciences data management and workflow
querying. LifeDB has its own SQL-like query language
called BioFlow [2, 3] using which, most arbitrary
online deep web resources and tools can be used to develop
applications completely dynamically and on an ad hoc basis.
The goal in LifeDB is to allow application development by
end users using SQL-like declarative query language without
having to worry about schema heterogeneity and geographical
distribution of resources. LifeDB relies on two basic fully
autonomous sub-systems: FastWrap [4, 5] for wrapper
generation and table annotation, and OntoMatch [6] for
schema mapping. BioFlow supports horizontal and vertical
integration and autonomous record linkage through entity
identification and resolution. It is based on an extended
parameterized relational algebra called Integra [7] that is
capable of blurring the distinctions between web documents
and traditional SQL tables by uniformly treating both as

To support end user application development, we have also
developed a visual application development system called
VizBuilder [8] that allows writing BioFlow applications
visually without any knowledge of BioFlow. In our current
LifeDB release, VizBuilder is included as an alternate

Currently, LifeDB is in a trial phase while it undergoes
internal validation. We believe LifeDB is performing within
its design parameters. We invite interested researchers in
Databases and Life Sciences to register for LifeDB as end
users. Registered users will be able to either download
LifeDB binary for local use, or use LifeDB in our server
with limited user data space. Currently, our goal is to
compile bug reports, if any, and fix the currently unknown
bugs before its final release. Full information on LifeDB
can be found at http://integra.cs.wayne.edu:8080/lifedb.

Our laboratory is fully committed to supporting registered
users and their applications on a long term basis on our
Wayne State Server. Registered users will have their own
data space in our server to store their data, applications
and other resources. For high-end users, we will also
consider offering dedicated processors to support computation
intensive applications, resource permitting. We are
considering a novel resource sharing model and it will be
announced when we finalize our plan and acquire the needed
resource. LifeDB will be an open source system for the
community, following the release of its final version. We
are now compiling user requests for data space over the
next 3-6 months based on which we plan to acquire the needed
hardware to support our users. A request for needed space
(and dedicated processor, subject to agreement) may be made
through an e-mail to the PI at jamil@cs.wayne.edu. Please use
subject heading "LifeDB Resource Request."

Publications related to LifeDB and other projects can be found
at our Integration Informatics Lab home page at
http://integra.cs.wayne.edu. A comprehensive LifeDB user manual
is also available for end users. The user manual will include
test examples and design parameters under which LifeDB is
designed and expected to function well.

LifeDB has been funded in part by National Science Foundation
grants SEIII IIS 0612203 and MRI CNS 0521454.


[1] Anupam Bhattacharjee, Aminul Islam, Mohammad Shafkat Amin,
Shahriyar Hossain, Shazzad Hosain, Hasan Jamil and Leonard Lipovich,
"On-the-fly Integration and ad hoc Querying of Life Sciences
Databases using LifeDB", 20th Database and Expert Systems
Applications, Linz, Austria, August 2009.

[2] Hasan Jamil, Bilal El-Hajj-Diab, "BioFlow: A Web-based
Declarative Workflow Language for Life Sciences", IEEE International
Workshop on Scientific Workflows, Hawaii, United States, July 2008.

[3] Hasan Jamil, Aminul Islam, "The Power of Declarative
Languages: A Comparative Exposition of Scientific Workflow Design
using BioFlow and Taverna", IEEE SCC International Workshop on
Scientific Workflows, Los Angeles, California, July 2009.

[4] Mohammad Shafkat Amin, Hasan Jamil, "FastWrap: An Efficient
Wrapper for Tabular Data Extraction from the Web", IEEE
International Conference on Information Reuse and Integration,
Las Vegas, Nevada, United States, August 2009.

[5] Mohammad Shafkat Amin, Anupam Bhattacharjee, Hasan Jamil,
"Wikipedia Driven Autonomous Label Assignment in Wrapper Induced
Tables with Missing Column Names", ACM International Symposium on
Applied Computing, Sierre, Switzerland, March 2010.

[6] Anupam Bhattacharjee, Hasan Jamil, "OntoMatch: A Monotonically
Improving Schema Matching System for Autonomous Data Integration",
IEEE International Conference on Information Reuse and Integration,
Las Vegas, Nevada, United States, August 2009.

[7] Shazzad Hosain, Hasan Jamil, "An Algebraic Language for Semantic
Data Integration on the Hidden Web", IEEE International Conference
on Semantic Computing, Berkeley, California, United States,
September 2009.

[8] Shahriyar Hosain, Hasan Jamil, "A Visual Interface for
on-the-fly Biological Database Integration and Workflow Design
using VizBuilder", International Workshop on Data Integration in
Life Sciences (DILS), Manchester, United Kingdom, July 2009.
URL: http://integra.cs.wayne.edu

Contact Person: Hasan Jamil (jamil@cs.wayne.edu)

Monday, May 10, 2010

ISCB Member Feedback Sought on (Draft) Literature Open Public Access Policy Statement

International Society for Computational Biology
ISCB (Draft) Literature Open Public Access Policy Statement
Draft approved by the ISCB board of Directors on April 8, 2010

Open for comment from the ISCB membership
and bioinformatics community

Comment period closes June 11, 2010

The International Society for Computational Biology (ISCB) is dedicated to advancing human knowledge at the intersection of computation and life sciences. This (draft) ISCB policy statement is intended to express strong support for open public access to the archival scientific and technical literature and to elucidate in more nuanced detail the position of ISCB on this important issue in scientific publication.

An official ISCB policy statement on the closely related topic of sharing software provides very clear support for Open Source/Open Access (http://www.iscb.org/iscb-policy-statements-/187). ISCB supports many of the principles of the “Budapest Open Access Initiative,” the “Bethesda Declaration on Open Access Publishing,” the U.S. National Academies of Sciences report on "Sharing Publication-Related Data and Materials: Responsibilities of Authorship in the Life Sciences," and the European “Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities.”

Knowledge is the fruit of the scientific research endeavor, and the archival scientific literature is its practical expression and means of communication. Shared knowledge multiplies its utility because every new scientific discovery is built upon previous scientific knowledge. Access to knowledge is the power to solve new problems and make informed decisions. More open public access to archival scientific and technical knowledge will empower more citizens and more scientists to solve more problems and make more informed decisions.

    1. There should be free, open, online, public access to research results in the archival scientific and technical literature, with all their existing content including supplementary material and data. This access may be at an interval following publication, which interval should not exceed one year.

    2. Existing models show high impact, scientific benefit, feasibility, and acceptability:

      a. The public benefit from open access to the world’s online information via the publicly-funded Internet provides a good model of expected impact.

      b. The scientific fertilization from open access to genomic information via the publicly-funded Human Genome Project provides a good model of expected scientific benefit.

      c. Open access policies by the U.S. National Institutes of Health, the Howard Hughes Medical Institute, and the Wellcome Trust provide good models of feasibility, acceptability, and implementation.

    3. Open literature access will enable a whole new generation of innovative tools and mechanisms that will endow the literature with enriched commentary and usability and connect the literature and databases via the proper pointers. These tools are already being built by publishers, researchers, and others. The creation of the web of knowledge around publications is an important consequence of semantic enrichment of the literature.

    4. Whether publications are immediately available or delayed, the underlying data and methods must be publicly available in sufficient detail to allow replication of the results and application of other computational methods to the data.

    5. Policy details — which version, where stored, how annotated and organized, what incentives — must be considered carefully. However, it has become essential to put forward a broad policy mandate for eventual public access to research knowledge.

    6. Publishing high-quality peer-reviewed scientific literature incurs costs. We recognize that cost recovery is a serious issue that must be addressed carefully if open access is to be a mandated policy.

    7. The funding policy must:

      a. Fund activities of peer review, copy editing, and publishing.

      b. Provide fair compensation, if and where needed, to facilitate transitions and adaptations to new models for publishing and sustaining essential revenue.

      c. Be consistent with government laws, other existing regulations, and research dissemination through viable commercial mechanisms.

    8. It is undesirable to take funding from current research and thereby risk underfunding basic science, so new funding should be made available for policy implementation. However, the expected total cost for complete open literature access is only a very small percentage of the total cost for the entire international research endeavor.
Scientific literature represents a substantial investment by governments, foundations, and others. One of our primary missions is the assembly of individual pieces of knowledge from this literature in ways that provide powerful new insights and ideas for next-stage research by the entire scientific community and society in general. We in the ISCB are committed to the continuous enhancement and leveraging of mankind’s knowledge resources. To achieve this goal, investment in open public access to the research literature must be made.

ISCB (Draft) Literature Open Public Access --- Appendix

A. Documents mentioned in the statement text.
1. Text of the “Budapest Open Access Initiative.”

2. Text of the “Bethesda Declaration on Open Access Publishing.”

3. Text of the U.S. National Academies of Sciences report on "Sharing Publication-Related Data and Materials: Responsibilities of Authorship in the Life Sciences."

4. Text of the “Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities.”

5. Text of Open Access Policy from the U.S. National Institutes of Health.

6. Text of Open Access Policy from the Howard Hughes Medical Institute.

7. Text of the Wellcome Trust's "Position Statement in Support of Open and Unrestricted Access to Published Research."

B. General Background Material.
1. Academic publishing - Wikipedia, the free encyclopedia.

2. Open access (publishing) - Wikipedia, the free encyclopedia.

3. ROARMAP (Registry of Open Access Repository Material Archiving Policies) as recommended by the Berlin Declaration.

C. Other Statements and Materials.
1. Text of Public Library of Science "Open Letter to Scientific Publishers" (signed by ~34,000 scholars worldwide).

2. Text of Research Councils of the UK "Access to Research Outputs."

3. Text of European Research Advisory Board Final Report "Scientific Publication: Policy On Open Access."

4. Text of Bulletin of the World Health Organization "Equitable access to scientific and technical information for health."

5. UNESCO EBSCO Open Science Directory.

6. Peter Suber's "Open Access Overview."

Sunday, May 2, 2010

Clinical assessment incorporating a personal genome

Clinical assessment incorporating a personal genome

In a Lancet paper Ashley et.al. undertook "an integrated analysis of a complete human genome in a clinical context." The interpretation of the results: "Although challenges remain, our results suggest that whole-genome sequencing can yield useful and clinically relevant information for individual patients."