LinuxHPC.org/Cluster Builder 1.3
    Bioinformatics
Translate to another language

Bioinformatics

Bioinformatics and computational biology involve the use of techniques from applied mathematics, informatics, statistics, and computer science to solve biological problems. Research in computational biology often overlaps with systems biology. Major research efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, and the modeling of evolution.


Bioinformatics Vs. Computational Biology

The terms bioinformatics and computational biology are often used interchangeably, although the former typically focuses on algorithm development and specific computational methods, while the latter focuses more on hypothesis testing and discovery in the biological domain. Although this distinction is used by National Institutes of Health in their working definitions of Bioinformatics and Computational Biology, it is clear that there is a tight coupling of developments and knowledge between the more hypothesis-driven research in computational biology and technique-driven research in bioinformatics. Computational biology also includes lesser known but equally important subdisciplines such as computational biochemistry and computational biophysics.

A common thread in projects in bioinformatics and computational biology is the use of mathematical tools to extract useful information from noisy data produced by high-throughput biological techniques such as genomics (The field of data mining overlaps with computational biology in this regard). A representative problem in bioinformatics is the assembly of high-quality DNA sequences from fragmentary "shotgun" DNA sequencing, while in computational biology, a representative problem might be statistical testing of a hypothesis of common gene regulation using data from mRNA microarrays or mass spectrometry.



Software Tools


The computational biology tool best-known among biologists is probably BLAST, an algorithm for searching large databases of protein or DNA sequences. NCBI provides a popular implementation that searches their massive sequence databases. Binoinformatic meta search engines (Entrez, Bioinformatic Harvester) help finding relevant information from several databases. There are also free Web-based software designed for structural bioinformatics such as STING.

Computer scripting languages such as Perl and Python are often used to interface with biological databases and parse output from bioinformatics programs. Communities of bioinformatics programmers have set up free/open source projects such as EMBOSS, Bioconductor, BioPerl, BioLinux, BioPython, BioRuby, and BioJava which develop and distribute shared programming tools and objects (as program modules) that make bioinformatics easier.


Program Modules
  • An integrated software workbench consisting of many free/open source tools described above and many others is known as VigyaanCD.
  • Taverna an open-source bioinformatics workbench that utilises a workflow model of experimental design. Taverna is included as part of the myGRID package of e-science software.
  • Quantum 3.1 is an example of the bioinformatics post-QSAR technology applying quantum and molecular physics instead of statistical methods.
  • Genevestigator is an example of how large-scale gene expression microarray data is used to predict gene function based on contextual information.
More recently, SOAP-based interfaces have been developed for a wide variety of bioinformatics applications such as blast, fasta, EMBOSS, clustalw, t-coffee, MUSCLE and many others. These are available from the EBI at EBI Web Services.

All text used in this article is available under the GNU Free Documentation License. It uses material from the Wikipedia article "Bioninformatics".