Jump to ContentJump to Main Navigation
Computational Text Analysisfor functional genomics and bioinformatics$
Users without a subscription are not able to see the full content.

Soumya Raychaudhuri

Print publication date: 2006

Print ISBN-13: 9780198567400

Published to Oxford Scholarship Online: November 2020

DOI: 10.1093/oso/9780198567400.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (oxford.universitypressscholarship.com). (c) Copyright Oxford University Press, 2021. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in OSO for personal use. date: 20 October 2021

An Introduction to Text Analysis in Genomics

An Introduction to Text Analysis in Genomics

1 (p.1) An Introduction to Text Analysis in Genomics
Computational Text Analysis

Soumya Raychaudhuri

Oxford University Press

The February 16th, 2001 issue of Science magazine announced the completion of the human genome project—making the entire nucleotide sequence of the genome available (Venter, Adams et al. 2001). For the first time a comprehensive data set was available with nucleotide sequences for every gene. This marked the beginning of a new era, the ‘‘genomics’’ era, where molecular biological science began a shift from the investigation of single genes towards the investigation of all genes in an organism simultaneously. Alongside the completion of the genome project came the introduction of new high throughput experimental approaches such as gene expression microarrays, rapid single nucleotide polymorphism detection, and proteomics methods such as yeast two hybrid screens (Brown and Botstein 1999; Kwok and Chen 2003; Sharff and Jhoti 2003; Zhu, Bilgin et al. 2003). These methods permitted the investigation of hundreds if not thousands of genes simultaneously. With these high throughput methods, the limiting step in the study of biology began shifting from data collection to data interpretation. To interpret traditional experimental results that addressed the function of only a single or handful of genes, investigators needed to understand only those few genes addressed in the study in detail and perhaps a handful of other related genes. These investigators needed to be familiar with a comparatively small collection of peer-reviewed publications and prior results. Today, new genomics experimental assays, such as gene expression microarrays, are generating data for thousands of genes simultaneously. The increasing complexity and sophistication of these methods makes them extremely unwieldy for manual analysis since the number and diversity of genes involved exceed the expertise of any single investigator. The only practical solution to analyzing these types of data sets is using computational methods that are unhindered by the volume of modern data. Bioinformatics is a new field that emphasizes computational methods to analyze such data sets (Lesk 2002). Bioinformatics combines the algorithms and approaches employed in computer science and statistics to analyze, understand, and hypothesize about the large repositories of collected biological data and knowledge.

Keywords:   biological function databases, electronic text resources, gene expression analysis, online journals, potential uses candidate gene identification, sequencing, text resources electronic text, vascular endothelial growth factor, whole-text mining

Oxford Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.

Please, subscribe or login to access full text content.

If you think you should have access to this title, please contact your librarian.

To troubleshoot, please check our FAQs , and if you can't find the answer there, please contact us .