Friday, January 28, 2005

Emerging Technologies in the 2005 R&D Funding Report

The 2005 R&D Funding Report from Battelle was recently released. The report was published in the January edition of R&D Magazine. You can see the full report at this site. The report indicates that while industry drove R&D spending in the 90s, government spending drives the investment into new technologies today.

In reading the report, I found the key technologies receiving investment to be interesting. In summary form, they are:
--Materials technologies – for medical device implants and harsh environments,
--Medical diagnostic imaging – for noninvasive diagnostics and better interpretation of images,
--Information mining and assessment – ability to rapidly analyze content from a wide range of topics,
--Environment – managing the environment,
--Energy production and distribution – renewable energies including fuel cells, bioenergy, and hydrogen.
--Medical technology – development of methods for diagnostics including devices, feedback-systems, and early-warning systems.
--Anti-terrorism technologies – including bomb detection/deactivation.

In looking at the industry sectors, I saw that total spending in the Bio/Pharmaceutical industry tops the list now exceeding the automotive industry. The list is as follows:

BioPharm $30B
Automotive $27B
Software $24B
Telecom $22B
Semicon $20B
IT $15B
Chemical $13B
Aerospace $11B

I believe this is one indicator showing how we have moved from the “Physics Century” into what is now called the “Bio Century”. Investment, of course, drives technology development – especially emerging technologies. I recommend you read the report and email me with your thoughts. You can reach me at hall.martin@ni.com

Best regards,
Hall T. Martin

Friday, January 21, 2005

Bioinformatics 101

Increasingly the term Bioinformatics comes up in applications I come across. I began to look into it more so I can understand the issues a scientist faces in the use of tools and databases to help accomplish their task. Bioinformatics is defined as “the task of organizing and analyzing increasingly complex data resulting from modern molecular and biochemical techniques.”

I spoke with Mia Markey at the University of Texas who gave me some pointers for researching this area. The first area to research is the existence and use of databases. A wealth of information exists in these databases and most are open to researchers around the world at no cost. By using a web link one can search through these databases. The key databases are:

NCBI – sequence data

Stanford Microarray Database – gene expression data

Swiss-Prot – protein sequence database

EMBL-EBI – European Nucleotide Sequence DB

PubMed – scientific papers database

In addition to databases, a number of tools are used such as the following:
BLAST – sequence matcher for alignment applications.
SAGE – provides absolute results unlike microarrays which provide relative results.
FASTA – alignment tool for proteins
PERL – useful for text string searches. See BIOPERL http://bioperl.org/

According to Markey, the number one problem scientific researchers face is that results from one test do not hold up under repeated testing. The number two problem is the need for better visualization tools for all the voluminous data available.

I was interested in what experience a scientist has in using the above mentioned databases so I found an example guide that walked me through the process so I could see what they see. If you are interested in seeing it for yourself, try these steps:

1. Go to the GeneCards at nciarray.nci.nih.gov/cards/index.html
2. Type in the name of your disease in the search window
3. Make note of the gene name and chromosomal location for each gene.
4. Go to the MapViewer at the NCBI Bioinformatics website at www.ncbi.nlm.nih.gov, to visually identify the location of each gene.
5. Choose a gene in which a protein product has been identified. You can check the box titled “proteins” for this information.
6. Click on the Unigene Cluster # or RefSeq# under sequence. The number starts with “NM”. Make note of the gene name and its number.
7. Find the function of the protein in the SwissProt database, http://us.expasy.org/sprot/
8. Find the amino acid sequence for the protein by looking at the Locus Link on the NCBI page. Go to the LocusLink page on the NCBI web site (see step 4) and type the gene name into the search box.
9. Finally, use PubMed, http://www.pubmedcentral.nih.gov/ to look up any published papers on the topic.

As you can see the information is spread among several databases and even a casual search starts to generate tremendous amounts of information that needs integration and analysis to make sense of it.

This blog is not meant to be a complete tutorial on Bioinformatics, but I found it informative to walk through the above steps to get a flavor for the type of data and analysis that is going here.

If you have experience with the Bioinformatics or an interest in this area, please email me at hall.martin@ni.com.

Best regards,
Hall T. Martin