Function logos of structurally aligned tRNA data as calculated by LOGOFUN [36] for two groups of Alphaproteobacteria and overview of tRNA-CIF-based binary phyloclassification.
Function logos generalize sequence logos. They are the sole means by which we predict tRNA Class-Informative Features (CIFs), which form the basis of the scoring schemes of the classifiers reported in this work. A full derivation of the mathematics of function logos is provided in [36]. The tRNA-CIF-based phyloclassifier shown in Figure 3A sums differences in heights of features between two function logos for a set of genomically derived tRNAs. Complete source code and data to reproduce the function logos in this figure are in Dataset S1.
doi:10.1371/journal.pcbi.1003454.g002
About Me
I recently participated in a seven week competitive internship (Insight Data Science) designed to foster the transition of PhD scientists into Data Science positions in industry. Please check out my three week prototype, kittytwin, an application to match human faces to cats currently up for adoption in select cities by Principle Component Analysis. I have just taken a position to develop data products and lead the Remote program for Insight Data Science to foster the transition of quantitative PhDs to industry. Research My research interests have always been hinged on the interface of math and biology, and how they can be used in unique ways to exploit the central dogma of molecular biology to glean new knowledge. I have recently taken courses in data science, and am enjoying learning about new tools used to creatively process and interpret data. To view works-in-progress on my github page, click here. As a computational biologist in the Department of Viticulture & Enology at UC Davis, I explored grapevine genomes and the genomes of their fungal nemeses in the form of next-gen sequencing data processing and analysis. Along with others in my lab, I investigated how grapevines respond to pathogens in an attempt to characterize effective methods for management of disease. The primary project during my PhD studies implemented a machine learning classifier to utilize transfer-RNAs and a unique Information Theory-based scoring system to categorize bacteria with genomes containing GC bias. This type of bias violates assumptions in classic phylogenetic methods, published in PLoS Computational Biology. (See figure). |
Program Director, Data Scientist
Insight Data Science
E-mail: [email protected]