Metagenomic Analysis Script
We have written a script to automate the analysis of metagenomic sequencing data. The script is a wrapper that using the analysis functions of mothur (https://www.mothur.org/).
E4D_RT
Electrical resistivity tomography (ERT) is a well-established method of imaging the electrical conductivity structure of the subsurface. Electrical conductivity is a useful metric for understanding the subsurface, because it is governed geology, minerology, and geochemistry. ERT has been used characterize subsurface geologic structure, and to monitor the evolution of subsurface processes ranging from microbially induced mineral transformations to steam migration through unsaturated sediments. Such time-lapse imaging typically involves collecting data during the process of interested, and then processing the data to produce images days, weeks, or months after the process has occurred.
Smart Monitoring and Diagnostic System (SMDS) (Methodology for Automated Detection of Degradation and Faults in Packaged Air Conditioners and Heat Pumps Using Only Two Sensors)
The invention was created in the process of developing a system known as the Smart Monitoring and Diagnostic System (SMDS) for packaged air conditioners and heat pumps used on commercial buildings (known as RTUs). The SMDS provides automated remote monitoring and detection of performance degradation and faults in these RTUs and could increase the awareness by building owners and maintenance providers of the condition of the equipment, the cost of operating it in degraded condition, and the quality of maintenance and repair service when it is performed. The SMDS provides these capabilities and would enable conditioned-based maintenance rather than the reactive and schedule-based preventive maintenance commonly used today, when maintenance of RTUs is done at all. Improved maintenance would help ensure persistent peak operating efficiencies, reducing energy consumption by an estimated 10% to 30%. The SMDS detects RTU performance degradation and select operational faults and provides estimates of the accrued energy and cost impacts weekly after detection of a performance issue. It uses only two sensors per each RTU, outdoor-air temperature and total power demand, both collected at one-minute intervals. Software code based on the methodology and algorithms developed by PNNL in a collaborative develop effort with NorthWrite Inc. and Universal Devices implement the detection processes and estimation of energy and cost impacts. The methodology, algorithms and original prototype software were developed exclusively by PNNL. Two versions of the SMDS, the Hardware SMDS and the Cloud SMDS, have been prototyped and tested on buildings in the field and/or on data from buildings. Both versions use the same methodologies for detection of performance degradation and for detection of specific faults. The inventive features are the methodologies developed for 1) detecting performance degradation in cooling by packaged air conditioners and heat pumps, 2) detecting operational faults in these units, and 3) inferring building operating schedules, all using only sensed measurements of the local (e.g., on the rooftop of the buildings where the RTUs are installed) outdoor-air temperature and the total power consumption of the RTU.
Simulations of State for Biology
THe invention is the development of a simulation capability that allows the prediction of concentrations of chemical species, rates, and energy requirements of systems of coupled reactions that does not require the use of rate constants. The technology, often refered to as simulations of state or state simulations, is based on the concept of simulation states (outcomes of reactions) rather than reactions themselves. This approach has been widely used to model equilibrium states. We are adopting it to model non-equilibrium states. This technology is particularly attractive for the domain of biology because the determination of rate parameters for a simulation (rate constants) is incredibly difficult. Other approaches, specifically flux-based approaches, have limited predictive power and are actually more apporpriately described as high-end data analysis methods.
Lipid Mini-On (NIH GRANT NO. HL122703, iEdison No. 0685901-18-0005)
The Lipid Mini-On (Lipid Mining and Ontology) is a R-based script. It will mainly be utilized by users through a Graphic user interface developed as a shiny app (https://shiny.rstudio.com/). The software allows users to mine and perform enrichment of lipid based on their molecular characteristics, classification at multiple levels.
PerSeq: A workflow for functional and taxonomic classification of sequences
Functional annotation in addition to linking those functions to taxonomic groups presents a challenge due to many established tools performing one task and not the other creating disparate downstream data. PerSeq aims to address these challenges by performing local alignments against a reference database annotated for both aspects, function and taxonomy, to annotate sequences individually. Downstream results are merged into count tables facilitating complex analyses based on observed function or functional potential in addition to being able to separate out function by taxonomic level to determine metabolic contributions by present organisms.
Unique Building Identifier Generator (demo site and source code)
Physical address, which have been used by humans for centuries to locate places in the physical world, are not sufficient to identify buildings or places in a digital world. Difficulty in joining data from disparate sources in a single location, due to the different ways of identifying buildings using addresses or local numbering systems, becomes a major obstacle to information and knowledge exchange. We developed a grid reference system-based Unique Building Identifier (UBID). The method converts a building footprint on a map to a unique code. Unlike geographic coordinates (latitude and longitude), which require high precision and can only locate points on the earth, the UBID code records two-dimensional information about a building footprint in a concise and practical manner. The UBID itself is neither a database nor a data schema. Rather, it acts as an external common key between databases to facilitate data mapping. The developed ruleset ensures that individual implementers can reach the same conclusion utilizing publicly available digital maps. It can also automatically resolve discrepancies, caused by physical space or structure changes over time, during data exchange. The method is also flexible and scalable to identify a portion of a building or a group of buildings.
Mercat: a versatile counter and diversity estimator for data base independent property analysis obtained from whole community sequencing data
Mercat highly scalable property software package for robust of analysis of features in next-generation sequencing data and observed unique peptides from metaproteomic data. Mercat is offered in python 3.5/anaconda3, is multiparellel and is easily installed using bioconda/conda recipes. Mercat inputs include assembled contigs, raw sequence reads from any platform, and unique peptide files obtained from proteomics with feature abundance counts tables. Mercat is the only software available that allows for direct analysis of data properties without a data-dependent search tools such as BLAST or diamond for compositional analysis of whole community shotgun sequencing (e.g. metagenomes and/or metatranscriptomes) or metaproteomic data.
eSTOMP Open Source
The eSTOMP-WR mode is designed to efficiently simulate isothermal variably saturated flow (Richards Equation) and multicomponent reactions in porous media on the most powerful computers available.
GIBS: A Grand Canonical Monte Carlo simulation program for computing ion distributions around biomolecules in hard sphere solvents.
The GIBS software program is a Grand Canonical Monte Carlo (GCMC) simulation program (written in C++) that can be used for 1) computing the excess chemical potential of ions and the mean activity coefficients of salts in homogeneous electrolyte solutions; and, 2) for computing the distribution of ions around fixed macromolecules such as, nucleic acids and proteins. The solvent can be represented as neutral hard spheres or as a dielectric continuum. The ions are represented as charged hard spheres that can interact via Coulomb, hard-sphere, or Lennard-Jones potentials. In addition to hard-sphere repulsions, the ions can also be made to interact with the solvent hard spheres via short-ranged attractive square-well potentials. In GIBS, the excess chemical potential of ions is computed using the adaptive iterative GCMC algorithm developed by Malasics and Boda (Journal of Chemical Physics, 132, 2010). The standard Metropolis algorithm is used to sample the distribution of ions, which determines the acceptance rates for inserting, deleting, and displacing an ion at each simulation step. The site for inserting an ion is randomly selected based on a cavity-biased, grid-insertion algorithm developed by Woo et al (Journal of Chemical Physics, 121, 2004). GIBS can handle systems of different ion sizes, and implements an efficient algorithm to track the list of cavities available for each particle type(ion and solvent hard spheres) after every single-particle insertion/deletion/displacement, and to quickly sample this list and select the site for inserting a particle. The GIBS program was written by Dr. Dennis G. Thomas in collaboration with Dr. Nathan A. Baker, at Pacific Northwest National Laboratory. The program was developed as part of projects funded by the National Institutes of Health through R01 Grant Nos. GM076121-04S1 and GM099450.