Lacey Sanderson (Usask)
Stephen Ficklin, Chun-Huai Cheng (WSU)
Meg Staton, Nathan Henry, Ming Chen (UTK)
Nathan Weeks (USDA-ARS/Ames)
Vijaya, Chris Childers (USDA/NAL)
Andrew, Pooja (NCGR)
Gerard Lazo (USDA-ARS/Albany)
- Non Chado Data?
-
-
Andrew Farmer: a bit of prototyping trying to associate URLs to BAM files via biomaterial table. Metadata for experiment for RNA-seq data would be stored there. Focusing on importing data from NCBI BioSample/BioProject.
-
Stephen Ficklin: a file module.
-
Nathan Weeks: Brilliant Gallery for showing image data. Also using customized version of FileBrowser Drupal module, for browser based access to Excel spreadsheets with metadata.
-
Chris Childers: documents (tutorials/guides/etc.) including flat files, PDF documents and other documents.
-
Gerard Lazo: Drupal wrapper around GrainGenes. Looking to migrate.
-
-
RNA-Seq Module (Meg Staton)
-
Adding RNASeq expression data to chado is not intuitive. Here is a google doc describing some potential ways forward.
-
Andrew Farmer: was importing XML from BioProject/BioSample and meta data fit nicely into existing Chado tables. Did not represent experimental design. Interested in pursuing work on RNA-seq.
-
Should we request table changes or new module for RNA-Seq Data
-
Perhaps a link between biomaterial + library and biomaterial + stock.
-
SolGenomics Network is developing and RNA-Seq module. Perhaps not in Chado but is willing to talk Chado.
-
Where to put quantitative information. Link features to samples and linking expression quantities.
-
Meg can make available to the code for others to see.
-
Andrew’s group can share the code with importing form NCBI BioSample/BioProject.
-
-
Page redesign & gene pages.
-
Would folks be interested in i5K templates for gene pages? They can share.
-
-
Site-wide indexing
-
Summary:
-
Goal: provide a recommended method, perhaps a module, and set of documentation to help others setup site-wide indexing on their site.
-
Discussed different implementations for a site-wide searching method: Drupal default search, Apache Solr, Elastic Search
-
Needs:
-
Only data made available to end-user should be indexed. Not necessarily everything in Chado
-
Search results should highlight matched words on page.
-
Would like to integrate with a “basket” system for later transformation of data.
-
-
Vijaya Tsavatapalli of USDA/NAL working on an Apache Solr module
-
Valentin Guignon (Bioversity International, CGIAR) working on an Elastic Search module.
-
-
Action Items
-
Vijaya will work on administrative section of her Apache Solr module to provide more customization.
-
Documentation for the module on tripal.info
-
More input from Valentin.
-
-
Chado/Tripal Sustainability (Stress Testing)
-
Summary:
-
Goal: plan for, test and attempt to resolve issues related to poor performance:
-
Syncing takes too long
-
Setting URL alias takes too long
-
Loading GFF?
-
Missing indexes in Chado
-
Bulk loader can be slow for large data sets.
-
ND tables: takes a large number of records for a single genotype/phenotypes.
-
MViews bloat the database.
-
-
Ideas:
-
Stress test by creating dummy data to explore limits of Chado.
-
Improved interface for MViews: better way to share, track FK constraints between them.
-
Improve accessibility of documentation for solutions to common problems.
-
Maybe something similar to testing (https://travis-ci.org/, https://jenkins-ci.org/). Looks like a Drupal module also available for testing. https://github.com/sonnym/travis-ci-drupal-module-example
-
-
Action Items
-
Chris created a python script takes a GFF and set of FASTA files (assembly/CDS/cDNA/peptide) and generates a set of new data.
-
Andrew is working on a test server at iPlant.
-
Lacey: it will likely be a month before I get some time to work on this :(
-
Stephen: will see if there’s another solution. Can we get AWS credits for it.
-
Stephen & Chun-huai to discuss creating data for data at Mainlab.
-
Some needed improvements
-
Stephen try to fit in improvements the Materialized Views + documentation in the coming months.
-
Restrict setting url alias to a limited set of items. Crashing because of memory issues.
-
- Stephen to work in patch into Tripal package so folks can at least apply the patch for syncing that Nathan created.
-
-