Tripal Developer's Meeting 2015-08-04

Meeting Date
Attendees

Lacey Sanderson (Usask)

Stephen Ficklin, Chun-Huai Cheng (WSU)

Meg Staton, Nathan Henry, Ming Chen (UTK)

Nathan Weeks (USDA-ARS/Ames)

Vijaya, Chris Childers (USDA/NAL)

Andrew, Pooja (NCGR)
Gerard Lazo (USDA-ARS/Albany)

  • Non Chado Data?
    • Andrew Farmer:  a bit of prototyping trying to associate URLs to BAM files via biomaterial table.  Metadata for experiment for RNA-seq data would be stored there.  Focusing on importing data from NCBI BioSample/BioProject.

    • Stephen Ficklin: a file module.

    • Nathan Weeks:  Brilliant Gallery for showing image data.  Also using customized version of FileBrowser Drupal module, for browser based access to Excel spreadsheets with metadata.

    • Chris Childers: documents (tutorials/guides/etc.) including flat files, PDF documents and other documents.

    • Gerard Lazo:  Drupal wrapper around GrainGenes.  Looking to migrate.

  • RNA-Seq Module (Meg Staton)

    • Adding RNASeq expression data to chado is not intuitive. Here is a google doc describing some potential ways forward.

    • Andrew Farmer:  was importing XML from BioProject/BioSample and meta data fit nicely into existing Chado tables.  Did not represent experimental design.  Interested in pursuing work on RNA-seq.  

    • Should we request table changes or new module for RNA-Seq Data

    • Perhaps a link between biomaterial + library and biomaterial + stock.

    • SolGenomics Network is developing and RNA-Seq module. Perhaps not in Chado but is willing to talk Chado.

    • Where to put quantitative information.  Link features to samples and linking expression quantities.

    • Meg can make available to the code for others to see.

    • Andrew’s group can share the code with importing form NCBI BioSample/BioProject.

  • Page redesign & gene pages.

    • Would folks be interested in i5K templates for gene pages?  They can share.

  • Site-wide indexing

    • Meeting notes from 2015/7/24

    • Meeting notes from 2015/7/16

    • Summary:

      • Goal: provide a recommended method, perhaps a module, and set of documentation to help others setup site-wide indexing on their site.

      • Discussed different implementations for a site-wide searching method:  Drupal default search, Apache Solr, Elastic Search

      • Needs:

        • Only data made available to end-user should be indexed. Not necessarily everything in Chado

        • Search results should highlight matched words on page.

        • Would like to integrate with a “basket” system for later transformation of data.

      • Vijaya Tsavatapalli of USDA/NAL working on an Apache Solr module

      • Valentin Guignon (Bioversity International, CGIAR) working on an Elastic Search module.

    • Action Items

      • Vijaya will work on administrative section of her Apache Solr module to provide more customization.

      • Documentation for the module on tripal.info

      • More input from Valentin.

 

  • Chado/Tripal Sustainability (Stress Testing)

    • Meeting notes from 2015/7/22

    • Summary:

      • Goal:  plan for, test and attempt to resolve issues related to poor performance:

        • Syncing takes too long

        • Setting URL alias takes too long

        • Loading GFF?

        • Missing indexes in Chado

        • Bulk loader can be slow for large data sets.

        • ND tables:  takes a large number of records for a single genotype/phenotypes.  

        • MViews bloat the database.

      • Ideas:

      • Action Items

        • Chris created a python script takes a GFF and set of FASTA files (assembly/CDS/cDNA/peptide) and generates a set of new data.

        • Andrew is working on a test server at iPlant.

        • Lacey: it will likely be a month before I get some time to work on this :(

        • Stephen: will see if there’s another solution.  Can we get AWS credits for it.  

        • Stephen & Chun-huai to discuss creating data for data at Mainlab.

        • Some needed improvements

          • Stephen try to fit in improvements the Materialized Views + documentation in the coming months.

          • Restrict setting url alias to a limited set of items.  Crashing because of memory issues.

        • Stephen to work in patch into Tripal package so folks can at least apply the patch for syncing that Nathan created.
Meeting Type