Tripal Developer's Meeting 2015-09-01

Meeting Date
Attendees

Chris Childers, Vijaya Tsavatapalli (USDA-ARS-NAL)

Valentin Guignon (Bioversity International, CGIAR)

Ethy Cannon (Iowa State University)

Meg Staton and Ming Chen (University of Tennessee Knoxville)

Stephen Ficklin, Chun-Huai Cheng , Taein Lee (WSU)

Lacey Sanderson (Usask)

Andrew Farmer, Alex Rice, Sudhansu Dash (NCGR)

Eric Rasche (TAMU)

Jeremy DeBarry (iPlant)

Nathan Weeks (USDA-ARS Ames)

 
  • Chado/Drupal Stress Testing

    • Chris Childers provided example script for generating sample genomic data: https://github.com/NAL-i5K/test_data_gen

      • Put scripts for generating test data together in a github account.

      • Is there a place on the database where we could store database dumps?  A repository to see how folks store data in Chado?

        • The Galaxy folks solve this by having individual instance admins give a 5-10 minute talk about “how we’ve set up our instance” at our monthly admin meetings.

    • Andrew Farmer exploring a test server for testing Chado/Tripal at iPlant?  Jeremy DeBarry here to discuss options.

      • Potentially useful: iPlant atmosphere.  Could set up an atmosphere instance and play with to see if it would work for our stress testing trials.

      • Can tie into iPlant data store.

      • Potentially a place for a Chado dump.

      • TODO: Stephen will explore iPlant atmosphere.

    • Can a new community database using Tripal use iPlant for hosting the site?

      • long term cost of ownership for maintenance of web servers is an issue.  iPlant could be used for a short term option…  

      • iPlant can help with data and metadata management.

        • Allocation for a user is 100GB by default, but allocation can be upgrade upon request.

        • users can add “tuples” to annotate data and can add “template” based metadata, with custom templates designed.

          • template based metadata will use ontologies.  

        • Will be bringing online bulk metadata association using CVS file.

        • iPlant Data commons will have minimal meta data standards to promote re-use and discoverability of the data.  

  • CV module & ontology search

    • SO loading link is old, should be https://github.com/The-Sequence-Ontology/SO-Ontologies

      • TODO:  need to update the sequence ontology to the new location.

    • no PURL prefixes in db records

    • Relationship Ontology has changed such that it no longer loads.  Missing namespace, has fields not defined in OBO v1.2 format, includes terms for other ontologies.

      • TODO: stephen will follow up with RO about issues

    • ENVO ontology has terms with no names that break the Tripal loader; have not heard back from the group about the meaning and purpose of these, don’t know if it’s proper OBO.

      • TODO: Ethy will follow up with ENVO.

    • Ideas for a Tripal ontology search module.

    • Will it be ontology hierarchy aware?

      • Yes, that is the fundamental purpose.

    • Latest OBO specification is 1.4. This spec does allow terms without names whereas 1.2 did not. So the Tripal loader should be updated accordingly.

  • Developer Best Practices & Tips

    • Tips for debugging Drupal and Tripal modules

    • Review techniques for customization (to keep customizations out of core Tripal code)

    • Handling (or avoiding) inter-module dependencies

      • TODO: talk on this one next month.

  • Site-wide Indexing.

    • i5K group working on Apache Solr module that can be shared with others.  With customizable admin interface.

    • Will refactor to make more generalizable to share with others.

    • Valentin has an ElasticSearch module, but it’s not yet available for others to use.  

    • Valentin:  could it be possible to separate key/value pairs used for indexing and the actual underlying infrastructure consumes the key/value pairs regardless of the indexing method.

    • TODO: valentin will send an email with suggestions.

  • Tripal v3.0 (Drupal Entities, Web Services, Semantic Web, Controlled Vocabulary centric)

    • Currently under design.  

    • Will support Drupal Entities

    • Centered on controlled vocabularies rather than Chado.

      • Controlled vocabularies are more intuitive than Chado-centric organization

      • Controlled vocabularies lends itself to bringing Tripal into the realm of the semantic web.  Thus allowing web services that afford data exchange between sites.

    • Will necessitate page redesign, where we will try to account for recent input on pages (e.g. gene pages).

    • Benefits

      • Continues to use Chado as the primary database back-end storage.

      • But, supports integration of non-Chado data stores (e.g. noSQL options)

      • Presentation of data is more intuitive… not Chado-centric.

      • Simplified admin interface…. not organized according to Chado tables.

      • Drupal entities are manageable using the Drupal web interface, therefore the need for template customizations is greatly reduced.

      • Web services and page displays contain the same information.

      • We have greater control over “syncing” and setting of URLs so we can better control performance.

      • Potentially smaller code-base for Tripal, hopefully supporting faster times for releases.

    • Migration Plans

      • This represents a major change in how data is presented. We will do everything in our power to make transition from Tripal v2.0 to v3.0 as seamless as possible.

      • We will provide a “beta” test period for folks to help us resolve issues related to conversion.

    • Goal to have an alpha release by end of the year.

    • TODO: stephen to share design spec for that

  • Tripal Codefest at PAG (Meg)

    • How much interest?  Stephen, Lacey, Meg, Valentin, Ralph Mosely.  Some folks interested but too busy.

    • Need a room at PAG conference.

    • TODO: doodle poll Meg.

  • Migration of Tripal code to GitHub (but continued syncing with Drupal git).

    • TODO: migration of Tripal to Github.

  • Extension module paper.

    • TODO: item to discuss next month.
Meeting Type