Tripal User's Meeting 2018/04/06

Attendees

Stephen Ficklin, Sook Jung, WSU
Shawna Spoor, WSU
Ethy Cannon, ISU
Lacey-Anne Sanderson, USask
Andrew Farmer, Sudhansu Dash, NCGR
Abdullah Almsaeed UTK
Nathan Weeks, USDA-ARS
Malachy O’Connell, ISU
Bradford Condon UTK
Ming Chen UTK

Reminders

  • Consider adding your modules into the Tripal GitHub organization.

    • You can make your own team in the github repo.  Ask stephen/lacey/meg for permission to join the Tripal organization.

    • If not, please tag the repo as Tripal.  

  • Remember to update the status of your modules on the Tripal.info site.  

    • If you need a login let us know. You have to email one of the admins.

  • If you have a new site or major updates to your existing site let us know and we can publish those on the Tripal news feed and twitter accounts.

    • http://tripal.info/sites_using_tripal

    • We need to reach out to new tripal sites to encourage them to get their site on tripal.info etc.  

    • Let’s (Shawna please talk to Stephen) about who in the outreach team is responsible for reaching out.  Ethy will try to reach out to those present at PAG

    • Additional tripal.info needs?  We will add training videos.

    • API in March

  • Send us your logo if you have shared a module with the community and would like it showcased on the Tripal.info site.

  • If you publish on your site please remember to cite Tripal (Sanderson et. al. 2013). A Tripal v3 paper will be submitted this year. (We did not have a Tripal v2 paper).

Agenda

  • Tripal v3 stable release update

    • Planned release for summer

  • Documentation, especially for installation, to help newbies, and for people investigating Tripal for the first time.

    • Might be helpful to have a statement indicating the overall philosophy of Tripal and the problems it is solving.

      • ACTION ITEM: What problems is Tripal solving → put this on the About page of Tripal

        • “Who is doing what with Tripal” ← When I (BC) solve a problem with Tripal, or I make a little module, I try to write a mini blog post about it.  We could encourage submissions from individual sites to blog about how they use tripal in their site for a specific problem etc

    • Could be interlaced in install process ("now you need to...")

      • ACTION ITEM:  Add an issue to issue queue for this.

    • Detect missing parts (e.g. "you have not yet created the necessary files in ...")

      • ACTION ITEM: combine with step above.

    • A list of FAQs would be helpful.

      • ACTION ITEM:  Add an FAQ to the Tripal website and an issue for this.

  • Related: could we re-visit Tripal discussions as the GitHub approach doesn't seem to encourage conversations around questions or ideas the way the e-mail list did.

    • Searchable archives of past discussions: information is now in multiple places (documentation, old e-mail list archives, GitHub, meeting notes, Twitter, ...)

      • TODO:  investigate if this is possible.

      • What are the best tools to get productive in the Tripal environment.

    • Encouraging conversations.

      • Stephen suggests Gitter

        • Used by galaxy and apollo

      • Stephen also mentioned the use of SLACK

        • Not allowed by the USDA

        • Discouraged but could be used through the web?

      • Lacey suggested adding to the template for issues to make them more welcoming to discussion (https://github.com/tripal/tripal/issues/251)

        • The concern seems to be that there are not enough people listening on github. You can’t do a shout-out to all tripal users

        • We can suggest to subscribe to the issue queue but then they get all the issue updates as well which could get annoying

          • Can you subscribe to issues tagged iwth “discussion” only?

      • ACTION ITEM:  Find a tool to encourage more discussion.

  • Someone from the LIS group could describe how gene families are saved in Chado, vocabulary used, et cetera.

    • Relates to the phylotree module -what version do we use?

      • In core is just the loader and tree viewer

    • Gene families started with the idea of displaying trees so they started with the phylogeny module

      • Create gene family definitions based on a HMM which is the basis for multiple sequence alignments

      • Try to follow the chado conventions for multiple sequence alignments

      • HMM defines the consensus sequence then everything else is aligned to that

      • Use featureprops to say that a given gene is part of the same family as everyone else

      • All members will be represented in the featureprop table to ensure that there is one consistent entry point for what genes are part of this family

      • There is a cvterm that indicates the featureprop is a gene family and the value the name/id of the tree (gene family)

      • Use materialized views for the gene search. This could be used to see how the data is stored and are part of the module definition

    • Tripal issue on topic: https://github.com/tripal/tripal/issues/308

    • GitHub Repo: https://github.com/legumeinfo/tripal_phylotree

    • Bradford: can use feature_relationship but then you have to create a feature for the gene family which feels like shoe-horning data into Chado

      • Andrew: If you think of a gene family as an ancestral gene that everything is derived from then it’s a feature like everything else

      • Bradford: feature table has not nullable organism_id so what is the organism for the group? Do you define the common ancestor in the organism table?

      • Sook: as long as we agree then it’s ok. We store populations in the stock table…

      • Andrew: problem with storing the common ancestor is the genus and species fields…

    • Stephen: Core tree loader downloads the full lineage from NCBI for the organisms in the organism table. The higher level nodes are only represented in the tree, not in the organism table.

      • Andrew: might this conflict with our idea of saving common ancestor in the organism table

    • Bradford: feels the phylotree is a pretty good model for a generic group model although it’s currently for trees

    • Stephen: if we can agree on a approach then it will make it a lot easy to share data

    • We should document Andrews method and propose it as a recommendation

    • Bradford: if you gene models are not external then do you have to create and maintain a db+dbxref collection for de novo gene models

    • Sook: Chado is also used by SGN (Naama) and the Venter Institute?

    • As far as suggestions go, we can form our own Tripal group to discuss and then suggest to the wider chado

    • Sook: comment on storing the gene families in the featureprop table.. We shouldnt use the value to link… Perhaps we need a linker table to phylotree for features

    • Andrew: what about people who want gene families but not trees?

    • Stephen: we should organize a meeting with the Chado community about gene families

    • ACTION ITEM: Bradford to organize a time to meet and follow up on his original email

    • ACTION ITEM: Andrew Farmer to make up a more formal document describing how LIS stores gene models

    • ACTION ITEM: Bradford: perhaps we need a blog section of Tripal.info for things like this.

      • This is done by Galaxy. Could be helped by us asking specific people to create blog posts ;-)

      • This could also help with creating a forum for discussion.

      • We do have to be mindful of the searching multiple sources problem

      • If you want to see my tripal posts, they are here: https://www.bradfordcondon.com/tag/tripal/

 

  • Testing

    • Test and Continuous Integration.

      • Runs all the tests and automatically builds a test installation to ensure they all pass. The status is then shown on the github page.

    • In tripal core

    • in your custom modules  

      • Use https://github.com/statonlab/TripalTestSuite

        • Composer tool which just downloads the needed files -doesn’t affect your system at all

      • Creates the testing environment

      • Really easy to install!

      • Provides a special Tripal test class

        • Transaction wrapping of tests

      • Data Seeders :-) to help with ensuring tests have the data they need

        • The test suite will provide a few general ones (like FASTA) but the hope is the community will contribute more

      • Factories allow you to generate 100’s of records in one line :-)

      • Travi-ci.org

        • Can be used to see the tests being run automatically

        • Also makes sure the module installs and builds correctly

        • Requires your module to be in github

        • NOTE: link you github account and it will detect new modules and once you say “test this module” it will detect code changes and re-run tests each time :-)

    • Lots of documentation on the github repo

      • Please submit any issues, suggestions, etc!

    • Meg: This would a Great training topic

      • Ethy: a help desk call with a slower walk through of setting this up

      • Next help desk call on April 20th

        • Divide up into groups and work with you one-on-one

      • Definitely have a break-out for testing if there’s interest

    • Allows us to all use the same testing! You can follow core’s lead easily using the Tripal Test Suite

    • Docker used for this available?

    • Please check out our developer docker manager tool if  you are interested in a tool to manage your docker (skip the learning curve):