Lacey-Anne Sanderson (both days)
Katheryn Buble (both days)
Valentin Guignon (both days)
Nic Herndon (both days)
Emily Grau (both days)
Eric Wafula (both days, beginner)
Prakash Timilsena (both days, beginner)
Meg Staton (meeting only)
Abdullah Almsaeed (both days)
Ming Chen (both days)
Steven Cannon (day 1 and first half of day 2)
Eliot Cline (both days)
Sudhansu Dash (day 1 and first half of day 2; beginner)
Chris Childers (both days)
Ethy Cannon (1 ½ days)
Qiaoshan Lin (both days)
Sofia Robb (both days)
Taein Lee (both days)
Victor Unda (both days)
Friday, January 13
8am - 12pm Round-table Discussion
12pm - 1pm LUNCH
1pm - 5pm Free Collaboration, continued Round-table Discussion
- Tripal 3 update
- Full stable release: May 2017
- Just about finished main core functionality done
- Uses the semantic web and controlled vocabularies to exchange data between Tripal sites as well as to expose it to the outside world
- Uses controlled vocabularies for everything: column names are given cvterms as well as relationships, etc.
- Content created by type (ie: gene) rather than by chado table (ie: feature)
- All chado content is now fields which allows you to use the web interface to reorder and change display rather than needing a php template
- What you see on the page is what is available on your web services
- Thus your web services are tailored to your site
- Beautiful upgrade process that allows you to upgrade immediately while keeping your nodes. Then you can upgrade on a per content type basis
- You can also use your old templates on entities
- Tripal v2.3
- Victor has started unit testing module which will allow much quicker releases
- Data type evolution: dealing with NGS type of data with Tripal and what about Chado? (Valentin)
- Includes: Genotype analysis and viewing tools: use cases and survey of tools now under development (Ethy)
- How to pick sets of germplasm efficiently
- Core data storage and extraction problem
- How to display the data to the user: zoomed in, haplotype viewer, etc.
- Speed, avoiding server overload (can some analysis be done on client side?)
- Valentin’s team stores its NGS data in MongoDB
- Uses tomcat on top of MongoDB
- Genotypic data associated with markers
- Lacey uses PostgreSQL (tested 5 billion: 5 million SNPs x 1000 germplasm)
- Ethy: MaizeGDB
- Mini, incomplete slides on other tools here: https://docs.google.com/presentation/d/1TVt74oQi3EON6DtQlcjZDirJpw5R53_J...
- HDF5 in the backend (TASSEL)
- MaizeGDB SNPversity is still having troubles with some of the larger jobs completing
- Flapjack is great but is a standalone java application
- Support for VCF is good (easier for researchers)
- VCF export? So far, not wanted, prefer matrix formats now, but need is likely
- Tripal Job launcher (Ethy)
- Can’t limit the number of parallel jobs
- This one shouldn’t be too difficult
- Drupal module cron queue
- Would be nice if job launcher could send jobs to a separate system
- Would be nice for the job launcher to respect priority
- Need a lot more fine-grained control about which jobs can be run in parallel and which can not
- Cannot cancel jobs through the job interface
- Job log is not extremely helpful -date/time for when messages are logged
- About 5 are using the drush daemon
- Tripal Elasticsearch launches multiple queues with
- Import/export module: extend Tripal Download API? (Valentin)
- Sofia uses perl to write SQL
- Ethy uses perl
- MainGroup lab has a PHP chado loader https://www.rosaceae.org/mcl
- RNAseq has it’s own internal loader in php
- Data checking as you go, perl interactive loaders
- Also need to ability to share these loaders with other groups
- Perhaps it would be useful for the API to be a PHP library/class-based so it would not need drupal
- Mention this on the mailing list
- Some people have a Tripal site solely for loading data, not for content display
- Tripal cv loader fails for GO.
- Probably due to a chado stored procedure
- Disable stored procedure then run chado’s perl set-cvtermpath script by hand? ISU will test.
- Tree Visualization of multiple species relationships using the organism and phylo tables (Chris)
- There is a newick file loader in Tripal 2.1 that populates the phylo* tables
- This one will pull lineage from NCBI taxonomy; documentation embedded in bulk loader documentation. Taxonomy/Organism linker (http://tripal.info/node/109 )
- Can be used to store taxonomy tree
- Has a visualize of a tree
- LegumeInfo phylotree module here.
- This one more based on gene families
- How indices of phylonodes are computed: http://archive.oreilly.com/pub/a/network/2002/11/27/bioconf.html
- And have look to the picture there: http://archive.oreilly.com/pub/a/network/2002/11/27/bioconf.html?page=2
- Tripal 3 and entity permissions based on drupal users and roles (Sofia)
- Will be very similar but will be on an entity-bases
- Would like the permissions moved as well **
- Will it handle permissions on a per-node basis?
- Need documentation on how to do this in Tripal 3
- Some people are using Organic groups
- Galaxy module and Data Exchange
- Trying to integrate Tripal with Galaxy so that users can see a Tripal interface but run galaxy workflows
- Uses webforms to create the interface for workflows
- Module queries galaxy and creates the webform for it
- Then admin can go in and tweak various things like defaults and help text
- Parameters can be re-arranged and grouped, etc.
- BLAST Module https://github.com/tripal/tripal_blast
- How many people are using it?
- Is it meeting your needs?
- BLAST module new feature: filter database list by organism. (Sofia)
- Overview of BLAST at PeanutBase and LegumeInfo here.
- Slides also show CViTjs
- Shows features of a gff3 file in the context of the whole genome
- blast https://docs.google.com/presentation/d/1iKnHgVyeGWe2pE2OFrjTD_FHb7yTIGslWGiPqQc5qX4/edit?usp=sharing
- BLAST at KnowPulse (current core module)
- Consumption of Web services planned, especially of CoGe, which has a nice REST API and a way of grouping target databases that are of interest to your users.
- Question from LegFed meeting: if target is a set of genomes, could multiple genomes be displayed on multiple instances of CViTjs?
- Meta data submission system (Chris)
- Drupal forms to control submission of metadata
- Data is not in chado
- Chado Multi-chado
- Needs reviews on Drupal.org --Please help by reviewing this module!
- Supports multiple chado databases attached at the same time
- Each user/session can only access one chado at a time
- Elastic search
- Site-wide search -much faster than Drupal views
- Dorries group wants to be able to use it with multiple sites on the same server
- Now available
- Expression module
- Creates biomaterials (reflects NCBIs concept)
- Separately you can load in your gene expression data -beautiful heatmap visualization
- Has an independent page where you enter a number of genes
- Builds a two dimensional heatmaps
- basket/cart functionality would be helpful
- Does anyone want biomaterials separate from expression?
- Search by normalized values above or below a threshold
- Might want to store p-values
- Sofia volunteered to be a tester for search functionality
- Hackathon produced an implementation of the Tripal Download API to download the expression values for a given node
- New functionality for blast analysis and interpro and upgraded the go module
- Basket/Cart functionality
- What are examples of existing carts on biological (or not) sites that people like?
- Valentin - how is flag module working out? Should we continue to leverage that?
- Flag module - working well with Musa to hold stock and 3 other entities, can apply actions on cart
- Long term goal: workspace with multiple user-provided and db-source datasets
- Think about a chart focused on entities rather then nodes so that it works with Tripal 3
- Do we want to ensure a cart can only contain one subtype (ie: only features (Tripal2) or only genes (Tripal3))?
- What do you love/hate about Tripal?
- Difficult to become proficient
- Drupal can be challenging to learn.
- Very Large (improved in Drupal8/Tripal 3?)
- Reusability of code and functionality
- Need a sustainability plan
- NSF research coordination network grant
- Stephen has ideas from DIBBS meeting about long term funding goals for NSF
- Chado (several versions) instantiation/update made easy
- Online documentation
- perhaps develop documentation standards for Tripal modules.
- currently have tripal.info - could pull in extension docs as well
- Is there a way to leverage read the docs functionality to combine all the documentation together?
- Mailing lists
- Idea: remove all mailing lists except announce. All bugs/requests routed through issue queue on github
- Seems ok with everyone
- Can become part of organization and auto-watch all repos without adding their own repos
- Interproscan html output: Can we add functionally to incorporate this into the interpro scan feature node/entity tab? (Sofia)
- This now has visualizations thanks to A. Bretaudau
- Feature Request: Interpro scan module -being able to add terms outside GO terms, etc. For example Reactome, KEGG. (Sofia)
- These are native options to IPS, need to add code to deal with these new options.
- KAAS/KEGG loader required output that is no longer available from KEGG. I have hacked a method for downloading the KEGG output with a shell script using curl and I have modified the KEGG loader to work with this new downloaded data. But I loose all the links. (new downloaded data does not have any links). Would be nice to get links back in. (Sofia)
- I have an undergraduate student working on this, hope to have something to release within a month or two (Meg)
- Possible Problems/Bugs:
- Issues with updating custom cv terms. Might be my fault. Do I/can I reload my obo files to update terms. I get errors when trying this. (Sofia)
- Can the feature names in Tripal/Chado contain more than one word? Some NCBI GenBank names have multi work name/definitions and do not have a gene symbol. (Sofia)
- Yes they can. Many of mine do :-) Assuming you mean can they contain spaces?
- Thank you!
- Can someone talk about he JBrowse Tripal module? (Sofia)
- Example gene page with JBrowse iframe: https://i5k.nal.usda.gov/OFAS025035
- NOTES: https://docs.google.com/document/d/12EYDxl9gC7nHHJsXkaRAPrpRnFvX1nmyjEE8bzQHDtY/edit?usp=sharing