alignments with the mouse genome, integrated with the UCSC browser, and allowed building up set queries using the results of previous queries (the birth of the History system)
scratch in Python At this point we made several key design decisions that (in hindsight) determined whether we would succeed or fail (We got very lucky)
flat files in various common formats This meant existing tools could be integrated easily because they did not need to change the data formats they work with or interact with a database It also meant that when high-throughput sequence data suddenly came along (2005), we were prepared to deal with data at that scale easily
build an abstract configuration driven interface to command line tools We did this to make our lives easier, we had many analysis tools lying around that we didn’t want to rewrite for Galaxy But this was equally appealing to other developers who could now easily make their tools available to Biologists
to be setup on most systems in minutes We primarily did this to engage tool developers, making it as easy as possible to develop new tool wrappers for contribution We envisioned those tools would all be made available through the main Galaxy service But it also provided a scaling strategy, making it easy for sites to run their own Galaxy
everything we do under a liberal open-source license (no copyleft), and only support open-source tools on the main instance Our primary development repository is exposed to the public, initially hosted by us but later moved out third parties (bitbucket.org, and then github.com) The software is distributed only through version control, with a rapid release cycle (at least monthly)
Visualizations beyond trackster - Expansion beyond genomics - Massive tool suite contributions and updates - Interactive environments - Training & Tours - … uhhh … so much more
manage datasets and the 100k scale. At some point users no longer care about seeing the individual history, workflow, just specific results. New: many workflow view, for monitoring the execution of many workflows in parallel New: reports — generate summaries of executing workflows, multiple workflows, from user templates with continuous updates
we do now Dataset complexity, heterogeneity, dimensionality and all only increasing The analysis decision process requires more support for data exploration, both visual and interactive data manipulation
The future Galaxy embraces real time and continuous communication. From exploratory analysis to batch job tracking to automatic reports, Galaxy needs to be responsive and informative. The future Galaxy is increasingly interactive The future Galaxy better supports transitions between analysis modes.