Suggested areas of development by the IMPACT/myGrid team included:


For Taverna:

Thread A: Workflow design techniques

Potential tasks

I) Tool integration, continued

1. Create a workflow that includes OCR (you can use the WSDL for Tesseract or the workflow from myExperiment)
2. Add evaluation of OCR result vs ground truth
3. Add binarisation and evaluate OCR results with/without

Thread B: The coding hackathon

Potential tasks

II) Enhance Tavernas XML splitters to take into account documentation in the WSDL
Web services created using the Toolwrapper require the use of XML input/output splitters in Taverna
However, the XML splitter does not expose any documentation or default values that are in the WSDL
You can read more about XML splitters in Taverna here:
Enhance the XML splitter to expose the WSDL documentation within Taverna

III) Parallelization
Taverna already has support for multiple parallel service invocations:
However, you will also require some scheduling mechanism over available endpoints.
Here you can find a Java library that has load balancing support for web services:
Write a simple web service that distributes call to several endpoints

Areas of development of interest by attending developers included:
  • Learning to use the Toolwrapper
  • Gain confidence in applying Taverna in digitisation related contexts
  • Insight into workflow development process
  • Better understanding of IMPACT & Taverna outcomes
  • Crispy chunks of system architecture and implementation
  • Mass-processing with Taverna
  • Workflow language
  • Running Taverna workflows in headless/server mode
  • Plugin development
  • ...