However, this view presupposes that the process being implemented with a WFMS has been modeled in advance, and that, at enactment time, the WFMS is only following what the process model dictates. This is where the rosier view of WFMS breaks down. In real life, both in office and in scientific lab environments, the enactment of a workcase may deviate significantly from what was planned/modeled. In extreme cases, the execution of a workcase may be totally ad hoc. This has given rise to a growing research area, which is concerned with enabling WFMS to help its users deal with these kinds of deviation-from-the-model cases. In this case, a better understanding of what is important in office work and in scientific work is necessary in order to provide the correct functionalities for the WFMS.
In a recent paper, the WASA architecture has been proposed in the context of scientific workflows [MedVW95] (WASA stands for "Workflow-based Architecture to support Scientific Applications"). It discusses properties of scientific work and functionalities that an environment to support scientific work must have in order to be useful. The WASA architecture is currently being used in the context of a scientific workflow in molecular biology, namely in DNA Fragment Assembly. This workflow, the resulting FAT-WASA architecture, and a prototypical implementation of the FAT-WASA system using a business workflow tool are discussed in [MeidVW95].
While [MedVW95] discusses properties of scientific work, and architectures and functionalities for these systems, this paper concentrates on scientific workflow management, its relations to business workflows and ad hoc workflows; it is organized as follows. Section 2 discusses basic properties of workflows in an office environment. Section 3 reviews the basic properties of scientific workflows and relates them to properties of business workflows. There is another kind of workflows discussed in the literature, namely ad hoc workflows. Their relationships to scientific workflows are discussed in Section 4. Concluding remarks complete the paper.
These examples show that in office work situations what is important is not follow the rules but "to get the job done" or even better, to achieve what doing the job would achieve, possibly in a more efficient way than planned. This, in the business workflow literature is called "exception handling" or "situated planning".
The degree of flexibility that scientists have in their work is usually much higher than in the business domain, where business processes are usually predefined and executed in a routine fashion. Assume a scientist decides to filter a data set coming from a measuring device; even if such filtering was not planned for, that is a perfectly acceptable attitude, provided the resulting data is tagged as being the output of the filtering activity.
This example shows what we believe is the most important characteristics of a scientific workflow: as a way of identifying data sets. The details and parameters of a workflow should be added to the data set in order to identify the data. Thus, by accessing this identification tag on a data set the scientist would know how the data was generated (devices, algorithms, time, place), which data manipulation activities were performed.
While flexibility is a major property of scientific work, there are numerous standard procedures that can be assembled to perform complex scientific experiments. An important difference from business workflows is that a scientific workflow is often not completely defined before it starts. The scientist performs some tasks and decides on the further steps only after evaluating the previous ones. These sequences of steps that make up part of a scientific experiment are known as partial workflows. Partial workflows may be re-used in later experiments. Therefore it is obvious that managing partial workflows is an essential goal in scientific workflow management.
The above illustrates that workflow systems can prove invaluable in helping activity tracking, data tagging and documentation, even for experiments performed by a single scientist. This is particularly true for scientists working on computational models; they generate large amounts of data, each produced by changing different parameters in the computer models, that must be properly identified.
There is one other aspect that distinguishes office from scientific workflows: an office workflow must be brought to a "satisfying" end. If a customer cancels his order, that purchase case must be further processed to be brought to an acceptable end state: the production may be rescheduled, the organization may sue the customer for expenses or for breaking the contract. In a scientific workflow, cases may be abandoned at any moment and at any step. If the scientist thinks that there was some contamination of the data, an experiment may be just stopped.
There are different forms of ad hoc flow discussed in the recent literature: ad hoc planning is the case where a particular actor in the workflow may alter the plan of activities of the workcase [BK95][BN95][SMM+94]. This re-planning can be restricted to a particular domain in the organization: the credit checking department proposes a different plan for this workcase because it is in some way a special case, but outside the credit checking department the case will proceed as planned. Or the re-planning can affect the whole plan for the workcase.
A different form of ad hoc flow can be described as "pass the buck." In this case, the flow of the workcase is not planned but at the end of each activity the actor decides to whom the workcase should be sent next. Ad hoc planing has been discussed in the literature as an important way of dealing with specificities of a workcase. The pass the buck mode is discussed by [WB96].
In scientific work, both forms of ad hoc flow seem very important: a group may replan a certain sequence of activities because of characteristics of the data, or because the scientists want to try a new data analysis procedure. Or a solitary scientist may not even replan in advance, but given the results of an activity, decide what to do next. One can see that because the WFMS will manage the data, the scientist may find it more interesting to describe to the WFMS what activity should be performed next instead of just doing it, since in the latter case she would have to manually attach to the resulting data the information on what activity and parameters was used to generate it.
Scientific workflows should also provide another functionality for ad hoc planning, which we call rewind. A scientist may decide after performing a sequence of data analysis activities (say high frequency filtering and outline removal) that a different form of data analysis should have been performed (say principal component analysis). The scientist should be able to rewind the workcase to a step previous to this data analysis sequence and from there perform the alternative data analysis procedure.
The rewind concept should not be confused with the redo concept, which is common in office and software engineering workflows. In a redo, the flow of the workcase is redirected so that a particular activity is executed again. The difference is that in the redo all data additions performed by the subsequent activities are available for the redone activity. For example, in software production workflows it is common to have loops of code/test where the code activity is redone if the test activity detects errors. The code activity has access to all the test results, and in fact depends on it. If it were rewound, the code activity would start again, from the specifications, with no data from later activities available. The rewind functionality is of course based on versioning the data produced by the activities. One would like to be able to restore the full context after (or before) some activity was performed and proceed with another course of actions.
[BN95] R. Blumenthal and G.J. Nutt, Supporting Unstructured Workflow Activities in the Bramble ICN System, in Proceedings of the 1995 ACM Conference on Organizational Computing Systems (COOCS'95), N. Comstock and C.A. Ellis (eds.), pp 130-137, Milpitas, California, 1995.
[BW95] P. Barthelmess and J. Wainer, Workflow Systems: a few definitions and a few suggestions, in Proceedings of the 1995 ACM Conference on Organizational Computing Systems (COOCS'95), N. Comstock and C.A. Ellis (eds.), pp 138-147, Milpitas, California, 1995.
[MedVW95] C. B. Medeiros, G. Vossen, M. Weske: WASA: A Workflow-Based Architecture to Support Scientific Database Applications (Extended Abstract). Proceedings of the 6th DEXA Conference (eds.: N. Revell, A. M. Tjoa), Springer LNCS 978, pp 574-583, London 1995.
[MeidVW95] J. Meidanis, G. Vossen, M. Weske: Using Workflow Management in DNA Sequencing. Fachbericht Angewandte Mathematik und Informatik 23/95-I, Universität Münster, 1995.
[SMM+94] K.D. Swenson and R.J. Maxwell and T. Matsumoto B. Saghari and K. Irwin, A Business Process Environment Supporting Collaborative Planning, in CSCW'94, ACM, 1994.
[WP96] J. Wainer and P. Barthelmess Workcase-centric Workflow Model Submitted to NSF Workshop on Workflow and Process Automation, 1996.