Original text by Lars Pind and Khy Huang.
This document describes the software design decisions behind the ]project-open[ workflow.
Related documents include:
It's not going to be possible to build all of these applications just by doing some UI tweaking on top of the workflow package alone. Rather, these packages will be made much skinnier and development much faster and less risky by outsourcing some of the more intricate parts to the worklflow package.
There's one part of the workflow package for each of the goals above. The engine is implemented as a data model and a PL/SQL API, with a thin wrapper for the API in Tcl. The user-interfaces are implemented as AOLserver Tcl pages.
Discussion is currently being undertaken of the scope of the workflow package. We currently fulfill all the requirements of the Content Management System, which only uses the engine data model and API. The user-interface part is not yet done, and we need to polish the requirements and design for that.
Historically, every module has implemented its own mini-version of workflow enactment code. Ticket-tracker is a prime example, but other examples include order fulfillment and customer service in the ecommerce module, the recruiting pipeline in the intranet module, and user approval in the OpenACS core.
This has the benefit of making it possible to completely tailorize the workflow to the needs of the application. But there are many disadvantages: The user has to visit several different URLs to figure out what's on his plate; the user experience will be different for each module, since each module developer will have done things slightly differently; you can't alter the process without hacking the module code directly; the module developer needs to write custom code to provide administrative features such as checking on the general performance of a workflow process; and generally, we can spend the time saved by speeding up development of workflow-centered applications on providing even better end-user and administrative tools that will benefit users of any workflow-based application.
We've also looked at, and strive to adhere to, the standards set forth by the Workflow Management Coalition.
There's also a host of commercial workflow software implementations. We haven't looked at them, but we'd very much like to. I'm not sure how easy it'd be, given that they're commercial, but at least we should have a chance to look at Oracle Workflow.
For the underlying conceptual model we considered one based on finite state machines (FSM) and one based on Petri Nets. We decided against finite state machines because FSMs don't support the routing constructs that we needed (parallel, implicit and explicit conditional routing). We also considered some extension of FSMs, but couldn't find any well-defined ones, and we didn't want to invent some model where the semantics might not have been well-considered enough. A workflow is really a distributed protocol, and as any computer scientist knows, these can be really hard to formalize in a way, that ensures there are no dead-locks, dead states, etc. Petri Nets provides a formal mathematical model with an abundance of proven analysis techniques that can be used to both ensure static and dynamic properties of the workflow. Besides, W.M.P. van der Aalst made such a good argument in The Application of Petri Nets to Workflow Management that we felt confident it was the right decision. Parts of this argument is also that there are standard extensions to Petri Nets that are very useful for use with workflows, such as color, time and inheritance.
One downside of the Petri Net approach is that the model is more difficult to grasp than the finite state machine. We've tried to accommodate that problem by differentiating between a "simple" and a "non-simple" (or complex) workflow process. "Simple" basically means we can provide a very intuitive, easy-to-use and understand UI for defining workflows, that doesn't require the user to invest any time in learning how the workflow package operates. The idea is that if you only want to do simple, straight-forward things, we can abstract away from the complexity. However, if you want to do more complex things, then there is no way to avoid knowing what you're doing. Not because our model is hard, but because the problem of formalizing a workflow so it accurately models what you want without introducing problems such as deadlocks or dead states.
Since the engine will need to be tightly integrated with and customized to suit a specific package, we need some callback mechanism that'll allow custom code to get executed by the engine. The code had to be executed in the database, since we wanted to allow for non-AOLserver/Tcl clients to use it. We considered three general approaches to that: (1) take an arbitrary chunk of PL/SQL code as a varchar and pass it to
execute immediate, (2) take the name of a PL/SQL procedure/function with a fixed signature and call that with arguments depending on the situation, or (3) take the name of a PL/SQL procedure/function, and have a dependent table hold the arguments it takes and the values to supply. We chose (2) because it is simpler than (3) and provides more structure and static checkability than (1).
We have the concept of workflow attributes, which directs the execution of the workflow. For example, if one of the tasks is to approve or reject something, we'll typically have an
approved_p attribute that takes a boolean value, and can be branched on in an or-split. We considered (1) having a skinny table as part of the workflow package to hold the values, (2) using the OpenACS kernel metadata/attribute system or (3) have a clob column in the case table that holds an XML document with the values. We originally did (1), then migrated to (2) when the kernel was somewhat ready for it, so we don't reproduce functionality. Plus doing it with the kernel offers a way abstract away from whether it's implemented using a skinny table or a fat table. The disadvantage is that there are severe limits as to what values can be represented, e.g. you can only represent atomic SQL datatypes. In particular, you can't (nicely) store compound datatypes, such as lists or arrays. Lists are especially helpful for storing assignments, i.e. a list of party_ids. The disadvantage of (3) is that you get no help from the database in type checking, and that queries and updates against the data are bound to be much slower than with a table.
At some point, we implemented manual, per-case assignments by setting and querying a workflow attribute. Conceptually, this is very nice, but with the current implementation of workflow attributes, it doesn't allow multiple assignments to the same task, so we had to scrap that. Since then we introduce the notion of roles to provide more flexibility in assignment of users to task. Roles are assigned to parties and they are associated with a case and/or task. This adds the functionality to group a set of task together for assignment (i.e. assign a role to a set of task)
We've decided to make the workflow engine have a PL/SQL-only API, with the Tcl API being simple, small wrappers around the PL/SQL API. This has the benefit of being easy to port to an ACS/Java version. The downside is that it's going to be harder to port to PostgreSQL. On the other hand, there is one Tcl-callback, but this is one that's used purely for UI purposes, so that seemed acceptable.
We've decided to keep all historical information about the execution of a workflow, i.e. enabled transitions, tokens, workflow attributes, and tie all that into the journal. This obviously consumes more space and takes more querying time when there's non-current information in the database, but it adds tremendously to our ability to troubleshoot, backtrack and analyze the performance of the workflow.
Some things, such as assignments, may be defined cascadingly at several levels, such as statically for all workflows, manually for each individual case, or programatically, by executing a callback that determines who to assign based on roles. For example, a task such as "send out check" would always be assigned to the billing department. A task such as "author article" would be assigned manually by the editor based on his knowledge of the skills of the individual authors in "author" role. In the ticket-tracker, the task of fixing a problem would typically be assigned to a party based on who owns the corresponding project or feature area, and would be implemented by a callback from the workflow package into the ticket-tracker package. The callback would include the "module owner" role as its parameter.
workflowPL/SQL package) and one for manipulating workflow cases (the
workflow_casePL/SQL pacakge). You could also say that the first concerns the static properties, while the other concerns the dynamic properties.
The workflow package currently only implement the most basic operations: create/drop a workflow, create/drop an attribute of a workflow, determine whether a workflow is simple or not, and delete all cases of a workflow (useful for dropping workflows).
To modify other static properties of workflows, such as the places, transitions and arcs, etc., you'll need to access the tables directly. There might not be much gained by providing a procedural API for this, since the basic design is considered very stable.
Much more thorough is the API for dealing with workflow cases. This provides the handles to interact with the execution of a workflow, which consists of: Altering its top-level state, i.e. start, suspend, resume or cancel a whole workflow case at a time; interact with tasks, i.e. start, cancel, finish or comment on a task; and finally, inform the workflow package that some external action that the workflow package is waiting for has occurred.
Some notes on the internals is in place: The key word is encapsulation of logic.
All the execution of callbacks, which involve use of
execute immediate, have been encapsulated in some PL/SQL procedures/functions that do the work. They're usually pretty simple, but in some cases they incorporate some extra logic, such as the result if there's no callback defined.
The cascading features, such as task assignments and task deadlines (static per-context setting, manual per-case setting, programmtic setting), the logic is encapsulated in their own PL/SQL functions,
The token manipulation procedures (produce, lock, release, consume tokens) encapsulate the actions on tokens and ensure the information we keep for the history of a token is consistent.
sweep_automatic_transitions is responsible for turning the wheels of a workflow execution. It is to be called whenever the state changes due to some action either by the user, by time (timed transitions and user-task-hold timeouts), or by external systems (in the form of message transitions). It checks to see if the case is finished, it loops over all the enabled automatic transitions and fire these, and it updates the table
wf_tasks to make sure it accurately reflects the true state of the workflow. It uses two helper procedures for this:
enable_transitions. The latter is also responsible for sending out email notifications as new user tasks are enabled.
fire_transition_internal is responsible for firing any transition. It consumes the appropriate tokens, produces new tokens in the relevant output places, and fires any fire callback on the transition.
For workflow to be really functional, we rely on the OpenACS kernel at some point implementing a system for automatically generating forms based on metadata. This is used for setting workflow attributes as part of performing a taks. Currently, we do it ourselves in a very ad-hoc fashion. Speaking of attributes, we store a
wf_datatype about attributes, in addition to the information stored by the OpenACS kernel. This is historical. We needed it, because the OpenACS kernel currently doesn't let us say that the datatype of some attribute is the primary key of some object of a specific type, e.g. that the value of an attribute should be a party. In this case, the OpenACS kernel datatype would be number, and the
wf_datatype would be party. We needed that for the manual assignments, when we did those with attributes, but now that we don't use attributes to store those anymore, because attributes can only hold one value each, we actually don't need it anymore. We've left it there, though, in case we need it for something else. Once we've definitively established that we don't need it anymore, it'll go away.
wf_attribute_infotable that holds the
wf_datatypediscussed above under design tradeoffs.
The context-level is less obvious. We've decided to store all the callback information at the context level, assuming that they might need to change depending on the context. This is not necessarily the right decision, so we might have to move them to the workflow level later. The code review also raised the issue of whether these should be normalized into a generic
wf_callbacks table, as the current
wf_context_transition_info table has rather many columns, many of which share a common pattern.
The operational level datamodel also follows the Petri Net pretty closely. There's a table,
wf_tasks holding dynamic information corresponding to transitions, and one,
wf_tokens, holding information about the tokens, which corresponds to dynamic information on places. There is a
wf_roles that stores all roles for the workflow. Besides that, there are two tables,
wf_case_deadlines, holding case-level manual assignments and deadlines. The assignments table is necessary because attributes can't hold multiple values. The deadlines table isn't strictly necessary, but it is an option in line with setting the deadline in the form of an attribute, provided particularly for the Content Management System. There's also a
wf_attribute_value_audit table, which all the historical values of workflow attributes, a service that is conceptually better provided by the OpenACS kernel, but since the kernel currently doesn't do that, we have it here.
Finally, there are a number of views intended to abstract away from whether something is defined at the workflow or context level, to query the actual workflow state (used by
workflow_case.enable_transitions), and one to abstract away from traversing the OpenACS kernel party hierarchy.
There's a specific process for defining workflows, i.e. the order in which inserts should happen. This is described in the "Defining Your Workflow" section of the Developer's Guide.
wf_tasksACS Objects (
wf_casesis already). The system level permissions could belong on the workflow package instance. Then, to make the hierarchical permissioning scheme work, the context_id of a task will point to the case, the context_id of a case will point to the workflow, and the context_id of a workflow will point to the workflow package instance in which it was created. That way, you can also create multiple workflow package instances and have each of these see only its own set of processes.
If we also make transitions, places and arcs ACS Objects, we'll have a more consistent data model, and the callback signatures will be easier and more streamlined.
In order to store the attributes and take advantage of the kernel services (generic/specific storage handling, auditing, automatic form-generation, etc.), and since the attributes differ from process to process, we'll have to make the process an object type, in addition to being an object. Should we make every version of a process its own object type, or should we have only one object type per process? This is a trade-off. If we want to be able to alter/delete attributes, we'll have to make each version its own object type. If we're more concerned with quickly scraping up all the cases of a particular process, regardless of version, one object type per case is more relevant. But that can be solved with a foreign key. In any event, we'll have to be clever to come up with a good UI and code for summarizing all cases of a process, regardless of version. Finally, there's the uglyness of creating all those object types. I guess the conclusion is: we'll make each version an object type, it's cleaner.
General class hierarchy: Workflow is the abstract ancestor object type of all workflows. Each process, e.g. "
expenses_wf" is an object type, subclassing the workflow object type. Then each process version is an object type, e.g. "
expenses_wf_123", subclassing the process object type.
In other words, we have this class hierarchy:
Why? We need the version to be an object type if we want to use the kernel's generic attribute storage, or even the automatic form-generation facilities to come. We need the process itself to be an object type if we want a quick way to get all the cases of a specific process, across all versions. Finally, the master generic obejct type is a good match for the top-levelworkflow => expenses_wf => expenses_wf_123
Contexts are replaced with package instances. Currently, a context is nothing but a token that allows us to exchange some parts of the workflow definition, such as assignments, and make them different for different contexts. The idea is that the same process can be used with different static assignments in two different departments of the same company, or within different companies on the same website. The way this would work is that you install multiple instances of the target application (e.g. ticket-tracker), one per department or company (or, in general, per subsite).
In addition, we'll allow you to install multiple versions of the workflow package. Each version will only be able to see the workflows created in that specific instance. It would then make sense to split up the workflow package into a worklist package and a workflow package: The basic idea of the worklist is that it always shows all the items on your personal worklist, regardless of what process or whatever they belong to. This would make even more sense, because there's no reason that there has to be an associated workflow process just to add a task to your worklist. So the worklist package would be more generic. That split takes a bit more thought, though, as we'd have to develop the worklist as a standalone application and figure out the exact interaction between that and the workflow package.
We want to clean up the current callback mechanism in a few ways.
wf_transitions, although, for the following two reasons, that's not what we're actually going to do). We might later add the ability to add another callback which is specific to the context/package, but we'll postpone that for now.
wf_context_transition_info, what with all the nullable columns pertaining to callbacks.
Here are the different types of callbacks (i.e., different signatures), and the concrete callbacks of these types.
(case_id, transition_id) return date
(case_id, arc_id) return char
/* This is the callback repository */ create table wf_callbacks ( callback_id integer pretty_name varchar description varchar callback_proc varchar type enum(task_date, task_notify, task_assign, guard) ); /* This is the list of custom arguments for these callbacks i.e., arguments not required by the signature. */ create table wf_callback_args ( arg_id integer callback_id integer pretty_name varchar description varchar position integer type enum(integer, number, string, attribute, transition) ); /* The argument type is mainly a UI thing. If you say 'integer', 'number' or 'string', we ask the user for such a value, and validate that it is in fact a valid integer/number/string. If you say attribute, we present the user with a box where he can select one of the existing attributes of this workflow. If you say transition, we ask the user to select one of the existing transitions, etc. We can easily add more types, e.g. places, arcs, maybe an enumeration (a list of options), etc. This is all just so that we can create a nice UI for the callbacks. */ /* This is a generic invocation of a callback, which basically just means that we've filled in the custom arguments with static values */ create table wf_callback_invocation ( invocation_id integer callback_id integer compiled_call varchar ); /* Compiled call is the string to be passed to execute immediate in Oracle. It'll contain bind variables for the arguments covered by the signature, and the static values for the custom arguments will be put directly into the string (don't forget to quote and escape properly). */ /* This table holds the static values to the custom arguments. We don't need to query this, once the 'compiled_call' column has been set. */ create table wf_callback_invocation_args ( invocation_id integer arg_id integer value varchar ); /* This is how we apply the callback to a transition */ create table wf_transition_invocation_map ( transition_id integer type enum(enable, fire, assignment, time, deadline, hold_timeout) sort_order integer /* the order, if there are more than one per transition invocation_id integer ); /* Here's the guard */ create table wf_arcs ( ... guard_id integer /* points to a row in wf_callback_invocation */ );
Finally, the table
wf_attribute_info should go away. We don't need it any more, since it was only used for assignment attributes, something that's gone now.
For more far-fetched future plans, refer to Future Plans.
Related Workflow Topics
Related Object Types