TaskVine

TaskVine

An execution system for large scale data intensive dynamic workflows.

Last updated: 25 Jun 2025     |     Release: devel

TaskVine is a system building large scale data intensive dynamic workflows that run on HPC clusters, GPU clusters, and commercial clouds. As tasks access external data sources and produce their own outputs, more and more data is pulled into local storage on workers. This data is used to accelerate future tasks and avoid re-computing exisiting results. Data gradually grows 'like a vine' through the cluster. TaskVine can serve as an execution for other workflow systems such as Parsl and Dask.

Terminology
Terminology below follows the definitions established by the Workflows Community Terminology.

Characteristics
Flow
Data
Granularity
Sub-workflows
Coupling
Loose
Domain
Agnostic
Composition
Description
Standard (Make)
Abstraction
Abstract
Modularity
Hierarchical
Orchestration
Planning
Static
Execution
Runner
Data Management
Transport
File-based
Storage
Shared
Replicated
Metadata Capture
Anomaly Detection
Monitoring
Provenance
Extensions
Distributed Storage
HPC Execution

Terminology
Terminology below follows the definitions established by the Workflows Community Terminology.

Characteristics
Flow
Task
Iterative
Granularity
Functions
Executables
Coupling
Loose
Domain
Agnostic
Composition
Description
API
Abstraction
Intermediate
Modularity
Flat
Orchestration
Planning
Dynamic
Execution
Resource Manager
Data Management
Transport
File-based
Storage
Shared
Distributed
Replicated
Metadata Capture
Anomaly Detection
Monitoring
Provenance
Extensions
Serverless
Autoscaling
HPC Execution
Recoverable Storage