版本

和其他编程语法一样，WDL也存在不同版本间的语法差异，因此在使用WDL进行流程撰写时，需要明确参考的版本规范
draft-3
v1.0
v1.1 截止202301，cromwell还不支持该版本。
所以后文语法如无特殊标准，均基于 WDL v1.0 版本。

语法规则

变量

变量的类型，主要有以下几种：

基础变量(primitive types )

Int i = 0                  # An integer value
Float f = 27.3             # A floating point number
Boolean b = true           # A boolean true/false
String s = "hello, world"  # A string value
File f = "path/to/file"    # A file

复合变量(compound types)

# In the examples below P represents any of the primitive types above, and X and Y represent any valid type (even nested compound types)
Array[X] xs = [x1, x2, x3]                    # An array of Xs
Map[P,Y] p_to_y = { p1: y1, p2: y2, p3: y3 }  # A map from Ps to Ys
Pair[X,Y] x_and_y = (x, y)                    # A pair of one X and one Y
Object o = { "field1": f1, "field2": f2 }     # Object keys are always `String`s

Array Literals

Arrays values can be specified using Python-like syntax, as follows:

1 2	Array[String] a = ["a", "b", "c"] Array[Int] b = [0,1,2]

Map Literals

Maps values can be specified using a similar Python-like sytntax:

1 2	Map[Int, Int] = {1: 10, 2: 11} Map[String, Int] = {"a": 1, "b": 2}

自定义结构（Struct Definition）

struct 是一种类似c的构造，它允许用户创建由先前存在的类型组成的新的复合类型。然后，可以在Task或Workflow定义中使用struct作为声明来代替任何其他常规类型。在许多情况下，结构体替代了Object类型，并允许对其成员进行适当的类型设置。

1	struct SampleData{}

复合类型还可以在结构中使用，以便轻松地将它们封装在单个对象中。

变量的调用

访问Object

wdl本身提供了多种读取文件信息生成Object的接口（read_tsv、read_json、read_map)等。
部分示例如下：
Config_File_software：

1
2
3

pipeline_root	/home/liubo4/Project/aio.wdl/Gdc/aio.NewAnnotation/
java	bin/java_v1.8
tabix	bin/tabix

要在wdl中提取上述文件信息

workflow wf_echo {
  input{
    String Config_File_software
    Object software = read_map(Config_File_software)
# 方式一：直接通过Config_File_software文件的第一列值作为属性直接提取； 
  pipeline_root = software.pipeline_root

# 方式二： 通过中间变量提取，这种方式可以高效的处理一些差异化配置需求；
    String tmp="pipeline_root"
    pipeline_root = software[tmp]
  }
}

在一些复杂业务中，可以使用json提供丰富的数据结构。

遍历map的方式

由于map本身的特殊性（键值对），通过scatter调用map时，会返回一个Pair类型的数据，实现key、value的获取。参考

workflow foo {
  Map[String, Int] map
  scatter (pair in map) {
    String key = pair.left
    Int value = pair.right
  }

  output {
    # Automatically gathered from inside the scatter:
    Array[String] keys = key
    Array[Int] values = value
  }
}

变量的可选参数Optional Parameters & Type Constraints

Types can be optionally suffixed with a ? or + in certain cases.
? means that the parameter is optional. A user does not need to specify a value for the parameter in order to satisfy all the inputs to the workflow.
+ applies only to Array types and it represents a constraint that the Array value must containe one-or-more elements.

关于每一种变量的使用，以及 WDL 的更多使用技巧，请参考官方规范文档。

表达式(Expressions)

LHS Type	Operators	RHS Type	Result	Semantics
`Boolean`	`==`	`Boolean`	`Boolean`
`Boolean`	`!=`	`Boolean`	`Boolean`
`Boolean`	`>`	`Boolean`	`Boolean`
`Boolean`	`>=`	`Boolean`	`Boolean`
`Boolean`	`<`	`Boolean`	`Boolean`
`Boolean`	`<=`	`Boolean`	`Boolean`
`Boolean`	`\	\	`	`Boolean`	`Boolean`
`Boolean`	`&&`	`Boolean`	`Boolean`
`File`	`+`	`File`	`File`	Append file paths
`File`	`==`	`File`	`Boolean`
`File`	`!=`	`File`	`Boolean`
`File`	`+`	`String`	`File`
`File`	`==`	`String`	`Boolean`
`File`	`!=`	`String`	`Boolean`
`Float`	`+`	`Float`	`Float`
`Float`	`-`	`Float`	`Float`
`Float`	`*`	`Float`	`Float`
`Float`	`/`	`Float`	`Float`
`Float`	`%`	`Float`	`Float`
`Float`	`==`	`Float`	`Boolean`
`Float`	`!=`	`Float`	`Boolean`
`Float`	`>`	`Float`	`Boolean`
`Float`	`>=`	`Float`	`Boolean`
`Float`	`<`	`Float`	`Boolean`
`Float`	`<=`	`Float`	`Boolean`
`Float`	`+`	`Int`	`Float`
`Float`	`-`	`Int`	`Float`
`Float`	`*`	`Int`	`Float`
`Float`	`/`	`Int`	`Float`
`Float`	`%`	`Int`	`Float`
`Float`	`==`	`Int`	`Boolean`
`Float`	`!=`	`Int`	`Boolean`
`Float`	`>`	`Int`	`Boolean`
`Float`	`>=`	`Int`	`Boolean`
`Float`	`<`	`Int`	`Boolean`
`Float`	`<=`	`Int`	`Boolean`
`Float`	`+`	`String`	`String`
`Int`	`+`	`Float`	`Float`
`Int`	`-`	`Float`	`Float`
`Int`	`*`	`Float`	`Float`
`Int`	`/`	`Float`	`Float`
`Int`	`%`	`Float`	`Float`
`Int`	`==`	`Float`	`Boolean`
`Int`	`!=`	`Float`	`Boolean`
`Int`	`>`	`Float`	`Boolean`
`Int`	`>=`	`Float`	`Boolean`
`Int`	`<`	`Float`	`Boolean`
`Int`	`<=`	`Float`	`Boolean`
`Int`	`+`	`Int`	`Int`
`Int`	`-`	`Int`	`Int`
`Int`	`*`	`Int`	`Int`
`Int`	`/`	`Int`	`Int`	Integer division
`Int`	`%`	`Int`	`Int`	Integer division, return remainder
`Int`	`==`	`Int`	`Boolean`
`Int`	`!=`	`Int`	`Boolean`
`Int`	`>`	`Int`	`Boolean`
`Int`	`>=`	`Int`	`Boolean`
`Int`	`<`	`Int`	`Boolean`
`Int`	`<=`	`Int`	`Boolean`
`Int`	`+`	`String`	`String`
`String`	`+`	`Float`	`String`
`String`	`+`	`Int`	`String`
`String`	`+`	`String`	`String`
`String`	`==`	`String`	`Boolean`
`String`	`!=`	`String`	`Boolean`
`String`	`>`	`String`	`Boolean`
`String`	`>=`	`String`	`Boolean`
`String`	`<`	`String`	`Boolean`
`String`	`<=`	`String`	`Boolean`
	`-`	`Float`	`Float`
	`+`	`Float`	`Float`
	`-`	`Int`	`Int`
	`+`	`Int`	`Int`
	`!`	`Boolean`	`Boolean`

参考

If then else

Int array_length = length(array)
runtime {
  memory: if array_length > 100 then "16GB" else "8GB"
}

参考该指标，可以根据数据量动态的给每个任务分配内存。

Member Access

语法 x.y 访问对象的属性. 其中 x 必须是一个对象或者是一个 workflow中的 task 。一个Task可以被视为一个对象，而Task的属性就是一个Task中的 output 。

wdl

workflow wf {
  input {
    Object obj
    Object foo
  }
  # This would cause a syntax error,
  # because foo is defined twice in the same namespace.
  call foo {
    input: var=obj.attr # Object attribute
  }

  call foo as foo2 {
    input: var=foo.out # Task output
  }
}

Map and Array Indexing

x[y] 用于建立索引的 maps 和arrays ；针对maps，y必须是x中的一个key；针对arrays, y必须是一个整数。

Pair Indexing

如果 x 是一对，则其中左边和右边的元素可以使用 x.left and x.right 获取。

内置函数

输入输出：stdout, stderr,read_tsv

stdout()函数用于捕获command中命令生成的标准输出。
stderr()函数用于捕获command中命令生成的标准报错。

stderr比stdout更常用，更多用于捕获warning信息

信息获取类：defined, glob, basename, select_first

文件读入：read_tsv, read_json, read_lines

文件输出：write_tsv, write_lines

# json file: person.json
{
    "name":"John",
    "age":42
}

# WDL read
workflow demo{
    File json_file = "person.json"
    Object p = read_json(json_file)
    ...
    call record{
        input:
            name = p.name,
            age = p.age
    }
}

glob：获取某一类型文件，返回文件数组
defined：判断变量是否被定义，返回布尔值True/False
select_first：输入为数组，返回首个不为空的元素。很重要的函数！

变量操作函数

prefix：为数组变量加上前缀。对于同类型的多输入文件非常重要！
sub：提供正则表达式功能（不建议在WDL中使用）

变量操作：prefix, sub

Workflow

Task

批量计算 (runtime)

用于配置任务运行时的相关参数。
使用批量计算作为后端时，主要的 runtime 参数有：

cluster:
    计算集群环境
    支持serverless 模式和固定集群模式
mounts:
    挂载设置
    支持 OSS 和 NAS
docker:
    容器镜像地址
    支持容器镜像服务
simg:
    容器镜像文件
    支持singularity 镜像
systemDisk:
    系统盘设置
    包括磁盘类型和磁盘大小
dataDisk:
    数据盘设置
    包括磁盘类型、磁盘大小和挂载点
memory:
    所需的任务内存
cpu:
    所需的计算核心数目
timeout:
    作业超时时间
maxRetries:
    指令允许定义在发生故障时可以重新提交流程实例的最大次数。

具体的参数解释及填写方法，请参考 Cromwell 官方文档

除此之外，还有一些其他的概念

runtime
parameter_meta
meta
从官方版本45开始，Cromwell 使用批量计算作为后端，支持 glob 和 Call caching 两个高级特性。