


  1. 从体系检测结果中减去胚系检出结果
  2. 体细胞变异调用者使用贝叶斯方法
  3. Fisher精确统计方法

Fisher 检验(目前产品在用)


以Varscan为例,基于Fisher检验是否存在显著性(Pvalue < 0.1) ,及双端情况将数据划分为3类(LOH、Germline、Somatic)。


组织深度1200x; 对照纯阴性。(对照阈值300x)
对于检测限 1% 的突变(组织/血浆测序深度1200x时),对照只有达到 389x 以上时,才可能有显著性。
对于一个 3% 的突变(组织/血浆测序深度1200x时),对照只有达到 127x 以上时,才可能有显著性。
对于一个 0.5% 的突变(组织/血浆测序深度1200x时),对照只有达到 843x 以上时,才可能有显著性。

WES产品 500x;对照纯阴性。(对照阈值200x)
针对检测限 3% 的突变,纯阴对照需要达到 133x 才能存在显著性。
针对一个 1% 的突变,纯阴对照需要达到 506x 才能存在显著性。



A variant allele in the case sample is not called if the site is variant in controls.
We explain an exception for GATK4 Mutect2 in a bit.
Historically, somatic callers have called somatic variants at the site-level. That is, if a variant site in the case is also variant in the matched control or in a population resource, e.g. dbSNP, even if the variant allele is different than the control or resource it is discounted from the somatic callset. This practice stems in part from cancer study designs where the control normal sample is sequenced at much lower depth than the case tumor sample. Because of the assumption mutations strike randomly, cancer geneticists view mutations at sites of common germline variation with skepticism. Remember for humans, common germline variant sites occur roughly on average one in a thousand reference bases. So if a commonly variant site accrues additional mutations, we must weigh the chance of it having arisen from a true somatic event or it being something else that will likely not add value to downstream analyses. For most sites and typical analyses, the latter is the case. The variant is unlikely to have arisen from a somatic event and more likely to be some artifact or germline variant, e.g. from mapping or cross-sample contamination.
GATK4 Mutect2 still applies this practice in part. The tool discounts variant sites shared with the panel of normals or with a matched normal control’s unambiguously variant site. If the matched normal’s variant allele is supported by few reads, at low allele fraction, then the tool accounts for the possibility of the site not being a germline variant.
When it comes to the population germline resource, GATK4 Mutect2 distinguishes between the variant alleles in the germline resource and the case sample. That is, Mutect2 will call a variant site somatic if the allele differs from that in the germline resource.

