这里会显示出您选择的修订版和当前版本之间的差别。
| 两侧同时换到之前的修订记录前一修订版后一修订版 | 前一修订版 | ||
| stat:rnaseq [2023/06/26 00:14] – [流程] inkit | stat:rnaseq [2024/10/21 13:14] (当前版本) – [rMATS] inkit | ||
|---|---|---|---|
| 行 1: | 行 1: | ||
| ====== RNA-Seq ====== | ====== RNA-Seq ====== | ||
| ====教程==== | ====教程==== | ||
| + | * 向SRA提交数据 [[stat: | ||
| * 知乎搜索 https:// | * 知乎搜索 https:// | ||
| * RNA-Seq数据标准化方法 https:// | * RNA-Seq数据标准化方法 https:// | ||
| 行 6: | 行 7: | ||
| * RNA-seq:转录组数据分析处理(上) https:// | * RNA-seq:转录组数据分析处理(上) https:// | ||
| * Common File Formats Used by the ENCODE Consortium [[https:// | * Common File Formats Used by the ENCODE Consortium [[https:// | ||
| + | * FAA RNA Seq Compare [[https:// | ||
| - | ====流程==== | + | =====流程===== |
| + | [[https:// | ||
| Raw Data: Fastq (.gz) -> | Raw Data: Fastq (.gz) -> | ||
| - QC: FastQC | - QC: FastQC | ||
| 行 17: | 行 20: | ||
| - Differential | - Differential | ||
| + | ====1 QC ==== | ||
| ===FastQC=== | ===FastQC=== | ||
| < | < | ||
| 行 23: | 行 27: | ||
| fastqc --noextract RawData/ | fastqc --noextract RawData/ | ||
| </ | </ | ||
| + | |||
| + | |||
| + | ====2 Alignment==== | ||
| ===基因组注释数据=== | ===基因组注释数据=== | ||
| 行 31: | 行 38: | ||
| >>GTF [[https:// | >>GTF [[https:// | ||
| >>GFF [[https:// | >>GFF [[https:// | ||
| + | |||
| + | ===Alignment Indexing=== | ||
| + | [[https:// | ||
| 行 36: | 行 46: | ||
| * [[https:// | * [[https:// | ||
| * STAR | * STAR | ||
| + | * HISAT2 | ||
| + | * Salmon [[https:// | ||
| + | |||
| + | ===HISAT2=== | ||
| + | [[https:// | ||
| + | | ||
| + | |||
| + | < | ||
| + | hisat2-build -p 32 fasta/ | ||
| + | hisat2 -x hisat_index/ | ||
| + | |||
| + | </ | ||
| ===STAR=== | ===STAR=== | ||
| 行 51: | 行 73: | ||
| STAR --genomeDir star_index --readFilesIn rna4/ | STAR --genomeDir star_index --readFilesIn rna4/ | ||
| + | </ | ||
| + | |||
| + | === kallisto | ||
| + | < | ||
| + | # bat | ||
| + | kallisto bus [arguments] FASTQ-files | ||
| + | |||
| + | kallisto quant -i rattus_index_ki/ | ||
| + | |||
| + | </ | ||
| + | ====3 Count ==== | ||
| + | > | ||
| + | >>1 [[https:// | ||
| + | >>> | ||
| + | < | ||
| + | featureCounts -p -M -O -T 32 -a gtf/ | ||
| + | </ | ||
| + | |||
| + | |||
| + | |||
| + | ===== Splicing ===== | ||
| + | ====rMATS==== | ||
| + | ===剪接事件=== | ||
| + | * SE(Skipped Exon,外显子跳跃) | ||
| + | * MXE(Mutually Exclusive Exons,相互排斥的外显子) | ||
| + | * A5SS(Alternative 5' Splice Site,可变 5' 剪接位点) | ||
| + | * A3SS(Alternative 3' Splice Site,可变 3' 剪接位点) | ||
| + | * RI(Retained Intron,内含子保留) | ||
| + | SE (Skipped Exon), MXE (Mutually Exclusive Exons), A5SS (Alternative 5' Splice Site), A3SS (Alternative 3' Splice Site), RI (Retained Intron) | ||
| + | ===计数方法=== | ||
| + | * JC(Junction Counts):仅使用跨越剪接点的 reads 进行定量分析。提供更为精准的剪接定量,但可能会漏掉一些低覆盖的剪接事件。 | ||
| + | * JCEC(Junction Counts and Exon Counts):使用跨越剪接点和覆盖整个外显子的 reads 进行定量分析。能检测更多事件,但有时可能会引入额外的噪音。 | ||
| + | 如果需要更精准的剪接事件识别,建议使用 JC。如果希望尽可能多地检测到所有的剪接事件,可以考虑 JCEC。 | ||
| + | ===txt=== | ||
| + | < | ||
| + | ID:剪接事件的唯一标识符。 | ||
| + | GeneID:发生剪接事件的基因的标识符(基因名称或基因 ID)。 | ||
| + | chr:发生剪接事件的染色体位置。 | ||
| + | strand:基因的链信息(正链或负链)。 | ||
| + | longExonStart_0base 和 longExonEnd:选择的较长外显子的起始和终止位置。 | ||
| + | shortES 和 shortEE:选择的较短外显子的起始和终止位置。 | ||
| + | flankingES 和 flankingEE:两侧剪接外显子的位置。 | ||
| + | ID:剪接事件的 ID 编号。 | ||
| + | IncFormLen 和 SkipFormLen:包含和跳过该外显子的转录本的长度。 | ||
| + | ICJ 和 SCJ(Inclusion Junction Counts / Skipping Junction Counts):代表包含和跳过该剪接事件的 reads 数目。 | ||
| + | IncLevel1 / IncLevel2:代表在两组样本中该剪接事件的包含水平(Ψ值)。 | ||
| + | IncLevelDiff:两组样本间的剪接差异值(ΔΨ,Inclusion Level Difference)。 | ||
| + | PValue 和 FDR:用于判断剪接事件是否显著差异的 P 值和 FDR(假发现率)。 | ||
| </ | </ | ||