基于全基因组重测序技术筛查通道侗族原发性高血压易感基因

Screening for susceptible genes of essential hypertension in Tongdao Dong population with the whole genome resequencing technology

  • 摘要: 目的 采用Illumina二代测序技术结合DNA pooling策略和生物信息学分析,在全基因组水平上初步筛查通道侗族人群原发性高血压的易感基因。方法 以100例通道侗族原发性高血压患者及100名健康对照人群为研究对象,提取其外周血DNA并分别制备两组的DNA pooling,构建文库并采用二代测序技术进行测序。使用比对软件BWA将测序数据与参考基因组进行比对,得到的BAM比对结果文件采用GATK软件对单核苷酸多态性(SNP)和插入缺失(InDel)进行检测,采用CNVnator检测拷贝数变异(CNV),Breakdancer检测结构变异(SV),使用ANNOVAR和内部软件AnnoDB对变异结果进行注释及影响预测。针对原发性高血压组和对照组编码区共有SNP进行Fisher精确检验,筛选出P<0.05的SNP位点,采用在线平台DAVID和KOBAS对筛选出来的差异表达基因进行基因本体(GO)功能注释和京都基因和基因组百科全书(KEGG)通路富集分析,采用STRING数据库的蛋白质相互作用分析平台和Cytoscape软件进行关键基因筛选及相互作用网络图的构建。结果 对照组与EH组分别检测到4 311 896和4 554 994个SNP位点,InDel位点总数分别为772 269和870 230个,CNV总数分别为8 027和7 052个,SV总数分别为5 534和8 072个。对照组与高血压组编码区共有的SNP为21 861个。在Cytoscape软件的cytoNCA模块下基于介数中心性(BC)算法进行网络拓扑结构分析以筛选关键基因时,发现两组编码区的差异基因丝氨酸/苏氨酸蛋白激酶1(AKT1)、表皮生长因子受体(EGFR)、血管紧张素转换酶(ACE)、整合素β1亚单位(ITGB1)、基质金属蛋白酶9(MMP9)、低密度脂蛋白受体(LDLR)等BC值更高,在整个网络图中与其他基因之间的通路联系更密集,流经的信息量更多。结论 通过DNA pooling与二代测序技术相结合,筛查出一批可能与通道侗族人群EH发生、发展相关的易感基因。通道侗族原发性高血压的发病机制可能主要涉及AKT1、EGFR、ACE介导的信号传导通路。

     

    Abstract: Objective Illumina next-generation sequencing technology combined with DNA pooling strategy and bioinformatics analysis were used to preliminarily screen the susceptibility genes for essential hypertension(EH) in the Dong population at whole genome level. Methods The study population consisted of 100 patients with EH and 100 healthy persons of Tongdao Dong. Extracted peripheral blood DNA and prepared two sets of DNA poolings. Two DNA libraries were constructed for genome resequencing with next-generation sequencing technologies. The sequencing data were compared with the reference genome using the comparison software BWA, and the obtained BAM comparison result were ready for analyzing single nucleotide polymorphism(SNP) and insertion-deletion(InDel) by GATK, for identifying copy number variations(CNV) by CNVnator, and structural variations(SV) by CNVnator Breakdancer. ANNOVAR and internal software AnnoDB were used to annotate the variants and variant impact. We obtained the significant(P<0.05) possible disease-related SNPs in coding area between the EH and control group by Fisher exact test. Gene ontology(GO) and kyoto encyclopedia of genes and genomes(KEGG) pathway were enriched by DAVID and KOBAS online tools for genes that differentially expressed. The protein-protein interaction network was constructed using the STRING database and we further selected the key susceptible EH-related genes and visualized resulting networks using the Cytoscape software. Results A total of 4 311 896 and 4 554 994 SNPs, 772 269 and 870 230 InDels, 8 027 and 7 052 CNVs, 5 534 and 8 072 SVs were identified in the normal control and EH group, respectively. These two groups shared 21 861 SNPs in coding area. Candidate gene screening was performed by using CytoNCA plugin in Cytoscape calculated each betweenness centrality(BC) and it was found more data flows through the portion of the differential expression of gene nodes including AKT serine/threonine kinase 1(AKT1), epidermal growth factor receptor(EGFR), angiotensin-converting enzyme(ACE), integrin subunit beta 1(ITGB1), matrix metalloproteinase 9(MMP9), low density lipoprotein receptor(LDLR) and they had higher BC values than other genes. Conclusions Through the combination of DNA pooling and next-generation sequencing technology, a number of susceptibility genes that may be related to the occurrence and development of EH in the Tongdao Dong population were screened out. The signal pathway mediated with AKT1, EGFR and ACE may play important roles in the pathogenesis of EH in Tongdao Dong population.

     

/

返回文章
返回