Disciplina - detalhe

LGN5835 - Bioinformática Aplicada


Carga Horária

Teórica
por semana
Prática
por semana
Créditos
Duração
Total
2
2
8
15 semanas
120 horas

Docentes responsáveis
Gabriel Rodrigues Alves Margarido

Objetivo
A disciplina oferecerá aos alunos uma visão teórico-prática das tecnologias disponíveis para o
sequenciamento genômico em larga escala, com ênfase nas ferramentas disponíveis para análise dos
dados. Capacitar os alunos a aplicar estratégias de genotipagem e transcriptômica ao melhoramento
genético.

Conteúdo
Genômica e Bioinformática. Tecnologias modernas para o sequenciamento de DNA. Pré-processamento
de dados de sequenciamento de nucleotídeos. Alinhamento de sequências biológicas. Sequenciamento e
montagem de novo de genomas completos. Descoberta de polimorfismos e genotipagem:
resequenciamento genômico; sequenciamento de bibliotecas de representação reduzida (GBS e RAD).
Genômica funcional e transcriptômica. Montagem de novo de transcriptomas. Expressão gênica
diferencial. Estudos de enriquecimento funcional. Entre os programas e plataformas utilizados,
destacam-se: Bowtie, HISAT, BWA-MEM, IGV, TASSEL-GBS, GATK, FreeBayes, Trinity, R,
R/Bioconductor, edgeR, goseq.

Bibliografia
Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic Local Alignment Search Tool.
Journal of Molecular Biology, v. 215, p. 403-410, 1990.
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped
BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research,
v. 25, p. 3389-3402, 1997.
Anders, S.; Huber, W. Differential expression analysis for sequence count data. Genome Biology, v. 11,
R106, 2010.
Anders, S.; Pul, P.T.; Huber, W. HTSeq—a Python framework to work with high-throughput sequencing
data. Bioinformatics, v. 31, p. 166-169, 2015.
Baker, M. De novo genome assembly: what every biologist should know. Nature Methods, v. 9, p. 333-
337, 2012.
Catchen, J.M.; Amores, A.; Hohenlohe, P.; Cresko, W.; Postlethwait, J.H. Stacks: Building and
Genotyping Loci De Novo From Short-Read Sequences. G3, v. 1, p. 171-182, 2011.
Davey, J.W.; Blaxter, M.L. RADSeq: next-generation population genetics. Briefings in Functional
Genomics, v. 9, p. 416-423, 2011.
Eaton, D.A.R. PyRAD: assembly of de novo RADseq loci for phylogenetic analyses. Bioinformatics, v. 30,
p. 1844-1849, 2014.
Elshire, R.J.; Glaubitz, J.C.; Sun, Q.; Poland, J.A.; Kawamoto, K.; Buckler, E.S.; Mitchell, S.E. A Robust,
Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species. PLoS ONE, v. 6, e19379,
2011.
Flicek, P.; Birney, E. Sense from sequence reads: methods for alignment and assembly. Nature Methods,
v. 6, S6-S12, 2009.
Garber, M. et al. Computational methods for transcriptome annotation and quantification using RNA-seq.
Nature Methods, v. 8, p. 469-477, 2011.
Glaubitz, J.C.; Casstevens, T.M.; Lu, F.; Harriman, J.; Elshire, R.J.; Sun, Q.; Buckler, E.S. TASSEL-GBS:
A High Capacity Genotyping by Sequencing Analysis Pipeline. PLoS ONE, v. 9, e90346, 2014.
Gotoh, O. An Improved Algorithm for Matching Biological Sequences. Journal of Molecular Biology, v.
162, p. 705-708, 1982.
Grabherr, M.G. et al. Full-length transcriptome assembly from RNA-seq data without a reference
genome. Nature Biotechnology, v. 29, p. 644-652, 2011.
Green, E. Strategies for the systematic sequencing of complex genomes. Nature Reviews Genetics, v. 2,
p. 573-583, 2001.
Haas, B.J.; Zody, M. Advancing RNA-Seq analysis. Nature Biotechnology, v. 28, p. 421-423, 2010.
Haas, B.J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for
reference generation and analysis. Nature Protocols, v. 8, p. 1494-1512, 2013.
Langmead, B.; Trapnell, C.; Pop, M.; Salzberg, S.L. Ultrafast and memory-efficient alignment of short
DNA sequences to the human genome. Genome Biology, v. 10, R25, 2009.
Li, H.; Homer, N. A survey of sequence alignment algorithms for next-generation sequencing. Briefings
in Bioinformatics, v. 11, p. 473-438, 2010.Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv,
1303.3997, 2013.
Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data
with DESeq2. Genome Biology, v. 15, 550, 2014.
Needleman, S.B.; Wunsch, C.D. A General Method Applicable to the Search for Similarities in the Amino
Acid Sequence of Two Proteins. Journal of Molecular Biology, v. 48, p. 443-453, 1970.
Oshlack, A.; Robinson, M.; Young, M. From RNA-seq reads to differential expression results. Genome
Biology, v. 11, p. 220, 2010.
Peterson, B.K.; Weber, J.N.; Kay, E.H.; Fisher, H.S.; Hoekstra, H.E. Double Digest RADseq: An
Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species. PLoS
ONE, v. 7, e37135, 2012.
Pevsner, J. Bioinformatics and Functional Genomics. 2 ed. Wiley-Blackwell, 992p. 2009.
Poland, J.A.; Rife, T.W. Genotyping-by-Sequencing for Plant Breeding and Genetics. The Plant Genome,
v. 5, p. 92-102, 2012.
Robinson, M.D.; McCarthy, D.J.; Smyth, G.K. edgeR: a Bioconductor package for differential expression
analysis of digital gene expression data. Bioinformatics, v. 26, p. 139-140, 2009.
Smith, T.F.; Waterman, M.S. Identification of Common Molecular Subsequences. Journal of Molecular
Biology, v. 147, p. 195-197, 1981.
Trapnell, C.; Pachter, L.; Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq.
Bioinformatics, v. 25, p. 1105-1111, 2009.
Trapnell, C.; Salzberg, S.L. How to map billions of short reads onto genomes. Nature Biotechnology, v.
27, p. 455-457, 2009.
Wang, Z.; Gerstein, M.; Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews
Genetics, v. 10, p. 57-63, 2009.
Wang, L.; Feng, Z.; Wang, X.; Wang, X.; Zhang, X. DEGseq: an R package for identifying differentially
expressed genes from RNA-seq data. Bioinformatics, v. 26, p. 136-138, 2010.