{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# [5.2]() Glossary [edit]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Table of Contents**\n", "0. [Pairwise alignment (noun)](#1)\n", "0. [kmer (noun)](#2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## [5.2.1](#1) Pairwise alignment (noun) [edit]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "A hypothesis about which bases or amino acids in two biological sequences are derived from a common ancestral base or amino acid. By definition, the *aligned sequences* will be of equal length with gaps (usually denoted with ``-``, or ``.`` for terminal gaps) indicating hypothesized insertion deletion events. A pairwise alignment may be represented as follows:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```\n", "ACC---GTAC\n", "CCCATCGTAG\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [5.2.2](#2) kmer (noun) [edit]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "A kmer is simply a word (or list of adjacent characters) in a sequence of length k. For example, the overlapping kmers in the sequence ``ACCGTGACCAGTTACCAGTTTGACCAA`` are as follows:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import skbio\n", "skbio.DNA('ACCGTGACCAGTTACCAGTTTGACCAA').kmer_frequencies(k=5, overlap=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is common for bioinformaticians to substitute the value of `k` for the letter _k_ in the word _kmer_. For example, you might here someone say \"we identified all seven-mers in our sequence\", to mean they identified all kmers of length seven." ] } ], "metadata": {}, "nbformat": 4, "nbformat_minor": 4 }