{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# [5.2]() Glossary [edit]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Table of Contents**\n",
"0. [Pairwise alignment (noun)](#1)\n",
"0. [kmer (noun)](#2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"## [5.2.1](#1) Pairwise alignment (noun) [edit]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"A hypothesis about which bases or amino acids in two biological sequences are derived from a common ancestral base or amino acid. By definition, the *aligned sequences* will be of equal length with gaps (usually denoted with ``-``, or ``.`` for terminal gaps) indicating hypothesized insertion deletion events. A pairwise alignment may be represented as follows:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
"ACC---GTAC\n",
"CCCATCGTAG\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## [5.2.2](#2) kmer (noun) [edit]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"A kmer is simply a word (or list of adjacent characters) in a sequence of length k. For example, the overlapping kmers in the sequence ``ACCGTGACCAGTTACCAGTTTGACCAA`` are as follows:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import skbio\n",
"skbio.DNA('ACCGTGACCAGTTACCAGTTTGACCAA').kmer_frequencies(k=5, overlap=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is common for bioinformaticians to substitute the value of `k` for the letter _k_ in the word _kmer_. For example, you might here someone say \"we identified all seven-mers in our sequence\", to mean they identified all kmers of length seven."
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 4
}