{ "cells": [ { "cell_type": "markdown", "id": "d2e81d31", "metadata": {}, "source": [ "书名:《大话数据结构》 \n", "作者:yzs \n", "时间:2021.9.21" ] }, { "cell_type": "markdown", "id": "e866858a", "metadata": {}, "source": [ "# 数据结构绪论 " ] }, { "cell_type": "markdown", "id": "dd480544", "metadata": {}, "source": [ "## 基本概念和术语 " ] }, { "cell_type": "markdown", "id": "e53c871f", "metadata": {}, "source": [ "**数据**:是描述客观事物的符号,是计算机中可以操作的对象,是能被计算机识别,并输入给计算机处理的符号集合。***(可以输入到计算机中、能被计算机程序处理)*** 数据不仅仅包括整型、实型等数值类型,还包括字符及声音、图像、视频等非数值类型。 \n", "**数据元素**:是组成数据的、有一定意义的基本单位,在计算机中通常作为整体处理。也被称为记录。 \n", "**数据项**:一个数据元素可以由多个数据项组成。(数据项是数据***不可分割的最小单位***。) \n", "**数据对象**:是性质相同的数据元素的集合,是数据的子集。 \n", "**数据结构**:是相互之间存在一种或多种特定关系的数据元素的集合 。" ] }, { "cell_type": "markdown", "id": "15284dbe", "metadata": {}, "source": [ "## 逻辑结构与物理结构 " ] }, { "cell_type": "markdown", "id": "0fa40683", "metadata": {}, "source": [ "### 逻辑结构 " ] }, { "cell_type": "markdown", "id": "013f985a", "metadata": {}, "source": [ "逻辑结构:是指数据对象中数据元素之间的相互关系。分为:\n", "1. 集合结构\n", "2. 线性结构(一对一)\n", "3. 树形结构(一对多)\n", "4. 图形结构(多对多)" ] }, { "cell_type": "markdown", "id": "1b727f09", "metadata": {}, "source": [ "### 物理结构 " ] }, { "cell_type": "markdown", "id": "17050c68", "metadata": {}, "source": [ "物理结构(存储结构):是指数据的逻辑结构在计算机中的存储形式。\n", "分为两种:\n", "1. 顺序存储结构:是把数据元素存放在地址连续的存储单元里,其数据间的逻辑关系和物理关系是一致的\n", "2. 链式存储结构:是把数据元素存放在任意的存储单元里,这组存储单元可以是连续的,也可以是不连续的" ] }, { "cell_type": "markdown", "id": "ac65f812", "metadata": {}, "source": [ "## 抽象数据类型 " ] }, { "cell_type": "markdown", "id": "eed6e8da", "metadata": {}, "source": [ "### 数据类型 " ] }, { "cell_type": "markdown", "id": "251fa9f7", "metadata": {}, "source": [ "数据类型:是指一组性质相同的值的集合及定义在此集合上的一些操作的总称。在C语言中,按照取值的不同,可以分为两类:\n", "1. 原子类型:是不可以再分解的基本类型,包括整型、实型、字符型等。\n", "2. 结构类型:自若干个类型组合而戚,是可以再分解的。例如,整型数组是由若干整型数据组成的。" ] }, { "cell_type": "markdown", "id": "7c80b948", "metadata": {}, "source": [ "### 抽象数据类型 (ADT)" ] }, { "cell_type": "markdown", "id": "36aa3d57", "metadata": {}, "source": [ "```c\n", "ADT 抽象数据类型名\n", "DATA\n", " 数据关系之间逻辑关系的定义(Structure)\n", "Operation\n", " 操作1\n", " 初始条件\n", " 操作结果描述\n", " 操作2\n", " ...\n", "endADT\n", "```" ] }, { "cell_type": "markdown", "id": "5902e0f3", "metadata": {}, "source": [ "# 算法" ] }, { "cell_type": "markdown", "id": "1f567441", "metadata": {}, "source": [ "算法是解决特定问题求解步骤的描述,在计算机中表现为指令的有限序列,并且每条指令表示一个或多个操作 。" ] }, { "cell_type": "markdown", "id": "af1feeac", "metadata": {}, "source": [ "## 算法的特性" ] }, { "cell_type": "markdown", "id": "98f9549b", "metadata": {}, "source": [ "输入、输出、有穷性、确定性、可行性" ] }, { "cell_type": "markdown", "id": "acb2368d", "metadata": {}, "source": [ "## 算法设计的要求 " ] }, { "cell_type": "markdown", "id": "ee00d1ef", "metadata": {}, "source": [ "1. 正确性:\n", " 1. 算法程序没有语法错误\n", " 2. 算法程序对于合法的输入数据能够产生满足要求的输出结果。\n", " 3. 算法程序对于非法的输入数据能够得出满足规格说明的结果。\n", " 4. 算法程序对于精心选择的,甚至刁难的测试数据都有满足要求的输出结果。\n", "2. 可读性\n", "3. 健壮性:当输入数据不合法时,算法也能做出相关处理。\n", "4. 时间效率高和存储量低" ] }, { "cell_type": "markdown", "id": "8c729411", "metadata": {}, "source": [ "## 算法效率的度量方法 " ] }, { "cell_type": "markdown", "id": "fd43a056", "metadata": {}, "source": [ "**事前分析估算方法**:在计算机程序编制前,依据统计方法对算法进行估算 \n", "\n", "测定运行时间最可靠的方法:就是计算对运行时间有消耗的基本操作的执行次数,运行时间与这个计数成正比" ] }, { "cell_type": "markdown", "id": "458b8f1e", "metadata": {}, "source": [ "### 函数的渐近增长 " ] }, { "cell_type": "markdown", "id": "08ff3d1b", "metadata": {}, "source": [ "函数的渐近增长:给定两个函数f(n)和g(n),如果存在一个整数N,使得对于所有的n>N,f(n)总是比g(n)大,那么,我们说f(n)的增长渐近快于g(n)" ] }, { "cell_type": "markdown", "id": "560bf958", "metadata": {}, "source": [ "判断一个算法的效率时,函数中的常数和其他次要项常常可以忽略,而更应该关注**主项(最高阶项)的阶数。**" ] }, { "cell_type": "markdown", "id": "29a23db8", "metadata": {}, "source": [ "### 算法时间复杂度 " ] }, { "cell_type": "markdown", "id": "ac811d16", "metadata": {}, "source": [ "在进行算法分析时,语句总的执行次数T(n)是关于问题规模n的函数,进而分析T(n)随n的变化情况并确定T(n)的数量级。算法的时间复杂度,也就是算法的时间量度,记作:**T(n)=O(f(n))**。它表示随问题规模n的增大,算法执行时间的增长率和f(n)的增长率相同,称作算法的渐近时间复杂度,简称为时间复杂度。其中f(n)是问题规模n的某个函数。" ] }, { "cell_type": "markdown", "id": "020317ba", "metadata": {}, "source": [ "**推导大O阶**: \n", "1. 用常数1取代运行时间中的所有加法常数。\n", "2. 在修改后的运行次数函数中,只保留最高阶项。\n", "3. 如果最高阶项存在且不是1,则去除与这个项相乘的常数。\n", "得到的结果就是大O阶。" ] }, { "cell_type": "markdown", "id": "d0e326bb", "metadata": {}, "source": [ "#### 常数阶 " ] }, { "cell_type": "markdown", "id": "4994d8de", "metadata": {}, "source": [ "这种与问题的大小无关(n的多少),执行时间恒定的算法,我们称之为具有O(1)的时间复杂度,又叫常数阶。 \n", "注意:不管这个常数是多少,我们都记作O(1),而不能是O(3)、O(12)等其他任何数字。 \n", "对于分支结构而言,无论是真,还是假,执行的次数都是恒定的,不会随着n的变大而发生变化,所以单纯的分支结构(不包含在循环结构中),其时间复杂度也是O(1)。" ] }, { "cell_type": "markdown", "id": "92d6f17e", "metadata": {}, "source": [ "#### 线性阶" ] }, { "cell_type": "markdown", "id": "a726ed26", "metadata": {}, "source": [ "#### 对数阶 " ] }, { "cell_type": "markdown", "id": "011fb18e", "metadata": {}, "source": [ "```c\n", "int count=1;\n", "while (count" ] }, { "cell_type": "markdown", "id": "6c4772fc", "metadata": {}, "source": [ "### 算法空间复杂度 " ] }, { "cell_type": "markdown", "id": "c1216b10", "metadata": {}, "source": [ "算法的空间复杂度通过计算算法所需的存储空间实现,算法空间复杂度的计算公式记作:$S(n)=O(f(n))$,其中,n为问题的规模,f(n)为语句关于n所占存储空间的函数" ] }, { "cell_type": "markdown", "id": "45c7a68c", "metadata": {}, "source": [ "# 线性表" ] }, { "cell_type": "markdown", "id": "f5535ad1", "metadata": {}, "source": [ "## 线性表的顺序存储结构 " ] }, { "cell_type": "code", "execution_count": 26, "id": "8c3d1482", "metadata": { "ExecuteTime": { "end_time": "2021-09-26T02:24:18.997453Z", "start_time": "2021-09-26T02:24:18.951173Z" } }, "outputs": [], "source": [ "#include \n", "#include \n", "#define MAXLENGTH 20\n", "\n", "typedef struct\n", "{\n", " int *data;\n", " int length;\n", "}Sqlist;\n", "\n", "/*不准许在结构内赋初值,因为它本身是一个结构,而不是一个数据.\n", "**如果被赋初值表示它是一个数值,所以要是想赋值需要在结构外赋值.\n", "**因为申请内存空间的时候它不具备初始化的能力,只是将某部分的内存交给一个地址,所以不可以在结构内赋初值\n", "*/\n", "\n", "int main()\n", "{\n", " void sqlist_print(Sqlist *sq);\n", " void init_list(Sqlist *sq);\n", " Sqlist *sq;\n", " \n", " sq=(Sqlist*)malloc(sizeof(Sqlist));//要先为该结构体指针分配内存,否则即为空指针\n", " sq->data=(int*)malloc(sizeof(int)*MAXLENGTH);\n", " init_list(sq);\n", " sqlist_print(sq);\n", "}\n", "\n", "void sqlist_print(Sqlist *sq)//输出\n", "{\n", " for (int i=0;ilength;i++)\n", " {\n", " printf(\"%d \",*((sq->data)+i));\n", " }\n", "}\n", "\n", "void init_list(Sqlist *sq)\n", "{\n", " int i=0;\n", " while (scanf(\"%d\",sq->data+i)!=0)\n", " {\n", " i+=1;\n", " sq->length+=1;\n", " if (sq->length >=MAXLENGTH-1)\n", " {\n", " break;\n", " }\n", " }\n", "}" ] }, { "cell_type": "markdown", "id": "e3c8b555", "metadata": {}, "source": [ "### 线性表顺序存储结构的优缺点 " ] }, { "cell_type": "markdown", "id": "45c0046f", "metadata": {}, "source": [ "优点:\n", "1. 无须为表中元素之间的逻辑关系而增加额外的存储空间\n", "2. 可以快速的存取表中任意位置的元素\n", "\n", "缺点:\n", "1. 插入和删除操作时间复杂度高\n", "2. 当线性表长度变化较大时,难以确定存储空间的容量\n", "3. 可能会造成存储空间的碎片\n", "\n", "内存碎片通常分为内部碎片和外部碎片:\n", "\n", "1. 内部碎片是由于采用固定大小的内存分区,当一个进程不能完全使用分给它的固定内存区域时就产生了内部碎片,通常内部碎片难以完全避免;\n", "2. 外部碎片是由于某些未分配的连续内存区域太小,以至于不能满足任意进程的内存分配请求,从而不能被进程利用的内存区域。\n", "\n", " 现在普遍采用的段页式内存分配方式就是将进程的内存区域分为不同的段,然后将每一段由多个固定大小的页组成。通过页表机制,使段内的页可以不必连续处于同一内存区域,从而减少了外部碎片,然而同一页内仍然可能存在少量的内部碎片,只是一页的内存空间本就较小,从而使可能存在的内部碎片也较少。https://blog.csdn.net/csdn_kou/article/details/82891141" ] }, { "cell_type": "markdown", "id": "23207a12", "metadata": {}, "source": [ "## 线性表的链式存储结构 " ] }, { "cell_type": "markdown", "id": "ec631fe2", "metadata": {}, "source": [ "### 单链表 " ] }, { "cell_type": "markdown", "id": "99bc1f50", "metadata": {}, "source": [ "```c\n", "typedef struct node\n", "{\n", " struct node* next;\n", " int value;\n", "}Linklist;\n", "Linklist *p=(Linklist*)malloc(sizeof(Linklist));\n", "...\n", "```" ] }, { "cell_type": "markdown", "id": "2d38ac96", "metadata": {}, "source": [ "单链表结构与顺序存储结构的优缺点比较" ] }, { "cell_type": "markdown", "id": "4c86544d", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "bb26a0c4", "metadata": {}, "source": [ "### 静态链表 " ] }, { "cell_type": "markdown", "id": "0e0b5d70", "metadata": {}, "source": [ "```c\n", "typedef struct\n", "{\n", " Elemtype data;\n", " int cur;\n", "}StaticLinklist[MAXSIZE];\n", "```" ] }, { "cell_type": "markdown", "id": "1242a651", "metadata": {}, "source": [ "优势:在插入和删除操作时,只需要修改游标,不需要移动元素,从而改进了在顺序存储结构中的插入和删除操作需要移动大量元素的缺点\n", "\n", "缺陷:1.没有解决连续存储分配带来的表长难以确定的问题 2.失去了顺序存储结构随机存取的特性" ] }, { "cell_type": "markdown", "id": "3ce696c6", "metadata": {}, "source": [ "### 循环链表 " ] }, { "cell_type": "markdown", "id": "cb8695b0", "metadata": {}, "source": [ "将单链表中终端结点的指针端由空指针改为指向头结点,就使整个单链表形成个环,这种头尾相接的单链表称为单循环链表,简称**循环链表**(circular linked)" ] }, { "cell_type": "markdown", "id": "8842c8c6", "metadata": {}, "source": [ "### 双向链表 " ] }, { "cell_type": "markdown", "id": "dd9f4ed1", "metadata": {}, "source": [ "为了克服单向性这一缺点,我们的老科学家们,设计出了双向链表。**双向链表(double linked list)**是在单链表的每个结点中,再设置一个指向其前驱结点的指针域。所以在双向链表中的结点都有两个指针域,一个指向直接后继,另一个指向直接前驱。" ] }, { "cell_type": "markdown", "id": "b13eac22", "metadata": {}, "source": [ "# 栈与队列" ] }, { "cell_type": "markdown", "id": "14ecb71d", "metadata": {}, "source": [ "## 栈 " ] }, { "cell_type": "markdown", "id": "bc5a8b8f", "metadata": {}, "source": [ "**栈(stack)** 是限定仅在表尾进行插入和删除操作的线性表。 \n", "我们把允许插入和删除的一端称为**栈顶(top)**,另一端称为**栈底( bottom)**,不含任何数据元素的栈称为空栈。栈又称为**后进先出**(Last In First out)的**线性表**,简称LIFO结构。 \n", "栈的插入操作叫做**进栈(压栈、入栈)**;栈的删除操作叫做**出栈(弹栈)** \n", "栈的出栈顺序规律:后面比他小的数一定是递减的" ] }, { "cell_type": "markdown", "id": "396fbd37", "metadata": {}, "source": [ "### 顺序栈 " ] }, { "cell_type": "markdown", "id": "e874c856", "metadata": {}, "source": [ "### 链栈 " ] }, { "cell_type": "markdown", "id": "612bf7f6", "metadata": {}, "source": [ "## 队列 " ] }, { "cell_type": "markdown", "id": "c45aee4b", "metadata": {}, "source": [ "**队列(queue)**是只允许在一端进行插入操作,而在另一端进行删除操作的线性表。" ] }, { "cell_type": "markdown", "id": "b4c0ab9c", "metadata": {}, "source": [ "### 循环队列 " ] }, { "cell_type": "markdown", "id": "94177d74", "metadata": {}, "source": [ "### 链队列 " ] }, { "cell_type": "markdown", "id": "f911a211", "metadata": {}, "source": [ "# 串 " ] }, { "cell_type": "markdown", "id": "52a7f3b9", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "d25752c6", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "83484836", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "C", "language": "c", "name": "c" }, "language_info": { "file_extension": ".c", "mimetype": "text/plain", "name": "c" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "165px" }, "toc_section_display": true, "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }