二叉树——堆详解

前言：

一、堆的结构

二、向上调整和向下调整

2.1 向上调整

2.2 向下调整

2.3 向上调整和向下调整时间复杂度比较

三、堆的实现

3.1 堆的初始化

3.2 堆的销毁

3.3 堆的插入

3.4堆的删除

3.5 取堆顶元素

3.6 对堆判空

四、堆排序

五、TOP-K 问题

六、代码总览

heap.h

heap.c

test.c(推排序和TOP-K问题）

最后

前言：

之前我们已经学习过了二叉树的基本知识，接下来我们就要上些“硬菜”了，话不多说，开始我们今天的学习吧！

一、堆的结构

前文我们已经交待了什么为堆？以及堆的分类。我们既然已经对上文的基础知识已经明白，那么我们现在就来探讨以下如何实现一个堆。

我们想，数据在内存中的存放是不是连续的，不以人的意志为改变，对吧。堆，是一个完全二叉树，在内存的存放不存在中间为空的情况。所以，咱们可以以数组的方式来存储堆。以下，是堆的结构：

struct Heap
{
	HPDataType* a;
	int size;
	int capacity;
}；

看起来和我们之前的顺序表结构有些类似。虽然，我们知道了其结构，但是我们不知道如何存储呀！要是随便存储，那和普通数组，普通二叉树有何区别。为了能够实现堆，我们将采取以下的方法。

二、向上调整和向下调整

2.1 向上调整

什么是向上调整呢？对于一个二叉树，我们知道了孩子节点，我们是不是可以反推出父节点。经行比较，小的成为父亲（建小堆），大的成为孩子，这样进行一一比较，小堆是不是就建好了。

向上调整就是：谁小谁当爹，谁大谁儿子，就这么简单。代码实现如下：

void AdjustUp(HPDataType* a, int child)
{
	int parent = (child - 1) / 2;
	while (child > 0)
	{
		if (a[parent] > a[child])
		{
			sweap(&a[parent], &a[child]);//注意：这里一定要传地址，&不要忘记
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
			break;
		}
	}
}

相信大家的sweap函数都能够独立实现，这里就不在展示，要是无法实现可参考文末处代码。

2.2 向下调整

向下调整是基于一个建好的堆来经行实现的，为啥？向上调整从子节点调，那向下调整是不是应该从父节点开始调。要是这个堆一会是大堆，一会是小堆，那还能调吗？肯定是不行了呀。所以：向下调整健堆的前提是：左右子树必须是一个堆，才能调整。

其实现思路和向上调整类似，这里直接来看代码：

void AdjustDown(HPDataType* a, int n, int parent)
{
	int child = parent * 2 + 1;
	while (child < n)
	{
		if (a[child] > a[child+1])//此步的目的是找到最小的
		{
			child++;
		}
		if (a[parent] > a[child])
		{
			sweap(&a[parent], &a[child]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}

有了向上调整，对于向下调整大家应该可以理解。

2.3 向上调整和向下调整时间复杂度比较

我们说了这么多肯定要考虑其效率问题，我们对于时间复杂度的关注度是一如既往的深。大家会有这样的感觉：这两个时间复杂度应该一样吧，都是O（N*logN）。大家先别急，我给大家推一遍就明白了：

它们的时间复杂度为什么会有差别呢？很简单，向下调整建堆是用小成大，大乘小。向上调整建堆是用小乘小，大乘大。所以会有这样的差距。从效率上看，向下调整建堆的效率大于向上调整建堆。在后文推排序中所用的方法为：向下调整。

三、堆的实现

既然我们的方法都已具备，那我们现在就拿下堆吧！

3.1 堆的初始化

void HeapInit(Heap* php)
{
	assert(php);
	php->a = NULL;
	php->capacity = php->size = 0;
}

3.2 堆的销毁

void HeapDestory(Heap* hp)
{
	assert(hp);
	free(hp->a);
	hp->a = NULL;
	hp->capacity = hp->size = 0;
}

3.3 堆的插入

void sweap(int* p1, int* p2)
{
	int tmp = *p1;
	*p1 = *p2;
	*p2 = tmp;
}
void AdjustUp(HPDataType* a, int child)
{
	int parent = (child - 1) / 2;
	while (child > 0)
	{
		if (a[parent] > a[child])
		{
			sweap(&a[parent], &a[child]);//注意：这里一定要传地址，&不要忘记
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
			break;
		}
	}
}
void HeapPush(Heap* hp, HPDataType x)
{
	assert(hp);
	if (hp->capacity == hp->size)
	{
		int newcapacity = hp->capacity == 0 ? 4 : hp->capacity * 2;
		HPDataType* newnode = (HPDataType*)realloc(hp->a,newcapacity * sizeof(HPDataType));
		if (hp->a == NULL)
		{
			perror("realloc fail");
			return;
		}
		hp->a = newnode;
		hp->capacity = newcapacity;
	}
	hp->a[hp->size] = x;
	hp->size++;
	//也可写成：hp->a[hp->size++] = x;
	AdjustUp(hp->a, hp->size - 1);//向上调整
}

3.4堆的删除

void AdjustDown(HPDataType* a, int n, int parent)
{
	int child = parent * 2 + 1;
	while (child < n)
	{
		if (a[child] > a[child+1])//此步的目的是找到最小的
		{
			child++;
		}
		if (a[parent] > a[child])
		{
			sweap(&a[parent], &a[child]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}
void HeapPop(Heap* hp)
{
	assert(hp);
	assert(hp->size > 0);
	sweap(&hp->a[0], &hp->a[hp->size - 1]);//删除顶元素
	hp->size--;
	AdjustDown(hp->a, hp->size, 0);//向下调整
}

堆的删除不是删除最后一个元素，而是删除首元素！！！

3.5 取堆顶元素

HPDataType HeapTop(Heap* hp)
{
	assert(hp);
	assert(hp->size > 0);
	return hp->a[0];
}

3.6 对堆判空

bool HeapEmpty(Heap* hp)
{
	assert(hp);
	return hp->size == 0;
}

好了，以上便是对堆的全部实现，大家稍微休息片刻，接下来咱们进入堆排序。

四、堆排序

接下来，咱们来搞堆排序。对于堆排序，大家有没有什么想法：升序是建大堆还是小堆，降序是建小堆还是大堆。对于这个问题，有人肯定会有这样的想法：降序，那不是非大堆莫属，升序，那不是非小堆莫属。非也。

那么，有什么方法呢？答案为：降序：小堆；升序：大堆。

就以降序为例：排降序咱们首先先把根（也就是祖先）扔到最后，之后不在理会它，推选出下一个祖先，重复即可。

以上，便为实现思路。

这只是一个大致方向，我们拿这个大致方向去实现肯定会困难重重。那我们面临最大的困难是什么呢？答案为：向下调整的使用条件。向下调整使用的前提是什么？左右子树必须是一个堆，才能调整。这个问题说起来不容易，实际做起来也不容易。那么，我们该如何完成呢？如下：

1.我们想找到最后一个节点的父节点。

2.将这个小堆进行向下调整。

3.然后，对其它节点也使用其类似方法。

这个方法是不是特别巧妙，以“农村包围城市”的思想，悄无声息就完成了这件大事。妙哉！

接下来便是推排序代码实现：

void HeapSort(int* a, int n)
{
	for (int i = (n - 1 - 1) / 2; i >= 0; i--)
	{
		AdjustDown(a, n, i);//见小堆
	}
	int end = n - 1;
	while (end)
	{
		sweap(&a[0], &a[end]);
		AdjustDown(a, end, 0);//经行排序
		end--;
	}
}

五、TOP-K 问题

什么为TOP-K问题呢？以下为大家构建一个场景：

你在大学里苦学两年，计划在大二的暑假的暑假找一份实习。和你预期一样，一家公司向你发起了面试，对于前面的计算机知识你对答入流，当你以为你能够“黄袍加身”之时，面试官问出了最后一题，我给你一个文件，里面有十万个数据，给我选出最大的十个出来。你当时在心里想：就这。你立马动态开辟了这么大的空间，运用堆排序，立马拿到了结果。面试官喝了口茶，说了句：“太大了。”你以为说的是茶太烫了，想着要不要做点什么的时候，然后面试官缓缓地说：“如果我给你1MB的空间，能不能搞出来，1KB的空间，行不行？”你不由得在心中想：你就是想为难我，当没想到，我重生了（一键三连即可听故事），这个问题早已被我攻克，我可是要站在顶尖的人。于是，你大笔一挥写出的代码震惊到面试官，于是如愿成为实习生。

本问题已经明确，在很大的数据面前，用极小的内存得到想要的结果。

具体实现思路如下：在十万数据中得到最大的十个数。我们可先取出十个数，建小堆，和其他数字比，如果有数字比顶元素大，便替换掉顶元素，进入堆，在向下调整，直到对比完。

代码实现如下：

void CreateNDate()
{
	// 造数据
	int n = 10000;
	srand(time(0));
	const char* file = "data.txt";
	FILE* fin = fopen(file, "w");
	if (fin == NULL)
	{
		perror("fopen error");
		return;
	}

	for (size_t i = 0; i < n; ++i)
	{
		int x = rand() % 1000000;
		fprintf(fin, "%d\n", x);
	}

	fclose(fin);
}

void PrintTopK(int k)
{
	int k;
	printf("请输入k>:");
	scanf("%d", &k);
	int* kminheap = (int*)malloc(sizeof(int) * k);
	if (kminheap == NULL)
	{
		perror("malloc fail");
		return;
	}
	const char* file = "data.txt";
	FILE* fout = fopen(file, "r");
	if (fout == NULL)
	{
		perror("fopen error");
		return;
	}

	// 读取文件中前k个数
	for (int i = 0; i < k; i++)
	{
		fscanf(fout, "%d", &kminheap[i]);
	}
	// 建K个数的小堆
	for (int i = (k - 1 - 1) / 2; i >= 0; i--)
	{
		AdjustDown(kminheap, k, i);
	}

	// 读取剩下的N-K个数
	int x = 0;
	while (fscanf(fout, "%d", &x) > 0)
	{
		if (x > kminheap[0])
		{
			kminheap[0] = x;
			AdjustDown(kminheap, k, 0);
		}
	}

	printf("最大前%d个数：", k);
	for (int i = 0; i < k; i++)
	{
		printf("%d ", kminheap[i]);
	}
	printf("\n");
}

六、代码总览

heap.h

#pragma once
#include<stdio.h>
#include<stdlib.h>
#include<stdbool.h>
#include<assert.h>

typedef int HPDataType;
typedef struct Heap
{
	HPDataType* a;
	int size;
	int capacity;
}Heap;

void HeapInit(Heap* php);
// 堆的销毁
void HeapDestory(Heap* hp);
// 堆的插入
void HeapPush(Heap* hp, HPDataType x);
// 堆的删除
void HeapPop(Heap* hp);
// 取堆顶的数据
HPDataType HeapTop(Heap* hp);
// 堆的数据个数
int HeapSize(Heap* hp);
// 堆的判空
bool HeapEmpty(Heap* hp);
//向下调整建堆
void AdjustDown(HPDataType* a, int n, int parent);
void sweap(int* p1, int* p2);

heap.c

#include"heap.h"

void HeapInit(Heap* php)
{
	assert(php);
	php->a = NULL;
	php->capacity = php->size = 0;
}
// 堆的销毁
void HeapDestory(Heap* hp)
{
	assert(hp);
	free(hp->a);
	hp->a = NULL;
	hp->capacity = hp->size = 0;
}
// 堆的插入
void sweap(int* p1, int* p2)
{
	int tmp = *p1;
	*p1 = *p2;
	*p2 = tmp;
}
void AdjustUp(HPDataType* a, int child)
{
	int parent = (child - 1) / 2;
	while (child > 0)
	{
		if (a[parent] > a[child])
		{
			sweap(&a[parent], &a[child]);//注意：这里一定要传地址，&不要忘记
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
			break;
		}
	}
}
void HeapPush(Heap* hp, HPDataType x)
{
	assert(hp);
	if (hp->capacity == hp->size)
	{
		int newcapacity = hp->capacity == 0 ? 4 : hp->capacity * 2;
		HPDataType* newnode = (HPDataType*)realloc(hp->a,newcapacity * sizeof(HPDataType));
		if (hp->a == NULL)
		{
			perror("realloc fail");
			return;
		}
		hp->a = newnode;
		hp->capacity = newcapacity;
	}
	hp->a[hp->size] = x;
	hp->size++;
	//也可写成：hp->a[hp->size++] = x;
	AdjustUp(hp->a, hp->size - 1);//向上调整
}
// 堆的删除
void AdjustDown(HPDataType* a, int n, int parent)
{
	int child = parent * 2 + 1;
	while (child < n)
	{
		if (a[child] > a[child+1])//此步的目的是找到最小的
		{
			child++;
		}
		if (a[parent] > a[child])
		{
			sweap(&a[parent], &a[child]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}
void HeapPop(Heap* hp)
{
	assert(hp);
	assert(hp->size > 0);
	sweap(&hp->a[0], &hp->a[hp->size - 1]);//删除顶元素
	hp->size--;
	AdjustDown(hp->a, hp->size, 0);//向下调整
}
// 取堆顶的数据
HPDataType HeapTop(Heap* hp)
{
	assert(hp);
	assert(hp->size > 0);
	return hp->a[0];
}
// 堆的数据个数
int HeapSize(Heap* hp)
{
	assert(hp);
	return hp->size;
}
// 堆的判空
bool HeapEmpty(Heap* hp)
{
	assert(hp);
	return hp->size == 0;
}

test.c(推排序和TOP-K问题）

#include"heap.h"


void HeapSort(int* a, int n)
{
	for (int i = (n - 1 - 1) / 2; i >= 0; i--)
	{
		AdjustDown(a, n, i);
	}
	int end = n - 1;
	while (end)
	{
		sweap(&a[0], &a[end]);
		AdjustDown(a, end, 0);
		end--;
	}
}
void CreateNDate()
{
	// 造数据
	int n = 10000;
	srand(time(0));
	const char* file = "data.txt";
	FILE* fin = fopen(file, "w");
	if (fin == NULL)
	{
		perror("fopen error");
		return;
	}

	for (size_t i = 0; i < n; ++i)
	{
		int x = rand() % 1000000;
		fprintf(fin, "%d\n", x);
	}

	fclose(fin);
}

void PrintTopK(int k)
{
	int k;
	printf("请输入k>:");
	scanf("%d", &k);
	int* kminheap = (int*)malloc(sizeof(int) * k);
	if (kminheap == NULL)
	{
		perror("malloc fail");
		return;
	}
	const char* file = "data.txt";
	FILE* fout = fopen(file, "r");
	if (fout == NULL)
	{
		perror("fopen error");
		return;
	}

	// 读取文件中前k个数
	for (int i = 0; i < k; i++)
	{
		fscanf(fout, "%d", &kminheap[i]);
	}
	// 建K个数的小堆
	for (int i = (k - 1 - 1) / 2; i >= 0; i--)
	{
		AdjustDown(kminheap, k, i);
	}

	// 读取剩下的N-K个数
	int x = 0;
	while (fscanf(fout, "%d", &x) > 0)
	{
		if (x > kminheap[0])
		{
			kminheap[0] = x;
			AdjustDown(kminheap, k, 0);
		}
	}

	printf("最大前%d个数：", k);
	for (int i = 0; i < k; i++)
	{
		printf("%d ", kminheap[i]);
	}
	printf("\n");
}