【优选算法】滑动窗口——leetcode——串联所有单词的⼦串（hard）

1.题目

2，算法原理

编辑 1.哈希表

2.left和right指针的移动

3.滑动窗口的执行次数

3.代码实现

1.C++代码

2.C语言代码

4.C++知识点

1. std::vector

2. std::string

3. std::unordered_map

4. 迭代器

5. 范围循环 (range-based for loop)

6. 动态内存管理

7. 面向对象编程（OOP）

总结

专栏：优选算法

1.题目

30. 串联所有单词的子串

给定一个字符串 s 和一个字符串数组 words。 words 中所有字符串 长度相同。

s 中的 串联子串 是指一个包含 words 中所有字符串以任意顺序排列连接起来的子串。

例如，如果 words = ["ab","cd","ef"]，那么 "abcdef"， "abefcd"，"cdabef"， "cdefab"，"efabcd"，和 "efcdab" 都是串联子串。 "acdbef" 不是串联子串，因为他不是任何 words 排列的连接。

返回所有串联子串在 s 中的开始索引。你可以以 任意顺序 返回答案。

示例 1：

输入：s = "barfoothefoobarman", words = ["foo","bar"]
输出：[0,9]
解释：因为 words.length == 2 同时 words[i].length == 3，连接的子字符串的长度必须为 6。
子串 "barfoo" 开始位置是 0。它是 words 中以 ["bar","foo"] 顺序排列的连接。
子串 "foobar" 开始位置是 9。它是 words 中以 ["foo","bar"] 顺序排列的连接。
输出顺序无关紧要。返回 [9,0] 也是可以的。

示例 2：

输入：s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
输出：[]
解释：因为 words.length == 4 并且 words[i].length == 4，所以串联子串的长度必须为 16。
s 中没有子串长度为 16 并且等于 words 的任何顺序排列的连接。
所以我们返回一个空数组。

示例 3：

输入：s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
输出：[6,9,12]
解释：因为 words.length == 3 并且 words[i].length == 3，所以串联子串的长度必须为 9。
子串 "foobarthe" 开始位置是 6。它是 words 中以 ["foo","bar","the"] 顺序排列的连接。
子串 "barthefoo" 开始位置是 9。它是 words 中以 ["bar","the","foo"] 顺序排列的连接。
子串 "thefoobar" 开始位置是 12。它是 words 中以 ["the","foo","bar"] 顺序排列的连接。

提示：

1 <= s.length <= 104
1 <= words.length <= 5000
1 <= words[i].length <= 30
words[i] 和 s 由小写英文字母组成

2，算法原理

类比438.找到字符串中所有字母异位词

如果我们把每⼀个单词看成⼀个⼀个字⺟，问题就变成了找到「字符串中所有的字⺟异位词」。⽆
⾮就是之前处理的对象是⼀个⼀个的字符，我们这⾥处理的对象是⼀个⼀个的单词。

具体思路见：【优选算法】滑动窗口——leetcode——438.找到字符串中所有字母异位词-CSDN博客

1.哈希表

hash<string,int>

string——>一个一个的字符

int——>这个字符串出现的次数

2.left和right指针的移动

移动的步长是每个单词的长度——len

3.滑动窗口的执行次数

len次

3.代码实现

1.C++代码

时间复杂度 O(n)

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        vector<int> ret;
        unordered_map<string, int> hash1; // 保存 words ⾥⾯所有单词的频次
        for (auto& s : words)
            hash1[s]++;
        int len = words[0].size(), m = words.size();
        for (int i = 0; i < len; i++) // 执⾏ len 次
        {
            unordered_map<string, int> hash2; // 维护窗⼝内单词的频次
            for (int left = i, right = i, count = 0; right + len <= s.size();
                 right += len) {
                // 进窗⼝ + 维护 count
                string in = s.substr(right, len);
                hash2[in]++;
                if (hash1.count(in) && hash2[in] <= hash1[in])
                    count++;
                // 判断
                if (right - left + 1 > len * m) {
                    // 出窗⼝ + 维护 count
                    string out = s.substr(left, len);
                    if (hash1.count(out) && hash2[out] <= hash1[out])
                        count--;
                    hash2[out]--;
                    left += len;
                }
                // 更新结果
                if (count == m)
                    ret.push_back(left);
            }
        }
        return ret;
    }
};

        int len = words[0].size(), m = words.size();

len 是 words 中单词的长度，假设所有单词长度相同。m 是 words 中单词的数量。
words[0].size() 取得第一个单词的长度，words.size() 取得单词的数量。

            unordered_map<string, int> hash2; // 维护窗口内单词的频次

这里又定义了一个 unordered_map<string, int> hash2，用于记录当前窗口内的单词频次。

                string in = s.substr(right, len);

s.substr(right, len) 提取从 right 开始的 len 长度的子串，存储在 in 中。这是当前窗口中新进入的单词。

                if (right - left + 1 > len * m) {
                    string out = s.substr(left, len);
                    if (hash1.count(out) && hash2[out] <= hash1[out])
                        count--;
                    hash2[out]--;
                    left += len;
                }

right - left + 1 > len * m 判断窗口大小是否超过 words 中所有单词长度的总和。如果超过，需要缩小窗口。
s.substr(left, len) 提取窗口左端的单词，存储在 out 中。
如果 out 在 words 中且 hash2[out] <= hash1[out]，说明 count 中的一个有效匹配即将被移除，所以 count--。
hash2[out]-- 从 hash2 中移除 out，减少其频次。
left += len 将窗口左边界右移一个单词的长度。

2.C语言代码

typedef struct {
    char* word;
    int count;
} WordCount;

int* findSubstring(char* s, char** words, int wordsSize, int* returnSize) {
    int sLen = strlen(s);
    int wordLen = strlen(words[0]);
    int windowLen = wordLen * wordsSize;

    WordCount* hash1 = (WordCount*)malloc(wordsSize * sizeof(WordCount));
    for (int i = 0; i < wordsSize; i++) {
        hash1[i].word = words[i];
        hash1[i].count = 0;
    }

    // 计算words数组中每个单词的频次
    for (int i = 0; i < wordsSize; i++) {
        int found = 0;
        for (int j = 0; j < i; j++) {
            if (strcmp(words[i], hash1[j].word) == 0) {
                hash1[j].count++;
                found = 1;
                break;
            }
        }
        if (!found) {
            hash1[i].count = 1;
        }
    }

    int* ret = (int*)malloc(sLen * sizeof(int));
    *returnSize = 0;

    for (int i = 0; i < wordLen; i++) {
        WordCount* hash2 = (WordCount*)calloc(wordsSize, sizeof(WordCount));
        for (int j = 0; j < wordsSize; j++) {
            hash2[j].word = hash1[j].word;
        }

        int left = i, right = i, count = 0;
        while (right + wordLen <= sLen) {
            char* in = (char*)malloc((wordLen + 1) * sizeof(char));
            strncpy(in, s + right, wordLen);
            in[wordLen] = '\0';

            int foundInHash1 = 0;
            for (int j = 0; j < wordsSize; j++) {
                if (strcmp(in, hash2[j].word) == 0) {
                    hash2[j].count++;
                    foundInHash1 = 1;
                    if (hash2[j].count <= hash1[j].count) {
                        count++;
                    }
                    break;
                }
            }

            while (right - left + 1 > windowLen) {
                char* out = (char*)malloc((wordLen + 1) * sizeof(char));
                strncpy(out, s + left, wordLen);
                out[wordLen] = '\0';

                for (int j = 0; j < wordsSize; j++) {
                    if (strcmp(out, hash2[j].word) == 0) {
                        if (hash2[j].count <= hash1[j].count) {
                            count--;
                        }
                        hash2[j].count--;
                        break;
                    }
                }

                free(out);
                left += wordLen;
            }

            if (count == wordsSize) {
                ret[*returnSize] = left;
                (*returnSize)++;
            }

            free(in);
            right += wordLen;
        }
        free(hash2);
    }

    free(hash1);
    ret = realloc(ret, (*returnSize) * sizeof(int)); // 重新调整大小
    return ret;
}

4.C++知识点

1. `std::vector`

定义：std::vector是C++标准模板库（STL）中的动态数组容器，提供了动态调整大小的功能。
特点：
- 动态大小：可以根据需求自动调整大小。
- 随机访问：支持高效的随机访问，可以通过索引直接访问任意元素。
- 自动内存管理：自动管理内存的分配和释放。
常用函数：
- push_back(value): 在末尾添加一个元素。
- pop_back(): 删除末尾的元素。
- size(): 返回当前元素的个数。
- operator[]: 通过索引访问元素。

std::vector 是一个动态数组，提供了可以动态调整大小的数组实现。以下是如何声明、初始化和操作std::vector的示例：

#include <vector>
#include <iostream>

int main() {
    // 创建一个空的int类型vector
    std::vector<int> numbers;

    // 添加元素
    numbers.push_back(1);
    numbers.push_back(2);
    numbers.push_back(3);

    // 访问元素
    std::cout << "First element: " << numbers[0] << std::endl;

    // 迭代访问元素
    std::cout << "All elements:";
    for (size_t i = 0; i < numbers.size(); ++i) {
        std::cout << " " << numbers[i];
    }
    std::cout << std::endl;

    return 0;
}

用法解释：

push_back: 在vector的末尾添加元素。
size(): 返回vector中元素的数量。
operator[]: 通过索引访问vector中的元素。

2. `std::string`

定义：std::string是C++标准库中的字符串类，用于处理字符序列。
特点：
- 动态大小：可以根据需求自动调整大小。
- 丰富的成员函数：提供了如查找、替换、获取子串等多种操作。
- 安全性：相比C风格字符串，std::string更安全，避免了缓冲区溢出问题。
常用函数：
- size(): 返回字符串的长度。
- substr(pos, len): 提取子字符串。
- find(sub): 查找子字符串的位置。

std::string 提供了一种处理字符串的方式，以下是std::string的基本用法：

#include <string>
#include <iostream>

int main() {
    // 创建字符串
    std::string greeting = "Hello, world!";

    // 获取字符串长度
    std::cout << "Length: " << greeting.size() << std::endl;

    // 获取子串
    std::string sub = greeting.substr(7, 5);
    std::cout << "Substring: " << sub << std::endl;

    // 拼接字符串
    std::string newGreeting = greeting + " Welcome!";
    std::cout << "New greeting: " << newGreeting << std::endl;

    return 0;
}

用法解释：

size(): 返回字符串的长度。
substr(pos, len): 返回从pos位置开始、长度为len的子串。
operator+: 拼接两个字符串。

3. `std::unordered_map`

定义：std::unordered_map是C++11标准引入的哈希表容器，用于存储键值对，支持快速查找。
特点：
- 无序存储：元素没有特定的顺序。
- 哈希表实现：利用哈希函数实现快速的插入、删除和查找操作。
- 复杂度：平均情况下，查找、插入、删除操作的时间复杂度为$O(1)$。
常用函数：
- operator[]: 通过键访问或插入元素。
- find(key): 查找键是否存在。
- count(key): 返回键的出现次数（0或1）。

std::unordered_map 提供了键值对的存储，并支持快速查找：

#include <unordered_map>
#include <iostream>

int main() {
    // 创建一个unorderd_map，键是string，值是int
    std::unordered_map<std::string, int> fruitCount;

    // 插入元素
    fruitCount["apple"] = 3;
    fruitCount["banana"] = 5;
    fruitCount["orange"] = 2;

    // 查找元素
    std::cout << "Number of apples: " << fruitCount["apple"] << std::endl;

    // 迭代访问元素
    for (const auto& pair : fruitCount) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }

    return 0;
}

用法解释：

operator[]: 通过键访问或插入元素。
迭代器：使用范围循环遍历unordered_map中的键值对。

4. 迭代器

定义：迭代器是一种对象，提供对容器元素的遍历功能。几乎所有STL容器都提供迭代器支持。
特点：
- 统一接口：提供统一的遍历容器元素的方式，无需关注容器的内部实现。
- 类型：包括输入迭代器、输出迭代器、前向迭代器、双向迭代器和随机访问迭代器。
常用函数：
- begin(): 返回指向容器第一个元素的迭代器。
- end(): 返回指向容器末尾后一个位置的迭代器。

迭代器用于遍历容器中的元素。以下是使用迭代器遍历std::vector的示例：

#include <vector>
#include <iostream>

int main() {
    std::vector<int> numbers = {1, 2, 3, 4, 5};

    // 使用迭代器遍历vector
    for (std::vector<int>::iterator it = numbers.begin(); it != numbers.end(); ++it) {
        std::cout << *it << " ";
    }
    std::cout << std::endl;

    return 0;
}

用法解释：

begin(): 返回指向容器第一个元素的迭代器。
end(): 返回指向容器末尾后一个位置的迭代器。
*it: 解引用迭代器，访问当前元素。

5. 范围循环 (`range-based for loop`)

定义：C++11引入的语法糖，简化了对容器的遍历。
特点：
- 简洁：无需显式管理迭代器。
- 类型自动推导：使用auto关键字自动推导元素类型。

范围循环提供了一种简洁的方式遍历容器中的元素：

#include <vector>
#include <iostream>

int main() {
    std::vector<int> numbers = {1, 2, 3, 4, 5};

    // 使用范围循环遍历vector
    for (int num : numbers) {
        std::cout << num << " ";
    }
    std::cout << std::endl;

    return 0;
}

用法解释：

for (int num : numbers): 直接访问numbers中的每个元素，num是元素的副本。

6. 动态内存管理

定义：C++允许程序在运行时动态分配和释放内存。
特点：
- 手动管理：需要手动分配和释放内存，避免内存泄漏。
- 相关操作：
  - new：分配内存。
  - delete：释放内存。
智能指针：C++11引入了智能指针如std::unique_ptr和std::shared_ptr，帮助自动管理内存。

C++允许使用new和delete进行动态内存管理，以下是一个基本示例：

#include <iostream>

int main() {
    // 动态分配一个int类型的内存空间
    int* p = new int;
    *p = 5;
    std::cout << "Value: " << *p << std::endl;

    // 释放内存
    delete p;

    // 动态分配一个int数组
    int* arr = new int[3];
    arr[0] = 1;
    arr[1] = 2;
    arr[2] = 3;

    // 打印数组内容
    for (int i = 0; i < 3; ++i) {
        std::cout << arr[i] << " ";
    }
    std::cout << std::endl;

    // 释放数组内存
    delete[] arr;

    return 0;
}

用法解释：

new: 动态分配内存。
delete: 释放内存。
delete[]: 释放动态分配的数组内存。

7. 面向对象编程（OOP）

定义：面向对象编程是一种编程范式，使用类和对象进行抽象和封装。
类：类是对对象的抽象描述，封装了数据和行为。
对象：对象是类的实例，通过类定义的结构创建。
访问修饰符：
- public: 公有成员，可以从类的外部访问。
- private: 私有成员，不能从类的外部访问。
- protected: 受保护成员，只能在类内部和派生类中访问。

C++支持面向对象编程，以下是一个简单的类定义和使用的示例：

#include <iostream>
#include <string>

class Dog {
private:
    std::string name;
    int age;

public:
    // 构造函数
    Dog(std::string n, int a) : name(n), age(a) {}

    // 成员函数
    void bark() {
        std::cout << "Woof! My name is " << name << " and I am " << age << " years old." << std::endl;
    }

    // 访问器函数
    std::string getName() const {
        return name;
    }

    int getAge() const {
        return age;
    }

    // 修改器函数
    void setName(std::string n) {
        name = n;
    }

    void setAge(int a) {
        age = a;
    }
};

int main() {
    // 创建对象
    Dog myDog("Buddy", 5);

    // 调用成员函数
    myDog.bark();

    // 修改属性
    myDog.setName("Charlie");
    myDog.setAge(6);
    myDog.bark();

    return 0;
}