Java-＞Map和Set

一、搜索树

1.概念

2.查找

3.插入

4.删除

二、搜索

1.Map的使用

1.1关于map的说明

1.2map的常用方法

2.Set的使用

2.1关于set的说明

2.2set的常用方法

三、哈希表

1.概念

2. 哈希冲突

3.冲突避免

3.1哈希函数的设计

3.2负载因子调节

4.解决冲突

4.1闭散列

4.2开散列(哈希桶)

5.实现一个哈希桶

一、搜索树

1.概念

二叉搜索树又称二叉排序树，它或者是一棵空树，或者是具有以下性质的二叉树：

1.若它的左子树不为空，则左子树上所有节点的值都小于根节点的值
2.若它的右子树不为空，则右子树上所有节点的值都大于根节点的值
3.它的左右子树也分别为二叉搜索树

2.查找

public class BinarySearchTree {
    static class TreeNode {
        public int val;
        public TreeNode left;
        public TreeNode right;

        public TreeNode(int val) {
            this.val = val;
        }
    }
    public TreeNode root;
    
    //O(logN)
    public boolean search(int val) {
        TreeNode cur = root;
        while(cur!=null) {
            if(cur.val < val) {
                cur = cur.right;
            }else if(cur.val > val) {
                cur = cur.left;
            }else {
                return true;
            }
        }
        return false;
    }
}

3.插入

    //O(logN)
    public void insert(int val) {
        TreeNode node = new TreeNode(val);
        if(root == null) {
            root = node;
            return;
        }
        TreeNode parent = null;
        TreeNode cur = root;
        while(cur!=null) {
            parent = cur;
            if(val < cur.val) {
                cur = cur.left;
            }else if(val > cur.val) {
                cur = cur.right;
            }else {
                return;
            }
        }
        if(val < parent.val) {
            parent.left = node;
        }else {
            parent.right = node;
        }
    }

二叉搜索树的插入和查找的时间复杂度在最好的情况下是O(logN)，但是如果这棵树是单分支那就是O(N)

AVL 红黑树基本就是高度平衡的可以达到O(logN)

4.删除

    public void remove(int val) {
        TreeNode cur = root;
        TreeNode parent = null;
        while(cur!=null) {
            if(cur.val < val) {
                parent = cur;
                cur = cur.right;
            }else if(cur.val > val) {
                parent = cur;
                cur = cur.left;
            }else {
                removeNode(parent,cur);
                return;
            }
        }
    }

    private void removeNode(TreeNode parent, TreeNode cur) {
        if(cur.left == null) {
            if(cur == root) {
                root = root.right;
            }
            if(parent.left == cur) {
                parent.left = cur.right;
            }else {
                parent.right = cur.right;
            }
        }else if(cur.right == null) {
            if(cur == root) {
                root = root.left;
            }
            if(parent.left == cur) {
                parent.left = cur.left;
            }else {
                parent.right = cur.left;
            }
        }else {
            //cur   左树的最大值  或  右树的最小值
            TreeNode traget = cur.left;
            TreeNode tragetParent = null;
            while(traget.right!=null) {
                tragetParent = traget;
                traget = traget.right;
            }
            cur.val = traget.val;
            if(tragetParent.right == traget) {
                tragetParent.right = traget.left;
            }else {
                tragetParent.left = traget.left;
            }
        }
    }

二、搜索

Map和set是一种专门用来进行搜索的容器或者数据结构，其搜索的效率与其具体的实例化子类有关

Map中存储的就是key-value的键值对

Set中只储存key

1.Map的使用

1.1关于map的说明

Map是一个接口类，该类没有继承自Collection，该类中存储的是<K,V>结构的键值对，并且K一定是唯一的，不能重复。

1.2map的常用方法

方法	解释
V get(Object key)	返回 key 对应的 value
V getOrDefault(Object key, V defaultValue)	返回 key 对应的 value，key 不存在，返回默认值
V put(K key, V value)	设置 key 对应的 value
V remove(Object key)	删除 key 对应的映射关系
Set<K> keySet()	返回所有 key 的不重复集合
Collection<V> values()	返回所有 value 的可重复集合
Set<Map.Entry<K, V>> entrySet()	返回所有的 key-value 映射关系
boolean containsKey(Object key)	判断是否包含 key
boolean containsValue(Object value)	判断是否包含 value

public class Text {
    public static void main(String[] args) {
        //根据K的值 来判断大小
        Map<String,Integer> map = new TreeMap<>();
        map.put("hello",5);
        map.put("abc",2);
        map.put("world",5);
        map.put("hello",13);
        System.out.println(map);
    }
}
//{abc=2, hello=13, world=5}

public class Text {
    public static void main(String[] args) {
        //根据K的值 来判断大小
        Map<String,Integer> map = new TreeMap<>();
        map.put("abc",2);
        map.put("world",5);
        map.put("hello",13);
        System.out.println(map);
    }
}
//{abc=2, hello=13, world=5}

import java.util.Collection;
import java.util.Map;
import java.util.Set;
import java.util.TreeMap;

public class Text {
    public static void main(String[] args) {
        //根据K的值 来判断大小
        Map<String,Integer> map = new TreeMap<>();
        map.put("hello",13);
        map.put("abc",2);
        map.put("world",5);
//        System.out.println(map.get("abc"));

        Set<String> strings = map.keySet();
        System.out.println(strings);

        Collection<Integer> values = map.values();
        System.out.println(values);

        //这是把Map转化成Set的方法
        Set<Map.Entry<String, Integer>> set = map.entrySet();

        for(Map.Entry<String,Integer> entry : set) {
            System.out.println("Key : " + entry.getKey() + "  Value : " + entry.getValue());
        }

    }
}
//[abc, hello, world]
//[2, 13, 5]
//Key : abc  Value : 2
//Key : hello  Value : 13
//Key : world  Value : 5

注意：
1. Map是一个接口，不能直接实例化对象，如果要实例化对象只能实例化其实现类TreeMap或者HashMap
2. Map中存放键值对的Key是唯一的，value是可以重复的
3. 在TreeMap中插入键值对时，key不能为空，否则就会抛NullPointerException异常，value可以为空。但是HashMap的key和value都可以为空。
4. Map中的Key可以全部分离出来，存储到Set中来进行访问(因为Key不能重复)。
5. Map中的value可以全部分离出来，存储在Collection的任何一个子集合中(value可能有重复)。
6. Map中键值对的Key不能直接修改，value可以修改，如果要修改key，只能先将该key删除掉，然后再来进行重新插入。

Map底层构	TreeMap	HashMap
底层结构	红黑树	哈希桶
插入/删除/查找时间复杂度	O(logN)	O(1)
是否有序	关于Key有序	无序
线程安全	不安全	安全
插入/删除/查找区别	需要进行元素比较	通过哈希函数计算哈希地址
比较与覆写	key必须能够比较，否则会抛出ClassCastException异常	自定义类型需要覆写equals和 hashCode方法
应用场景	需要Key有序场景下	Key是否有序不关心，需要更高的时间性能

2.Set的使用

2.1关于set的说明

Set与Map主要的不同有两点：Set是继承自Collection的接口类，Set中只存储了Key

2.2set的常用方法

方法	解释
boolean add(E e)	添加元素，但重复元素不会被添加成功
void clear()	清空集合
boolean contains(Object o)	判断 o 是否在集合中
Iterator<E> iterator()	返回迭代器
boolean remove(Object o)	删除集合中的 o
int size()	返回set中元素的个数
boolean isEmpty()	检测set是否为空，空返回true，否则返回false
Object[] toArray()	将set中的元素转换为数组返回
boolean containsAll(Collection<?> c)	集合c中的元素是否在set中全部存在，是返回true，否则返回 false
boolean addAll(Collection<? extends E> c)	将集合c中的元素添加到set中，可以达到去重的效果

public class Text {
    public static void main(String[] args) {
        Set<String> set = new TreeSet<>();
        set.add("hello");
        set.add("abc");
        set.add("world");
        System.out.println(set);
    }
}
//[abc, hello, world]

public class Text {
    public static void main(String[] args) {
        Set<String> set = new TreeSet<>();
        set.add("hello");
        set.add("abc");
        set.add("world");
        set.add("world");
        System.out.println(set);
    }
}
//[abc, hello, world]

public class Text {
    public static void main(String[] args) {
        Set<String> set = new TreeSet<>();
        set.add("hello");
        set.add("abc");
        set.add("world");
        System.out.println(set);
        for(String s : set) {
            System.out.println(s);
        }
    }
}
//[abc, hello, world]
//abc
//hello
//world

public class Text {
    public static void main(String[] args) {
        Set<String> set = new TreeSet<>();
        set.add("hello");
        set.add("abc");
        set.add("world");
        System.out.println(set);
//        for(String s : set) {
//            System.out.println(s);
//        }

        Iterator<String> iterator = set.iterator();
        while(iterator.hasNext()) {
            System.out.println(iterator.next());
        }
    }
}
//[abc, hello, world]
//abc
//hello
//world

注意：
1. Set是继承自Collection的一个接口类
2. Set中只存储了key，并且要求key一定要唯一
3. TreeSet的底层是使用Map来实现的，其使用key与Object的一个默认对象作为键值对插入到Map中的
4. Set最大的功能就是对集合中的元素进行去重
5. 实现Set接口的常用类有TreeSet和HashSet，还有一个LinkedHashSet，LinkedHashSet是在HashSet的基础上维护了一个双向链表来记录元素的插入次序。
6. Set中的Key不能修改，如果要修改，先将原来的删除掉，然后再重新插入
7. TreeSet中不能插入null的key，HashSet可以

三、哈希表

1.概念

一种存储结构，通过某种函数(hashFunc)使元素的存储位置与它的关键码之间能够建立一一映射的关系，那么在查找时通过该函数可以很快找到该元素。

插入元素：根据待插入元素的关键码，以此函数计算出该元素的存储位置并按此位置进行存放

搜索元素：对元素的关键码进行同样的计算，把求得的函数值当做元素的存储位置，在结构中按此位置取元素比较，若关键码相等，则搜索成功

该方式即为哈希(散列)方法，哈希方法中使用的转换函数称为哈希(散列)函数，构造出来的结构称为哈希表(HashTable)(或者称散列表)

例如：数据集合{1，7，6，4，5，9}；
哈希函数设置为：hash(key) = key % capacity; capacity为存储元素底层空间总的大小

2. 哈希冲突

对于两个数据元素的关键字 i 和 j ，有 i != j，但有：Hash(i) == Hash(j)，即：不同关键字通过相同哈希哈数计算出相同的哈希地址，该种现象称为哈希冲突或哈希碰撞。

把具有不同关键码而具有相同哈希地址的数据元素称为“同义词”。

3.冲突避免

由于我们哈希表底层数组的容量往往是小于实际要存储的关键字的数量的，这就导致一个问题，冲突的发生是必然的，但我们能做的应该是尽量的降低冲突率

3.1哈希函数的设计

1.哈希函数的定义域必须包括需要存储的全部关键码，而如果散列表允许有m个地址时，其值域必须在0到m-1之间

2.哈希函数计算出来的地址能均匀分布在整个空间中

3.哈希函数应该比较简单

常见哈希函数：

1. 直接定制法
取关键字的某个线性函数为散列地址：Hash（Key）= A*Key + B 优点：简单、均匀缺点：需要事先知道关键字的分布情况

使用场景：适合查找比较小且连续的情况

面试题：字符串中第一个只出现一次字符

2. 除留余数法
设散列表中允许的地址数为m，取一个不大于m，但最接近或者等于m的质数p作为除数，按照哈希函数：Hash(key) = key% p(p<=m),将关键码转换成哈希地址

3.2负载因子调节

负载因子和冲突率的关系粗略演示：

已知哈希表中已有的关键字个数是不可变的，那我们能调整的就只有哈希表中的数组的大小

4.解决冲突

4.1闭散列

闭散列：也叫开放定址法，当发生哈希冲突时，如果哈希表未被装满，说明在哈希表中必然还有空位置，那么可以把key存放到冲突位置中的“下一个” 空位置中去

1.线性探测：把冲突元素放在下一个位置
2.二次探测：j = （n + i^2）/ m , i为发生了几次冲突……

闭散列最大的缺陷就是空间利用率比较低，这也是哈希的缺陷

4.2开散列(哈希桶)

开散列法又叫链地址法(开链法)，首先对关键码集合用散列函数计算散列地址，具有相同地址的关键码归于同一子集合，每一个子集合称为一个桶，各个桶中的元素通过一个单链表链接起来，各链表的头结点存储在哈希表中。

数组+链表来实现的

数组中存放的是链表地址

将所有冲突的1元素放在同一个链表中

1. 每个桶的背后是另一个哈希表
2. 每个桶的背后是一棵搜索树

5.实现一个哈希桶

public class HashBuck {
    public static class Node {
        public int key;
        public int val;
        public Node next;

        public Node(int key, int val) {
            this.key = key;
            this.val = val;
        }
    }

    public Node[] array = new Node[10];
    public static int useSize = 0;
    public static final float LOAD_FACTOR = 0.75f;

    public void put(int key,int val) {
        //1.通过哈希函数找到一个位置
        int index = key % array.length;

        //2.判断当前数组下标的链表  有没有Key相同的 如果有 更新val
        Node cur = array[index];
        while(cur!=null) {
            if(cur.key == key) {
                cur.val = val;
                return;
            }
            cur = cur.next;
        }

        //3.如果没有  进行尾插法
        Node node = new Node(key,val);
        cur = array[index];
        if(array[index] == null) {
            array[index] = node;
        }else {
            while (cur.next != null) {
                cur = cur.next;
            }
            cur.next = node;
        }
        useSize++;

        //4.计算一下当前的负载因子  如果超过了0.75 那么就需要扩容
        if(doLoadFactor() >= LOAD_FACTOR) {
            //扩容
            //array = Arrays.copyOf(array,array.length*2);  error
            resize();
        }
    }

    private float doLoadFactor() {
        return 1.0f * useSize / array.length;
    }

    private void resize() {
        Node[] newArray = new Node[array.length*2];
        for (int i = 0; i < array.length; i++) {
            Node cur = array[i];
            while(cur!=null) {
                int newIndex = cur.key % newArray.length;
                Node curN = cur.next;
                if(newArray[newIndex] == null) {
                    newArray[newIndex] = cur;
                    cur.next = null;
                }else {
                    Node tmp = newArray[newIndex];
                     while(tmp.next!=null) {
                         tmp = tmp.next;
                     }
                     tmp.next = cur;
                     cur.next = null;
                }
                cur = curN;
            }
        }
        array = newArray;
    }

    public int get(int key) {
        //1.通过哈希函数找到一个位置
        int index = key % array.length;

        //2.判断当前数组下标的链表  有没有Key相同的 如果有 更新val
        Node cur = array[index];
        while(cur!=null) {
            if(cur.key == key) {
                return cur.val;
            }
            cur = cur.next;
        }
        return -1;
    }
}

引用类型：

public class HashBuck2<K,V> {
    public static class Node<K,V> {
        public K key;
        public V val;
        public Node<K,V> next;

        public Node(K key, V val) {
            this.key = key;
            this.val = val;
        }
    }

    public Node<K,V>[] array =(Node<K, V>[]) new Node[10];
    public int useSize;

    public void put(K key,V val) {
        //1.通过哈希函数找到一个位置
        int hashCode = key.hashCode();
        int index = hashCode / array.length;
        
        //2.判断当前数组下标的链表  有没有Key相同的 如果有 更新val
        Node<K,V> cur = array[index];
        while(cur!=null) {
            if(cur.key.equals(key)) {
                cur.val = val;
                return;
            }
            cur = cur.next;
        }

        //3.如果没有  进行尾插法
        //…………

        //4.计算一下当前的负载因子  如果超过了0.75 那么就需要扩容
        //…………
    }
    
    public V get(K key) {
        //1.通过哈希函数找到一个位置
        int hashCode = key.hashCode();
        int index = hashCode / array.length;

        //2.判断当前数组下标的链表  有没有Key相同的 如果有 更新val
        Node<K,V> cur = array[index];
        while(cur!=null) {
            if(cur.key.equals(key)) {
                return cur.val;
            }
            cur = cur.next;
        }
        return null;
    }
}


import java.util.Objects;

public class Person {
    public String id;

    public Person(String id) {
        this.id = id;
    }

    @Override
    public String toString() {
        return "Person{" +
                "id='" + id + '\'' +
                '}';
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Person person = (Person) o;
        return Objects.equals(id, person.id);
    }

    @Override
    public int hashCode() {
        return Objects.hash(id);
    }
}

哈希表的插入/删除/查找时间复杂度是O(1)

1. HashMap 和 HashSet 即 java 中利用哈希表实现的 Map 和 Set
2. java 中使用的是哈希桶方式解决冲突的
3. java 会在冲突链表长度大于一定阈值后，将链表转变为搜索树（红黑树）
4. java 中计算哈希值实际上是调用的类的 hashCode 方法，进行 key 的相等性比较是调用 key 的 equals 方法。所以如果要用自定义类作为 HashMap 的 key 或者 HashSet 的值，必须覆写 hashCode 和 equals 方法，而且要做到 equals 相等的对象，hashCode 一定是一致的

public class Text {
    public static void main(String[] args) {
        //找到10W个数据中第一个重复的数据
        int[] array = {1,2,3,4,5,6,3,2,1};
        Set<Integer> set = new HashSet<>();
        for (int i = 0; i < array.length; i++) {
            if(!set.contains(array[i])) {
                set.add(array[i]);
            }else {
                System.out.println(array[i]);
                return;
            }
        }
    }
}
//3

public class Text {
    public static void main(String[] args) {
        //10W个数据去重
        int[] array = {1,2,3,4,5,6,3,2,1};
        Set<Integer> set = new HashSet<>();
        for (int i = 0; i < array.length; i++) {
            set.add(array[i]);
        }
        System.out.println(set);
    }
}
//[1, 2, 3, 4, 5, 6]

import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;

public class Text {
    public static void calaWordNum(String[] words) {
        Map<String,Integer> map = new HashMap<>();
        for (String word : words) {
            if(map.get(word) != null) {
                int oldNum = map.get(word);
                map.put(word,oldNum+1);
            }else {
                map.put(word,1);
            }
        }
        for (Map.Entry<String,Integer> entry : map.entrySet()) {
            System.out.println("单词 " + entry.getKey() + " 出现了 " + entry.getValue() + " 次!");
        }
    }
    public static void main(String[] args) {
        //统计每单词出现的次数
        String[] words = {"this","is","good","man","this","good","good"};
        calaWordNum(words);
    }
}
//单词 this 出现了 2 次!
//单词 is 出现了 1 次!
//单词 man 出现了 1 次!
//单词 good 出现了 3 次!