LLM之RAG实战(三十)| 探索RAG语义分块策略

news2025/1/1 23:17:46

       在LLM之RAG实战(二十九)| 探索RAG PDF解析解析文档后,我们可以获得结构化或半结构化的数据。现在的主要任务是将它们分解成更小的块来提取详细的特征,然后嵌入这些特征来表示它们的语义,其在RAG中的位置如图1所示:

       最常用的分块方法是基于规则的,采用固定的块大小或相邻块的重叠等技术。对于多级文档,我们可以使用Langchain提供的RecursiveCharacterTextSplitter[1]来定义多级分隔符。

       然而,在实际应用中,由于严格的预定义规则(块大小或重叠部分的大小),基于规则的分块方法很容易导致检索上下文不完整或包含噪声的块大小过大等问题。

       因此,对于分块,最优雅的方法显然是基于语义的分块。语义分块旨在确保每个分块包含尽可能多的语义独立信息。

       本文将探讨语义分块的方法,并解释了它们的原理和应用。我们将介绍三种类型的方法:

  • Embedding-based
  • Model-based
  • LLM-based

一、基于Embedding的方法

      LlamaIndex和Langchain都提供了一个基于embedding的语义分块器。这两个框架的实现思路基本是一样的,我们将以LlamaIndex为例进行介绍。

      请注意,要访问LlamaIndex中的语义分块器,您需要安装最新的版本。我安装的前一个版本0.9.45没有包含此算法。因此,我创建了一个新的conda环境,并安装了更新版本0.10.12:

pip install llama-index-corepip install llama-index-readers-filepip install llama-index-embeddings-openai

      值得一提的是,LlamaIndex的0.10.12可以灵活安装,因此这里只安装了一些关键组件。安装的版本如下:

(llamaindex_010) Florian:~ Florian$ pip list | grep llamallama-index-core              0.10.12llama-index-embeddings-openai 0.1.6llama-index-readers-file      0.1.5llamaindex-py-client          0.1.13

测试代码如下:

from llama_index.core.node_parser import (    SentenceSplitter,    SemanticSplitterNodeParser,)from llama_index.embeddings.openai import OpenAIEmbeddingfrom llama_index.core import SimpleDirectoryReaderimport osos.environ["OPENAI_API_KEY"] = "YOUR_OPEN_AI_KEY"# load documentsdir_path = "YOUR_DIR_PATH"documents = SimpleDirectoryReader(dir_path).load_data()embed_model = OpenAIEmbedding()splitter = SemanticSplitterNodeParser(    buffer_size=1, breakpoint_percentile_threshold=95, embed_model=embed_model)nodes = splitter.get_nodes_from_documents(documents)for node in nodes:    print('-' * 100)    print(node.get_content())

         splitter.get_nodes_from_documents函数的主要过程如图2所示:

       图2中提到的“sentences”是一个python列表,其中每个成员都是一个字典,包含四个(键、值)对,键的含义如下:

  • sentences:当前句子;
  • index:当前句子的序号;
  • combined_sentence:一个滑动窗口,包括[index-self-buffer_size,index,index+self.buffer_size]3句话(默认情况下,self-buffer_size=1)。它是一种用于计算句子之间语义相关性的工具。组合前句和后句的目的是减少噪音,更好地捕捉连续句子之间的关系;
  • combined_sentence_embedding:combined_sentence的嵌入。

       从以上分析中可以明显看出,基于embedding的语义分块本质上包括基于滑动窗口(combined_sentence)计算相似度。那些相邻的并且满足阈值的句子被分类到一个块中。

       下面我们使用BERT论文[2]作为目录路径,以下是一些运行结果:

(llamaindex_010) Florian:~ Florian$ python /Users/Florian/Documents/june_pdf_loader/test_semantic_chunk.py ......----------------------------------------------------------------------------------------------------We argue that current techniques restrict thepower of the pre-trained representations, espe-cially for the fine-tuning approaches. The ma-jor limitation is that standard language models areunidirectional, and this limits the choice of archi-tectures that can be used during pre-training. Forexample, in OpenAI GPT, the authors use a left-to-right architecture, where every token can only at-tend to previous tokens in the self-attention layersof the Transformer (Vaswani et al., 2017). Such re-strictions are sub-optimal for sentence-level tasks,and could be very harmful when applying fine-tuning based approaches to token-level tasks suchas question answering, where it is crucial to incor-porate context from both directions.In this paper, we improve the fine-tuning basedapproaches by proposing BERT: BidirectionalEncoder Representations from Transformers.BERT alleviates the previously mentioned unidi-rectionality constraint by using a “masked lan-guage model” (MLM) pre-training objective, in-spired by the Cloze task (Taylor, 1953). Themasked language model randomly masks some ofthe tokens from the input, and the objective is topredict the original vocabulary id of the maskedarXiv:1810.04805v2  [cs.CL]  24 May 2019----------------------------------------------------------------------------------------------------word based only on its context. Unlike left-to-right language model pre-training, the MLM ob-jective enables the representation to fuse the leftand the right context, which allows us to pre-train a deep bidirectional Transformer. In addi-tion to the masked language model, we also usea “next sentence prediction” task that jointly pre-trains text-pair representations. The contributionsof our paper are as follows:• We demonstrate the importance of bidirectionalpre-training for language representations. Un-like Radford et al. (2018), which uses unidirec-tional language models for pre-training, BERTuses masked language models to enable pre-trained deep bidirectional representations. Thisis also in contrast to Peters et al. ----------------------------------------------------------------------------------------------------......

基于embedding的方法:总结

  • 测试结果表明,块的粒度相对较粗。
  • 图2还显示了这种方法是基于页面的,并且没有直接解决跨越多个页面的块的问题。
  • 通常,基于嵌入的方法的性能在很大程度上取决于嵌入模型。实际效果需要进一步评估。

二、基于模型的方法

2.1 Naive BERT

       回忆一下BERT的预训练过程,其中有个二元分类任务(NSP)来让模型学习两个句子之间的关系。两个句子同时输入到BERT中,并且该模型预测第二个句子是否在第一个句子之后。

       我们可以将这一原理应用于设计一种简单的分块方法。对于文档,请将其拆分为多个句子。然后,使用滑动窗口将两个相邻的句子输入到BERT模型中进行NSP判断,如图3所示:

       如果预测得分低于预设阈值,则表明两句之间的语义关系较弱。这可以作为文本分割点,如图3中句子2和句子3之间所示。

       这种方法的优点是可以直接使用,而不需要训练或微调。

       然而,这种方法在确定文本分割点时只考虑前句和后句,忽略了来自其他片段的信息。此外,这种方法的预测效率相对较低。

2.2 Cross Segment Attention

      论文《Text Segmentation by Cross Segment Attention》[3]提出了三种跨段注意力模型,如图4所示:

       图4(a)显示了跨段BERT模型,该模型将文本分割定义为逐句分类任务。潜在中断的上下文(两侧的k个令牌)被输入到模型中。将与[CLS]相对应的隐藏状态传递给softmax分类器,以做出关于在潜在断点处进行分割的决定。

       本论文还提出了另外两个模型:一种是使用BERT模型来获得每个句子的向量表示。然后,将多个连续句子的这些向量表示输入到Bi-LSTM(图4(b))或另一个BERT(图4),以预测每个句子是否是文本分割边界。

       当时,这三个模型取得了SOTA的结果,如图5所示:

       然而,到目前为止,只发现了本论文的训练代码[4],推理模型没有找到。

2.3 SeqModel

       跨段模型独立地对每个句子进行矢量化,不考虑任何更广泛的上下文信息。SeqModel中提出了进一步的增强,如论文“Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation”[5]中所述。

       SeqModel[6]使用BERT同时对多个句子进行编码,在计算句子向量之前对较长上下文中的依赖关系进行建模。然后它预测文本分割是否发生在每个句子之后。此外,该模型利用自适应滑动窗口方法在不影响精度的情况下提高推理速度。SeqModel的示意图如图6所示。

        SeqModel可以通过ModelScope[7]框架使用。代码如下所示:

from modelscope.outputs import OutputKeysfrom modelscope.pipelines import pipelinefrom modelscope.utils.constant import Tasksp = pipeline(    task = Tasks.document_segmentation,    model = 'damo/nlp_bert_document-segmentation_english-base')print('-' * 100)result = p(documents='We demonstrate the importance of bidirectional pre-training for language representations. Unlike Radford et al. (2018), which uses unidirectional language models for pre-training, BERT uses masked language models to enable pretrained deep bidirectional representations. This is also in contrast to Peters et al. (2018a), which uses a shallow concatenation of independently trained left-to-right and right-to-left LMs. • We show that pre-trained representations reduce the need for many heavily-engineered taskspecific architectures. BERT is the first finetuning based representation model that achieves state-of-the-art performance on a large suite of sentence-level and token-level tasks, outperforming many task-specific architectures. Today is a good day')print(result[OutputKeys.TEXT])

       测试数据最后附加了一句话,“Today is a good day”,但结果并没有把“Today is a good day”分开。

(modelscope) Florian:~ Florian$ python /Users/Florian/Documents/june_pdf_loader/test_seqmodel.py 2024-02-24 17:09:36,288 - modelscope - INFO - PyTorch version 2.2.1 Found.2024-02-24 17:09:36,288 - modelscope - INFO - Loading ast index from /Users/Florian/.cache/modelscope/ast_indexer......----------------------------------------------------------------------------------------------------...... We demonstrate the importance of bidirectional pre-training for language representations.Unlike Radford et al.(2018), which uses unidirectional language models for pre-training, BERT uses masked language models to enable pretrained deep bidirectional representations.This is also in contrast to Peters et al.(2018a), which uses a shallow concatenation of independently trained left-to-right and right-to-left LMs.• We show that pre-trained representations reduce the need for many heavily-engineered taskspecific architectures.BERT is the first finetuning based representation model that achieves state-of-the-art performance on a large suite of sentence-level and token-level tasks, outperforming many task-specific architectures.Today is a good day

三、基于LLM的方法

       论文《Dense X Retrieval: What Retrieval Granularity Should We Use?》[8]引入了一个新的检索单元,称为proposition。proposition被定义为文本中的原子表达式,每个命题都封装了一个不同的事实,并以简洁、自包含的自然语言格式呈现。

       那么,我们如何获得这个所谓的命题呢?在本文中,它是通过构建提示和与LLM的交互来实现的。

       LlamaIndex和Langchain都实现了相关的算法,下面使用LlamaIndex进行了演示。

        LlamaIndex的实现思想包括使用论文中提供的提示生成命题:

PROPOSITIONS_PROMPT = PromptTemplate(    """Decompose the "Content" into clear and simple propositions, ensuring they are interpretable out ofcontext.1. Split compound sentence into simple sentences. Maintain the original phrasing from the inputwhenever possible.2. For any named entity that is accompanied by additional descriptive information, separate thisinformation into its own distinct proposition.3. Decontextualize the proposition by adding necessary modifier to nouns or entire sentencesand replacing pronouns (e.g., "it", "he", "she", "they", "this", "that") with the full name of theentities they refer to.4. Present the results as a list of strings, formatted in JSON.Input: Title: ¯Eostre. Section: Theories and interpretations, Connection to Easter Hares. Content:The earliest evidence for the Easter Hare (Osterhase) was recorded in south-west Germany in1678 by the professor of medicine Georg Franck von Franckenau, but it remained unknown inother parts of Germany until the 18th century. Scholar Richard Sermon writes that "hares werefrequently seen in gardens in spring, and thus may have served as a convenient explanation for theorigin of the colored eggs hidden there for children. Alternatively, there is a European traditionthat hares laid eggs, since a hare’s scratch or form and a lapwing’s nest look very similar, andboth occur on grassland and are first seen in the spring. In the nineteenth century the influenceof Easter cards, toys, and books was to make the Easter Hare/Rabbit popular throughout Europe.German immigrants then exported the custom to Britain and America where it evolved into theEaster Bunny."Output: [ "The earliest evidence for the Easter Hare was recorded in south-west Germany in1678 by Georg Franck von Franckenau.", "Georg Franck von Franckenau was a professor ofmedicine.", "The evidence for the Easter Hare remained unknown in other parts of Germany untilthe 18th century.", "Richard Sermon was a scholar.", "Richard Sermon writes a hypothesis aboutthe possible explanation for the connection between hares and the tradition during Easter", "Hareswere frequently seen in gardens in spring.", "Hares may have served as a convenient explanationfor the origin of the colored eggs hidden in gardens for children.", "There is a European traditionthat hares laid eggs.", "A hare’s scratch or form and a lapwing’s nest look very similar.", "Bothhares and lapwing’s nests occur on grassland and are first seen in the spring.", "In the nineteenthcentury the influence of Easter cards, toys, and books was to make the Easter Hare/Rabbit popularthroughout Europe.", "German immigrants exported the custom of the Easter Hare/Rabbit toBritain and America.", "The custom of the Easter Hare/Rabbit evolved into the Easter Bunny inBritain and America." ]Input: {node_text}Output:""")

       在上一节基于嵌入的方法中,我们安装了LlamaIndex 0.10.12的关键组件。但如果我们想使用DenseXRetrievalPack,我们还需要运行pip install-lama-index-llms-openai。安装后,当前与LlamaIndex相关的组件如下:

(llamaindex_010) Florian:~ Florian$ pip list | grep llamallama-index-core                    0.10.12llama-index-embeddings-openai       0.1.6llama-index-llms-openai             0.1.6llama-index-readers-file            0.1.5llamaindex-py-client                0.1.13

      测试代码如下:

from llama_index.core.readers import SimpleDirectoryReaderfrom llama_index.core.llama_pack import download_llama_packimport osos.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_KEY"# Download and install dependenciesDenseXRetrievalPack = download_llama_pack(    "DenseXRetrievalPack", "./dense_pack")# If you have already downloaded DenseXRetrievalPack, you can import it directly.# from llama_index.packs.dense_x_retrieval import DenseXRetrievalPack# Load documentsdir_path = "YOUR_DIR_PATH"documents = SimpleDirectoryReader(dir_path).load_data()# Use LLM to extract propositions from every document/nodedense_pack = DenseXRetrievalPack(documents)response = dense_pack.run("YOUR_QUERY")

       通过上述测试代码,学会了初步使用类DenseXRetrievalPack,下面分析一下类DenseXRetrievalPack的源代码:

class DenseXRetrievalPack(BaseLlamaPack):    def __init__(        self,        documents: List[Document],        proposition_llm: Optional[LLM] = None,        query_llm: Optional[LLM] = None,        embed_model: Optional[BaseEmbedding] = None,        text_splitter: TextSplitter = SentenceSplitter(),        similarity_top_k: int = 4,    ) -> None:        """Init params."""        self._proposition_llm = proposition_llm or OpenAI(            model="gpt-3.5-turbo",            temperature=0.1,            max_tokens=750,        )        embed_model = embed_model or OpenAIEmbedding(embed_batch_size=128)        nodes = text_splitter.get_nodes_from_documents(documents)        sub_nodes = self._gen_propositions(nodes)        all_nodes = nodes + sub_nodes        all_nodes_dict = {n.node_id: n for n in all_nodes}        service_context = ServiceContext.from_defaults(            llm=query_llm or OpenAI(),            embed_model=embed_model,            num_output=self._proposition_llm.metadata.num_output,        )        self.vector_index = VectorStoreIndex(            all_nodes, service_context=service_context, show_progress=True        )        self.retriever = RecursiveRetriever(            "vector",            retriever_dict={                "vector": self.vector_index.as_retriever(                    similarity_top_k=similarity_top_k                )            },            node_dict=all_nodes_dict,        )        self.query_engine = RetrieverQueryEngine.from_args(            self.retriever, service_context=service_context        )

       如代码所示,首先使用text_splitter将文档划分为原始nodes,然后调用self._gen_propositions来获得相应的sub_nodes。然后,它使用nodes+sub_nodes构建VectorStoreIndex,该索引可以通过RecursiveRetriever进行检索。递归检索器可以使用小块进行检索,但它会将相关的大块传递到生成阶段。

      测试文档仍然是BERT论文。通过调试,我们发现sub_nodes[].text不是原来的文本,它们被重写了:

> /Users/Florian/anaconda3/envs/llamaindex_010/lib/python3.11/site-packages/llama_index/packs/dense_x_retrieval/base.py(91)__init__()     90 ---> 91         all_nodes = nodes + sub_nodes     92         all_nodes_dict = {n.node_id: n for n in all_nodes}ipdb> sub_nodes[20]IndexNode(id_='ecf310c7-76c8-487a-99f3-f78b273e00d9', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Our paper demonstrates the importance of bidirectional pre-training for language representations.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n', index_id='8deca706-fe97-412c-a13f-950a19a594d1', obj=None)ipdb> sub_nodes[21]IndexNode(id_='4911332e-8e30-47d8-a5bc-ed7cbaa8e042', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Radford et al. (2018) uses unidirectional language models for pre-training.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n', index_id='8deca706-fe97-412c-a13f-950a19a594d1', obj=None)ipdb> sub_nodes[22]IndexNode(id_='83aa82f8-384a-4b06-92c8-d6277c4162bf', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='BERT uses masked language models to enable pre-trained deep bidirectional representations.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n', index_id='8deca706-fe97-412c-a13f-950a19a594d1', obj=None)ipdb> sub_nodes[23]IndexNode(id_='2ac635c2-ccb0-4e62-88c7-bcbaef3ef38a', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Peters et al. (2018a) uses a shallow concatenation of independently trained left-to-right and right-to-left LMs.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n', index_id='8deca706-fe97-412c-a13f-950a19a594d1', obj=None)ipdb> sub_nodes[24]IndexNode(id_='e37b17cf-30dd-4114-a3c5-9921b8cf0a77', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Pre-trained representations reduce the need for many heavily-engineered task-specific architectures.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n', index_id='8deca706-fe97-412c-a13f-950a19a594d1', obj=None)

       sub_nodes和nodes之间的关系如图7所示,是一个从小到大的索引结构。

      从小到大的索引结构是通过 self._gen_propositions构建的,代码如下:

    async def _aget_proposition(self, node: TextNode) -> List[TextNode]:        """Get proposition."""        inital_output = await self._proposition_llm.apredict(            PROPOSITIONS_PROMPT, node_text=node.text        )        outputs = inital_output.split("\n")        all_propositions = []        for output in outputs:            if not output.strip():                continue            if not output.strip().endswith("]"):                if not output.strip().endswith('"') and not output.strip().endswith(                    ","                ):                    output = output + '"'                output = output + " ]"            if not output.strip().startswith("["):                if not output.strip().startswith('"'):                    output = '"' + output                output = "[ " + output            try:                propositions = json.loads(output)            except Exception:                # fallback to yaml                try:                    propositions = yaml.safe_load(output)                except Exception:                    # fallback to next output                    continue            if not isinstance(propositions, list):                continue            all_propositions.extend(propositions)        assert isinstance(all_propositions, list)        nodes = [TextNode(text=prop) for prop in all_propositions if prop]        return [IndexNode.from_text_node(n, node.node_id) for n in nodes]    def _gen_propositions(self, nodes: List[TextNode]) -> List[TextNode]:        """Get propositions."""        sub_nodes = asyncio.run(            run_jobs(                [self._aget_proposition(node) for node in nodes],                show_progress=True,                workers=8,            )        )        # Flatten list        return [node for sub_node in sub_nodes for node in sub_node]

       对于每个原始node,异步调用self_aget_proposition通过PROPOSITIONS_PROMPT获取LLM的返回inital_output,然后基于inital_out获取命题并构建TextNode。最后,将这些TextNode与原始node相关联,即[IndexNode.from_text_node(n,node.node_id)for n in nodes]。

       值得一提的是,原始论文使用LLM生成的命题作为训练数据来进一步微调文本生成模型。文本生成模型[9]现在可以公开访问。感兴趣的读者可以尝试一下。

基于LLM的方法:综述

       一般来说,这种使用LLM构造命题的分块方法实现了更精细的分块。它与原始节点形成了一个从小到大的索引结构,从而为语义分块提供了一个新的思路。然而,这种方法依赖于LLM,这是相对昂贵的。

四、结论

       本文探讨了三种类型的语义分块方法的原理和实现方法,并提供了一些综述。

       一般来说,语义分块是一种更优雅的方式,也是优化RAG的关键。

参考文献:

[1] https://github.com/langchain-ai/langchain/blob/v0.1.9/libs/langchain/langchain/text_splitter.py#L851C1-L851C6

[2] https://arxiv.org/pdf/1810.04805.pdf

[3] https://arxiv.org/abs/2004.14535

[4] https://github.com/aakash222/text-segmentation-NLP/

[5] https://arxiv.org/pdf/2107.09278.pdf

[6] https://github.com/alibaba-damo-academy/SpokenNLP

[7] https://github.com/modelscope/modelscope/

[8] https://arxiv.org/pdf/2312.06648.pdf

[9] https://github.com/chentong0/factoid-wiki

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1521810.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【GitHub】使用git链接下载很慢?试试服务器配置SSH,起飞

参考文献 保姆级教学,教你用配置SSH拉取github代码 CentOS ssh -T gitgithub.comgit config --global user.name "learnore" git config --global user.email "15200831505163.com"cd /root/.ssh vim id_rsa.pubGitHub Settings 结果 下载速…

力扣L13--- 409.最长回文串(JAVA版)-2024年3月1日

1.题目描述 2.知识点 注1:向下取整是将一个数值向下舍入到最接近的整数,但不超过这个数值的整数。具体规则如下: 对于正数,向下取整后得到的整数是不大于原数值的最大整数; 对于负数,向下取整后得到的整数…

uniapp——第2篇:编写vue语法

前提,建议先学会前端几大基础:HTML、CSS、JS、Ajax,还有一定要会Vue!(Vue2\Vue3)都要会!!!不然不好懂 一、去哪写? 就在【pages】的你的人一个页面文件夹里的【.vue】文…

简单的网页制作

1网页编写格式 <!DOCTYPE html> <html><head><meta charset"utf-8"> <title>中文测试。。。。</title></head><body>这里是测试body测试内容。。。</body> </html>2标签 在body内<h1></h1&…

突破编程_前端_JS编程实例(工具栏组件)

1 开发目标 工具栏组件旨在模拟常见的桌面软件工具栏&#xff0c;所以比较适用于 electron 的开发&#xff0c;该组件包含工具栏按钮、工具栏分割条和工具栏容器三个主要角色&#xff0c;并提供一系列接口和功能&#xff0c;以满足用户在不同场景下的需求&#xff1a; 点击工具…

中间件 | RPC - [Dubbo]

INDEX 1 Dubbo 与 web 容器的关系2 注册发现流程3 服务配置3.1 注册方式 & 订阅方式3.2 服务导出3.3 配置参数 4 底层技术4.1 Dubbo 的 spi 机制4.2 Dubbo 的线程池4.3 Dubbo 的负载均衡策略4.3 Dubbo 的协议 1 Dubbo 与 web 容器的关系 dubbo 本质上是一个 RPC 框架&…

leetcode代码记录(动态规划基础题(斐波那契数列)

目录 1. 题目&#xff1a;2. 斐波那契数列&#xff1a;小结&#xff1a; 1. 题目&#xff1a; 斐波那契数 &#xff08;通常用 F(n) 表示&#xff09;形成的序列称为 斐波那契数列 。该数列由 0 和 1 开始&#xff0c;后面的每一项数字都是前面两项数字的和。也就是&#xff1a…

基于高德地图JS API实现Vue地图选点组件

基于高德地图JS API2.0实现一个搜索选择地点后返回给父组件位置信息的功能&#xff0c;同时可以进行回显 目录 1 创建key和秘钥1.1 登录高德地图开放平台1.2 创建应用1.3 绑定服务创建秘钥 2 使用组件前准备2.1 导入loader2.2 在对应的组件设置秘钥2.3 引入css样式 3 功能实现…

【C语言】整型提升与算术转换

一、表达式求值 在我们平常的表达式求值的题目中&#xff0c;虽然看似是道很简单的题目&#xff1b;但是出题人总是会埋坑&#xff0c;其中最常见的就是整型提升与算术转换。 二、整型提升 C语⾔中整型算术运算总是⾄少以缺省(默认)整型(int)类型的精度来进⾏的&#xff1b;…

【MySQL基础】MySQL基础操作二

文章目录 &#x1f34e;1.数据库约束&#x1f350;约束类型&#x1f346;1.1NOT NULL&#x1f346;1.2UNIQUE&#x1f346;1.3DEFAULT&#x1f346;1.4PRIMARY KEY&#x1f346;1.5FOREIGN KEY &#x1f34f;2.查询操作&#x1f35f;2.1聚合查询&#x1f354;2.1.1聚合函数&…

视频号电商的风口来了!这个消息还有多少人不知道?

大家好&#xff0c;我是电商糖果 短视频做电商&#xff0c;这几年的热度真的是非常高&#xff0c;就是因为热度太高了&#xff0c;才让视频号也动了电商的心思。 腾讯推出的视频号是为了和抖音对打&#xff0c;这几年靠着微信输送的流量&#xff0c;视频号的日活已经渐渐有赶…

JavaSE-----认识异常【详解】

目录 一.异常的概念与体系结构&#xff1a; 1.1异常的概念&#xff1a; 1.2一些常见的异常&#xff1a; 1.3异常的体系结构&#xff1a; 1.4异常的分类&#xff1a; 二.异常的处理机制&#xff1a; 2.1 抛出异常&#xff1a; 2.2异常的捕获&#xff1a; 2.3try-catch-&…

JavaWeb一些开发问题

一、Restful package com.example.crudtest1.pojo;import lombok.AllArgsConstructor; import lombok.Data; import lombok.NoArgsConstructor;Data NoArgsConstructor AllArgsConstructor public class Result {private Integer code;//响应码&#xff0c;1 代表成功; 0 代表失…

浅易理解:卷积神经网络(CNN)

浅易理解卷积神经网络流程 本文的目录&#xff1a; 1 什么卷积神经网络 2 输入层 3 卷积层 4 池化层 5 全连接层 传统的多层神经网络只有 输入层、隐藏层、输出层 卷积神经网络&#xff08;CNN)&#xff1a; 在多层神经网络的基础上&#xff0c;加入了更加有效的特征学习部分…

315曝光黑灰产业链:主板机

关注卢松松&#xff0c;会经常给你分享一些我的经验和观点。 315晚会曝光主板机黑灰产业链&#xff0c;主板机是什么呢?可能很多人还不知道。在这里松松给大家普及一下&#xff0c;也欢迎大家关注卢松松哟! 主板机是什么呢? 通过报废手机的主板&#xff0c;拆出来后组装成主…

【Linux进程状态】

提示&#xff1a;文章写完后&#xff0c;目录可以自动生成&#xff0c;如何生成可参考右边的帮助文档 目录 前言 一、直接谈论Linux的进程状态 看看Linux内核源代码怎么说 1.1、R状态 -----> 进程运行的状态 1.2、S状态 -----> 休眠状态(进程在等待“资源”就绪) 1.3、T状…

NFTScan 正式上线 Blast NFTScan 浏览器和 NFT API 数据服务

2024 年 3 月 15 号&#xff0c;NFTScan 团队正式对外发布了 Blast NFTScan 浏览器&#xff0c;将为 Blast 生态的 NFT 开发者和用户提供简洁高效的 NFT 数据搜索查询服务。NFTScan 作为全球领先的 NFT 数据基础设施服务商&#xff0c;Blast 是继 Bitcoin、Ethereum、BNBChain、…

Linux 系统调用函数fork、vfork、clone详解

文章目录 1 fork1.1 基本介绍1.2 fork实例1.2.1多个fork返回值1.2.2 C语言 fork与输出1.2.3 fork &#x1f4a3; 2 vfork2.1 基本介绍2.2 验证vfork共享内存 3 clone3.1 基本介绍3.2 clone使用 1 fork 1.1 基本介绍 #include <sys/types.h> #include <unistd.h>p…

2024年【危险化学品经营单位主要负责人】找解析及危险化学品经营单位主要负责人模拟考试

题库来源&#xff1a;安全生产模拟考试一点通公众号小程序 危险化学品经营单位主要负责人找解析考前必练&#xff01;安全生产模拟考试一点通每个月更新危险化学品经营单位主要负责人模拟考试题目及答案&#xff01;多做几遍&#xff0c;其实通过危险化学品经营单位主要负责人…

Grok的开源的一些想法

Grok是埃隆马斯克的人工智能团队开发的大模型&#xff0c;自马斯克发布消息称将开源大模型&#xff0c;其热度就居高不下。Grok的开源能迅速帮助国内建立起AI应用的能力。 从xAI公布的数据来看&#xff0c;Grok在主流的测试方法中均已超过GPT-3.5&#xff0c;而其是开源发展速度…