Large Language Models(LLMs) Concepts

Large Language Models(LLMs) Concepts

news2025/7/17 19:43:31

1、Introduction to Large Language Models(LLM)

1.1、Definition of LLMs

Large: Training data and resources.
Language: Human-like text.
Models: Learn complex patterns using text data.

The LLM is considered the defining moment in the history of AI.

Some applications:

Sentiment analysis
Identifying themes
Translating text or speech
Generating code
Next-word prediction

1.2、Real-world application

Transforming finance industry:

[Investment outlook] | [Annual reports] | [News articles] | [Social media posts]

--> LLM

[Market analysis] | [Portfolio management] [Investment opportunities]

Revolutionizing healthcare sector:

- Analyze patient data to offer personalized recommendations.

- Must adhere to privacy laws.

Education:

- Personalized coaching and feedback.

- Interactive learning experience.

- AI-powered tutor:
  - Ask questions.
  - Receive guidance.
  - Discuss ideas.

Visual question answering:

Defining multimodel:

Multimodel:
- Many types of processing or generation

Nun-multimodel:
- One type of processing or generation



Visual question answering:
- Answers to questions about visual content
- Object identification & relationships
- Scene description

1.3、Challenges of language modeling

Sequence matters
Context modeling
Long-range dependency
Single-task learning

2、Building Blocks of LLMs

2.1、Novelty of LLMs

Overcome data's unstructured nature
Outperform traditional models
Understand linguistic subteties

The bulding blocks show below:

2.2、Generalized overview of NLP

2.2.1、Text Pre-processing

Can be done in a different order as they are independent.

Tokenization: Splits text into individual words, or tokens.
Stop word removal: Stop words do not add meaning.
Lemmatization: Group slightly different words with similar meaning so we can reduce words to their basic form. For example, we can map them to their root word.

2.2.2、Text Representation

Text data into numerical form.

Bag-of-words:

Limitation:

- Does not capture the order or context.

- Does not capture the semantics between the words.

Word embeddings:

2.3、Fine-tuning

Fine-tuning:
- Addresses some of these challenges.
- Adapts a pre-trained model.


Pre-trained model:
- Learned from general-purpose datasets.
- Not optimized for specific-tasks.
- Can be fine-tuned for a specific problem.

2.4、Learning techniques

N-shot learning: zero-shot, few-show, and multi-shot.

2.4.1、Zero-shot learning

No explicit training.
Uses language understanding and context.
Generalizes without any prior examples.

2.4.2、Few-shot learning

Learn a new task with a few examples.

2.4.3、Multi-shot learning

Requires more examples than few-shot.

3、Training Methodology and Techniques

3.1、Building blocks to train LLMs

3.1.1、Generative pre-training

Trained using generative pre-training

- Input data of text tokens.

- Trained to predict the tokens within the dataset.



Types:

- Next word prediction.

- Masked language modeling.

3.1.2、Next word prediction

Supervised learning technique.
Predicts next word and generates coherent text.
Captures the dependencies between words.
Training data consist of pairs of input and output examples.

3.1.3、Masked language modeling

Hides a selective word.
Trained model predicts the masked word.

3.2、Introducing the transformer

3.2.1、Transformer architecture

Relationship between words.
Components: Pre-processing, Positional Encoding, Encoders, and Decoders.

3.2.2、Inside the transformer

(1) Text pre-processing and representation:

Text preprocessing: tokenization, stop word removal, lemmatization.
Text representation: word embedding.

(2) Positional encoding:

Information on the position of each word.
Understand distant words.

(3) Encoders:

Attention mechanism: directs attention to specific words and relationships.
Neural network: process specific features.

(4) Decoders:

Includes attention and neural networks.
Generates the output.

3.2.3、Transformers and long-range dependencies

Initial challenge: lone-range dependency.
Attention: focus on different parts of the input.

3.2.4、Processes multiple parts simultaneously

Limitation of traditional language models: Sequential - one word at a time.
Transformers: Process multiple parts simultaneously (Faster processing).

3.3、Attention mechanisms

3.3.1、Attention mechanisms

Understand complex structures.
Focus on important words.

3.3.2、Two primary types: Slef-attention and multi-head attention

For example:

3.4、Advanced fine-tuning

3.4.1、LLM training three steps:

Pre-training：
Fine-tuning:
RLHF:
（1）Why RLHF?

（2）Starts with the need to fine-tune

3.4.2、Simplifying RLHF

Model output reviewed by human.
Updates model based on the feedback.

Step1:

Receives a prompt.
Generates multiple responses.

Step2:

Human expert checks these responses.
Ranks the responses based on quality: Accuracy、Relevance、Coherence.

Step3:

Learns from expert's ranking.
To align its response in future with their preferences.

And it goes on:

Continues to generate responses.
Receives expert's rankings.
Adjusts the learning.

3.4.3、Recap

4、Concerns and Considerations

4.1、Data concerns and considerations

Data volume and compute power.
Data quality.
Labeling.
Bias.
Privacy.

4.1.1、Data volume and compute power

LLMs need a lot of data.
Extensive computing power.
Can cost millions of dollars.

4.1.2、Data quality

Quality data is essential.

4.1.3、Labeled data

Correct data label.
Labor-intensive.
Incorrect labels impact model performance.
Address errors: identify >>> analyze >>> iterate.

4.1.4、Data bias

Influenced by societal stereotypes.
Lack of diversity in training data.
Discrimination and unfair outcomes.

Spot and deal with the biased data:

Evaluate data imbalances.
Promote diversity.
Bias mitigation techniques: more diverse examples.

4.1.5、Data privacy

Compliance with data protection and privacy regulations.
Sensitive or personally identifiable information (PII).
Privacy is a concern.
Get permission.

4.2、Ethical and environmental concerns

4.2.1、Ethical concerns

Transparency risk - Challenging to understand the output.
Accountavility risk - Responsibility of LLMs' actions.
Information hazards - Disseminating harmful information.

4.2.2、Environmental concerns

Ecological footprint of LLMs.
Substantial energy resources to train.
Impact through carbon emissions.

4.3、Where are LLMs heading?

Model explainability.
Efficiency.
Unsupervised bias handling.
Enhanced creativity.

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/2097351.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！

相关文章

HMI触屏网关-VISION如何与Modbus TCP从机通信

上文：HMI触屏网关-VISION如何与Modbus RTU从机通信-CSDN博客 1. 硬件连接 Modbus TCP协议采用网口通信的方式，因此，只需要保证网关的LAN口IP和Modbus TCP从机的IP在同一网段即可。 Modbus TCP从机参数说明： 2. VISION创建Modbu…

阅读更多...

怎么将ts格式转mp4？必须掌握的4种视频转换方法

怎么将ts格式转mp4？必须掌握的4种视频转换方法

当今，视频格式转换变得愈发重要。当我们面对不太常见的ts格式，想要将其转换为更通用的mp4时，掌握正确的转换方法尤为关键。今天，我们将分享4种实现ts格式转mp4的必备方法。每一种方法都有其独特优势，满足不同需求。我…

阅读更多...

027、架构_资源_GTM

027、架构_资源_GTM

系统级GTM：默认的GTM，当创建分片集群时，如果不创建实例级GTM，则会用系统级GTM 本章节主要介绍GTM 集群的新增、删除、配置、绑定等管理操作。新增GTM集群摘要新增GTM集群，与租户相绑定，可查看绑定租户与配置集群参数设置，租户可重绑定其他正常可用的GTM集群。步骤1.…

阅读更多...

windows 编译libx264报错问题之解决

windows 编译libx264报错问题之解决

编译过程参考：Win10环境下编译和运行 x264_x.264下载使用教程-CSDN博客一、gcc not found 在https://www.msys2.org/ 下载Mingw后，安装 pacman -S mingw-w64-x86_64-gcc 安装完成后，执行gcc -v提示找不到gcc 解决办法： …

阅读更多...

迎接开学第一天！请查收这份2024开学必备好物清单！

迎接开学第一天！请查收这份2024开学必备好物清单！

新的学期正悄然来临，开学第一天校园里即将迎来一张张充满朝气的面孔。无论是重返课堂的老生还是满怀期待的新生，开学季总是充满了新的希望与挑战。为了帮助学生们更好地适应即将到来的学习生活，我们精心准备了这份2024开学必备好物清单。从提…

阅读更多...

Java提高篇——Java 异常处理

Java提高篇——Java 异常处理

阅读目录异常的概念异常的体系结构Java 异常的处理机制异常处理的基本语法异常链自定义异常总结回到顶部异常的概念异常是程序中的一些错误，但并不是所有的错误都是异常，并且错误有时候是可以避免的。比如说，你的代码少了一个分号&…

阅读更多...

FreeRTOS指南 -- 基础知识

FreeRTOS指南 -- 基础知识

裸机 / OS 裸机编程：单任务系统的方式，框架是在main( )函数中循环的处理，实时性差，在大循环中再紧急的函数没轮到只能等着，虽然在中断中处理一些紧急任务，但是在大型嵌入式系统中，这样的单任务系…

阅读更多...

深入探索MySQL数据库结构设计：实战案例解析，打造高效、可扩展的数据存储方案

深入探索MySQL数据库结构设计：实战案例解析，打造高效、可扩展的数据存储方案

作者简介：我是团团儿，是一名专注于云计算领域的专业创作者，感谢大家的关注座右铭： 云端筑梦，数据为翼，探索无限可能，引领云计算新纪元个人主页：团儿.-CSDN博客前言：…

阅读更多...

BERT 高频面试题八股文——基础知识篇

BERT 高频面试题八股文——基础知识篇

基础知识 1. 问：请简述自然语言处理(NLP)的主要研究目标是什么？ 答：NLP的主要研究目标是使计算机能够理解、解释和生成人类语言。 2. 问：什么是BERT模型，它为什么重要？ 答：BERT是一种预训练…

阅读更多...

超级会员卡积分收银系统源码，一站式解决方案，可以收银的小程序带完整的安装代码包以及搭建部署教程

超级会员卡积分收银系统源码，一站式解决方案，可以收银的小程序带完整的安装代码包以及搭建部署教程

系统概述超级会员卡积分收银系统源码，是一款专为零售行业设计的综合性管理软件系统。该系统以高效的收银功能为核心，结合会员管理、积分系统、商品管理、库存监控、报表分析等多个功能模块，旨在帮助商家实现线上线下一体化经营，…

阅读更多...

海康二次开发学习笔记7-流程相关操作

海康二次开发学习笔记7-流程相关操作

流程相关操作流程的相关操作包括选择路径,导入流程,导出流程,运行流程等. 在开始前,扩展优化一下写法,供其他地方重复调用. /// <summary>/// 消息显示区显示消息/// </summary>/// <param name"msg"></param>public void AddMsg(string …

阅读更多...

【windows】windows 如何实现 ps aux | grep xxx -c 统计某个进程数的功能？

【windows】windows 如何实现 ps aux | grep xxx -c 统计某个进程数的功能？

windows 如何实现 ps aux | grep xxx -c 统计某个进程数的功能？ 在Windows中，要实现类似Linux中ps aux | grep xxx -c的功能，即统计某个特定进程的数量，可以使用PowerShell或命令提示符（cmd.exe）来实现。 …

阅读更多...

osgearth添加地形夸张系数VerticalScale时报E0393:不允许指针指向不完整的类类型的解决方法

osgearth添加地形夸张系数VerticalScale时报E0393:不允许指针指向不完整的类类型的解决方法

如下图1所示：图1 error C2027: 使用了未定义类型“osgEarth::TerrainEngineNode” E0393:不允许指针指向不完整的类类型“osgEarth::TerrainEngineNode”

阅读更多...

SSM一篇就懂

SSM一篇就懂

01、初始Spring 什么是Spring，它有什么特点？ Spring是一个容器框架，主要负责维护bean与bean之间的关系和生命周期。它具有以下特点： 控制反转（IoC）：通过依赖注入（DI）&…

阅读更多...

自动化获取诊断信息（H3C网络设备）

自动化获取诊断信息（H3C网络设备）

介绍在设备遇到个人无法处理的问题时，需要下载诊断信息发送给400处理哦，而通过传统的方式获取诊断信息需要通过多个步骤来获取，步骤繁琐，在设备数量过多的情况下，严重影响工作效率，而通过python自动化的方…

阅读更多...

提交MR这个词儿您知道是什么意思吗？

提交MR这个词儿您知道是什么意思吗？

作为测试的同学，是不是经常会听研发同学说提交MR呢？那么究竟什么是提交MR呢？在这篇文章中会告诉大家！ 在Git中，提交MR（Merge Request，合并请求）是在进行协作开发的一种常见方式&…

阅读更多...

UPDF 编辑器怎么样，值得购买吗？

UPDF 编辑器怎么样，值得购买吗？

如今 PDF 工具可谓是五花八门，但不少工具在滥竽充数，软件里塞满广告，界面也是十几年前的风格。近一两年火起来的 UPDF 编辑器，凭借体积轻巧、视效轻盈、体验轻快、多平台等特点，在同类产品中脱颖而出，成为…

阅读更多...

科研绘图系列：python语言散点图和密度分布图（scatter density plot）

科研绘图系列：python语言散点图和密度分布图（scatter density plot）

介绍散点图（Scatter Plot）是一种数据可视化技术，用于显示两个变量之间的关系。它通过在直角坐标系中绘制数据点来展示数据的分布和趋势。每个数据点在横轴（X轴）和纵轴（Y轴）上都有一个坐标值，分别对应两个变量的数值。密度分布图是一种统计图表，用于表示数据的分布…

阅读更多...

100特殊效果技能包：100 Special Skills Effects Pack

100特殊效果技能包：100 Special Skills Effects Pack

总计177个, 包括100个概念FX！ 这个资源包含几个 FX。魔术，冰块，鲜血，恶魔，毒药，行星，斜线，爆炸和其他特殊效果正等着您。该asset的主要功能。: [1]:Standard, URP&HDRP(Distor…

阅读更多...

unity游戏开发——标记物体一目了然

unity游戏开发——标记物体一目了然

Unity游戏开发:标记物体,让开发变得一目了然 “好读书，不求甚解；每有会意，便欣然忘食。” 本文目录： Unity游戏开发 Unity游戏开发:标记物体,让开发变得一目了然前言1. 什么是Tag？2. Unity中如何添加和管理Tag步骤1&am…

阅读更多...

推荐文章

最新文章