
news2024/11/16 1:20:52



After supervised learning, the most widely used form of machine learning is unsupervised learning. Let’s take a look at what that means, we’ve talked about supervised learning and this video is about unsupervised learning.

When we’re looking at supervised learning in the last video recalled, it looks something like this in the case of a classification problem. Each example, was associated with an output label y such as benign or malignant, designated by the poles and crosses in unsupervised learning. Were given data that isn’t associated with any output labels y, say you’re given data on patients and their tumor size and the patient’s age.

But not whether the tumor was benign or malignant, so the dataset looks like this on the right. We’re not asked to diagnose whether the tumor is benign or malignant, because we’re not given any labels. Why in the dataset, instead, our job is to find some structure or some pattern or just find something interesting in the data.
但是我们不知道图(右侧无监督学习的图中)中这些点的肿瘤到底是良性还是恶性的。我们不必去诊断肿瘤到底是良性还是恶性的,因为我们没有获得任何标签(就是说,我们不知道这个图里面的三角形到底代表是良性肿瘤还是恶性肿瘤,而且也没有提前给机器学习系统提供示例,即带有患者年龄 肿瘤大小对应的是否为恶性肿瘤的数据集,所以我们无法直接判断出图中的三角形所代表的肿瘤到底是恶性还是良性),所以在这个什么信息也不知道的数据集中,我们的工作是去找到一些结构或一些模式或者只是寻找数据中有趣的点(即我们要发掘数据中潜在的关联关系)

This is unsupervised learning, we call it unsupervised because we’re not trying to supervise the algorithm. To give some quote right answer for every input, instead, we asked the our room to figure out all by yourself what’s interesting. Or what patterns or structures that might be in this data, with this particular data set.

An unsupervised learning algorithm, might decide that the data can be assigned to two different groups or two different clusters. And so it might decide, that there’s one cluster what group over here, and there’s another cluster or group over here. This is a particular type of unsupervised learning, called a clustering algorithm. Because it places the unlabeled data, into different clusters and this turns out to be used in many applications.

For example, clustering is used in google news, what google news does is every day it goes. And looks at hundreds of thousands of news articles on the internet, and groups related stories together.

For example, here is a sample from Google News, where the headline of the top article, is giant panda gives birth to rear twin cubs at Japan’s oldest zoo. This article has actually caught my eye, because my daughter loves pandas and so there are a lot of stuff panda toys. And watching panda videos in my house, and looking at this, you might notice that below this are other related articles.

Maybe from the headlines alone, you can start to guess what clustering might be doing. Notice that the word panda appears here here, here, here and here and notice that the word twin also appears in all five articles. And the word Zoo also appears in all of these articles, so the clustering algorithm is finding articles. All of all the hundreds of thousands of news articles on the internet that day, finding the articles that mention similar words and grouping them into clusters.

Now, what’s cool is that this clustering algorithm figures out on his own which words suggest, that certain articles are in the same group. What I mean is there isn’t an employee at google news who’s telling the algorithm to find articles that the word panda. And twins and zoo to put them into the same cluster, the news topics change every day. And there are so many news stories, it just isn’t feasible to people doing this every single day for all the topics that use covers.

Instead the algorithm has to figure out on his own without supervision, what are the clusters of news articles today. So that’s why this clustering algorithm, is a type of unsupervised learning algorithm. Let’s look at the second example of unsupervised learning applied to clustering genetic or DNA data.

This image shows a picture of DNA micro array data, these look like tiny grids of a spreadsheet. And each tiny column represents the genetic or DNA activity of one person, So for example, this entire Column here is from one person’s DNA. And this other column is of another person, each row represents a particular gene.

So just as an example, perhaps this role here might represent a gene that affects eye color, or this role here is a gene that affects how tall someone is. Researchers have even found a genetic link to whether someone dislikes certain vegetables, such as broccoli, or brussels sprouts, or asparagus. So next time someone asks you why didn’t you finish your salad, you can tell them, maybe it’s genetic for DNA micro race.
举个例子,或许这一行代表着一个影响眼睛颜色的基因,而这一行则是一个影响身高的基因。研究人员甚至发现了与一个人是否讨厌某些蔬菜(如西兰花、甘蓝或芦笋)之间存在遗传联系。 所以,下次有人问你为什么没有吃完沙拉,你可以告诉他们,也许这是由DNA微阵列的遗传因素所决定的。


The idea is to measure how much certain genes, are expressed for each individual person. So these colors red, green, gray, and so on, show the degree to which different individuals do, or do not have a specific gene active. And what you can do is then run a clustering algorithm to group individuals into different categories. Or different types of people like maybe these individuals that group together, and let’s just call this type one. And these people are grouped into type two, and these people are groups as type three.

This is unsupervised learning, because we’re not telling the algorithm in advance, that there is a type one person with certain characteristics. Or a type two person with certain characteristics, instead what we’re saying is here’s a bunch of data. I don’t know what the different types of people are but can you automatically find structure into data.

And automatically figure out whether the major types of individuals, since we’re not giving the algorithm the right answer for the examples in advance. This is unsupervised learning, here’s the third example, many companies have huge databases of customer information given this data. Can you automatically group your customers, into different market segments so that you can more efficiently serve your customers.
即使我们没有事先给算法正确答案的示例,聚类算法也能够自动找出主要的个体类型。聚类算法就是一种无监督学习的方法。举个第三个例子,很多公司拥有庞大的客户信息数据库,当你有了客户信息数据库中的这些数据,你能否自动将你的客户划分到不同的市场区隔(市场区隔(Market Segment)是将消费者依不同的需求、特征区分成若干个不同的群体,而形成各个不同的消费群。),以便更高效地为客户提供服务。

Concretely the deep learning dot AI team did some research to better understand the deep learning dot AI community. And why different individuals take these classes, subscribed to the batch weekly newsletter, or attend our AI events. Let’s visualize the deep learning dot AI community, as this collection of people running clustering.
具体来说,Deep Learning Dot AI团队进行了一些调查,该调查旨在更好地了解Deep Learning Dot AI社区。我们想了解不同的人选择学习这些课程,批量订阅每周简报或者参加我们的人工智能活动的原因。我们将Deep Learning Dot AI社区视为便于运行聚类算法的人的样本集合。

That is market segmentation found a few distinct groups of individuals, one group’s primary motivation is seeking knowledge to grow their skills. Perhaps this is you, and so that’s great, a second group’s primary motivation is looking for a way to develop their career. Maybe you want to get a promotion or a new job, or make some career progression if this describes you, that’s great too. And yet another group wants to stay updated on how AI impacts their field of work, perhaps this is you, that’s great too. This is a clustering that our team used to try to better serve our community as we’re trying to figure out. Whether the major categories of learners in the deeper and community, So if any of these is your top motivation for learning, that’s great. And I hope I’ll be able to help you on your journey. Or in case this is you, and you want something totally different than the other three categories. That’s fine too, and I want you to know, I love you all the same.
通过划分市场区隔我们发现了几类不同的人群,有一组人的学习本课程的初始动机是寻求知识来提升自己的技能。也许这就是你的情况,这非常好。第二组人的学习本课程的动机是寻找发展自己事业的途径。也许你想升职、换工作,或在职业发展中取得一些进步。如果你符合这一描述,那也非常好。还有另一组人希望随时了解人工智能对他们的职业相关领域的影响。也许这就是你的情况,那也非常好。这是我们的团队运用聚类算法试图更好地服务我们的社区,我们正努力尝试弄清楚在Deep Learning Dot AI社区中主要的学习者类别是什么。所以,如果你学习本课程的动机恰好在以上提及的三个类别之中,那太棒了。我希望我能够在你的学习旅途中帮助到你。或者,如果你的学习动机不同于以上三种类别,你希望追求与上述三种类别完全不同的东西,那也没关系。我想让你知道,我同样爱你们所有人。

So to summarize a clustering algorithm. Which is a type of unsupervised learning algorithm, takes data without labels and tries to automatically group them into clusters. And so maybe the next time you see or think of a panda, maybe you think of clustering as well. And besides clustering, there are other types of unsupervised learning as well. Let’s go on to the next video, to take a look at some other types of unsupervised learning algorithms.




springboot项目外卖管理 day09-mysql主从复制以及nginx入门

文章目录 一、读写分离问题分析MySQL主从复制介绍 配置配置主库,我这里就用虚拟机上的mysql当主库了配置从库,我这里就用我的另一台克隆的虚拟机了 读写分离案例背景Sharding-JDBC介绍项目实现读写分离 二、Nginx简介Nginx的下载和安装安装过程&#xff…


1、当前我的个人开源库基于STM32F103,开发环境基于Keil,操作系统基于FreeRTOS V9.0 2、基于官方标准固件库V3.5基础上开发的BSP驱动外设库。 3、当前完成的有BKP_BSP、DMA_BSP、EXTI_BSP、FSMC_BSP、GPIO_BSP、IWDG_BSP、I2C_BSP、RTC_BSP、SPI_BSP、U…


0 概述 分析三种类型的insert在parse的各个阶段的差异: insert into TAB_IS SELECT * FROM STUDENT a WHERE a.sno > ANY (SELECT b.sno from STUDENT b); insert into TAB_IS values(10, AAA); insert into TAB_IS values(20, CCC),(30, DDD),(40, EEE);不同i…


24 两两交换链表的节点 力扣 思路: 还是看了carl哥的视频讲解才写出来。有点难搞 首先 还是老样子 需要一个dummyhead虚拟头节点。 然后核心就是 我们要操作后面两个节点的时候 一定要移动到 这两个节点的上一个节点。 (来自代码随想录) 然后…


记录学习的过程,如果在GD32F303CC上面移植EasyFlash。关于EasyFlash的相关介绍和源码,请参考:https://gitee.com/Armink/EasyFlash 或者 https://github.com/armink/EasyFlash 主要记录移植过程中需要注意的点,移植还是比较简单的…

6.6 极重要的复习,权限与指令间的关系

权限对于使用者账号很重要,因为他可以限制使用者能不能读取/创建/删除/修改文件或目录。 一、让使用者能进入某目录成为“可工作目录”的基本权限为何: 可使用的指令:例如 cd 等变换工作目录的指令; 目录所需权限:使…


liunx安装git : 提示:记录自己装git 过程 执行下边命令安装 yum -y install git 安装完查看是否安装成功 git --version安装路径默认在/usr/libexe 愉快开始使用git

帆软 FineReport 绘制漏斗图

七一建党节,祝党生日快乐! 夏日炎炎,周末在家,想起在用帆软做页面展示的时候,使用到了漏斗图,记录下来,方便查看。 以订单销量变化为例,分为五个阶段,商品浏览人数&#…


1、在对应的控制器文件中,添加如下代码: $adminIds $this->getDataLimitAdminIds(); if (is_array($adminIds)) {if (!in_array($row[$this->dataLimitField], $adminIds)) {$this->error(__(You have no permission));} } 2、在对应的index…


四.分布式锁 4.1.分布式锁概述 分布式锁:满足分布式系统或集群模式下多进程可见并且互斥的锁。分布式锁的核心思想就是让大家都使用同一把锁,只要大家使用的是同一把锁,那么我们就能锁住线程,不让线程并行,让程序串行…



Unity - 搬砖日志 - UGUI合批优化 - Overlap(UI AABB 有重叠), Z != 0 照样合批的方案

文章目录 环境目的Screen Space - Overlay优化限制该方案起源 环境 Unity : 2020.3.37f1 Pipeline : BRP (另一个项目在 2021.1.xx 的 LTS 下的 URP 管线同样如此,目测:因为 UGUI 不受渲染管线切换而变化) 目的 便于索引&#…




差错控制的基本概念--数字信号在传输过程中,由于信道传输特性不理想及加性噪声的影响,不可避免地会发生错误。 可通过以下三方面的措施来减小误码率:1)提高信道容量;2)降低编码效率;3&#xff…


ServiceManager的启动 hongxi.zhu Android 13 主要流程: 1. 启动ServiceManager进程 ServiceManager是由init(pid 1)进程启动的 system/core/rootdir/init.rc on init......# Start essential services.start servicemanager //framework层使用start hwservic…

Linux | Ubuntu卸载QQ

Linux | Ubuntu卸载QQ 终端输入: dpkg -l| grep qq如下图,找到QQ文件: 删除命令: sudo apt-get --purge remove 文件名在终端输入: sudo apt-get --purge remove libqqwing2v5:amd64如下图删除成功


MYSHOP商城 实验目的实验概述系统功能概述Myshop 商城概述系统开发分析功能列表系统用例图系统活动图 数据库设计运作界面展示用户管理模块新用户注册用户登录商城首页与用户退出 商品模块商品分页展示查看商品详情信息 购物车模块空购物车页面加入商品到购物车 订单模块提交订…

4.32UDP通信实现 4.33广播 4.34组播 4.35本地套接字通信

4.32UDP通信实现 ![在这 udp_client.c #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <arpa/inet.h>int main() {// 1.创建一个通信的socketint fd socket(PF_INET, SOCK_DGRAM, 0);if(fd -1) {…

springboot项目外卖管理 day08-缓存优化

文章目录 一、缓存优化问题说明环境搭建导入maven坐标配置yml文件设置序列化器&#xff0c;编写配置类 缓存短信验证码缓存菜品数据实现思路 SpringCacheSpring Cache介绍Spring Cache常用注解Spring Cache使用方式 缓存套餐数据实现思路 一、缓存优化 问题说明 环境搭建 导入…


前言 启动service有两种方式&#xff1a;startService和bindService。 这一篇先讲startService&#xff0c;读者如果只想看流程图&#xff0c;可以直接跳到总结。 1. ContextImpl 代码路径&#xff1a;frameworks\base\core\java\android\app\ContextImpl.java 1.1 startServ…