COMP9315-week2-lecture1

news2025/1/12 10:03:57

COMP9315 19T2 Week 2 Lecture 1_哔哩哔哩_bilibili

C:\python\COMP9315-master\19T2\Lectures\weel02.pdf

COMP9315 24T1 - Course Notes (unsw.edu.au)

前面三分之一时间讲week1的练习题,是一个存储过程

COMP9315-master\19T2\Lecture Exercises\week01\ex05\schema.sql 这是一个存储过程

Week 02 Lectures

Storage Manager
Storage Management 2/68
Levels of DBMS related to storage management

... Storage Management 3/68
Aims of storage management in DBMS:

  • provide view of data as collection of pages/tuples
  • map from database objects (e.g. tables) to disk files
  • manage transfer of data to/from disk storage
  • use buffers to minimise disk/memory transfers
  • interpret loaded data as tuples/records
  • basis for file structures used by access methods

Views of Data in Query Evaluation

... Views of Data in Query Evaluation 5/68
Representing database objects during query execution:

  • DB (handle on an authorised/opened database)
  • Rel (handle on an opened relation)
  • Page (memory buffer to hold contents of disk block)
  • Tuple (memory holding data values from one tuple)

Addressing in DBMSs:

  • PageID = FileID+Offset ... identifies a block of data
    • where Offset gives location of block within file
  • TupleID = PageID+Index ... identifies a single tuple
    • where Index gives location of tuple within page

Storage Management 6/68
Topics in storage management ...

  • Disks and Files
    • performance issues and organisation of disk files
  • Buffer Management
    • using caching to improve DBMS system throughput
  • Tuple/Page Management
    • how tuples are represented within disk pages
  • DB Object Management (Catalog)
    • how tables/views/functions/types, etc. are represented

Storage Technology
Storage Technology 8/68
Persistent storage is

  • large, cheap, relatively slow, accessed in blocks
  • used for long-term storage of data

Computational storage is

  • small, expensive, fast, accessed by byte/word
  • used for all analysis of data

Access cost HDD:RAM ≅ 100000:1, e.g.

  • 10ms to read block containing two tuples
  • 1μs to compare fields in two tuples

... Storage Technology 9/68
Hard disks are well-established, cheap, high-volume, ...
Alternative bulk storage: SSD

  • faster than HDDs, no latency
  • can read single items
  • update requires block erase then write
  • over time, writes "wear out" blocks
  • require controllers that spread write load

Feasible for long-term, high-update environments?

... Storage Technology 10/68
Comparison of HDD and SSD properties:
                  HDD          SDD
Cost/byte ~ 4c / GB ~ 13c / GB
Read latency ~ 10ms ~ 50μs
Write latency ~ 10ms ~ 900μs
Read unit block (e.g. 1KB) byte
Writing write a block write on empty block
Will SSDs ever replace HDDs?

Cost Models 11/68
Throughout this course, we compare costs of DB operations
Important aspects in determining cost:

  • data is always transferred to/from disk as whole blocks (pages)
  • cost of manipulating tuples in memory is negligible
  • overall cost determined primarily by #data-blocks read/written

Complicating factors in determining costs:

  • not all page accesses require disk access (buffer pool)
  • tuples typically have variable size (tuples/page ?)

More details later ...

File Management 12/68
Aims of file management subsystem:

  • organise layout of data within the filesystem
  • handle mapping from database ID to file address
  • transfer blocks of data between buffer pool and filesystem
  • also attempts to handle file access error problems (retry)

Builds higher-level operations on top of OS file operations

... File Management 13/68
Typical file operations provided by the operating system:
fd = open(fileName,mode)
// open a named file for reading/writing/appending
close(fd)
// close an open file, via its descriptor
nread = read(fd, buf, nbytes)
// attempt to read data from file into buffer
nwritten = write(fd, buf, nbytes)
// attempt to write data from buffer to file
lseek(fd, offset, seek_type)
// move file pointer to relative/absolute file offset
fsync(fd)
// flush contents of file buffers to disk

DBMS File Organisation 14/68
How is data for DB objects arranged in the file system?
Different DBMSs make different choices, e.g.

  • by-pass the file system and use a raw disk partition
  • have a single very large file containing all DB data
  • have several large files, with tables spread across them
  • have multiple data files, one for each table
  • have multiple files for each table

etc.

Single-file DBMS 15/68
Consider a single file for the entire database (e.g. SQLite)
Objects are allocated to regions (segments) of the file.

If an object grows too large for allocated segment, allocate an extension.
What happens to allocated space when objects are removed?

... Single-file DBMS 16/68
Allocating space in Unix files is easy:

  • simply seek to the place you want and write the data
  • if nothing there already, data is appended to the file
  • if something there already, it gets overwritten

If the seek goes way beyond the end of the file:

  • Unix does not (yet) allocate disk space for the "hole"
  • allocates disk storage only when data is written there

With the above, a disk/file manager is easy to implement

Single-file Storage Manager 17/68
Consider the following simple single-file DBMS layout:

E.g.
SpaceMap = [ (0,10,U), (10,10,U), (20,600,U), (620,100,U), (720,20,F) ]
TableMap = [ ("employee",20,500), ("project",620,40) ]

... Single-file Storage Manager 18/68
Each file segment consists of a number fixed-size blocks
The following data/constant definitions are useful
#define PAGESIZE 2048 // bytes per page
typedef long PageId; // PageId is block index
                             // pageOffset=PageId*PAGESIZE
typedef char *Page; // pointer to page/block buffer
Typical PAGESIZE values: 1024, 2048, 4096, 8192

... Single-file Storage Manager 19/68
Storage Manager data structures for opened DBs & Tables
typedef struct DBrec {
  char *dbname; // copy of database name
  int fd; // the database file
  SpaceMap map; // map of free/used areas
  NameTable names; // map names to areas + sizes
} *DB;
typedef struct Relrec {
  char *relname; // copy of table name
  int start; // page index of start of table data
  int npages; // number of pages of table data
...
} *Rel;

Example: Scanning a Relation

With the above disk manager, our example:
select name from Employee
might be implemented as something like
DB db = openDatabase("myDB");
Rel r = openRelation(db,"Employee");
Page buffer = malloc(PAGESIZE*sizeof(char));
for (int i = 0; i < r->npages; i++) {
  PageId pid = r->start+i;
  get_page(db, pid, buffer);
  for each tuple in buffer {
    get tuple data and extract name
    add (name) to result tuples
  }
}

Single-File Storage Manager 21/68
// start using DB, buffer meta-data
DB openDatabase(char *name) {
DB db = new(struct DBrec);
db->dbname = strdup(name);
db->fd = open(name,O_RDWR);
db->map = readSpaceTable(db->fd);
db->names = readNameTable(db->fd);
return db;
}
// stop using DB and update all meta-data
void closeDatabase(DB db) {
writeSpaceTable(db->fd,db->map);
writeNameTable(db->fd,db->map);
fsync(db->fd);
close(db->fd);
free(db->dbname);
free(db);
}
... Single-File Storage Manager 22/68
// set up struct describing relation
Rel openRelation(DB db, char *rname) {
Rel r = new(struct Relrec);
r->relname = strdup(rname);
// get relation data from map tables
r->start = ...;
r->npages = ...;
return r;
}
// stop using a relation
void closeRelation(Rel r) {
free(r->relname);
free(r);
}
... Single-File Storage Manager 23/68
// assume that Page = byte[PageSize]
// assume that PageId = block number in file
// read page from file into memory buffer
void get_page(DB db, PageId p, Page buf) {

lseek(db->fd, p*PAGESIZE, SEEK_SET);
read(db->fd, buf, PAGESIZE);
}
// write page from memory buffer to file
void put_page(Db db, PageId p, Page buf) {
lseek(db->fd, p*PAGESIZE, SEEK_SET);
write(db->fd, buf, PAGESIZE);
}
... Single-File Storage Manager 24/68
Managing contents of space mapping table can be complex:
// assume an array of (offset,length,status) records
// allocate n new pages
PageId allocate_pages(int n) {
if (no existing free chunks are large enough) {
int endfile = lseek(db->fd, 0, SEEK_END);
addNewEntry(db->map, endfile, n);
} else {
grab "worst fit" chunk
split off unused section as new chunk
}
// note that file itself is not changed
}
... Single-File Storage Manager 25/68
Similar complexity for freeing chunks
// drop n pages starting from p
void deallocate_pages(PageId p, int n) {
if (no adjacent free chunks) {
markUnused(db->map, p, n);
} else {
merge adjacent free chunks
compress mapping table
}
// note that file itself is not changed
}
Changes take effect when closeDatabase() executed.

Exercise 1: Relation Scan Cost 26/68
Consider a table R(x,y,z) with 105 tuples, implemented as
number of tuples r = 10,000
average size of tuples R = 200 bytes
size of data pages B = 4096 bytes
time to read one data page Tr = 10msec
time to check one tuple 1 usec
time to form one result tuple 1 usec
time to write one result page Tr = 10msec
Calculate the total time-cost for answering the query:
insert into S select * from R where x > 10;
if 50% of the tuples satisfy the condition.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2087597.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

AcWing859. Kruskal算法求最小生成树

一定要看这个链接的讲解视频&#xff1a;强烈推荐&#xff01;&#xff01;&#xff01;【图-最小生成树-Prim(普里姆)算法和Kruskal(克鲁斯卡尔)算法】 文章目录 1.题目2.Kruskal基本思想&#xff1a;3.逐行解释代码&#xff1a; 1.题目 2.Kruskal基本思想&#xff1a; Krus…

sql-labs46-50通关攻略

第46关 一.查询数据库 http://172.16.1.142/Less-46/?sort1%20and%20updatexml(1,concat(0x7e,(select%20database()),0x7e),1)--http://172.16.1.142/Less-46/?sort1%20and%20updatexml(1,concat(0x7e,(select%20database()),0x7e),1)-- 二.查表 http://172.16.1.142/Les…

Eureka:Spring Cloud中的服务注册与发现如何实现?

Eureka&#xff1a;Spring Cloud中的服务注册与发现如何实现&#xff1f; 1、什么是服务注册与发现&#xff1f;2、Eureka的工作原理3、Eureka的优势 &#x1f496;The Begin&#x1f496;点点关注&#xff0c;收藏不迷路&#x1f496; 在微服务架构的浪潮中&#xff0c;服务注…

谷歌的 GameNGen:无需游戏引擎,人工智能模拟 “毁灭战士“,开辟新天地

谷歌公司的研究人员创建了一个神经网络&#xff0c;可以在不使用传统游戏引擎的情况下生成经典射击游戏《毁灭战士》的实时游戏&#xff0c;从而实现了人工智能领域的一个重要里程碑。这个名为 GameNGen 的系统标志着人工智能向前迈出了重要一步&#xff0c;它能在单芯片上以每…

ffmpeg教程及加速视频转码

ffmpeg教程及加速视频转码 1、ffmpeg简介&#xff1a; ffmpeg来自MPEG视频编码标准。 是一套可以用来记录&#xff0c;转换数字音频、视频&#xff0c;并能将其转化为流的开源计算机程序。 可以轻易的实现多种视频格式之间的相互转换。 2、基础知识&#xff1a; 容器、文件…

2d像素游戏基本架构

目录 2D像素游戏的基本架构通常包括以下几个关键部分 Unity和虚幻引擎在2D游戏开发中的性能比较 Unity的2D工具设计复杂的地图和场景 创建和管理地图资源&#xff1a; 使用TileMap工具&#xff1a; 构建复杂场景&#xff1a; 添加碰撞体和物理效果&#xff1a; 优化和…

密码访问单页自定义跳转页面源码

源码介绍 密码访问单页自定义跳转页面源码&#xff0c;密码访问单页自定义跳转页面&#xff0c;修改了的密码访问单页&#xff0c;添加了js自定义密码跳转页面。需要正确输入密码才能跳转目标网址。 源码截图 源码下载 密码访问单页自定义跳转页面源码

区分wps还是office创建的文档,word、ppt和excel

手动区分 文档->右键->属性 代码实现 namespace WpsAndOfficeDifferent {internal class Program{static void Main(string[] args){string root System.AppDomain.CurrentDomain.SetupInformation.ApplicationBase ?? "";#region 区分office和wps创建…

SELF-INSTRUCT: Aligning Language Modelswith Self-Generated Instructions 学习

指令微调就是要训练模型执行用户的要求的能力。 文章首先说“指令微调”数据集经常是人工生成&#xff0c;有数量少等缺点。文章提供了一个让语言模型自己生成指令微调数据&#xff0c;自己学习的方法。首先会让一个语言模型自己生成要求&#xff0c;输入和输出&#xff0c;然…

【SpringBoot】电脑商城-09-默认收获地址和删除收货地址

默认收货地址 1 默认收货地址-持久层 1.1 规划需要执行的SQL语句 1.将某用户的所有收货地址设置为非默认地址&#xff08;是否默认&#xff1a;0-不默认&#xff0c;1-默认&#xff09;。 update t_address set is_default0 where uid?2.将某用户指定的收货地址设置为默认…

108页PPT分享:华为流程体系及实施方法最佳实践

PPT下载链接见文末~ 华为的流程体系、流程框架及实施方法是一个复杂而精细的系统&#xff0c;旨在确保公司运作的高效性和竞争力。以下是对这些方面的详细描述&#xff1a; 一、华为的流程体系 华为的流程体系是一套全面的管理体系&#xff0c;它涵盖了企业所有的活动&#…

玩转云服务:Oracle Cloud甲骨文永久免费云主机配置指南(续)

前段时间&#xff0c;和大家分享了白嫖Oracle Cloud的云服务器&#xff1a; 玩转云服务&#xff1a;Oracle Cloud甲骨文永久免费云服务器注册及配置指南。 新注册的小伙伴&#xff0c;可以在 30 天内&#xff0c;利用 300 美元免费储值&#xff0c;任性使用所有 Oracle Cloud …

【unity实战】使用新版输入系统Input System+Rigidbody实现第三人称人物控制器

最终效果 前言 使用CharacterController实现3d角色控制器&#xff0c;之前已经做过很多了&#xff1a; 【unity小技巧】unity最完美的CharacterController 3d角色控制器&#xff0c;实现移动、跳跃、下蹲、奔跑、上下坡、物理碰撞效果&#xff0c;复制粘贴即用 【unity实战】C…

InternLM2.5 部署到安卓手机上

环境准备 1.1 安装rust export RUSTUP_DIST_SERVERhttps://mirrors.ustc.edu.cn/rust-static export RUSTUP_UPDATE_ROOThttps://mirrors.ustc.edu.cn/rust-static/rustup curl --proto https --tlsv1.2 -sSf https://mirrors.ustc.edu.cn/misc/rustup-install.sh | sh1.2 安…

PostgreSQL 服务启动不了问题

如图&#xff0c;遇到这个问题&#xff0c;需要给文件夹打开权限即可。 先给主文件夹postgreSQL打开所有权限,点击属性->安全那里&#xff0c;所有修改啥的权限都打开。再给里面的data文件夹打开权限。

开源搜索引擎之Solr

Apache Solr 是一个开源的企业级搜索平台&#xff0c;构建在 Apache Lucene 之上&#xff0c;提供了强大的全文搜索、实时索引和分布式搜索能力。Solr 被广泛用于构建高性能的搜索应用程序&#xff0c;支持从简单的搜索引擎到复杂的数据分析平台等多种场景。以下是对 Apache So…

1panle搭建的maxkb增加本地向量模型

首先下载模型&#xff0c;比如m3e-large&#xff0c;并上传到/opt/maxkb/model/local_embedding/ 目录&#xff0c;没有就创建 目录如下&#xff1a; 然后修改1panel的容器信息&#xff0c;点击右边的编辑&#xff1a; 在下方的挂在目录处点击添加&#xff1a; 在两个框都输入…

ISIS路由渗透

/ 实验介绍: / 原理概述 在IS-IS网络中&#xff0c;所有的Level-2和Level-1-2路由器构成了一个连续的骨干区域。Level-1区域必须且只能与骨干区域相连&#xff0c;不同的Level-1区域之间不能直接相连。Level-1区域内的路由信息会通过Level-1-2路由器通报给Level-2区域&#x…

EmguCV学习笔记 C# 8.3 Grabcut法

版权声明&#xff1a;本文为博主原创文章&#xff0c;转载请在显著位置标明本文出处以及作者网名&#xff0c;未经作者允许不得用于商业目的。 EmguCV是一个基于OpenCV的开源免费的跨平台计算机视觉库,它向C#和VB.NET开发者提供了OpenCV库的大部分功能。 教程VB.net版本请访问…

爬取央视热榜并存储到MongoDB

1. 环境准备 在开始之前&#xff0c;确保你已经安装了以下Python库&#xff1a; pip install requests pymongo2. 爬取网页内容 首先&#xff0c;我们需要爬取央视热榜的网页内容。通过requests.get()方法&#xff0c;我们可以获取网页的HTML内容&#xff0c;并通过re.finda…