《计算机英语》 Unit 4 Information Management 信息管理

Section A Information Storage 信息存储

1. The importance of Information信息的重要性


reside         vi属于,驻留

tablet        n平板电脑

laptop        n笔记本电脑

repository        n仓库

claim        n索赔

regulatory        n法规

contractual        a合同的

obligation        n责任,合约

Information is increasingly important in our daily lives. We have become information dependent in the 21st century, living in an on-command, on-demand world, which means, we need information when and where it is required.

We access the Internet every day to perform searches, participate in social networking, send and receive E-mails, share pictures and videos, and use scores of other applications.

Equipped with a growing number of content-generating devices, more information is created by individuals than by organizations. Information created by individuals gains value when shared with others.

2. What is data 什么是数据


binary         n二进制

digital        a数字的

DBMS(Database Management System)        数据库管理系统

query        v查询

retrieve        v检索

Data is a collection of raw facts from which conclusions may be drawn.

Before the advent of computers, the methods adopted for data creation and sharing were limited to fewer forms, such as paper and film.

Today, the same data can be converted into more convenient forms, such as an e-mail message, an e-book, a digital image, or a digital movie.

This data can be generated using a computer and stored as strings of binary numbers (0s and 1s).

Data in this form is called digital data and is accessible by the user only after a computer processes it.

Data can be classified as structured or unstructured based on how it is stored and managed.

Structured data is organized in rows and columns in a rigidly defined format so that applications can retrieve and process it efficiently.

Structured data is typically stored using a database management system (DBMS).

Data is unstructured if its elements cannot be stored in rows and columns, which makes it difficult to query and retrieve by applications.

For example, customer contacts that stored in various forms such as sticky notes, e-mail messages, business cards, or even digital format files, such as .doc, .txt, and .pdf.

Due to its unstructured nature, it is difficult to retrieve this data using a traditional customer relationship management application.

A vast majority of new data being created today is unstructured.

The industry is challenged with new architectures, technologies, techniques, and skills to store, manage, analyze, and derive value from unstructured data form numerous sources.

3. Evolved of storage Architecture 存储架构的演变


storage         n存储

term        vt把。。。称为

mainframe        n主机,大型机

tape reel        磁带卷

disk pack        磁盘组

Affordability        n可购性,成本合理性

deployment        n部署

maintenance        n维护,维修

consolidate        v整合

leverage        n杠杆;v利用

Storage devices 存储设备

a media card in a cell phone or digital camera,
DVDs, CD-ROMs, and disk drives in personal computers

  • DVDs        abbr.(digital video disks, 或 digital versatile discs)
  • CD-ROMs        abbr.(=Compact disc read-only memory)

internal hard disks, external disk arrays, and tapes

Storage architecture 存储架构

  • server-centric storage architecture  以服务器为中心的存储架构

In earlier implementations of open systems, the storage was typically internal to the server. These storage devices could not be shared with any other servers.

In this architecture, each server has a limited number of storage devices, and any administrative tasks, such as maintenance of the server or increasing storage capacity, might result in unavailability of information.

  • information-centric architecture 以信息为中心的架构

storage devices are managed centrally and independent of servers. These centrally-managed storage devices are shared with multiple servers.

When a new server is deployed in the environment, storage is assigned from the same shared storage devices, to that server.

The capacity of shared storage can be increased dynamically by adding more storage devices without impacting information availability.

4. Storage networking technologies 存储网络技术


SAN(Storage Area Network) 存储区域网络

fibre        n光纤

Gb        千兆字节,吉字节(gigabye的缩写)

scalable        a可扩展的

robust        a强健,鲁棒

NAS(Network Attached Storage)网络连接存储

seamlessly        adv无缝的

Object-based Storage         基于对象的存储

flat        a单一的

  • SAN(Storage Area Network) 存储区域网络
  • NAS(Network Attached Storage)网络连接存储
  • Object-based Storage 基于对象的存储

More than 90% of the data being generated is unstructured. Traditional solutions are inefficient to handle the growth.

These challenges demanded a smarter approach to manage unstructured data based on its content.

Object-based storage is a way to store file data in the form of objects on flat address space based on its content and attributes rather than the name and location.

Figure 4A-4 displays the key components of Object-based Storage device.

5. Challenge of Storage 存储的挑战


data science         数据科学

simultaneously        adv同时地

provision        v为...提供物品

  • data science 数据科学
  • data center 数据中心
  • virtualization and cloud computing 虚拟化和云计算


  1. Although the majority of information is created by individuals, it is stored and managed by a relatively small number of organizations.

  2. Data is a collection of raw facts from which conclusions may be drawn.

  3. This data can be generated using a computer and stored as strings of binary numbers (0s and 1s).

  4. Data can be classified as structured or unstructured based on how it is stored and managed. 数据可以根据其存储和管理方式被分类为结构化非结构化

  5. Structured data is typically stored using a database management system (DBMS). 结构化数据通常使用数据库管理系统(DBMS)来存储。

  6. In information-centric architecture, storage devices are managed centrally and independent of servers.

  7. Data center is a facility that contains storage, compute, network, and other IT resources to provide centralized data-processing capabilities.

  8. Cloud infrastructure is usually built upon virtualized data centers, which provide resource pooling and rapid provisioning of resources.


  • data center - 数据中心
  • binary - 二进制
  • digital - 数字的
  • data science - 数据科学
  • DBMS (Database Management System) - 数据库管理系统
  • mainframe - 主机(大型计算机)
  • tape reel - 磁带卷
  • disk pack - 磁盘组
  • SAN (Storage Area Network) - 存储区域网络
  • Fibre Channel (FC) - 光纤通道
  • scalable - 可扩展的
  • robust - 健壮的、可靠的
  • NAS (Network Attached Storage) - 网络连接存储
  • object-based storage - 基于对象的存储

Section B Data Mining 数据挖掘


interrogation         n询问

data warehouse        数据仓库

lieu        n代替,场所

statistics        n统计学

machine learning        机器学习

neural network        神经网络

cluster analysis         聚类分析

association analysisi        关联分析

outlier analysis        孤立点分析

deviation        n偏离

sequential pattern analysis        序列模式分析

empirical        a经验的

bioinformatics        n生物信息学

genomics        n基因学

biometrics        n生物统计学

coincidence        n巧合,一致

ethical        n道德的,民族的


  1. A rapidly expanding subject that is closely associated with database technology is data mining, which consists of techniques for discovering patterns in collections of data.

  2. Data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.

  3. Several types of analytical software in data mining are available: statistical, machine learning, and neural networks.
    数据挖掘中有几种类型的分析软件: 统计学、机器学习和神经网络

  4. Association analysis involves looking for links between data groups.

  5. Outlier analysis tries to identify data entries that do not comply with the norm.

  6. Data mining encompasses a vast number of ethical issues involving the rights of individuals represented in the data warehouse.


  • data mining - 数据挖掘
  • knowledge discovery in data (KDD) - 数据挖掘中的知识发现
  • data warehouses - 数据仓库
  • machine learning - 机器学习
  • neural networks - 神经网络
  • cluster analysis - 聚类分析:一种将数据集中的对象分组的统计方法,使得同一组内的对象比其他组的对象更相似。

  • association analysis - 关联分析:一种用于发现大数据集中变量之间有趣关系的方法,常见的应用包括市场篮子分析。

  • outlier analysis - 孤立点分析:一种用于识别数据集中异常值或离群点的分析方法,这些点可能代表了测量误差、数据录入错误或真实的变异。

  • sequential pattern analysis - 序列模式分析:一种分析数据集中的序列信息,以发现项目之间有意义的时序关联模式的方法。





