title: 双向链表与DFS的Unix文件储存程序 date: 2021-11-28 12:39:26 tags:
- [链表]
- [DFS] categories:
- [码农日常]
写在前面
这是一篇关于python编写的小型文件储存程序,旨在于模拟Unix下
ls
、cd
等的命令,在整一个中文互联网世界中,很难找到这样的实现方式,但是从数据结构和算法实现的角度来说存在许多价值。
题目
开篇一个题,不得不佩服英文的作业阅读,能够很细节引导学生去实现某一样计算机程序,有一个循序渐进的过程,能够相当于指引一样,并且十分易懂。
This exercise is about object oriented programming in Python. Your task is to use objects to implement a representation of a simple file system. In computer science, a file system is a data structure that the Operating System uses to control how data is stored and retrieved from a hard (or solid state) drive. A file system could be represented as a tree, in which files and directories may be organised as a hierarchy, so that, directories may contain subdirectories. For example, a file gatos.jpg may be under Isaac directory, but that is a subdirectory of the home folder. In file systems, a directory is just a file, but it is a special kind of file that contains a list of files. Here is an example:
Figure 1: An example of a file system as a tree. Directories are represented in circles, plain files in squared boxes
You can test your code with the test in test-fs.py .
1.Implement the following classes to represent a simple file system:
File: everything is a file
PlainFile: a plain file has a name (we are ignoring its contents)
Directory: has a name and a list of files
with constructors (i.e. init methods), so that, we can define the following directory tree (i.e. file system) based on the image above:
>> root = Directory("root",
[PlainFile("boot.exe"), Directory("home",[
Directory("thor", [PlainFile("hunde.jpg"), PlainFile("quatsch.txt")]),
Directory("isaac", [PlainFile("gatos.jpg")])])])
2.Add methods that print (recursively) the entire file system tree (i.e. define appropriate str methods). You don’t need to produce a pretty output
- it is ok if it looks like this:
Directory(root,[PlainFile(boot.exe),Directory(home, [Directory(thor,[PlainFile(hunde.jpg),PlainFile( quatsch.txt)],Directory(isaac,[PlainFile(gatos.jpg)]]]
3.A File may have other attributes such as owner. If not indicated, the owner will be set to “default”. Implement a method chown(new_owner) that will allow you to modify the owner of a file or directory.
>> file = PlainFile("boot.exe")
>> folder = Directory("Downloads",[])
>> print(f'file.owner: {file.owner}; folder: {folder.owner}') file.owner: default; folder: default
>> file.chown("root")
>> folder.chown("isaac")
>> print(f'file.owner: {file.owner}; folder: {folder.owner}') file.owner: root; folder: isaac
4.Implement a method ls() that recursively prints the content of the direc- tory and all the subdirectories, using indentation to represent how deep in the tree structure the file/directory is.
# if we run ls() on the previous object root:
>> root.ls() root
boot.exe home
thor
hunde.jpg quatsch.txt
isaac
gatos.jpg
5.Implement a new class, FileSystem , which will allow us to navigate and manipulate a file system as if it was a UNIX file system. In particular, this class will allow us to keep track of the working directory. It should be initialised as follows:
>> fs = FileSystem(root)
(a)Implement a method pwd() that tells you the current working direc- tory. This might be useful later when moving across directories. It should work like this:
>> fs.pwd() 'root'
(b)Implement ls() for the FileSystem class, so that, fs.ls() would work as question 4, but only printing from the current directory.
(c)Implement a method cd() that will allow you to move to a different directory (changing the working directory). It should work as follows:
# if you try to move to a non existing directory or to a file, # the method should complain:
>> fs.cd("casa")
The directory does not exist!
# But you can move to an existing directory in the working directory.
>> fs.cd("home")
# if we now do ls(), you should only see the content in home:
>> fs.ls() home
thor
hunde.jpg quatsch.txt
isaac
gatos.jpg
Note that our filesystem is case sensitive, if the user searches for “Home”, the method won’t find the folder, because “home” is a dif- ferent folder.
(d)Implement methods to create files create_file(name) and directo- ries mkdir(name) within the working directory. Both methods should make sure that the file or directory to be created doesn’t already ex- ist within the working directory. Directories must be empty when creating them with mkdir(name). The method mkdir may allow you to indicate the owner when creating it, but files will share the owner of the working directory.
(e)Modify the method cd() to allow us to go back to a parent directory. We will indicate the parent directory as “..”. For example:
>> fs.cd("home")
>> fs.pwd() 'home'
>> fs.cd("..")
>> fs.pwd() 'root'
Note that applying fs.cd("..") in a root node with no parent di- rectory will have no effect, but won’t return an error.
(f)Implement a method rm(name) that will allow you to remove a file from the current working directory. A directory can only be deleted if it is empty.
>> fs.rm("home")
Sorry, the directory is not empty
(g)Implement a method find(name) which tries to find a file name in a file system and returns the path to the first occurrence of the file if it finds it but False otherwise. For example:
>> fs.find("gatos.jpg") 'root/home/isaac/gatos.jpg'
>> fs.find("thor") 'root/home/thor'
>> fs.find("unix") False
Note that if you moved deeper in the directory tree using cd(), find(name) should only look from the current working directory.
(h)The UNIX file system has many other operations that you could implement here. Here are some ideas, but feel free to implement any functionality you find useful. Discuss ideas with us, but please add enough comments to understand what you are aiming to achieve.
•chown -R: we have implemented chown() to change the owner of a single file or directory. Add an efficient way to apply this function recursively to all files and sub directories of a folder.
•Files and directories usually have permissions (read, write, and execution). Can you add that and functions to manipulate the permissions? (e.g. chmod).
•Improve the ls() to show the owner, permissions (similar to what ls -l would do in UNIX.
•In UNIX we can also move files from one directory to another using the command mv, indicating the destination PATH.
通过方法
想通过测试,需要走下面的代码并且全部pass才能完成通过程序
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
"""
from file_system import Directory, PlainFile, FileSystem
print("Testing question 1")
# question 1 should allow to create simple files and folders:
file = PlainFile("boot.exe")
folder = Directory("Downloads",[])
root = Directory("root",[PlainFile("boot.exe"),
Directory("home",[
Directory("thor",
[PlainFile("hunde.jpg"),
PlainFile("quatsch.txt")]),
Directory("isaac",
[PlainFile("gatos.jpg")])])])
print("Testing question 2")
# question 2: implement the str
print(root)
"""
Directory(root,[PlainFile(boot.exe),Directory(home,[Directory(thor,[PlainFile(hunde.jpg),PlainFile(quatsch.txt)],Directory(isaac,[PlainFile(gatos.jpg)]]]
"""
print("Testing question 3")
# question 3: test chown()
file = PlainFile("boot.exe")
folder = Directory("Downloads",[])
print(f'file.owner: {file.owner}; folder: {folder.owner}')
file.chown("root")
folder.chown("isaac")
print(f'file.owner: {file.owner}; folder: {folder.owner}')
print("Testing question 4")
#question 4: ls() doesn't return anything but prints.
root.ls()
"""
root
boot.exe
home
thor
hunde.jpg
quatsch.txt
isaac
gatos.jpg
"""
# question 5: create FileSystem
print("Testing question 5a: basic filesystem and pwd")
fs = FileSystem(root)
# 5a:
print(fs.pwd())
print("Testing question 5b: ls in working directory")
# 5b:
fs.ls()
# 5c:
print("Testing question 5c: cd")
# if you try to move to a non existing directory or to a file,
# the method should complain:
fs.cd("casa")
# But you can move to an existing directory in the working directory.
fs.cd("home")
# if we now do ls(), you should only see the content in home:
fs.ls()
# you can't go backwards yet...
print("Testing question 5d: mkdir and create file")
fs = FileSystem(root) # re-initialise fs
fs.mkdir("test") # the owner of the directory should be 'default' as not indicated. fs.mkdir("test","isaac") would set the owner to isaac
fs.cd("test")
fs.create_file("test.txt")
fs.ls()
print("Testing question 5e: dot dot"
# to test this properly, let's create the entire file system using our previous functions!
root = Directory("root",[],owner="root")
fs = FileSystem(root)
fs.create_file("boot.exe") # when creating a file we do not need to indicate owner, it will be the same as the working directory
fs.mkdir("test")
fs.mkdir("home",owner="root")
fs.cd("home")
fs.mkdir("thor",owner="thor")
fs.mkdir("isaac",owner="isaac")
fs.cd("thor")
fs.create_file("hunde.jpg")
fs.create_file("quatsch.txt")
fs.cd("..")
fs.cd("isaac")
fs.create_file("gatos.jpg")
fs.cd("..")
fs.cd("..")
fs.ls()
print("Testing question 5f: rm")
fs.rm("test") # shouldn't work!
fs.cd("test")
fs.rm("test.txt")
fs.cd("..")
fs.rm("test")
fs.ls()
print("Testing question 5e: find")
print(fs.find("gatos.jpg"))
fs.cd("home")
print(fs.find("boot.exe")) # shouldn't find it!
程序编写
初始化PlainFile和Directory类
初始化两个类利用了基础的__init__
方法进行
class File(object):
"""
Everything is a file
"""
def __init__(self, name, owner) -> None:
super().__init__()
self.name = name
self.owner = "default"
class PlainFile(File):
"""
A plain file has a name ignoring its contents
"""
def __init__(self, name, owner="default") -> None:
super().__init__(name, owner)
pass
def chown(self, new_owner) -> None:
"""
Allow to modify the owner of a file or directory.
"""
self.owner = new_owner
class Directory(object):
"""
Includes a name and a list of files, recursively store the folder
"""
def __init__(self, root, args, owner="default") -> None:
super().__init__()
self.root = root
self.sub = args
self.owner = owner
显然这些初始化的过程并不困难,而是在后面的一些查找、遍历、递归当中。
列出文件
def _ls(self, subs, level) -> None:
"""
:params subs, the root Directory.
:return None, print (recursively) the entire file system tree.
"""
iter_text = "".join([" " for _ in range(level)])
_type = type(subs)
if _type == Directory:
print("{}{}".format(iter_text, subs.root))
level += 1
self._ls(subs.sub, level)
elif _type == PlainFile:
print("{}{}".format(iter_text, subs.name))
elif _type == list:
for i in subs:
self._ls(i, level)
# for sub in subs:
# if type(sub) == list:
# self.getrecursive_tree(sub)
文件列出需要定义level
层数,表示递归的层级,并且根据不同数据类型的字典子集,输出不同的数据结果。
文件查找
在文件查找当中,使用到DFS深度优先搜索的方式进行,以递归的方式进行实现。
def _find(self, name, workspace):
"""
Using DFS to find the File or Directory
If found ,return path otherwise return False
"""
subs = workspace.sub
namelist = []
for file in subs:
if type(file) == PlainFile:
namelist.append(file.name)
elif type(file) == Directory:
namelist.append(file.root)
for file in subs:
if name in namelist:
return workspace.root + "/" + name
else:
if type(file) == Directory:
if self._find(name, file):
return workspace.root + "/" + self._find(name, file)
FileSystem文件系统类
整片代码的精华就在于此文件系统类,一开始是根据指引进行编写,根据引导来构成代码程序,后面逐渐弄清楚的文件系统与其内部成员之间的关系,相当于一个双向链表(也类似于一个树形结构)
def __init__(self, workspace:Directory, parentstate = None) -> None:
super().__init__()
self._pwd = workspace.root
self._workspace = workspace
self._parentstate = parentstate
#Use parents state to store "father" status
可见,其内部数据结构由三个成员组成:_pwd
当前目录名称,_wordspace
当前工作目录,_parentstate
父目录,其中当前工作目录的数据类型为字典,用于储存当前环境下的工作路径,而_parentstate
数据类型为FileSystem
,那么文件系统直接的目录,怎么控制cd
进入下级目录和进入上层目录呢?
由于是第一次在类内写链表,也折腾了很久,而且这个并不像是一般类型的链表,比如说Directory
下的数组就不一定都是一个数据类型,其内部还有PlainFile
的数据类型。
其cd
用于控制当前目录的代码为
def cd(self, filename) -> None:
"""
Change the workspace.
"""
if filename == ".." and self._parentstate:
self._workspace = self._parentstate._workspace
self._pwd = self._parentstate._pwd
self._parentstate = self._parentstate._parentstate
else:
subs = self._workspace.sub
namelist = []
for i in subs:
if type(i) == PlainFile:
namelist.append(None)
elif type(i) == Directory:
namelist.append(i.root)
if filename in namelist:
for index, name in enumerate(namelist):
if filename == name:
# self._parentstate = FileSystem()
new_filesys = FileSystem(self._workspace, self._parentstate)
self._parentstate = new_filesys
self._workspace = self._workspace.sub[index]
self._pwd = self._workspace.root
else:
print("The directory does not exist!")
cd
..则代表进入上层目录,而cd加文件夹则代表进入下级目录,那么最关键的问题在于:
进入下级目录之后在目录当中新建文件,再返回上级目录,上级目录是否会包含下级目录的信息呢?还是单单一个新建的对象(地址不同)直接没有联系。
因此需要处理好状态控制,处理好对象与对象之间互相的联系
仔细阅读代码,在查找到下级目录的时候,先新建一个FileSystem
对象,此对象的_parentstate
就是当前目录下的父目录,然后再更新_parentstate
为当前的父目录。
因为FileSystem
的_workspace
当中,其指针地址一致,因此不会出现刚刚那样的问题,而如果要作出简短的总结,则是
FileSystem是用于控制状态的类, 里面含有指向固定地址的指针。
其他示例
#模拟案例
class Content(object):
def __init__(self, data) -> None:
super().__init__()
self.data = data
class Link(object):
def __init__(self,data , pre = None, next = None) -> None:
super().__init__()
self.pre = pre
self.next = next
self.data = data
def ls(self):
self.data = self.pre.data
self.pre = self.pre.pre
def cd(self):
new = Link(self.data, self.pre)
self.pre = new
self.data = self.data.data[2]
link = Link(Content([1,2, Content([3, 4])]))
print(link.data.data)
link.cd()
link.data.data.append(8)
link.ls()
print(link.data.data[2].data)
写在后面
虽然这只是使用python语言实现的一个很简单的文件系统,但是里面包含了如DFS
、双向链表
以及含有使用__str__
重写函数的点,
参考
《回溯算法实现在特定路径下查找文件》
https://blog.csdn.net/leviopku/article/details/121499215
《Python-深入理解递归函数中的return返回值》
https://blog.csdn.net/weixin_40476348/article/details/98602498#:~:text=python%20