Skip to content

Commit 835173c

Browse files
authored
Merge pull request #363 from sspencerwire/rsync_book
Rsync book
2 parents db36ac2 + b926186 commit 835173c

14 files changed

+1499
-0
lines changed
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
---
2+
title: rsync brief description
3+
author: tianci li
4+
contributors: Steven Spencer
5+
update : 2021-11-04
6+
---
7+
8+
# Backup Brief
9+
10+
What is a backup?
11+
12+
Backup refers to the duplication of data in the file system or database. In the event of an error or disaster, the effective data of the system can be restored in a timely manner and normal operation.
13+
14+
What are the backup methods?
15+
16+
* Full backup: refers to a one-time copy of all files, folders or data in the hard disk or database. (Pros: the best, can recover data faster. Disadvantages: take up a larger hard disk space.)
17+
* Incremental backup: refers to the backup of the data updated after the last full backup or incremental backup. The process is like this, such as a full backup on the first day; a backup of the newly added data on the second day, as opposed to a full backup; on the third day, a backup of the newly added data on the basis of the second day. , Relative to the next day. And so on.
18+
* Differential backup: Refers to the backup of the changed files after the full backup. For example, a full backup on the first day; a backup of the new data on the second day; a backup of the new data from the second day to the third day on the third day; and a backup of all the new data from the second day to the fourth day on the fourth day. And so on.
19+
* Selective backup: Refers to backing up a part of the system.
20+
* Cold backup: refers to the backup when the system is in shutdown or maintenance state. The backed up data is exactly the same as the data in the system during this period.
21+
* Hot backup: Refers to the backup when the system is in normal operation. As the data in the system is updated at any time, the backed-up data has a certain lag relative to the real data of the system.
22+
* Remote backup: refers to backing up data in another geographic location to avoid data loss and service interruption caused by fire, natural disasters, theft, etc.
23+
24+
## rsync in brief
25+
26+
On a server, I backed up the first partition to the second partition, which is commonly known as "Local backup." The specific backup tools are `tar` , `dd` , `dump` , `cp `, etc. can be achieved. But in fact, it is still "Don't put the eggs in the same basket." Once the hardware fails and cannot boot and start normally, the data still cannot be retrieved. In order to solve the local backup For this problem, we introduced another kind of backup --- "remote backup".
27+
28+
Some people will say, I use the `tar` or `cp` command on the first server, and then transfer it to the second server via `scp` or `sftp`.
29+
30+
In a production environment, the amount of data is relatively large. First of all, `tar` or `cp` consumes a lot of time and occupies system performance. Transmission via `scp` or `sftp` also occupies a lot of network bandwidth, which is not allowed in the actual production environment. Secondly, these commands or tools need to be manually entered by the administrator and need to be combined with the crontab of the scheduled task. However, the time set by crontab is not easy to grasp, and the set time is too short. For example, if it is executed once every 1 minute, it may happen that the first script is not executed, and the second script is executed again; the set time has passed For example, if it is executed once every 5 hours, there may be data loss because the data is not backed up in time.
31+
32+
Therefore, there needs to be a data backup in the production environment which needs to meet the following requirements:
33+
34+
1. Backups transmitted over the network
35+
2. Real-time data file synchronization
36+
3. Less occupancy of system resources and higher efficiency
37+
38+
`rsync` appeared to meet the above needs. It uses the GNU open source license agreement. It is a fast incremental backup tool. The latest version is 3.2.3 (2020-08-06). You can visit [ Official website ] (https://rsync.samba.org/) for more information.
39+
40+
In terms of platform support, most Unix-like systems are supported, whether it is GNU/Linux or BSD. In addition, there are related `rsync` under the Windows platform, such as cwRsync.
41+
42+
The original `rsync` was maintained by the Australian programmer <font color=red>Andrew Tridgell</font> (shown in Figure 1 below), and now it has been maintained by <font color=red>Wayne Davison</font> (shown in Figure 2 below) ) For maintenance, you can go to [ github project address ](https://github.com/WayneD/rsync) to get the information you want.
43+
44+
![ Andrew Tridgell ](images/Andrew_Tridgell.jpg)
45+
![ Wayne Davison ](images/Wayne_Davison.jpg)
46+
47+
!!! note "Attention!"
48+
**rsync itself is only an incremental backup tool and does not have the function of real-time data synchronization. It needs to be supplemented with another program. In addition to this, synchronization is one-way, and if you want two-way backup, you need to use another tool to achieve it. **
49+
50+
### Basic Principles and Features
51+
How does `rsync` achieve efficient one-way data synchronization backup?
52+
The core of `rsync` is its **Checksum algorithm** . If you are interested, you can go to [ Rsync Working Principle ](https://rsync.samba.org/how-rsync-works.html) and [ rsync Algorithm ](https ://rsync.samba.org/tech_report/) I understand that this part is beyond the scope of the author's ability, so I won't give too much explanation.
53+
54+
The characteristics of `rsync` are:
55+
* The entire directory can be updated recursively;
56+
* Can selectively retain file synchronization attributes, such as hard link, soft link, owner, group, corresponding permissions, modification time, etc., and can retain some of the attributes;
57+
* Support two protocols for transmission, one is ssh protocol, the other is rsync protocol
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
title: rsync 简述
3+
author: tianci li
4+
update: 2021-11-04
5+
---
6+
7+
# 备份简述
8+
9+
什么是备份?
10+
11+
备份指的是将文件系统或者数据库中的数据进行复制,一旦发生错误或者灾难时,能及时方便地恢复系统的有效数据且正常运作。
12+
13+
备份的方式有哪些?
14+
15+
* 完全备份(Full backup):指把硬盘或数据库内的所有文件、文件夹或数据作一次性的复制。(优点:最好,能更快的恢复数据。缺点:占用较大的硬盘空间。)
16+
* 增量备份(incremental backup):指对上一次全部备份或增量备份后更新的数据进行备份。过程是这样的,比如第一天进行一次完全备份;第二天进行一次新增数据的备份,相对于全部备份来说;第三天在第二天的基础上再进行一次新增数据的备份,相对于第二天来说。以此类推。
17+
* 差异备份(Differential backup) :指完整备份后变更的文件的备份。比如第一天完全备份;第二天备份新增数据;第三天备份第二天到第三天的新增数据;第四天备份第二天到第四天所有的新增数据。以此类推。
18+
* 选择性备份(Selective backup):指对系统的一部分进行备份。
19+
* 冷备份(Cold backup):指系统处于停机或维护状态下的备份。备份的数据与系统中此时段的数据完全一致。
20+
* 热备份(Hot backup): 指系统处于正常运转状态下的备份。由于系统中的数据随时在更新,备份的数据相对于系统的真实数据有一定的滞后。
21+
* 异地备份(Remote backup):指在另外一个地理位置备份数据,避免因为火灾、自然灾害、盗窃等造成数据丢失与服务中断。
22+
23+
## rsync简述
24+
25+
在一台服务器上我将第一个分区备份到第二个分区,也就是俗称的 " 本地备份(Local backup)",备份的具体工具有`tar``dd``dump``cp`等都能实现。但其实还是 "把鸡蛋放在一个同篮子里(Don't put the eggs in the same basket)",一旦硬件出现问题然后无法正常的引导和启动,数据还是没办法找回,为了解决本地备份的这个问题,我们引入了另外一种备份————"异地备份"。
26+
27+
有人会说,我在第一台服务器上使用tar或者cp命令,然后通过scp或者sftp传到第二台服务器不就可以了吗?
28+
29+
在生产环境下,数据量是比较大的。首先tar或者cp会消耗大量的时间且占用系统的性能,通过scp或者sftp传输还会占用大量的网络带宽,这在实际的生产环境下是不被允许的。其次,这些命令或者说工具是需要管理员手工输入的,需要搭配计划任务crontab一起。但crontab设定的时间不好掌握,设定的时间过短,比如说间隔1分钟执行一次,可能会出现第一次脚本没有执行完毕,第二次脚本又执行的情况;设定的时间过长,比如间隔5小时执行一次,可能会出现因为数据没有及时的备份,导致数据丢失的情况。
30+
31+
所以在生产环境下需要有一种数据备份,需满足以下的需求:
32+
33+
1. 通过网络传输的备份
34+
2. 实时的数据文件同步
35+
3. 对系统资源的占用较小,且效率较高
36+
37+
rsync就是为了满足以上的需求而出现的,使用GNU开源许可证协议,是一款快速增量备份的工具,目前最新版本为3.2.3(2020-08-06),您可以访问 [官方网站](https://rsync.samba.org/) 了解更多的信息。
38+
39+
在平台支持上,支持绝大多数的类Unix,不管是GNU/Linux还是BSD等都支持。另外Windows平台下也有相关的rsync,比如cwRsync。
40+
41+
最初的 rsync 由澳大利亚程序员<font color=red>Andrew Tridgell</font>(下图1所示)进行维护,现在已由<font color=red>Wayne Davison</font>(下图2所示)进行维护,可以到 [github项目地址](https://github.com/WayneD/rsync) 获取您想要的信息。
42+
43+
![Andrew Tridgell](images/Andrew_Tridgell.jpg)
44+
![Wayne Davison](images/Wayne_Davison.jpg)
45+
46+
!!! note "注意!"
47+
**rsync本身只是一个增量备份的工具,并不具备实时数据同步的功能,需要搭配另外的程序做功能补充。除了这之外,同步是单向的,要想双向备份,需要使用另外的工具才能实现。**
48+
49+
### 基本原理和特点
50+
rsync是如何实现高效的单向数据同步备份的?
51+
rsync的核心就是它的**Checksum算法**,如果您感兴趣可以去 [Rsync工作原理](https://rsync.samba.org/how-rsync-works.html) 以及 [rsync算法](https://rsync.samba.org/tech_report/) 了解,这一部分超出了作者的能力范围,不做过多的说明。
52+
53+
rsync的特点有:
54+
* 能以递归的形式更新整个目录;
55+
* 能有选择的保留文件同步属性,比如硬链接、软链接、所有者、所属组、对应权限、修改时间等,可以保留其中的一部分属性;
56+
* 支持两种协议进行传输,一个是ssh协议,一个是rsync协议
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
---
2+
title: rsync demo 01
3+
author: tianci li
4+
contributors: Steven Spencer
5+
update: 2021-11-04
6+
---
7+
8+
# Preface
9+
10+
`rsync` needs to perform user authentication before data synchronization. **There are two protocol methods for authentication: SSH protocol and rsync protocol (the default port of rsync protocol is 873)**
11+
12+
* SSH protocol verification login method: use SSH protocol as the basis for user identity authentication (that is, use the system user and password of GNU/Linux itself for verification), and then perform data synchronization.
13+
* rsync protocol verification login method: use rsync protocol for user identity authentication (non-GNU/Linux system users, similar to vsftpd virtual users), and then perform data synchronization.
14+
15+
Before the specific demonstration of rsync synchronization, you need to use the `rsync` command. In Rocky Linux 8, the rsync rpm package is installed by default, and the version is 3.1.3-12, as follows:
16+
17+
```bash
18+
[root@Rocky ~ ] # rpm -qa|grep rsync
19+
rsync-3.1.3-12.el8.x86_64
20+
```
21+
22+
```txt
23+
Basic format: rsync [options] original location target location
24+
Commonly used options:
25+
-a: archive mode, recursive and preserves the attributes of the file object, which is equivalent to -rlptgoD (without -H, -A, -X)
26+
-v: Display detailed information about the synchronization process
27+
-z: compress when transferring files
28+
-H: Keep hard link files
29+
-A: retain ACL permissions
30+
-X: retain chattr permissions
31+
-r: Recursive mode, including all files in the directory and subdirectories
32+
-l: still reserved for symbolic link files
33+
-p: Permission to retain file attributes
34+
-t: time to retain file attributes
35+
-g: retain the group belonging to the file attribute (only for super users)
36+
-o: retain the owner of the file attributes (only for super users)
37+
-D: Keep device files and other special files
38+
```
39+
40+
The author's personal use: `rsync -avz original location target location`
41+
42+
## Environment Description
43+
44+
|Item|Description|
45+
|---|---|
46+
| Rocky Linux 8(Server) | 192.168.100.4/24 |
47+
| Fedora 34(client) | 192.168.100.5/24 |
48+
49+
You can use Fedora 34 to upload and download
50+
51+
```mermaid
52+
graph LR;
53+
RockyLinux8-->|pull/download|Fedora34;
54+
Fedora34-->|push/upload|RockyLinux8;
55+
```
56+
57+
You can also use Rocky Linux 8 to upload and download
58+
59+
```mermaid
60+
graph LR;
61+
RockyLinux8-->|push/upload|Fedora34;
62+
Fedora34-->|pull/download|RockyLinux8;
63+
```
64+
65+
## Demonstration based on SSH protocol
66+
67+
!!! tip "Attention!"
68+
Here, both Rocky Linux 8 and Fedora 34 use the root user to log in. Fedora 34 is the client and Rocky Linux 8 is the server.
69+
70+
### pull/download
71+
72+
Since it is based on the SSH protocol, we first create a user in the server:
73+
74+
```bash
75+
[root@Rocky ~ ] # useradd testrsync
76+
[root@Rocky ~ ] # passwd testrsync
77+
```
78+
79+
On the client side, we pull/download it, and the file on the server is /rsync/aabbcc
80+
81+
```bash
82+
[root@fedora ~ ] # rsync -avz [email protected]:/rsync/aabbcc /root
83+
[email protected] ' s password:
84+
receiving incremental file list
85+
aabbcc
86+
sent 43 bytes received 85 bytes 51.20 bytes/sec
87+
total size is 0 speedup is 0.00
88+
[root@fedora ~]# cd
89+
[root@fedora ~]# ls
90+
aabbcc
91+
```
92+
The transfer was successful.
93+
94+
!!! tip "Attention"
95+
If the server's SSH port is not the default 22, you can specify the port in a similar way-`rsync -avz -e ' ssh -p [port] ' `.
96+
97+
### push/upload
98+
99+
```bash
100+
[root@fedora ~]# touch fedora
101+
[root@fedora ~]# rsync -avz /root/* [email protected]:/rsync/
102+
[email protected] ' s password:
103+
sending incremental file list
104+
anaconda-ks.cfg
105+
fedora
106+
rsync: mkstemp " /rsync/.anaconda-ks.cfg.KWf7JF " failed: Permission denied (13)
107+
rsync: mkstemp " /rsync/.fedora.fL3zPC " failed: Permission denied (13)
108+
sent 760 bytes received 211 bytes 277.43 bytes/sec
109+
total size is 883 speedup is 0.91
110+
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1330) [sender = 3.2.3]
111+
```
112+
113+
**Prompt permission denied, how to deal with it? **
114+
First check the permissions of the /rsync/ directory. Obviously, there is no permission. We can use `setfacl` to give permission:
115+
116+
```bash
117+
[root@Rocky ~ ] # ls -ld /rsync/
118+
drwxr-xr-x 2 root root 4096 November 2 15:05 /rsync/
119+
```
120+
121+
```bash
122+
[root@Rocky ~ ] # setfacl -mu:testrsync:rwx /rsync/
123+
[root@Rocky ~ ] # getfacl /rsync/
124+
getfacl: Removing leading ' / ' from absolute path names
125+
# file: rsync/
126+
# owner: root
127+
# group: root
128+
user::rwx
129+
user:testrsync:rwx
130+
group::rx
131+
mask::rwx
132+
other::rx
133+
```
134+
135+
Try again, success!
136+
137+
```bash
138+
[root@fedora ~ ] # rsync -avz /root/* [email protected]:/rsync/
139+
[email protected] ' s password:
140+
sending incremental file list
141+
anaconda-ks.cfg
142+
fedora
143+
sent 760 bytes received 54 bytes 180.89 bytes/sec
144+
total size is 883 speedup is 1.08
145+
```

0 commit comments

Comments
 (0)