# How to Remove, Replace, and Resync a Disk in a Degraded RAID Array

Cherry Servers data centers continuously monitor RAID arrays for servers with pre-installed operating systems. In the event that we detect a RAID array degradation or failure event, we will promptly notify the customer and recommend scheduling maintenance to replace the failed disk. 

The replacement process usually takes around 15-20 minutes to complete, and should not lead to data loss. As long as the remaining disks continue to function, the new disk will be able to copy the secured data, allowing the RAID array to continue functioning without data loss.

This guide covers the process for a software-type RAID array, and applies to various RAID configurations, including RAID 1, RAID 5, and RAID 10. Before you begin, please ensure that you have “smartmontools” installed, as it is required to retrieve the serial number of the failed drive.

Furthermore, please contact our dedicated support team and inform them that you will be replacing a degraded RAID disk. As part of the process, they will perform the replacement once the disk has been detached.

## Instructions to Remove, Replace, and Resync a Disk in a Degraded RAID Array
### Step 1: Check the Current RAID Array Status 

Open the terminal and connect to your server.
Check the current RAID array status of your disks, using:
```bash command
cat /proc/mdstat
```
This command will display the status of your RAID arrays. The faulty drive could be marked as [F] or might have disappeared from the RAID array. You’ll see something like this:
```bash output
root@ijsvkuoqaf-rufkgeaosw:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb22 sda2[0]
244059136 blocks super 1.2 [2/1] [_U]
bitmap: 2/2 pages [8KB], 65536KB chunk
```

### Mark the Faulty Disk as Failed 
1. As shown before, you can identify the faulty disk by running the following command:
	```bash output
	cat /proc/mdstat
	```
	If you notice that a drive is missing, use the following command to identify the missing disk:
	```bash command
	lsblk
	```
2. Mark the disk as failed with this command. Replace "*/dev/md0*" with your RAID array device and "*/dev/sdb2*" with the faulty disk.
	
	```bash command
	mdadm --manage /dev/md0 --fail /dev/sdb2 
	```
	A confirmation will be shown:
	```bash output
	root@ijsvkuoqaf-rufkgeaosw:~# mdadm --manage /dev/md0 --fail /dev/sdb2
	mdadm: set /dev/sdb2 faulty in /dev/md0
	```
	
### Step 3: Install “smartmontools” and Retrieve the Serial number of the Faulty Disk
1. Install “smartmontools”. 
	
    The instructions to do this using common Linux distributions are:
	- For Debian/Ubuntu operating systems:
	```bash command
	sudo apt-get install smartmontools 
	```
	- For CentOS/RHEL operating systems: 
	```bash command
	sudo yum install smartmontools 
	```
	- For Fedora: 
	```bash command
	sudo dnf install smartmontools 
	```
	- For Arch Linux: 
	```bash command
	sudo pacman -S smartmontools
	```
2. Find the serial number of the faulty disk using the following command, and replace “*/dev/sdb2*” with the faulty disk's device name:
	```bash command
	smartctl -a /dev/sdb2 | grep "Serial Number" 
	```
	You should see the serial number displayed like this:
	```bash output
	root@ijsvkuoqaf-rufkgeaosw:~# smartctl -a /dev/sdb2 | grep "Serial Number"
	Serial Number:    S3YJNC1K903109A
	```
### Step 4: Remove the Faulty Disk from the Array
1. Use this command to remove the disk from the array:
	```bash command
	mdadm --manage /dev/md0 --remove /dev/sdb2 
	```
	
	Confirmation will appear as:
	```bash output
	root@ijsvkuoqaf-rufkgeaosw:~# mdadm --manage /dev/md0 --remove /dev/sdb2
	mdadm: hot removed /dev/sdb2 from /dev/md0
	```
	
2. You can verify the disk’s removal using the same command we used at the beginning:
	```bash command
	cat /proc/mdstat 
	```
	
### Step 5: Contact Support and Power off the Server
1. Contact technical support.

	Please inform our dedicated support team that you have completed all the above steps and the server is ready for maintenance. 
2. Power off the server with: 
	```bash command
	sudo shutdown -h now 
	```
3. Our technical support team will replace the faulty drive with a new one and power the server back on.
### Step 6: Add the New Disk
1. Power on the server if it has not already been powered on. 
2. Copy the partition table from a remaining operational disk to the new disk. 
	
    Use the following command to identify the good disk (e.g., */dev/sda*) and the new disk (e.g., */dev/sdb*), replacing the names with your actual disk names.
	```bash command
	sfdisk -d /dev/sda | sfdisk /dev/sdb 
	```
3. Verify that the partition table on the new disk matches that of the other disks in the array with:
	```bash command
	fdisk –l 
	```
	You should see something similar to this:
	```bash output
	root@ijsvkuoqaf-rufkgeaosw:~# fdisk -l
	Disk /dev/sdb: 232.89 GiB, 250059350016 bytes, 488397168 sectors
	Disk model: Samsung SSD 860
	Units: sectors of 1 * 512 = 512 bytes
	Sector size (logical/physical): 512 bytes / 512 bytes
	I/O size (minimum/optimal): 512 bytes / 512 bytes
	Disklabel type: gpt
	Disk identifier: 8D869982-27C0-42BC-A398-BE04CD523E2F
	
	Device       Start       End   Sectors   Size Type
	/dev/sdb1     2048      4095      2048     1M BIOS boot
	/dev/sdb2     4096 488386559 488382464 232.9G Linux filesystem
	
	Disk /dev/sda: 232.89 GiB, 250059350016 bytes, 488397168 sectors
	Disk model: Samsung SSD 860
	Units: sectors of 1 * 512 = 512 bytes
	Sector size (logical/physical): 512 bytes / 512 bytes
	I/O size (minimum/optimal): 512 bytes / 512 bytes
	Disklabel type: gpt
	Disk identifier: F05BFD55-29C9-4075-B716-195CFFB2398F
	
	Device       Start       End   Sectors   Size Type
	/dev/sda1     2048      4095      2048     1M BIOS boot
	/dev/sda2     4096 488386559 488382464 232.9G Linux filesystem
	
	Disk /dev/md0: 232.75 GiB, 249916555264 bytes, 488118272 sectors
	Units: sectors of 1 * 512 = 512 bytes
	Sector size (logical/physical): 512 bytes / 512 bytes
	I/O size (minimum/optimal): 512 bytes / 512 bytes
	
	```
4. Add the new disk to the RAID array with this command, replacing "*/dev/sdb2*" with the new disk's actual partition name.:
	```bash command
	mdadm --manage /dev/md0 --add /dev/sdb2 
	```
### Step 7: Synchronize the RAID Array
You may check the synchronization status of your RAID array using the command:
```bash command
cat /proc/mdstat 
```
This may take some time depending on the size of the disks and the RAID level. 

These steps should ensure minimal downtime for your server, and maintain data integrity. If you encounter any issues during these steps please don’t hesitate to contact our dedicated technical support team.