Cherry Servers

How to Remove, Replace, and Resync a Disk in a Degraded RAID Array

Cherry Servers data centers continuously monitor RAID arrays for servers with pre-installed operating systems. In the event that we detect a RAID array degradation or failure event, we will promptly notify the customer and recommend scheduling maintenance to replace the failed disk.

The replacement process usually takes around 15-20 minutes to complete, and should not lead to data loss. As long as the remaining disks continue to function, the new disk will be able to copy the secured data, allowing the RAID array to continue functioning without data loss.

This guide covers the process for a software-type RAID array, and applies to various RAID configurations, including RAID 1, RAID 5, and RAID 10. Before you begin, please ensure that you have “smartmontools” installed, as it is required to retrieve the serial number of the failed drive.

Furthermore, please contact our dedicated support team and inform them that you will be replacing a degraded RAID disk. As part of the process, they will perform the replacement once the disk has been detached.

#Instructions to Remove, Replace, and Resync a Disk in a Degraded RAID Array

#Step 1: Check the Current RAID Array Status

Open the terminal and connect to your server. Check the current RAID array status of your disks, using: bash command cat /proc/mdstat This command will display the status of your RAID arrays. The faulty drive could be marked as [F] or might have disappeared from the RAID array. You’ll see something like this: bash output root@ijsvkuoqaf-rufkgeaosw:~# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sdb22 sda2[0] 244059136 blocks super 1.2 [2/1] [_U] bitmap: 2/2 pages [8KB], 65536KB chunk

#Mark the Faulty Disk as Failed

  1. As shown before, you can identify the faulty disk by running the following command:
    Outputcat /proc/mdstat
    

If you notice that a drive is missing, use the following command to identify the missing disk: bash command lsblk 2. Mark the disk as failed with this command. Replace "/dev/md0" with your RAID array device and "/dev/sdb2" with the faulty disk.

```bash command
mdadm --manage /dev/md0 --fail /dev/sdb2 
```
A confirmation will be shown:
```bash output
root@ijsvkuoqaf-rufkgeaosw:~# mdadm --manage /dev/md0 --fail /dev/sdb2
mdadm: set /dev/sdb2 faulty in /dev/md0
```

#Step 3: Install “smartmontools” and Retrieve the Serial number of the Faulty Disk

  1. Install “smartmontools”.

    The instructions to do this using common Linux distributions are:

    • For Debian/Ubuntu operating systems:
    Command Line
    sudo apt-get install smartmontools 
    
    • For CentOS/RHEL operating systems:
    Command Line
    sudo yum install smartmontools 
    
    • For Fedora:
    Command Line
    sudo dnf install smartmontools 
    
    • For Arch Linux:
    Command Line
    sudo pacman -S smartmontools
    
  2. Find the serial number of the faulty disk using the following command, and replace “/dev/sdb2” with the faulty disk's device name:

    Command Line
    smartctl -a /dev/sdb2 | grep "Serial Number" 
    

    You should see the serial number displayed like this:

    Outputroot@ijsvkuoqaf-rufkgeaosw:~# smartctl -a /dev/sdb2 | grep "Serial Number"
    Serial Number:    S3YJNC1K903109A
    

#Step 4: Remove the Faulty Disk from the Array

  1. Use this command to remove the disk from the array:

    Command Line
    mdadm --manage /dev/md0 --remove /dev/sdb2 
    

    Confirmation will appear as:

    Outputroot@ijsvkuoqaf-rufkgeaosw:~# mdadm --manage /dev/md0 --remove /dev/sdb2
    mdadm: hot removed /dev/sdb2 from /dev/md0
    
  2. You can verify the disk’s removal using the same command we used at the beginning:

    Command Line
    cat /proc/mdstat 
    

#Step 5: Contact Support and Power off the Server

  1. Contact technical support.

    Please inform our dedicated support team that you have completed all the above steps and the server is ready for maintenance.

  2. Power off the server with:

    Command Line
    sudo shutdown -h now 
    
  3. Our technical support team will replace the faulty drive with a new one and power the server back on.

#Step 6: Add the New Disk

  1. Power on the server if it has not already been powered on.

  2. Copy the partition table from a remaining operational disk to the new disk.

    Use the following command to identify the good disk (e.g., /dev/sda) and the new disk (e.g., /dev/sdb), replacing the names with your actual disk names.

    Command Line
    sfdisk -d /dev/sda | sfdisk /dev/sdb 
    
  3. Verify that the partition table on the new disk matches that of the other disks in the array with:

    Command Line
    fdisk –l 
    

    You should see something similar to this:

    Outputroot@ijsvkuoqaf-rufkgeaosw:~# fdisk -l
    Disk /dev/sdb: 232.89 GiB, 250059350016 bytes, 488397168 sectors
    Disk model: Samsung SSD 860
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disklabel type: gpt
    Disk identifier: 8D869982-27C0-42BC-A398-BE04CD523E2F
    
    Device       Start       End   Sectors   Size Type
    /dev/sdb1     2048      4095      2048     1M BIOS boot
    /dev/sdb2     4096 488386559 488382464 232.9G Linux filesystem
    
    Disk /dev/sda: 232.89 GiB, 250059350016 bytes, 488397168 sectors
    Disk model: Samsung SSD 860
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disklabel type: gpt
    Disk identifier: F05BFD55-29C9-4075-B716-195CFFB2398F
    
    Device       Start       End   Sectors   Size Type
    /dev/sda1     2048      4095      2048     1M BIOS boot
    /dev/sda2     4096 488386559 488382464 232.9G Linux filesystem
    
    Disk /dev/md0: 232.75 GiB, 249916555264 bytes, 488118272 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    
    
  4. Add the new disk to the RAID array with this command, replacing "/dev/sdb2" with the new disk's actual partition name.:

    Command Line
    mdadm --manage /dev/md0 --add /dev/sdb2 
    

#Step 7: Synchronize the RAID Array

You may check the synchronization status of your RAID array using the command:

Command Line
cat /proc/mdstat 

This may take some time depending on the size of the disks and the RAID level.

These steps should ensure minimal downtime for your server, and maintain data integrity. If you encounter any issues during these steps please don’t hesitate to contact our dedicated technical support team.

No results found for ""
Recent Searches
Navigate
Go
ESC
Exit
We use cookies to ensure seamless user experience for our website. Required cookies - technical, functional and analytical - are set automatically. Please accept the use of targeted cookies to ensure the best marketing experience for your user journey. You may revoke your consent at any time through our Cookie Policy.
build: 920a9a1ae.1622