Recovering RAID array from a Thecus NAS

I had to recover a RAID-5 array from a client’s Thecus NAS which had stopping reading the disks or even starting up properly.

Fortunately this NAS, a Thecus N4100PRO, is basically a Linux server and uses the standard software RAID and LVM formats.

To start with I pulled the hard drive out of my desktop, plugged in the four drives from the NAS and booted up off of a Parted Magic CD.

After confirming that the hard drives were indeed healthy, I started with the following command:

  mdadm --examine --scan --verbose /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2

which gave the following output:

  ARRAY /dev/md1 level=raid5 num-devices=4 UUID=0bb2ce09:500bcf2f:945dc254:7108bfd3
  devices=/dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2

Copy this into /etc/mdadm.conf and save.

Now we try to mount it…

root@PartedMagic:~# mdadm -Av /dev/md1
mdadm: looking for devices for /dev/md1
mdadm: /dev/sr0 is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sde1 is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sde is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sdd1 is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sdd is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sdc1 is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sdc is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sdb1 is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sdb is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sda1 is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sda is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/loop1 is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/loop0 is not one of /dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2
mdadm: /dev/sdd2 is identified as a member of /dev/md1, slot 3.
mdadm: /dev/sdc2 is identified as a member of /dev/md1, slot 2.
mdadm: /dev/sdb2 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sda2 is identified as a member of /dev/md1, slot 0.
mdadm: added /dev/sdb2 to /dev/md1 as 1
mdadm: added /dev/sdc2 to /dev/md1 as 2
mdadm: added /dev/sdd2 to /dev/md1 as 3
mdadm: added /dev/sda2 to /dev/md1 as 0
mdadm: /dev/md1 has been started with 4 drives.

Great! We have our RAID array mounted.

root@PartedMagic:/mnt# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md1 : active raid5 sda2[0] sdd2[3] sdc2[2] sdb2[1]
2924400000 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>

Now we need to mount the file system on the RAID.

  root@PartedMagic:/media/md1# mount /dev/md1 /media/md1
  mount: unknown filesystem type 'LVM2_member'

xxxxx

Using the LVM tools, we need to find the LV (Logical Volume) and VG (Volume Group) names. First we do a disk scan:

[geshi lang=”bash” nums=”0″ highlight=”6″ target=”_self” ]

root@PartedMagic:/media/md1# lvmdiskscan
 /dev/ram0    16.00 MiB
 /dev/loop0   35.21 MiB
 /dev/ram1    16.00 MiB
 /dev/loop1   137.79 MiB
 /dev/md1     2.72 TiB LVM physical volume
 /dev/ram2    16.00 MiB
 /dev/ram3    16.00 MiB
 /dev/ram4    16.00 MiB
 /dev/ram5    16.00 MiB
 /dev/ram6    16.00 MiB
 /dev/ram7    16.00 MiB
 /dev/ram8    16.00 MiB
 /dev/ram9    16.00 MiB
 /dev/ram10   16.00 MiB
 /dev/ram11   16.00 MiB
 /dev/ram12   16.00 MiB
 /dev/ram13   16.00 MiB
 /dev/ram14   16.00 MiB
 /dev/ram15   16.00 MiB
 /dev/sde1    232.88 GiB
 0 disks
 19 partitions
 0 LVM physical volume whole disks
 1 LVM physical volume[/geshi]

Now we run lvdisplay:

  root@PartedMagic:/media/md1# lvdisplay
  --- Logical volume ---
  LV Name                /dev/vg0/syslv
  VG Name                vg0
  LV UUID                rGlvdG-JYaD-rjDd-nrOH-zRw1-DIxz-AGKiy8
  LV Write Access        read/write
  LV Status              NOT available
  LV Size                1.00 GiB
  Current LE             512
  Segments               1
  Allocation             inherit
  Read ahead sectors     16384

  --- Logical volume ---
  LV Name                /dev/vg0/lv0
  VG Name                vg0
  LV UUID                MGZbUD-m7vl-g7Dt-JRi5-XPEy-U2AN-f7WoJ5
  LV Write Access        read/write
  LV Status              NOT available
  LV Size                1.29 TiB
  Current LE             676352
  Segments               1
  Allocation             inherit
  Read ahead sectors     16384

And now a vgdisplay just to make sure it’s the correct drive:

  root@PartedMagic:/media/md1# vgdisplay
  --- Volume group ---
  VG Name               vg0
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               2.72 TiB
  PE Size               2.00 MiB
  Total PE              1427929
  Alloc PE / Size       676864 / 1.29 TiB
  Free  PE / Size       751065 / 1.43 TiB
  VG UUID               QfPXRZ-nT9Z-o4rU-OMUS-bvuq-PFbF-H5XMQu

So, now I tried to mount using the file system using the volume group name”

root@PartedMagic:/media/md1# mount /dev/vg0/lv0 /mnt/thecus
mount: special device /dev/vg0/lv0 does not exist

Running an lvscan shows that the volumes are actually still inactive.

The following commands should fix that:

root@PartedMagic:/media/md1# modprobe dm-mod
root@PartedMagic:/media/md1# vgchange -ay
2 logical volume(s) in volume group "vg0" now active

Another run of lvscan shows that the LV’s are now active. Now, I should be able to mount the file system:

root@Partedmagic:/media/md1# mount /dev/vg0/lv0 /mnt/thecus
root@PartedMagic:/media/md1# cd /mnt/thecus/
root@PartedMagic:/mnt/thecus# ls -al
total 92
drwxr-xr-x 17 root root  4096 Oct 13  2011 ./
drwxr-xr-x  5 root root    60 May 14  2012 ../
drwxrwxrwx  2   99  102  4096 Oct 13  2011 _Module_Folder_/
drwxrwxrwx  2   99  102  4096 Oct 13  2011 _NAS_Module_Source_/
d--x--x--x  2 root root  4096 Oct 13  2011 _NAS_NFS_Exports_/
drwxrwxrwx  3   99  102  4096 Oct 27  2011 _NAS_Picture_/
drwxr-xr-x  4 root root  4096 Aug 27  2009 dlnamedia/
drwxr-xr-x  2 root root  4096 Mar  8 11:34 ftproot/
drwx------  2 root root 16384 Aug 27  2009 lost+found/
drwxr-xr-x  6 root root  4096 Oct 13  2011 module/
drwxrwxrwx  2   99  102  4096 Apr  8  2011 naswebsite/
drwx------  5   99  102  4096 Oct 13  2011 nsync/
drwxrwxrwx  5   99  102  4096 Oct  9  2011 share/
drwxr-xr-x  2 root root  4096 Aug 29  2009 target_usb/
drwxr-xr-x  3 root root  4096 Oct 13  2011 tmp/
drwxrwxrwx  2   99  102  4096 Aug 27  2009 usbcopy/
drwxrwxrwx  3   99  102  4096 Aug 29  2009 usbhdd/

Success! We can see his files and copy them off to another hard drive. There’ll be another happy client tomorrow.

—-

Sources:

  • http://serverfault.com/questions/32709/how-do-i-move-a-linux-software-raid-to-a-new-machine
  • http://superuser.com/questions/117824/how-to-get-an-inactive-raid-device-working-again
  • http://pissedoffadmins.com/?p=481

Leave a Reply

Your email address will not be published.