[Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

webangel · Le 31/01/2021, à 00:17

Bonsoir, j'ai un NAS en raid6 avec 7 disques de 3To qui était en Ubuntu (la version desktop pas serveur) 18.04 et qui fonctionnait bien jusque là
je lui est fait une petite toilette un dépoussiérage de l'intérieur, et changement du ventilateur de cpu qui était HS par un autre (Noctua) et puis mise à niveau vers Ubuntu 20.04
Lorsque le pc à redémarrer je me suis retrouver avec un state clean, degraded et le passage de /dev/md0 à /dev/md127...

Voici mes investigations.

sudo mdadm --detail /dev/md0

mdadm: cannot open /dev/md0: No such file or directory

sudo mdadm --detail /dev/md127
/dev/md127:
           Version : 1.2
     Creation Time : Fri Mar 22 14:27:12 2013
        Raid Level : raid6
        Array Size : 14650670080 (13971.97 GiB 15002.29 GB)
     Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
      Raid Devices : 7
     Total Devices : 6
       Persistence : Superblock is persistent

       Update Time : Sat Jan 30 23:01:37 2021
             State : clean, degraded 
    Active Devices : 6
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

              Name : SRVNAS3T:0  (local to host SRVNAS3T)
              UUID : 0e2234e3:8853b712:5eb5db08:2afa36ea
            Events : 8757

    Number   Major   Minor   RaidDevice State
       6       8       17        0      active sync   /dev/sdb1
       7       8        1        1      active sync   /dev/sda1
       -       0        0        2      removed
       3       8       49        3      active sync   /dev/sdd1
       4       8       65        4      active sync   /dev/sde1
       9       8       81        5      active sync   /dev/sdf1
       8       8       97        6      active sync   /dev/sdg1

j'ai vérifier la connexion physique de sdc qui est removed de celui-ci vers la carte mère tant cable SATA et câble d'alimentation.
j'ai lancer ceci:

cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md127 : active raid6 sda1[7] sdf1[9] sdb1[6] sdd1[3] sde1[4] sdg1[8]
      14650670080 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UU_UUUU]

Puis voir dans les logs au boot:

dmesg |grep md
[    0.000000] Linux version 5.4.0-65-generic (buildd@lcy01-amd64-018) (gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)) #73-Ubuntu SMP Mon Jan 18 17:25:17 UTC 2021 (Ubuntu 5.4.0-65.73-generic 5.4.78)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-65-generic root=UUID=9981d402-314f-4d3e-b6e4-554d69d2d188 ro quiet splash nomdmonddf nomdmonisw
[    0.046716] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-65-generic root=UUID=9981d402-314f-4d3e-b6e4-554d69d2d188 ro quiet splash nomdmonddf nomdmonisw
[    1.128769] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[    1.129625] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[    1.129637] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[    1.245380] ata7: PATA max UDMA/100 cmd 0xc800 ctl 0xc400 bmdma 0xb400 irq 17
[    1.245381] ata8: PATA max UDMA/100 cmd 0xc000 ctl 0xb800 bmdma 0xb408 irq 17
[    1.254869] ata9: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xff00 irq 14
[    1.254871] ata10: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xff08 irq 15
[    2.348438] md/raid:md127: device sda1 operational as raid disk 1
[    2.348440] md/raid:md127: device sdf1 operational as raid disk 5
[    2.348441] md/raid:md127: device sdb1 operational as raid disk 0
[    2.348442] md/raid:md127: device sdd1 operational as raid disk 3
[    2.348443] md/raid:md127: device sde1 operational as raid disk 4
[    2.348443] md/raid:md127: device sdg1 operational as raid disk 6
[    2.351734] md/raid:md127: raid level 6 active with 6 out of 7 devices, algorithm 2
[    2.351777] md127: detected capacity change from 0 to 15002286161920
[    4.947417] systemd[1]: Inserted module 'autofs4'
[    5.114384] systemd[1]: systemd 245.4-4ubuntu3.4 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
[    5.114513] systemd[1]: Detected architecture x86-64.
[    5.144731] systemd[1]: Set hostname to <SRVNAS3T>.
[    6.062277] systemd-sysv-generator[306]: stat() failed on /etc/init.d/screen-cleanup, ignoring: No such file or directory
[    7.395401] systemd[1]: /lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
[    7.897994] systemd[1]: /lib/systemd/system/rpc-statd.service:16: PIDFile= references a path below legacy directory /var/run/, updating /var/run/rpc.statd.pid → /run/rpc.statd.pid; please update the unit file accordingly.
[    8.246938] systemd[1]: Created slice system-modprobe.slice.
[    8.247280] systemd[1]: Created slice system-systemd\x2dfsck.slice.
[    8.247512] systemd[1]: Created slice User and Session Slice.
[    8.247578] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[    8.247828] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[    8.247921] systemd[1]: Reached target User and Group Name Lookups.
[    8.247952] systemd[1]: Reached target Slices.
[    8.254821] systemd[1]: Listening on RPCbind Server Activation Socket.
[    8.267185] systemd[1]: Listening on Syslog Socket.
[    8.267356] systemd[1]: Listening on fsck to fsckd communication Socket.
[    8.267430] systemd[1]: Listening on initctl Compatibility Named Pipe.
[    8.267621] systemd[1]: Listening on Journal Audit Socket.
[    8.267699] systemd[1]: Listening on Journal Socket (/dev/log).
[    8.267819] systemd[1]: Listening on Journal Socket.
[    8.267939] systemd[1]: Listening on udev Control Socket.
[    8.267997] systemd[1]: Listening on udev Kernel Socket.
[    8.269348] systemd[1]: Mounting Huge Pages File System...
[    8.271099] systemd[1]: Mounting POSIX Message Queue File System...
[    8.272783] systemd[1]: Mounting NFSD configuration filesystem...
[    8.274360] systemd[1]: Mounting RPC Pipe File System...
[    8.276119] systemd[1]: Mounting Kernel Debug File System...
[    8.278083] systemd[1]: Mounting Kernel Trace File System...
[    8.284812] systemd[1]: Starting Journal Service...
[    8.285017] systemd[1]: Condition check resulted in Kernel Module supporting RPCSEC_GSS being skipped.
[    8.287176] systemd[1]: Starting Set the console keyboard layout...
[    8.288845] systemd[1]: Starting Create list of static device nodes for the current kernel...
[    8.290329] systemd[1]: Starting Load Kernel Module drm...
[    8.348769] systemd[1]: Condition check resulted in Set Up Additional Binary Formats being skipped.
[    8.348820] systemd[1]: Condition check resulted in File System Check on Root Device being skipped.
[    8.490377] systemd[1]: Starting Load Kernel Modules...
[    8.491836] systemd[1]: Starting Remount Root and Kernel File Systems...
[    8.493300] systemd[1]: Starting udev Coldplug all Devices...
[    8.495066] systemd[1]: Starting Uncomplicated firewall...
[    8.497566] systemd[1]: Mounted Huge Pages File System.
[    8.497803] systemd[1]: Mounted POSIX Message Queue File System.
[    8.497942] systemd[1]: Mounted Kernel Debug File System.
[    8.498071] systemd[1]: Mounted Kernel Trace File System.
[    8.498720] systemd[1]: Finished Create list of static device nodes for the current kernel.
[    8.611947] systemd[1]: Finished Remount Root and Kernel File Systems.
[    8.706010] systemd[1]: Condition check resulted in Rebuild Hardware Database being skipped.
[    8.706092] systemd[1]: Condition check resulted in Platform Persistent Storage Archival being skipped.
[    8.707224] systemd[1]: Starting Load/Save Random Seed...
[    8.708627] systemd[1]: Starting Create System Users...
[    8.710948] systemd[1]: Mounted RPC Pipe File System.
[    8.711168] systemd[1]: modprobe@drm.service: Succeeded.
[    8.711525] systemd[1]: Finished Load Kernel Module drm.
[    8.712001] systemd[1]: Finished Uncomplicated firewall.
[    8.713245] systemd[1]: Starting pNFS block layout mapping daemon...
[    8.916102] systemd[1]: nfs-blkmap.service: Can't open PID file /run/blkmapd.pid (yet?) after start: Operation not permitted
[    8.916443] systemd[1]: Started pNFS block layout mapping daemon.
[    8.939302] systemd[1]: Mounted NFSD configuration filesystem.
[    9.071699] systemd[1]: Finished Set the console keyboard layout.
[    9.113813] systemd[1]: Started Journal Service.
[    9.184511] systemd-journald[322]: Received client request to flush runtime journal.
[   15.330408] EDAC amd64: Node 0: DRAM ECC disabled.
[   15.330410] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
[   15.366185] EDAC amd64: Node 0: DRAM ECC disabled.
[   15.366187] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
[   17.198352] EXT4-fs (md127): mounted filesystem with ordered data mode. Opts: (null)

puis vérification avec fdisk

sudo fdisk -l
Disque /dev/sde : 2,75 TiB, 3000592982016 octets, 5860533168 secteurs
Disk model: WDC WD30EZRX-00D
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 4096 octets / 4096 octets
Type d'étiquette de disque : gpt
Identifiant de disque : E2950BFE-7DBE-4D9B-A8B5-2F6F7A8D68F3

Périphérique Début        Fin   Secteurs Taille Type
/dev/sde1     2048 5860533134 5860531087   2,7T Système de fichiers Linux


Disque /dev/sda : 2,75 TiB, 3000592982016 octets, 5860533168 secteurs
Disk model: WDC WD30EZRX-00D
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 4096 octets / 4096 octets
Type d'étiquette de disque : gpt
Identifiant de disque : EF597E99-2271-4D1A-AEEB-493268CE5243

Périphérique Début        Fin   Secteurs Taille Type
/dev/sda1     2048 5860533134 5860531087   2,7T Système de fichiers Linux


Disque /dev/sdb : 2,75 TiB, 3000592982016 octets, 5860533168 secteurs
Disk model: WDC WD30EZRX-00D
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 4096 octets / 4096 octets
Type d'étiquette de disque : gpt
Identifiant de disque : 7DD97423-76D4-48B4-A7A8-ED7992C9290F

Périphérique Début        Fin   Secteurs Taille Type
/dev/sdb1     2048 5860533134 5860531087   2,7T Système de fichiers Linux


Disque /dev/sdg : 2,75 TiB, 3000592982016 octets, 5860533168 secteurs
Disk model: ST3000DM001-1ER1
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 4096 octets / 4096 octets
Type d'étiquette de disque : gpt
Identifiant de disque : 30B0320F-DFD1-4DED-8C14-01EEA2B78BED

Périphérique Début        Fin   Secteurs Taille Type
/dev/sdg1     2048 5860533134 5860531087   2,7T Système de fichiers Linux


Disque /dev/sdc : 2,75 TiB, 3000592982016 octets, 5860533168 secteurs
Disk model: WDC WD30EZRX-00D
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 4096 octets / 4096 octets
Type d'étiquette de disque : gpt
Identifiant de disque : E164A1CB-8653-49BE-97B3-FEE25D8C9CE9

Périphérique Début        Fin   Secteurs Taille Type
/dev/sdc1     2048 5860533134 5860531087   2,7T Système de fichiers Linux


Disque /dev/sdd : 2,75 TiB, 3000592982016 octets, 5860533168 secteurs
Disk model: WDC WD30EZRX-00D
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 4096 octets / 4096 octets
Type d'étiquette de disque : gpt
Identifiant de disque : 00E55F67-3BAD-466A-9A29-5D6384B905A8

Périphérique Début        Fin   Secteurs Taille Type
/dev/sdd1     2048 5860533134 5860531087   2,7T Système de fichiers Linux


Disque /dev/sdf : 2,75 TiB, 3000592982016 octets, 5860533168 secteurs
Disk model: WDC WD30EZRX-00D
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 4096 octets / 4096 octets
Type d'étiquette de disque : gpt
Identifiant de disque : 40D67A90-D4C5-4C06-B871-F8922E0E3727

Périphérique Début        Fin   Secteurs Taille Type
/dev/sdf1     2048 5860533134 5860531087   2,7T Système de fichiers Linux


Disque /dev/sdh : 465,78 GiB, 500107862016 octets, 976773168 secteurs
Disk model: SAMSUNG HD501LJ 
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 512 octets
taille d'E/S (minimale / optimale) : 512 octets / 512 octets
Type d'étiquette de disque : dos
Identifiant de disque : 0x3ed43ed3

Périphérique Amorçage     Début       Fin Secteurs Taille Id Type
/dev/sdh1    *               63  97659134 97659072  46,6G 83 Linux
/dev/sdh2              97659135 144536804 46877670  22,4G  5 Étendue
/dev/sdh5              97659198 105466724  7807527   3,7G 82 partition d'échange Linux / Solaris
/dev/sdh6             105466788 144536804 39070017  18,6G 83 Linux


Disque /dev/md127 : 13,66 TiB, 15002286161920 octets, 29301340160 secteurs
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 524288 octets / 2621440 octets

puis vérification avec parted :

sudo parted -l
[sudo] Mot de passe de jeff : 
Modèle : ATA WDC WD30EZRX-00D (scsi)
Disque /dev/sda : 3001GB
Taille des secteurs (logiques/physiques) : 512B/4096B
Table de partitions : gpt
Drapeaux de disque : 

Numéro  Début   Fin     Taille  Système de fichiers  Nom               Drapeaux
 1      1049kB  3001GB  3001GB                       Linux filesystem


Modèle : ATA WDC WD30EZRX-00D (scsi)
Disque /dev/sdb : 3001GB
Taille des secteurs (logiques/physiques) : 512B/4096B
Table de partitions : gpt
Drapeaux de disque : 

Numéro  Début   Fin     Taille  Système de fichiers  Nom               Drapeaux
 1      1049kB  3001GB  3001GB                       Linux filesystem


Modèle : ATA WDC WD30EZRX-00D (scsi)
Disque /dev/sdc : 3001GB
Taille des secteurs (logiques/physiques) : 512B/4096B
Table de partitions : gpt
Drapeaux de disque : 

Numéro  Début   Fin     Taille  Système de fichiers  Nom               Drapeaux
 1      1049kB  3001GB  3001GB                       Linux filesystem


Modèle : ATA WDC WD30EZRX-00D (scsi)
Disque /dev/sdd : 3001GB
Taille des secteurs (logiques/physiques) : 512B/4096B
Table de partitions : gpt
Drapeaux de disque : 

Numéro  Début   Fin     Taille  Système de fichiers  Nom               Drapeaux
 1      1049kB  3001GB  3001GB                       Linux filesystem


Modèle : ATA WDC WD30EZRX-00D (scsi)
Disque /dev/sde : 3001GB
Taille des secteurs (logiques/physiques) : 512B/4096B
Table de partitions : gpt
Drapeaux de disque : 

Numéro  Début   Fin     Taille  Système de fichiers  Nom               Drapeaux
 1      1049kB  3001GB  3001GB                       Linux filesystem


Modèle : ATA WDC WD30EZRX-00D (scsi)
Disque /dev/sdf : 3001GB
Taille des secteurs (logiques/physiques) : 512B/4096B
Table de partitions : gpt
Drapeaux de disque : 

Numéro  Début   Fin     Taille  Système de fichiers  Nom               Drapeaux
 1      1049kB  3001GB  3001GB                       Linux filesystem


Modèle : Grappe RAID logiciel Linux (md)
Disque /dev/md127 : 15,0TB
Taille des secteurs (logiques/physiques) : 512B/4096B
Table de partitions : loop
Drapeaux de disque : 

Numéro  Début  Fin     Taille  Système de fichiers  Drapeaux
 1      0,00B  15,0TB  15,0TB  ext4


Modèle : ATA ST3000DM001-1ER1 (scsi)
Disque /dev/sdg : 3001GB
Taille des secteurs (logiques/physiques) : 512B/4096B
Table de partitions : gpt
Drapeaux de disque : 

Numéro  Début   Fin     Taille  Système de fichiers  Nom               Drapeaux
 1      1049kB  3001GB  3001GB                       Linux filesystem


Modèle : ATA SAMSUNG HD501LJ (scsi)
Disque /dev/sdh : 500GB
Taille des secteurs (logiques/physiques) : 512B/512B
Table de partitions : msdos
Drapeaux de disque : 

Numéro  Début   Fin     Taille  Type      Système de fichiers  Drapeaux
 1      32,3kB  50,0GB  50,0GB  primary   ext4                 démarrage
 2      50,0GB  74,0GB  24,0GB  extended
 5      50,0GB  54,0GB  3997MB  logical   linux-swap(v1)
 6      54,0GB  74,0GB  20,0GB  logical   ext4

Puis :

sudo update-initramfs -v -u

puis :

sudo reboot

Merci d'avance, toute aide sera la bienvenue.

Edit: il ya bien cette solution proposée dans un post sur le forum Lien d'un post sur le forum pour solution du md127

du coup ça peut être appliquer dans mon cas ? sauf que j'ai sdc en état removed...
Edit1: J'ai fait un petit smartctl:

sudo smartctl -a /dev/sdc
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T1804212
LU WWN Device Id: 5 0014ee 6adb36684
Firmware Version: 80.00A80
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jan 31 01:14:16 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		(39540) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 397) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   197   180   021    Pre-fail  Always       -       5141
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       123
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   111   001   000    Old_age   Always       -       23761
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67279
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       122
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9809
194 Temperature_Celsius     0x0022   121   098   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       316
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       313
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       318

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%      1743         1574048

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

je sais pas si ça vous dit quelque chose ? je mis perds un peut...

Edit2: Apparement pas d'erreurs sigificatives sur ce disque, je vais donc le rajouter dans la grappe Raid6

sudo mdadm --add /dev/md127 /dev/sd[c]
mdadm: added /dev/sdc

cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md127 : active raid6 sdc[10] sdf1[9] sde1[4] sdd1[3] sdb1[6] sda1[7] sdg1[8]
      14650670080 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UU_UUUU]
      [>....................]  recovery =  3.4% (100648336/2930134016) finish=499.1min speed=94485K/sec

il me reste plus qu'à attendre environs 8h30 que le rebuild se termine.

sudo mdadm --detail /dev/md127
/dev/md127:
           Version : 1.2
     Creation Time : Fri Mar 22 14:27:12 2013
        Raid Level : raid6
        Array Size : 14650670080 (13971.97 GiB 15002.29 GB)
     Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
      Raid Devices : 7
     Total Devices : 7
       Persistence : Superblock is persistent

       Update Time : Sun Jan 31 14:53:59 2021
             State : clean, degraded, recovering 
    Active Devices : 6
   Working Devices : 7
    Failed Devices : 0
     Spare Devices : 1

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

    Rebuild Status : 0% complete

              Name : SRVNAS3T:0  (local to host SRVNAS3T)
              UUID : 0e2234e3:8853b712:5eb5db08:2afa36ea
            Events : 8776

    Number   Major   Minor   RaidDevice State
       6       8       17        0      active sync   /dev/sdb1
       7       8        1        1      active sync   /dev/sda1
      10       8       32        2      spare rebuilding   /dev/sdc
       3       8       49        3      active sync   /dev/sdd1
       4       8       65        4      active sync   /dev/sde1
       9       8       81        5      active sync   /dev/sdf1
       8       8       97        6      active sync   /dev/sdg1

Je verrai ensuite le souci du md127 à passer sur md0...

Dernière modification par webangel (Le 06/03/2021, à 17:19)

geole · Le 31/01/2021, à 15:29

Bonjour
Il ne faut pas s'affoler si maintenant le RAID est vu comme md127 au lieu de md0
Le plus simple est de l'accepter. Au besoin en modifiant tes scripts personnels si tu as des scripts pour y accéder

La cause de l'éjection du disque est ici

197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 316

Il y a 316 secteurs qui ne peuvent pas être lus c'est suffisant pour enpêcher un raid5 de remettre en route automatiquement
J'ai noté ce compteur
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
Jamais il n'a su en récupérer quelques uns

Plus important,pour le choix de réparation
9 Power_On_Hours 0x0032 008 008 000 Old_age Always - 67279
Ce compteur dit qu'après 67279 heures de fonctionnement le disque est statistiquement usé à (100-8) =92%

Je ne te conseille pas de le reformater puis de le remettre en service mais tout simplement de le remplacer par un autre tout neuf.

Tu pourrais en profiter pour donner l'état smartctl des autres disques car souvent ils ont été installés au même moment et sont du même modèle donc au même niveau de risque

AJout. Je vois que tu as pris la décision de le réinjecter sans formatage, Lorsque la reconstruction sera finie, refais un rapport smartctl s'il y a encore plus de zéro secteur pending, au prochain arrêt, il sera de nouveau éjecté du raids

Dernière modification par geole (Le 31/01/2021, à 15:36)

webangel · Le 31/01/2021, à 16:19

Bonjour geole, merci beaucoup pour ton aide, bien vue les erreurs pending et le power on hours de 67279 en effet le NAS tourne depuis 2012, cela fait environs 9 Ans sans soucis.
Il faudra que je pense à investir, dans un nouveau disque de remplacement. Je te donne le smartctl pour les autres disques à voir également si besoin de changer d'autres disques?

sudo smartctl -a /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T1829451
LU WWN Device Id: 5 0014ee 6585e5eb7
Firmware Version: 80.00A80
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jan 31 15:49:36 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(39600) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 398) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       54
  3 Spin_Up_Time            0x0027   199   180   021    Pre-fail  Always       -       5016
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       121
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   007   007   000    Old_age   Always       -       68524
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       120
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9813
194 Temperature_Celsius     0x0022   116   101   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       2
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       1
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       17

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

sudo smartctl -a /dev/sdb
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T1023287
LU WWN Device Id: 5 0014ee 0ae2b5cec
Firmware Version: 80.00A80
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jan 31 16:03:33 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(40380) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 405) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   195   174   021    Pre-fail  Always       -       5233
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       123
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   007   007   000    Old_age   Always       -       68545
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       121
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       67
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9998
194 Temperature_Celsius     0x0022   116   098   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

sudo smartctl -a /dev/sdd
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T1757538
LU WWN Device Id: 5 0014ee 6adb38582
Firmware Version: 80.00A80
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jan 31 16:04:52 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(40980) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 411) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   197   179   021    Pre-fail  Always       -       5116
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       122
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   007   007   000    Old_age   Always       -       68524
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       121
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       67
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9943
194 Temperature_Celsius     0x0022   112   094   000    Old_age   Always       -       38
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

sudo smartctl -a /dev/sde
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T1839677
LU WWN Device Id: 5 0014ee 6585e287c
Firmware Version: 80.00A80
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jan 31 16:06:15 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(39900) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 400) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   194   175   021    Pre-fail  Always       -       5283
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       123
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67349
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       122
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9730
194 Temperature_Celsius     0x0022   109   090   000    Old_age   Always       -       41
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

udo smartctl -a /dev/sdf
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T1801757
LU WWN Device Id: 5 0014ee 60308f2af
Firmware Version: 80.00A80
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jan 31 16:07:11 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(41160) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 413) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       4
  3 Spin_Up_Time            0x0027   197   179   021    Pre-fail  Always       -       5116
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       122
  5 Reallocated_Sector_Ct   0x0033   194   194   140    Pre-fail  Always       -       197
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67327
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       121
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9869
194 Temperature_Celsius     0x0022   109   088   000    Old_age   Always       -       41
196 Reallocated_Event_Count 0x0032   128   128   000    Old_age   Always       -       72
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

sudo smartctl -a /dev/sdg
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    Z500FA5C
LU WWN Device Id: 5 000c50 0796690bf
Firmware Version: CC25
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jan 31 16:09:06 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(   80) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 319) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x1085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   119   099   006    Pre-fail  Always       -       233268584
  3 Spin_Up_Time            0x0003   095   094   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       99
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   066   060   030    Pre-fail  Always       -       4490593
  9 Power_On_Hours          0x0032   039   039   000    Old_age   Always       -       53881
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       98
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   097   097   000    Old_age   Always       -       3
190 Airflow_Temperature_Cel 0x0022   058   040   045    Old_age   Always   In_the_past 42 (Min/Max 36/42 #1963)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       38
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       316
194 Temperature_Celsius     0x0022   042   060   000    Old_age   Always       -       42 (0 18 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       53883h+18m+50.927s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       6962859470
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       439009965491

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Disque 500Go avec le file systeme:

sudo smartctl -a /dev/sdh
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint T166
Device Model:     SAMSUNG HD501LJ
Serial Number:    S0MUJ13P797058
LU WWN Device Id: 5 0000f0 01b797058
Firmware Version: CR100-10
User Capacity:    500107862016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 3b
SATA Version is:  SATA 2.5, 3.0 Gb/s
Local Time is:    Sun Jan 31 16:11:34 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 8995) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 153) minutes.
SCT capabilities: 	       (0x003f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0007   100   100   015    Pre-fail  Always       -       6976
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       892
  5 Reallocated_Sector_Ct   0x0033   253   253   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   253   253   051    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0025   253   253   015    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       28436
 10 Spin_Retry_Count        0x0033   253   253   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0012   253   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       471
187 Reported_Uncorrect      0x0032   253   253   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   253   253   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   067   052   000    Old_age   Always       -       33
194 Temperature_Celsius     0x0022   139   094   000    Old_age   Always       -       33
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       9676229
196 Reallocated_Event_Count 0x0032   253   253   000    Old_age   Always       -       0
197 Total_Pending_Sectors   0x0012   253   253   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   253   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x000a   100   100   000    Old_age   Always       -       0
201 Soft_Read_Error_Rate    0x000a   253   100   000    Old_age   Always       -       0
202 Data_Address_Mark_Errs  0x0032   100   100   000    Old_age   Always       -       239

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Peux-tu me dire ce que tu en penses ? Merci geole.

geole · Le 31/01/2021, à 17:39

Je sélectionne des extraits pour une meilleure vision.

webangel a écrit :

Bonjour geole, merci beaucoup pour ton aide, bien vue les erreurs pending et le power on hours de 67279 en effet le NAS tourne depuis 2012, cela fait environs 9 Ans sans soucis.
Il faudra que je pense à investir, dans un nouveau disque de remplacement. Je te donne le smartctl pour les autres disques à voir également si besoin de changer d'autres disques?
sudo smartctl -a /dev/sda
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       54
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   007   007   000    Old_age   Always       -       68524
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       2
SMART Error Log Version: 1
No Errors Logged

Deux secteurs signalés. il sera probablement le 3eme sur la liste de remplacement

sudo smartctl -a /dev/sdb
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   007   007   000    Old_age   Always       -       68545
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0

L'âge a lui seul n'est pas un facteur de remplacement lorsque les autres compteurs sont bons.

sudo smartctl -a /dev/sdd
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   007   007   000    Old_age   Always       -       68524
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0

L'âge a lui seul n'est pas un facteur de remplacement lorsque les autres compteurs sont bons.

sudo smartctl -a /dev/sde
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67349
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0

L'âge a lui seul n'est pas un facteur de remplacement lorsque les autres compteurs sont bons.

sudo smartctl -a /dev/sdf
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       4
  5 Reallocated_Sector_Ct   0x0033   194   194   140    Pre-fail  Always       -       197
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67327
196 Reallocated_Event_Count 0x0032   128   128   000    Old_age   Always       -       72
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1

Certainement le disque à remplacer en second dans la liste. Il n'a qu'un seul secteur irrécupérable mais il en a déjà sauvé 197 en traitant 72 incidents. Tu n'auras pas toujours cette chance de réussite

sudo smartctl -a /dev/sdg
=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   119   099   006    Pre-fail  Always       -       233268584
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   066   060   030    Pre-fail  Always       -       4490593
  9 Power_On_Hours          0x0032   039   039   000    Old_age   Always       -       53881
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0

Un disque plus récent mais avec une mécanique de mauvaise qualité qui doit ralentir de débit de l'ensemble. Mais rien de grave. A regarder son évolution tous les six mois. Ce n'était pas une bonne "pioche"

Disque 500Go avec le file systeme:

sudo smartctl -a /dev/sdh
=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint T166
Device Model:     SAMSUNG HD501LJ
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       -       0
  5 Reallocated_Sector_Ct   0x0033   253   253   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   253   253   051    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0025   253   253   015    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       28436
197 Total_Pending_Sectors   0x0012   253   253   000    Old_age   Always       -       0

Rien de particulier à dire A l'exception du compteur N°9 disant que le disque est tout neuf. Ce compteur doit être mal géré pour le suivi de son usure.

Peux-tu me dire ce que tu en penses ? Merci geole.

En résumé, tu as certainement le disque éjecté à remplacer et au moins un des deux autres avant la fin de l'année (Lorsque la dégradation arrive, elle est exponentielle (mot à la mode). Je pense que le fait que tu as déclaré en RAIDS6, fait que tu peux avoir une mise en route avec deux disques ayant des secteurs illisibles. En RAIDS5, c'est un seul. En RAID1, c'est zéro.
Donc au démarrage, il t'en as trouvé trois et il en a éjecté un. Il a choisi le plus mauvais. Dans quelques heures, on saura si la remise en RAIDS a remis ce compteur à zéro. Si oui, il n'y a pas le feu, sinon il faudra penser à en changer 1 assez rapidement.

Je ne connais pas du tout le taux de remplissage du RAIDS. Tu peux être amené à décider de remplacer par un disque de même capacité si cela te suffit. Mais, si tu penses à terme manquer d'espace disque, tu peux en acheter un beaucoup plus grand que tu découperas en deux partitions, l'une de 3 To et l'autre avec le reste que tu pourras utiliser de façon normale pour tout ce que tu veux (sauvetage du logiciel?) Lorsque tu en seras au remplacement du second disque, tu auras alors la possibilité de créer un nouveau RAID1 avec ces deux partitions sdX2.
Mais il est possible que d'ici là, le logiciel RAIDS évolue et permette des agrandissement de façon facile.

webangel · Le 31/01/2021, à 18:56

Merci pour ton analyse, je sais ce qui me reste à faire, commander deux disques de 3To dans l'immédiat et un plus tard.
c'est ma ministre des finances qui va être contente
Pour ce qui est du partitionnement d'un 3To pour une utilisation de documents c'est une bonne idée, mais j'ai un NAS Synology 4 baie
en raid 5 qui héberge mes fichiers importants.
Je te donne le taux d'occupation du nas qui pose problème:

df -h
Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
udev               837M       0  837M   0% /dev
tmpfs              173M    3,5M  170M   2% /run
/dev/sdh1           46G    9,5G   34G  22% /
tmpfs              865M       0  865M   0% /dev/shm
tmpfs              5,0M    4,0K  5,0M   1% /run/lock
tmpfs              865M       0  865M   0% /sys/fs/cgroup
/dev/md127          14T     12T  1,3T  91% /NAS
/dev/sdh6           19G    132M   18G   1% /home
tmpfs              173M     20K  173M   1% /run/user/1000

Oui en effet il est blinder...

à toute geole.

je passerai en résolu lorsque j'aurai réintégrer les deux disques neufs dans l'array.
PS: que me conseillerai tu comme disque de 3To ?

geole · Le 31/01/2021, à 19:15

Je ne sais pas conseiller pour l'achat de disque.
Je voulais simplement te dire que tu pouvais remplacer un disque de 3 TO par un disque de taille plus grande. Tu pourras alors récupérer la partie excédentaire pour faire autre chose.
Par exemple, en remplaçant deux disques de 3 TO par deux disques de 5 TO (pas regardé les prix ni la qualité) tu peux alors fabriquer en plus un RAID1 de 2TO dans lequel tu pourras déplacer quelque fichiers du MD127 afin de le faire un peu respirer

webangel · Le 31/01/2021, à 20:10

Merci geole oui tu as raison, c'est pas facile de conseiller un disque dur, il y a tellement de paramètres à prendre en compte pour la fiabilité.
Pour ce qui est du soulagement de md127 sur un raid1 supplémentaire,
je ne vais pouvoir acheter des disques de capacité supérieur à 3 To, car un peux trop cher pour ma bourse.
Je pense acheter des 3 To WD Red™ - Disque dur Interne NAS - 3To - 5 400 tr/min - 3.5" (WD30EFRX) à 111€ l'unité.
a voir si je trouve pas moins cher que ça, et assez rapidement livrables...
à toute geole.

geole · Le 31/01/2021, à 23:05

Pense à redonner l'état smartctl du disque reconstruit (sdc)
S'il n'y a plus de secteurs pending,
déclare un des deux autres faulty et demande sa reconstruction.
si même résultat, tu passeras à l'autre.

Dernière modification par geole (Le 31/01/2021, à 23:05)

webangel · Le 31/01/2021, à 23:24

ok ça marche la reconstruction est à 92% et sera finie dans 55 minutes environs...

Edit: Reconstruction terminée
un petit smartctl pour voir si Current Pending Sector est bien à zéro (pour rappel 316 avant reconstruction)

smartctl -a /dev/sdc
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

Smartctl open device: /dev/sdc failed: Permission denied
jeff@SRVNAS3T:~$ sudo smartctl -a /dev/sdc
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T1804212
LU WWN Device Id: 5 0014ee 6adb36684
Firmware Version: 80.00A80
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Feb  1 00:15:38 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		(39540) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 397) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   197   180   021    Pre-fail  Always       -       5141
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       123
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   118   001   000    Old_age   Always       -       23761
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67302
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       122
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9810
194 Temperature_Celsius     0x0022   117   098   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       313
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       318

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%      1743         1574048

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

En effet les Current Pending son à zéro c'est tout bon

je vérifie l'état de la grappe raid6

sudo mdadm --detail /dev/md127
/dev/md127:
           Version : 1.2
     Creation Time : Fri Mar 22 14:27:12 2013
        Raid Level : raid6
        Array Size : 14650670080 (13971.97 GiB 15002.29 GB)
     Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
      Raid Devices : 7
     Total Devices : 7
       Persistence : Superblock is persistent

       Update Time : Mon Feb  1 00:14:34 2021
             State : clean 
    Active Devices : 7
   Working Devices : 7
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

              Name : SRVNAS3T:0  (local to host SRVNAS3T)
              UUID : 0e2234e3:8853b712:5eb5db08:2afa36ea
            Events : 8894

    Number   Major   Minor   RaidDevice State
       6       8       17        0      active sync   /dev/sdb1
       7       8        1        1      active sync   /dev/sda1
      10       8       32        2      active sync   /dev/sdc
       3       8       49        3      active sync   /dev/sdd1
       4       8       65        4      active sync   /dev/sde1
       9       8       81        5      active sync   /dev/sdf1
       8       8       97        6      active sync   /dev/sdg1

le state est clean c'est tout bon

Je vais maintenant passé au disque suivant partition sdf1 et la déclaré faulty.

sudo mdadm --manage /dev/md127 --set-faulty /dev/sdf1
[sudo] Mot de passe de jeff : 
mdadm: set /dev/sdf1 faulty in /dev/md127

Maintenant je le retire de la grappe md127

sudo mdadm --manage /dev/md127 --remove /dev/sdf1
mdadm: hot removed /dev/sdf1 from /dev/md127

sudo mdadm --detail /dev/md127
/dev/md127:
           Version : 1.2
     Creation Time : Fri Mar 22 14:27:12 2013
        Raid Level : raid6
        Array Size : 14650670080 (13971.97 GiB 15002.29 GB)
     Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
      Raid Devices : 7
     Total Devices : 6
       Persistence : Superblock is persistent

       Update Time : Mon Feb  1 00:40:13 2021
             State : clean, degraded 
    Active Devices : 6
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

              Name : SRVNAS3T:0  (local to host SRVNAS3T)
              UUID : 0e2234e3:8853b712:5eb5db08:2afa36ea
            Events : 8897

    Number   Major   Minor   RaidDevice State
       6       8       17        0      active sync   /dev/sdb1
       7       8        1        1      active sync   /dev/sda1
      10       8       32        2      active sync   /dev/sdc
       3       8       49        3      active sync   /dev/sdd1
       4       8       65        4      active sync   /dev/sde1
       -       0        0        5      removed
       8       8       97        6      active sync   /dev/sdg1

Maintenant je rajoute ma partition sfd1 dans la grappe md127

sudo mdadm --manage /dev/md127 --add /dev/sdf1
mdadm: added /dev/sdf1

Reconstruction en cours...

sudo cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md127 : active raid6 sdf1[9] sdc[10] sde1[4] sdd1[3] sdb1[6] sda1[7] sdg1[8]
      14650670080 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UUUUU_U]
      [>....................]  recovery =  0.4% (12864000/2930134016) finish=595.0min speed=81710K/sec
      
unused devices: <none>

La reconstruction sera finie dans environs 10 heures dans le meilleur des cas, on verra si je n'ai plus de Pending...

bonne nuit à cette après-midi pour de nouvelle aventures...

Dernière modification par webangel (Le 01/02/2021, à 00:54)

geole · Le 01/02/2021, à 10:33

Bonjour

J'ai malgré tout une méconnaissance des RAIDS, je trouve dommage que le RAIDS ne dispose pas d'un outil très simple de refabrication des secteurs illisibles beaucoup plus simple que cette technique "bulldozer". Je vais poser la question. https://answers.launchpad.net/ubuntu/+question/695297

Dernière modification par geole (Le 01/02/2021, à 12:25)

webangel · Le 01/02/2021, à 11:27

Bonjour, petit soucis de reconstruction alors qu'elle devrait approchée de la fin,
il y a que 49.8 % d'effectuer et il reste 110130 minutes de rebuild et la vitesse est tombée drastiquement à 222k/secondes
du coup est il possible que ce disque soit hs ?

sudo cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md127 : active raid6 sdf1[9] sdc[10] sde1[4] sdd1[3] sdb1[6] sda1[7] sdg1[8]
      14650670080 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UUUUU_U]
      [=========>...........]  recovery = 49.8% (1461297208/2930134016) finish=110130.1min speed=222K/sec
      
unused devices: <none>

Un nouveau mdstat c'est encore pire

sudo cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md127 : active raid6 sdf1[9] sdc[10] sde1[4] sdd1[3] sdb1[6] sda1[7] sdg1[8]
      14650670080 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UUUUU_U]
      [==========>..........]  recovery = 50.0% (1465951872/2930134016) finish=12224.0min speed=1996K/sec
      
unused devices: <none>

geole · Le 01/02/2021, à 11:37

Fais un rapport smartctl pour le disque et SDC car cela peut se faire avec le disque en plein fonctionnement.
A mon avis, il tente de traiter des secteurs difficiles à écrire et essaie de les remplacer.
De mémoire, c'est 30 secondes d'essai avant de déclarer H.S. d'où le freinage
Ou pire,il n'arrive plus très bien à lire le disque SDC

Dernière modification par geole (Le 01/02/2021, à 11:40)

webangel · Le 01/02/2021, à 12:10

Ok merci geole, voici le rapport smartctl de sdf1 c'est celui qui est en cours de reconstruction.

sudo smartctl -a /dev/sdf1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T1801757
LU WWN Device Id: 5 0014ee 60308f2af
Firmware Version: 80.00A80
User Capacity:    3000592982016 bytes [3,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Feb  1 12:03:44 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(41160) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 413) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       4
  3 Spin_Up_Time            0x0027   197   179   021    Pre-fail  Always       -       5116
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       122
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       218
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67347
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       121
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9874
194 Temperature_Celsius     0x0022   112   088   000    Old_age   Always       -       38
196 Reallocated_Event_Count 0x0032   123   123   000    Old_age   Always       -       77
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

geole · Le 01/02/2021, à 12:31

Valeurs avant

  5 Reallocated_Sector_Ct   0x0033   194   194   140    Pre-fail  Always       -       197
196 Reallocated_Event_Count 0x0032   128   128   000    Old_age   Always       -       72
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1

Valeurs actuelles

5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       218
196 Reallocated_Event_Count 0x0032   123   123   000    Old_age   Always       -       77
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1

Pour la suite, tu peux te contenter de cette commande moins bavarde et inutile de mettre le n° de partition. On s'adresse au disque

sudo smartctl -A /dev/sdf

webangel · Le 01/02/2021, à 12:56

J'ai fait un nouveau smarctl tu as raison elle est beaucoup moins bavarde comme ça

sudo smartctl -A /dev/sdf
[sudo] Mot de passe de jeff : 
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       4
  3 Spin_Up_Time            0x0027   197   179   021    Pre-fail  Always       -       5116
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       122
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       218
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67348
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       121
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9874
194 Temperature_Celsius     0x0022   111   088   000    Old_age   Always       -       39
196 Reallocated_Event_Count 0x0032   123   123   000    Old_age   Always       -       77
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1

Merci geole pour le suivi

Edit: Dernier mdstat effectué

sudo cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md127 : active raid6 sdf1[9] sdc[10] sde1[4] sdd1[3] sdb1[6] sda1[7] sdg1[8]
      14650670080 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UUUUU_U]
      [==========>..........]  recovery = 50.4% (1478714240/2930134016) finish=13902.0min speed=1739K/sec

A peut près 9.5 Jours pour terminer la reconstruction...
Est-ce que je laisse continuer comme ça, encore un petit temps ?

Pour info geole j'ai commander deux disques dur chez ldlc Seagate IronWolf 3 To
Disque dur 3.5" 3 To 5900 RPM 64 Mo Serial ATA 6 Gb/s pour NAS (bulk) j'espère qu'ils feront l'affaire...
Même si ils ne servent pas maintenant, je les garderais sous le coude en réserve.

Edit: Voici les derniers résultats pas brillants en soi:

sudo cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md127 : active raid6 sdf1[9] sdc[10] sde1[4] sdd1[3] sdb1[6] sda1[7] sdg1[8]
      14650670080 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UUUUU_U]
      [==========>..........]  recovery = 54.2% (1590076308/2930134016) finish=379667.2min speed=58K/sec
      
unused devices: <none>

Je ne vais pas insister et donc partitionner de nouveau le disque sdf.
Je l'écarte du raid :

sudo mdadm --manage /dev/md127 --remove /dev/sdf1
mdadm: hot removed /dev/sdf1 from /dev/md127

Je le dé-partitionne choix :d

sudo gdisk /dev/sdf
GPT fdisk (gdisk) version 1.0.5

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.

Command (? for help): 

Command (? for help): gpt
b	back up GPT data to a file
c	change a partition's name
d	delete a partition
i	show detailed information on a partition
l	list known partition types
n	add a new partition
o	create a new empty GUID partition table (GPT)
p	print the partition table
q	quit without saving changes
r	recovery and transformation options (experts only)
s	sort partitions
t	change a partition's type code
v	verify disk
w	write table to disk and exit
x	extra functionality (experts only)
?	print this menu

Command (? for help): d
Using 1

Command (? for help): d
No partitions

Création de la nouvelle partition en choisissant 8300 par défaut Linux File System

b	back up GPT data to a file
c	change a partition's name
d	delete a partition
i	show detailed information on a partition
l	list known partition types
n	add a new partition
o	create a new empty GUID partition table (GPT)
p	print the partition table
q	quit without saving changes
r	recovery and transformation options (experts only)
s	sort partitions
t	change a partition's type code
v	verify disk
w	write table to disk and exit
x	extra functionality (experts only)
?	print this menu
Command (? for help): n
Partition number (1-128, default 1): 
First sector (34-5860533134, default = 2048) or {+-}size{KMGTP}: 
Last sector (2048-5860533134, default = 5860533134) or {+-}size{KMGTP}: 
Current type is 8300 (Linux filesystem)
Hex code or GUID (L to show codes, Enter = 8300): 
Changed type of partition to 'Linux filesystem'
Command (? for help): ?

écriture de la configuration choix: w

b	back up GPT data to a file
c	change a partition's name
d	delete a partition
i	show detailed information on a partition
l	list known partition types
n	add a new partition
o	create a new empty GUID partition table (GPT)
p	print the partition table
q	quit without saving changes
r	recovery and transformation options (experts only)
s	sort partitions
t	change a partition's type code
v	verify disk
w	write table to disk and exit
x	extra functionality (experts only)
?	print this menu

Command (? for help): w

Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!

Do you want to proceed? (Y/N): y
OK; writing new GUID partition table (GPT) to /dev/sdf.
The operation has completed successfully.

Ajout du disque nouvellement partitionner dans le volume raid:

sudo mdadm --manage /dev/md127 --add /dev/sdf1
mdadm: added /dev/sdf1

Un petit mdstat:

sudo cat /proc/mdstat
[sudo] Mot de passe de jeff : 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md127 : active raid6 sdf1[9] sdc[10] sde1[4] sdd1[3] sdb1[6] sda1[7] sdg1[8]
      14650670080 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UUUUU_U]
      [>....................]  recovery =  0.2% (6439164/2930134016) finish=21299.3min speed=2287K/sec
      
unused devices: <none>

ça a pas l'air d'être plus rapide qu'avant le nouveau partitionnement...

Dernière modification par webangel (Le 02/02/2021, à 18:37)

webangel · Le 02/02/2021, à 20:48

Bon, j'ai arrêter la reconstruction qui ne donne rien et fait un remove de sdf du volume raid, car évidement
cela générait un fort ralentissement, d'accès aux données du volume, qui les rendaient même souvent inaccessibles.
Que faire d'autre pour le moment?

geole · Le 03/02/2021, à 11:35

Bonjour
Je te propose de refaire un état smartctl de tes sept disques RAIDS avec l'option -A
Puis de recréer la partition SDF1 en la formatant avec l'option EFFACER de l'application gnome-disque utility
Cela va prendre pas mal de temps (17 heures?) et écrire la totalité des secteurs mais sans perturber le RAIDS en lui même.
Lorsque ce formatage sera fini, tu redonneras l'état smartcl de ce seul disque avec l'option -a afin d'avoir plus de détails.On saura alors comment il a survécu.
Apparté: J'ai toujours été satisfait des livraisons très rapides de LDLC.

webangel · Le 03/02/2021, à 12:03

Bonjour geole, merci beaucoup pour ton aide moi aussi je trouve ldlc rapide à envoyer le matériel
et qui plus est ils sont très réactif et compétent en diagnostique et support technique, je commande très souvent chez eux depuis 2006.

Voici le smartctl -A des disques du NAS:

sudo smartctl -A /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       160
  3 Spin_Up_Time            0x0027   199   180   021    Pre-fail  Always       -       5016
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       121
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   007   007   000    Old_age   Always       -       68592
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       120
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9849
194 Temperature_Celsius     0x0022   121   101   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       2
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       1
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       34

sudo smartctl -A /dev/sdb
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   195   174   021    Pre-fail  Always       -       5233
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       123
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   007   007   000    Old_age   Always       -       68612
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       121
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       67
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       10025
194 Temperature_Celsius     0x0022   121   098   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

sudo smartctl -A /dev/sdc
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   199   199   051    Pre-fail  Always       -       42747
  3 Spin_Up_Time            0x0027   197   180   021    Pre-fail  Always       -       5141
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       123
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       222
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67361
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       122
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9810
194 Temperature_Celsius     0x0022   120   098   000    Old_age   Always       -       30
196 Reallocated_Event_Count 0x0032   106   106   000    Old_age   Always       -       94
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       157
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       313
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       318

sudo smartctl -A /dev/sde
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   194   175   021    Pre-fail  Always       -       5283
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       123
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67417
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       122
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9756
194 Temperature_Celsius     0x0022   116   090   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

sudo smartctl -A /dev/sdf
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       4
  3 Spin_Up_Time            0x0027   197   179   021    Pre-fail  Always       -       5116
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       122
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       218
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   008   008   000    Old_age   Always       -       67395
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       121
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9908
194 Temperature_Celsius     0x0022   114   088   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   123   123   000    Old_age   Always       -       77
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1

sudo smartctl -A /dev/sdg
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   117   099   006    Pre-fail  Always       -       151308232
  3 Spin_Up_Time            0x0003   095   094   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       99
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   066   060   030    Pre-fail  Always       -       4504878
  9 Power_On_Hours          0x0032   039   039   000    Old_age   Always       -       53949
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       98
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   097   097   000    Old_age   Always       -       3
190 Airflow_Temperature_Cel 0x0022   062   040   045    Old_age   Always   In_the_past 38 (Min/Max 35/42 #1963)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       38
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       316
194 Temperature_Celsius     0x0022   038   060   000    Old_age   Always       -       38 (0 18 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       53951h+08m+33.616s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       6962860614
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       447227232123

J'ai quand même ajouter le smartctl du disque systeme sdh

sudo smartctl -A /dev/sdh
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-65-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0007   100   100   015    Pre-fail  Always       -       6976
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       892
  5 Reallocated_Sector_Ct   0x0033   253   253   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   253   253   051    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0025   253   253   015    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       28504
 10 Spin_Retry_Count        0x0033   253   253   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0012   253   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       471
187 Reported_Uncorrect      0x0032   253   253   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   253   253   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   067   052   000    Old_age   Always       -       33
194 Temperature_Celsius     0x0022   139   094   000    Old_age   Always       -       33
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       23805953
196 Reallocated_Event_Count 0x0032   253   253   000    Old_age   Always       -       0
197 Total_Pending_Sectors   0x0012   253   253   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   253   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x000a   100   100   000    Old_age   Always       -       0
201 Soft_Read_Error_Rate    0x000a   100   100   000    Old_age   Always       -       0
202 Data_Address_Mark_Errs  0x0032   100   100   000    Old_age   Always       -       239

je fais le formatage de sdf cette àprès-midi. bonne appètit geole

Dernière modification par webangel (Le 03/02/2021, à 13:06)

webangel · Le 03/02/2021, à 12:38

Je viens de lancer le formatage avec effacement des données, ce sera fini dans environs 19 heures.

geole · Le 03/02/2021, à 18:35

Voici un script à lancer en copier/coller pour faciliter le suivi des anomalies.Lettre=(a b c d e f g h)

Lettre=(a b c d e f g h)
for Dsk in "${Lettre[@]}" ; do
   echo ................ état du disque /dev/sd$Dsk
   sudo smartctl -a  /dev/sd$Dsk   | egrep " Reallocated_Sector| Current_Pending|Errors Logged"   
done

et je récupère l'état à ce matin

[code]sudo smartctl -A /dev/sda
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       2

sudo smartctl -A /dev/sdb
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0

sudo smartctl -A /dev/sdc
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       222
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       157

sudo smartctl -A /dev/sde
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0

sudo smartctl -A /dev/sdf
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       218
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1

sudo smartctl -A /dev/sdg
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0

sudo smartctl -A /dev/sdh
  5 Reallocated_Sector_Ct   0x0033   253   253   010    Pre-fail  Always       -       0
197 Total_Pending_Sectors   0x0012   253   253   000    Old_age   Always       -       0
[/code]

Dernière modification par geole (Le 03/02/2021, à 18:47)

webangel · Le 03/02/2021, à 20:52

Très bien ton script, il marche super bien, thks a lot geole

Résultat de l'utilisation du script:

................ état du disque /dev/sda
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       2
No Errors Logged
................ état du disque /dev/sdb
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
No Errors Logged
................ état du disque /dev/sdc
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       222
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       157
No Errors Logged
................ état du disque /dev/sdd
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
No Errors Logged
................ état du disque /dev/sde
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
No Errors Logged
................ état du disque /dev/sdf
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       218
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
No Errors Logged
................ état du disque /dev/sdg
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
No Errors Logged
................ état du disque /dev/sdh
  5 Reallocated_Sector_Ct   0x0033   253   253   010    Pre-fail  Always       -       0
No Errors Logged

Apparemment rien de changer pour le moment.

webangel · Le 04/02/2021, à 11:30

Bonjour, le formatage de sdf est aller jusqu'au bout voici un nouveau smartctl:

./smartctl-hdd-abcdefgh.sh
................ état du disque /dev/sda
[sudo] Mot de passe de jeff : 
Désolé, essayez de nouveau.
[sudo] Mot de passe de jeff : 
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       2
No Errors Logged
................ état du disque /dev/sdb
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
No Errors Logged
................ état du disque /dev/sdc
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       222
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       157
No Errors Logged
................ état du disque /dev/sdd
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
No Errors Logged
................ état du disque /dev/sde
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
No Errors Logged
................ état du disque /dev/sdf
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       218
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
No Errors Logged
................ état du disque /dev/sdg
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
No Errors Logged
................ état du disque /dev/sdh
  5 Reallocated_Sector_Ct   0x0033   253   253   010    Pre-fail  Always       -       0
No Errors Logged

J'attends ton retour de savoir si je retente de l'insérer dans le volume raid ou pas ?

geole · Le 04/02/2021, à 12:27

Bonjour
Non tu ne l'insères pas car il a toujours un secteur défectueux.
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1
On va essayer de l'identifier et de le réparer
J'ai récupéré ses caractéristiques
Disque /dev/sdf : 2,75 TiB, 3000592982016 octets, 5860533168 secteurs
Disk model: WDC WD30EZRX-00D
Unités : secteur de 1 × 512 = 512 octets
Taille de secteur (logique / physique) : 512 octets / 4096 octets
taille d'E/S (minimale / optimale) : 4096 octets / 4096 octets
Type d'étiquette de disque : gpt
Identifiant de disque : 40D67A90-D4C5-4C06-B871-F8922E0E3727
La commande de recherche va donc être

sudo badblocks -b 4096 -n -s -v -o ~/sdf1.badblocks /dev/sdf1

Puis tu donneras le retour de cette commande.

cat ~/sdf.badblocks

ou alors avec l'application smartctl, avec un test long qui porte sur la totalité du disque
je ne sais pas qu'elle est la meilleure solution. Je pesne que c'est la s econnde car on vient de formater la partition donc elle devrait être correcte. Il est donc possible que le secteur défectueux n'appartienne pas à la partition.

Dernière modification par geole (Le 04/02/2021, à 12:31)

webangel · Le 04/02/2021, à 15:09

Merci geole je vais suivre ton conseil et utiliser smartctl en option test long,
j'ai regarder dans le man smartctl mais je me perds dans toutes les options disponibles,
aurai tu la possibilité, de m'indiquer la ligne de commande smartctl à appliquer dans ce cas ?

Dernière modification par webangel (Le 04/02/2021, à 15:16)

geole · Le 04/02/2021, à 15:40

Je te le propose en trois passages de 1 To chacun
1) Le lancement

 sudo smartctl  -t select,0-1999999999  /dev/sdf    ### pour 1 Tio

2) Pour suivre l'exécution (Je pense que le document est bon)

sudo smartctl -a /dev/sdf | grep left

3) Lorsqu'il te dit que cela est fini, on regarde l'état de retour

 sudo smartctl -q errorsonly -H -l selftest /dev/sdf

Si fini pour cause de tout OK, tu vas passer au téra suivant avec cette commande

     sudo smartctl  -t select,next /dev/sdf

Si fini pour cause d'erreur. En théorie, on devrait seulement en trouver une seule mais comme l'unité physique d'écriture est de 4096, on devrait avoir 8 secteurs logiques de 512 illisibles
a) Tu récupères le N° de bloc en erreur
b) Tu regardes s'il est bien divisible par 8 car on ne sait jamais
c) On va quand même vérifier qu'on ne se trompe
Voici un petit script, Il faut modifier la première ligne avec le bon numéroi

Sect=80
for    (( I=0; I < 8; I++ ));  do
       sudo   hdparm    --yes-i-know-what-i-am-doing    --read-sector   $(($Sect+$I))  /dev/sdf
       sleep 5
done

d) Si s'est bien illisible, on ne s'est pas trompé d'endroit, On force alors l'écriture. Petit script. Changer aussi la premiere ligne

Sect=80
for    (( I=0; I < 8; I++ ));  do
       sudo   hdparm    --yes-i-know-what-i-am-doing    --write-sector   $(($Sect+$I))  /dev/sdf
       sleep 5
done

e) Il ne reste plus qu'à reprendre le contrôle de l'état du disque

     sudo smartctl  -t select,next /dev/sdf

Bon courage

Dernière modification par geole (Le 04/02/2021, à 16:34)

Ubuntu-fr

Navigation

Liens de recherche

Annonce

#1 Le 31/01/2021, à 00:17

[Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#2 Le 31/01/2021, à 15:29

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#3 Le 31/01/2021, à 16:19

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#4 Le 31/01/2021, à 17:39

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#5 Le 31/01/2021, à 18:56

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#6 Le 31/01/2021, à 19:15

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#7 Le 31/01/2021, à 20:10

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#8 Le 31/01/2021, à 23:05

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#9 Le 31/01/2021, à 23:24

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#10 Le 01/02/2021, à 10:33

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#11 Le 01/02/2021, à 11:27

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#12 Le 01/02/2021, à 11:37

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#13 Le 01/02/2021, à 12:10

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#14 Le 01/02/2021, à 12:31

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#15 Le 01/02/2021, à 12:56

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#16 Le 02/02/2021, à 20:48

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#17 Le 03/02/2021, à 11:35

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#18 Le 03/02/2021, à 12:03

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#19 Le 03/02/2021, à 12:38

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#20 Le 03/02/2021, à 18:35

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#21 Le 03/02/2021, à 20:52

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#22 Le 04/02/2021, à 11:30

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#23 Le 04/02/2021, à 12:27

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#24 Le 04/02/2021, à 15:09

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

#25 Le 04/02/2021, à 15:40

Re : [Résolu] Raid6 state clean,degraded md127 depuis mise à niveau 20.04

Pied de page des forums