Systems

From TipsTrade Wiki
Jump to: navigation, search

SNMP

Eaton UPS MIBs

Usage

# Ambient temperature
snmpget -v 1 -c public $host -OqvU -mALL XUPS-MIB::xupsEnvAmbientTemp.0
# Power usage
snmpget -v 1 -c public $host -OqvU -mALL UPS-MIB::upsOutputPower.1

HPE ILO4 MIBs

Links

Usage

# 01-Inlet Ambient temperature
snmpget -c public -v2c $host -OqvU -mALL 1.3.6.1.4.1.232.6.2.6.8.1.4.0.1
snmpget -c public -v2c $host -OqvU -mALL CPQHLTH-MIB::cpqHeTemperatureCelsius.0.1
# All temperatures
snmpwalk -c public -v2c $host -mALL 1.3.6.1.4.1.232.6.2.6.8.1.4.0
snmpwalk -c public -v2c $host -mALL CPQHLTH-MIB::cpqHeTemperatureCelsius.0

Fans

OID Name Description
1.3.6.1.4.1.232.6.2.6.7.1.2.0 (Fan Index)
1.3.6.1.4.1.232.6.2.6.7.1.3.0 (Fan Locale (1=other, 2=unknown, 3=system, 4=systemBoard, 5=ioBoard, 6=cpu, 7=memory, 8=storage, 9=removable media, 10=power supply, 11=ambent, 12=chassis, 13=bridge card, 14=management board, 15=backplane, 16=network slot, 17=blade slot, 18=virtual)
1.3.6.1.4.1.232.6.2.6.7.1.4.0 (Fan Present (1=other, 2=absent, 3=present)
1.3.6.1.4.1.232.6.2.6.7.1.5.0 (Fan Present (1=other, 2=tachOutput, 3=spinDetect)
1.3.6.1.4.1.232.6.2.6.7.1.6.0 (Fan Speed (1=other, 2=normal, 3=high)
1.3.6.1.4.1.232.6.2.6.7.1.9.0 (Fan Condition (1=other, 2=ok, 3=degraded, 4=failed)

Temperature

OID Name Description
1.3.6.1.4.1.232.6.2.6.8.1.2.0 CPQHLTH-MIB::cpqHeTemperatureIndex.0 (Temperature Sensor Index)
1.3.6.1.4.1.232.6.2.6.8.1.3.0 CPQHLTH-MIB::cpqHeTemperatureLocale.0 (Temperature Sensor Locale (1=other, 2=unknown, 3=system, 4=systemBoard, 5=ioBoard, 6=cpu, 7=memory, 8=storage, 9=removable media, 10=power supply, 11=ambent, 12=chassis, 13=bridge card)
1.3.6.1.4.1.232.6.2.6.8.1.7.0 CPQHLTH-MIB::cpqHeTemperatureThresholdType.0 (Threshold Type (1=other, 5=blowout, 9=caution, 15=critical, 16=noreaction)
1.3.6.1.4.1.232.6.2.6.8.1.4.0 CPQHLTH-MIB::cpqHeTemperatureCelsius.0 (Temperature Celsius)
1.3.6.1.4.1.232.6.2.6.8.1.5.0 CPQHLTH-MIB::cpqHeTemperatureThreshold.0 (TemperatureThreshold)
1.3.6.1.4.1.232.6.2.6.8.1.6.0 CPQHLTH-MIB::cpqHeTemperatureCondition.0 (TemperatureCondition)

CPU

OID Name Description
1.3.6.1.4.1.232.1.2.2.1.1.0 (CPU Index)
1.3.6.1.4.1.232.1.2.2.1.1.0 (CPU Name)
1.3.6.1.4.1.232.1.2.2.1.1.0 (CPU Speed in MHz)
1.3.6.1.4.1.232.1.2.2.1.1.0 (CPU Step)
1.3.6.1.4.1.232.1.2.2.1.1.0 (CPU status (1=unknown, 2=ok, 3=degraded, 4=failed, 5=disabled)
1.3.6.1.4.1.232.1.2.2.1.1.0 (Number of enabled CPU cores)
1.3.6.1.4.1.232.1.2.2.1.1.0 (Number of available CPU threads)
1.3.6.1.4.1.232.1.2.2.1.1.0 (CPU power status (1=unknown, 2=Low Powered, 3=Normal Powered, 4=High Powered)


Logical Drives

OID Name Description
1.3.6.1.4.1.232.3.2.3.1.1.2.0 (Logical Drive Index)
1.3.6.1.4.1.232.3.2.3.1.1.1.0 (Logical Drive Controller)
1.3.6.1.4.1.232.3.2.3.1.1.3.0 (Logical Drive Fault Tolerance (1=other, 2=none, 3=RAID 1/RAID 1+0 (Mirroring), 4=RAID 4 (Data Guard), 5=RAID 5 (Distributed Data Guard), 7=RAID 6 (Advanced Data Guarding), 8=RAID 50, 9=RAID 60, 10=RAID 1 ADM (Advanced Data Mirroring), 11=RAID 10 ADM (Advanced Data Mirroring with Striping))
1.3.6.1.4.1.232.3.2.3.1.1.9.0 (Logical Drive Size in Mb)
1.3.6.1.4.1.232.3.2.3.1.1.4.0 (Logical Drive Status (1=other, 2=ok, 3=Failed, 4=Unconfigured, 5=Recovering, 6=Ready Rebuild, 7=Rebuilding, 8=Wrong Drive, 9=Bad Connect, 10=Overheating, 11=Shutdown, 12=Expanding, 13=Not Available, 14=Queued For Expansion, 15=Multi-path Access Degraded, 16=Erasing, 17=Predictive Spare Rebuild Ready, 18=Rapid Parity Initialization In Progress, 19=Rapid Parity Initialization Pending, 20=No Access – Encrypted with No Controller Key, 21=Unencrypted to Encrypted Transformation in Progress, 22=New Logical Drive Key Rekey in Progress, 23=No Access – Encrypted with Controller Encryption Not Enabled, 24=Unencrypted To Encrypted Transformation Not Started, 25=New Logical Drive Key Rekey Request Received)
1.3.6.1.4.1.232.3.2.3.1.1.11.0 (Logical Drive Condition (1=other, 2=ok, 3=degraded, 4=failed)

Drives

OID Name Description
1.3.6.1.4.1.232.3.2.5.1.1.2.0 (Drive Index)
1.3.6.1.4.1.232.3.2.5.1.1.5.0 (Drive Bay)
1.3.6.1.4.1.232.3.2.5.1.1.64.0 (Drive Location)
1.3.6.1.4.1.232.3.2.5.1.1.3.0 (Drive Vendor)
1.3.6.1.4.1.232.3.2.5.1.1.51.0 (Drive Serial Number)
1.3.6.1.4.1.232.3.2.5.1.1.45.0 (Drive Size in Mb)
1.3.6.1.4.1.232.3.2.5.1.1.65.0 (Drive Link Rate (1=other, 2=1.5Gbps, 3=3.0Gbps, 4=6.0Gbps, 5=12.0Gbps))
1.3.6.1.4.1.232.3.2.5.1.1.70.0 (Drive Current Temperature)
1.3.6.1.4.1.232.3.2.5.1.1.71.0 (Drive Temperature Threshold)
1.3.6.1.4.1.232.3.2.5.1.1.72.0 (Drive Maximum Temperature)
1.3.6.1.4.1.232.3.2.5.1.1.6.0 (Drive Status (1=Other, 2=Ok, 3=Failed, 4=Predictive Failure, 5=Erasing, 6=Erase Done, 7=Erase Queued, 8=SSD Wear Out, 9=Not Authenticated)
1.3.6.1.4.1.232.3.2.5.1.1.37.0 (Drive Condition (1=other, 2=ok, 3=degraded, 4=failed)
1.3.6.1.4.1.232.3.2.5.1.1.9.0 (Drive Reference Time in hours)

iLO NIC

OID Name Description
1.3.6.1.4.1.232.9.2.5.2.1.0 (iLO location)
1.3.6.1.4.1.232.9.2.5.1.1.0 (iLO NIC model)
1.3.6.1.4.1.232.9.2.5.1.1.0 (iLO NIC MAC)
1.3.6.1.4.1.232.9.2.5.1.1.0 (iLO NIC IPv4)
1.3.6.1.4.1.232.9.2.5.1.1.0 (iLO NIC speed)
1.3.6.1.4.1.232.9.2.5.1.1.0 (iLO NIC FQDN)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Tx bytes)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Tx packets)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Tx discard packets)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Tx error packets)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Rx bytes)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Rx packets)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Rx discard packets)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Rx error packets)
1.3.6.1.4.1.232.9.2.5.2.1.0 (Rx unknown packets)

Memory

OID Name Description
1.3.6.1.4.1.232.6.2.14.13.1.0 (Memory Index)
1.3.6.1.4.1.232.6.2.14.13.1.0 (Location)
1.3.6.1.4.1.232.6.2.14.13.1.0 (Manufacturer)
1.3.6.1.4.1.232.6.2.14.13.1.0 (Part Number)
1.3.6.1.4.1.232.6.2.14.13.1.0 (Size in Kbytes)
1.3.6.1.4.1.232.6.2.14.13.1.0 (Memory Technology)
1.3.6.1.4.1.232.6.2.14.13.1.0 (Memory Type)
1.3.6.1.4.1.232.6.2.14.13.1.0 (Memory status (1=other, 2=notPresent, 3=present, 4=good, 5=add, 6=upgrade, 7=missing, 8=doesNotMatch, 9=notSupported, 10=badConfig, 11=degraded, 12=spare, 13=partial)
1.3.6.1.4.1.232.6.2.14.13.1.0 (Memory condition (1=other, 2=ok, 3=degraded, 4=degradedModuleIndexUnknown)

SATA/SCSI Hot plugging

Scan SCSI bus for hot plugged devices:

echo "- - -" > /sys/class/scsi_host/host<HOST>/scan

Spin down a device before hot plug removal:

echo 1 > /sys/block/<DEVICE>/device/delete

Software RAID (mdadm)

Taken from mdadm cheat sheet[1]

Creates a new array:
This creates a new level 1 (mirrored) array over two devices - sda and sdb.

mdadm --create --verbose /dev/md0 --level=1 /dev/sda1 /dev/sdb2

Write the configuration:

mdadm --detail --scan >> /etc/mdadm/mdadm.conf

Remove a disk from an array:
The disk needs to be failed first before it can be removed, although it's most likely in a failed state already.

mdadm --fail /dev/md0 /dev/sda1

and now remove it

mdadm --remove /dev/md0 /dev/sda1

Add a disk to an existing array:
The partition table will probably have to be recreated.

sfdisk -d /dev/<ORIGINAL> | sfdisk /dev/<NEW>

then add the device to the array.

mdadm --add /dev/md0 /dev/sdb1

Verifying the status of the RAID arrays:

cat /proc/mdstat

or

mdadm --detail /dev/md0