vioscsi issue in OCI
Checking to try make vioscsi work in illumos, there is a problem on disk geometry as the pages 3 and 4 are not being returned in the hypervisor, so that means one way to fix this is to lie about disk geometry, this is what currently linux sees on the same hypervisor.
[opc@blog ~]$ sudo fdisk -l -u=cylinders /dev/sda WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion. Disk /dev/sda: 50.0 GB, 50010783744 bytes, 97677312 sectors Units = cylinders of 2048 * 512 = 1048576 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 1048576 bytes Disk label type: gpt Disk identifier: 6F25D687-CAE6-428A-8AA0-E618C576A2EB # Start End Size Type Name 1 2048 411647 200M EFI System EFI System Partition 2 411648 17188863 8G Linux swap 3 17188864 97675263 38.4G Microsoft basic [opc@blog ~]$ sudo fdisk -l /dev/sda WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion. Disk /dev/sda: 50.0 GB, 50010783744 bytes, 97677312 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 1048576 bytes Disk label type: gpt Disk identifier: 6F25D687-CAE6-428A-8AA0-E618C576A2EB # Start End Size Type Name 1 2048 411647 200M EFI System EFI System Partition 2 411648 17188863 8G Linux swap 3 17188864 97675263 38.4G Microsoft basic
Linux, OpenBSD also fakes values https://github.com/torvalds/linux/blob/master/drivers/scsi/sd.c
static int sd_getgeo(struct block_device *bdev, struct hd_geometry *geo) { struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk); struct scsi_device *sdp = sdkp->device; struct Scsi_Host *host = sdp->host; sector_t capacity = logical_to_sectors(sdp, sdkp->capacity); int diskinfo[4]; /* default to most commonly used values */ diskinfo[0] = 0x40; /* 1 << 6 */ diskinfo[1] = 0x20; /* 1 << 5 */ diskinfo[2] = capacity >> 11; /* override with calculated, extended default, or driver values */ if (host->hostt->bios_param) host->hostt->bios_param(sdp, bdev, capacity, diskinfo); else scsicam_bios_param(bdev, capacity, diskinfo); geo->heads = diskinfo[0]; geo->sectors = diskinfo[1]; geo->cylinders = diskinfo[2]; return 0; }
So I was just hardcoded values when sd_get_physical_geometry
That did not work
DEVELOPER MENU: dump_disk - dump disk entries dump_cont - dump controller entries dump_c_chain - dump controller chain entries dev_params - dump device parameters !<cmd> - execute <cmd>, then return quit developer> dump_disk disk_name c1t0d1 disk_path /dev/rdsk/c1t0d1s2 ctlr_cname = vioscsi cltr_dname = sd ctype_name = SCSI ctype_ctype = 13 devfsname = /pci@0,0/pci108e,8@4/iport@iport0/disk@0,1 developer> dev_params ncyl = 0 acyl = 0 pcyl = 0 nhead = 0 nsect = 0 developer> dump_c_chain ctlrp->ctlr_type->ctype_name = ata ctlrp->ctlr_type->ctype_name = SCSI ctlrp->ctlr_type->ctype_name = pcmcia ctlrp->ctlr_type->ctype_name = virtual-dsk ctlrp->ctlr_type->ctype_name = generic-block-device developer> dump_cont ctype_name = SCSI cname = vioscsi dname = sd ctype_ctype = 13 developer>
Clearly something is wrong when scsi is being asked about it's geometry, so one point where that information is obtained is here:
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_get_physical_geometry:return { printf("\nRETURN: %s() -> %d \n", probefunc, arg1); ustack(); }' -c format
Here is the stack trace when calling format
root@ /opt]# /opt/t.dt Searching for disks... RETURN: sd_get_physical_geometry() -> 5 libc.so.1`ioctl+0x7 format`is_efi_type+0x22 format`add_device_to_disklist+0x281 format`do_search+0x303 format`main+0xb0 format`_start_crt+0x9a format`_start+0x1a RETURN: sd_get_physical_geometry() -> 5 libc.so.1`ioctl+0x7 format`is_efi_type+0x22 format`add_device_to_disklist+0x281 format`do_search+0x303 format`main+0xb0 format`_start_crt+0x9a format`_start+0x1a RETURN: sd_get_physical_geometry() -> 5 libc.so.1`ioctl+0x7 libefi.so.1`efi_alloc_and_read+0x71 format`read_efi_label+0x26 format`add_device_to_disklist+0x43e format`do_search+0x303 format`main+0xb0 format`_start_crt+0x9a format`_start+0x1a done c1t0d1: configured with capacity of 46.58GB
Return code of 5 means EIO according to /usr/src/include/sys/syserrno.h
if (ISCD(un) || un->un_interconnect_type == SD_INTERCONNECT_SATA || (un->un_ctype == CTYPE_CCS && SD_INQUIRY(un)->inq_ansi >= 5)) return (ret);
Today I tried fdisk in the OCI compute instance.
In OCI
[root@ /]# fdisk -d /dev/rdsk/c1t0d1p0 Physical Geometry: cylinders[6080] heads[255] sectors[63] sector size[512] blocks[97675200] mbytes[-1459] Virtual (HBA) Geometry: cylinders[6080] heads[255] sectors[63] sector size[512] blocks[97675200] mbytes[-1459] Error in ioctl DKIOCGMBOOT: I/O error fdisk: Error reading partition table from /dev/rdsk/c1t0d1p0.
And in a healthy host
root@dev01 ~]# fdisk -d /dev/rdsk/c1d0p0 Physical Geometry: cylinders[5874] heads[255] sectors[63] sector size[512] blocks[94365810] mbytes[1021] Virtual (HBA) Geometry: cylinders[5874] heads[255] sectors[63] sector size[512] blocks[94365810] mbytes[1021] Partition Table Entry Values: SYSID ACT BHEAD BSECT BEGCYL EHEAD ESECT ENDCYL RELSECT NUMSECT 238 0 0 2 0 255 63 1023 1 94371479 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Now how is mbytes calculated ?
if (io_debug) { 1026 (void) fprintf(stderr, "Physical Geometry:\n"); 1027 (void) fprintf(stderr, 1028 " cylinders[%d] heads[%d] sectors[%d]\n" 1029 " sector size[%d] blocks[%d] mbytes[%d]\n", 1030 Numcyl, 1031 heads, 1032 sectors, 1033 sectsiz, 1034 Numcyl * heads * sectors, 1035 (Numcyl * heads * sectors * sectsiz) / 1048576); 1036 (void) fprintf(stderr, "Virtual (HBA) Geometry:\n"); 1037 (void) fprintf(stderr, 1038 " cylinders[%d] heads[%d] sectors[%d]\n" 1039 " sector size[%d] blocks[%d] mbytes[%d]\n", 1040 hba_Numcyl, 1041 hba_heads, 1042 hba_sectors, 1043 sectsiz, 1044 hba_Numcyl * hba_heads * hba_sectors, 1045 (hba_Numcyl * hba_heads * hba_sectors * sectsiz) / 1046 1048576); 1047 } 1048 }
[root@ /]# fdisk -G /dev/rdsk/c1t0d1p0 Physical geometry for device /dev/rdsk/c1t0d1p0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 6080 6080 0 0 255
So we had according to sd(4) driver
(Numcyl * heads * sectors * sectsiz) / 1048576)
root@dev01 ~]# fdisk -d /dev/rdsk/c1d0p0 Physical Geometry: cylinders[5874] heads[255] sectors[63] sector size[512] blocks[94365810] mbytes[1021] Virtual (HBA) Geometry: cylinders[5874] heads[255] sectors[63] sector size[512] blocks[94365810] mbytes[1021] Partition Table Entry Values: SYSID ACT BHEAD BSECT BEGCYL EHEAD ESECT ENDCYL RELSECT NUMSECT 238 0 0 2 0 255 63 1023 1 94371479 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Going through the code fdisk calls ioctl DKIOCGMBOOT which finally is a read on sector 0 of the disk through .
-
Which ends up calling read/write operation on the sd driver.
-
Here it checks for a valid block size sd_update_block_info
- https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/io/scsi/targets/sd.c?r=93686a1e#5224
I'm trying these DTrace scripts
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_READ_CAPACITY:entry { print(*(struct sd_lun *)args[0]->ssc_un); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_READ_CAPACITY:entry { print(*(struct scsi_device *)args[0]->ssc_un); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_READ_CAPACITY:return { printf("%s:%d",probefunc, args[1]);}' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_RDWR:entry { printf("%s:start_block %d bufflen %d\n",probefunc,args[4], args[3]); ustack();exit(0); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_ssc_send:entry { print(*(struct sd_ssc_t *)args[0]); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_ssc_send:return { printf("%s: returns %d\n",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:scsi:scsi_uscsi_copyin:entry { print(*(struct uscsi_cmd *)args[3]);ustack();exit(0); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:scsi:scsi_uscsi_copyin:return { printf("%s:%d",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_MODE_SELECT:entry { printf("%s: returns %d\n",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:scsi:scsi_uscsi_handle_cmd:return { printf("%s: returns %d\n",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:genunix:physio:return { printf("%s: returns %d\n",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:vioscsi:vioscsi_tran_start:entry { print(*(struct scsi_device *)args[0]->a.a_sd); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:vioscsi:vioscsi_tran_getcap:return { printf("%s:%d",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0' # this one returns entry/return dtrace -x switchrate=1000hz -q -n 'fbt:vioscsi:vioscsi_tran_getcap:entry { printf("%s:%s",probefunc, stringof(args[1])); }fbt:vioscsi:vioscsi_tran_getcap:return { printf("returns %s:%d\n",probefunc, args[1]); } ' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_TEST_UNIT_READY:return { printf("returns %s:%d\n",probefunc, args[1]); } ' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:cmlb:cmlb_ioctl:return { printf("returns %s:%d\n",probefunc, args[1]); } ' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:cmlb:cmlb_dkio_get_mboot:entry { print(*(struct cmlb_lun *)args[0]); } fbt:cmlb:cmlb_dkio_get_mboot:return { printf("returns %s:%d\n",probefunc, args[1]); } ' -c 'fdisk /dev/rdsk/c1t0d1p0'
The last thing I was doing as I traced all the error back to biowait, where it just could not read a block, was to trace vioscsi and not sd, I find out that vioscsi is failing in get geometry capability (I don't know how bad it is)
vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:geometryreturns vioscsi_tran_getcap:-1 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:geometryreturns vioscsi_tran_getcap:-1 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32
I just added geometry cap to vioscsi, but same error. One thing I installed OpenBSD 7.2 and 7.3 and both fail, the error is on the vioscsi driver as I removed at boot and the boot process continues. Should the problem would be in the attach function?.
This is a table with scsi errors
https://www.ibm.com/docs/en/flashsystem-v7000u/1.5.2?topic=problems-smart-ascascq-error-codes-messages https://www.ibm.com/docs/en/spectrum-protect/8.1.0?topic=messages-standard-asc-ascq-codes-descriptions
beginning format. The current time is Tue Apr 11 13:14:08 2023 Formatting... Format failed Retry of formatting operation without any of the standard mode selects and ignoring disk's Grown Defects list. The disk may be able to be reformatted this way if an earlier formatting operation was interrupted by a power failure or SCSI bus reset. The Grown Defects list will be recreated by format verification and surface analysis. Retry format without mode selects and Grown Defects list? y Formatting... Illegal request during format ASC: 0x20 ASCQ: 0x0 Illegal request during format ASC: 0x20 ASCQ: 0x0
0x20 means INVALID COMMAND OPERATION CODE
Redhat explanation about scsi codes
vioscsi_softc_t dtrace -x switchrate=1000hz -q -n 'fbt:vioscsi:vioscsi_cmd_handler:entry { print(*(struct vioscsi_softc_t *)args[1]); }' -c 'fdisk /dev/rdsk/c1t0d1p0' dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_enable_descr_sense:entry { print(*(struct sd_ssc_t *)args[1]); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
Situation is the same, today at least on the free tier of OCI, according to users, illumos will boot if the image is setup to use the paravirtualized drivers. So I'm stopping my research now.
Just tested again in a Linux instance which virtio drivers they report.
00:04.0 SCSI storage controller: Red Hat, Inc. Virtio SCSI Subsystem: Oracle/SUN Virtio SCSI Physical Slot: 4 Flags: bus master, fast devsel, latency 0, IRQ 11 I/O ports at c000 [size=64] Memory at 81010000 (32-bit, non-prefetchable) [size=4K] Memory at 800004000 (64-bit, prefetchable) [size=16K] Capabilities: [98] MSI-X: Enable+ Count=4 Masked- Capabilities: [84] Vendor Specific Information: VirtIO: <unknown> Capabilities: [70] Vendor Specific Information: VirtIO: Notify Capabilities: [60] Vendor Specific Information: VirtIO: DeviceCfg Capabilities: [50] Vendor Specific Information: VirtIO: ISR Capabilities: [40] Vendor Specific Information: VirtIO: CommonCfg Kernel driver in use: virtio-pci