vioscsi issue in OCI
Checking to try make vioscsi work in illumos, there is a problem on disk geometry as the pages 3 and 4 are not being returned in the hypervisor, so that means one way to fix this is to lie about disk geometry, this is what currently linux sees on the same hypervisor.
[opc@blog ~]$ sudo fdisk -l -u=cylinders /dev/sda WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion. Disk /dev/sda: 50.0 GB, 50010783744 bytes, 97677312 sectors Units = cylinders of 2048 * 512 = 1048576 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 1048576 bytes Disk label type: gpt Disk identifier: 6F25D687-CAE6-428A-8AA0-E618C576A2EB # Start End Size Type Name 1 2048 411647 200M EFI System EFI System Partition 2 411648 17188863 8G Linux swap 3 17188864 97675263 38.4G Microsoft basic [opc@blog ~]$ sudo fdisk -l /dev/sda WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion. Disk /dev/sda: 50.0 GB, 50010783744 bytes, 97677312 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 1048576 bytes Disk label type: gpt Disk identifier: 6F25D687-CAE6-428A-8AA0-E618C576A2EB # Start End Size Type Name 1 2048 411647 200M EFI System EFI System Partition 2 411648 17188863 8G Linux swap 3 17188864 97675263 38.4G Microsoft basic
Linux, OpenBSD also fakes values https://github.com/torvalds/linux/blob/master/drivers/scsi/sd.c
static int sd_getgeo(struct block_device *bdev, struct hd_geometry *geo)
{
struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
struct scsi_device *sdp = sdkp->device;
struct Scsi_Host *host = sdp->host;
sector_t capacity = logical_to_sectors(sdp, sdkp->capacity);
int diskinfo[4];
/* default to most commonly used values */
diskinfo[0] = 0x40; /* 1 << 6 */
diskinfo[1] = 0x20; /* 1 << 5 */
diskinfo[2] = capacity >> 11;
/* override with calculated, extended default, or driver values */
if (host->hostt->bios_param)
host->hostt->bios_param(sdp, bdev, capacity, diskinfo);
else
scsicam_bios_param(bdev, capacity, diskinfo);
geo->heads = diskinfo[0];
geo->sectors = diskinfo[1];
geo->cylinders = diskinfo[2];
return 0;
}
So I was just hardcoded values when sd_get_physical_geometry
That did not work
DEVELOPER MENU:
dump_disk - dump disk entries
dump_cont - dump controller entries
dump_c_chain - dump controller chain entries
dev_params - dump device parameters
!<cmd> - execute <cmd>, then return
quit
developer> dump_disk
disk_name c1t0d1 disk_path /dev/rdsk/c1t0d1s2
ctlr_cname = vioscsi cltr_dname = sd ctype_name = SCSI
ctype_ctype = 13
devfsname = /pci@0,0/pci108e,8@4/iport@iport0/disk@0,1
developer> dev_params
ncyl = 0
acyl = 0
pcyl = 0
nhead = 0
nsect = 0
developer> dump_c_chain
ctlrp->ctlr_type->ctype_name = ata
ctlrp->ctlr_type->ctype_name = SCSI
ctlrp->ctlr_type->ctype_name = pcmcia
ctlrp->ctlr_type->ctype_name = virtual-dsk
ctlrp->ctlr_type->ctype_name = generic-block-device
developer> dump_cont
ctype_name = SCSI cname = vioscsi dname = sd ctype_ctype = 13
developer>
Clearly something is wrong when scsi is being asked about it's geometry, so one point where that information is obtained is here:
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_get_physical_geometry:return { printf("\nRETURN: %s() -> %d \n", probefunc, arg1); ustack(); }' -c format
Here is the stack trace when calling format
root@ /opt]# /opt/t.dt
Searching for disks...
RETURN: sd_get_physical_geometry() -> 5
libc.so.1`ioctl+0x7
format`is_efi_type+0x22
format`add_device_to_disklist+0x281
format`do_search+0x303
format`main+0xb0
format`_start_crt+0x9a
format`_start+0x1a
RETURN: sd_get_physical_geometry() -> 5
libc.so.1`ioctl+0x7
format`is_efi_type+0x22
format`add_device_to_disklist+0x281
format`do_search+0x303
format`main+0xb0
format`_start_crt+0x9a
format`_start+0x1a
RETURN: sd_get_physical_geometry() -> 5
libc.so.1`ioctl+0x7
libefi.so.1`efi_alloc_and_read+0x71
format`read_efi_label+0x26
format`add_device_to_disklist+0x43e
format`do_search+0x303
format`main+0xb0
format`_start_crt+0x9a
format`_start+0x1a
done
c1t0d1: configured with capacity of 46.58GB
Return code of 5 means EIO according to /usr/src/include/sys/syserrno.h
if (ISCD(un) ||
un->un_interconnect_type == SD_INTERCONNECT_SATA ||
(un->un_ctype == CTYPE_CCS && SD_INQUIRY(un)->inq_ansi >= 5))
return (ret);
Today I tried fdisk in the OCI compute instance.
In OCI
[root@ /]# fdisk -d /dev/rdsk/c1t0d1p0 Physical Geometry: cylinders[6080] heads[255] sectors[63] sector size[512] blocks[97675200] mbytes[-1459] Virtual (HBA) Geometry: cylinders[6080] heads[255] sectors[63] sector size[512] blocks[97675200] mbytes[-1459] Error in ioctl DKIOCGMBOOT: I/O error fdisk: Error reading partition table from /dev/rdsk/c1t0d1p0.
And in a healthy host
root@dev01 ~]# fdisk -d /dev/rdsk/c1d0p0 Physical Geometry: cylinders[5874] heads[255] sectors[63] sector size[512] blocks[94365810] mbytes[1021] Virtual (HBA) Geometry: cylinders[5874] heads[255] sectors[63] sector size[512] blocks[94365810] mbytes[1021] Partition Table Entry Values: SYSID ACT BHEAD BSECT BEGCYL EHEAD ESECT ENDCYL RELSECT NUMSECT 238 0 0 2 0 255 63 1023 1 94371479 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Now how is mbytes calculated ?
if (io_debug) {
1026 (void) fprintf(stderr, "Physical Geometry:\n");
1027 (void) fprintf(stderr,
1028 " cylinders[%d] heads[%d] sectors[%d]\n"
1029 " sector size[%d] blocks[%d] mbytes[%d]\n",
1030 Numcyl,
1031 heads,
1032 sectors,
1033 sectsiz,
1034 Numcyl * heads * sectors,
1035 (Numcyl * heads * sectors * sectsiz) / 1048576);
1036 (void) fprintf(stderr, "Virtual (HBA) Geometry:\n");
1037 (void) fprintf(stderr,
1038 " cylinders[%d] heads[%d] sectors[%d]\n"
1039 " sector size[%d] blocks[%d] mbytes[%d]\n",
1040 hba_Numcyl,
1041 hba_heads,
1042 hba_sectors,
1043 sectsiz,
1044 hba_Numcyl * hba_heads * hba_sectors,
1045 (hba_Numcyl * hba_heads * hba_sectors * sectsiz) /
1046 1048576);
1047 }
1048 }
[root@ /]# fdisk -G /dev/rdsk/c1t0d1p0 Physical geometry for device /dev/rdsk/c1t0d1p0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 6080 6080 0 0 255
So we had according to sd(4) driver
(Numcyl * heads * sectors * sectsiz) / 1048576)
root@dev01 ~]# fdisk -d /dev/rdsk/c1d0p0 Physical Geometry: cylinders[5874] heads[255] sectors[63] sector size[512] blocks[94365810] mbytes[1021] Virtual (HBA) Geometry: cylinders[5874] heads[255] sectors[63] sector size[512] blocks[94365810] mbytes[1021] Partition Table Entry Values: SYSID ACT BHEAD BSECT BEGCYL EHEAD ESECT ENDCYL RELSECT NUMSECT 238 0 0 2 0 255 63 1023 1 94371479 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Going through the code fdisk calls ioctl DKIOCGMBOOT which finally is a read on sector 0 of the disk through .
-
Which ends up calling read/write operation on the sd driver.
-
Here it checks for a valid block size sd_update_block_info
- https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/io/scsi/targets/sd.c?r=93686a1e#5224
I'm trying these DTrace scripts
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_READ_CAPACITY:entry { print(*(struct sd_lun *)args[0]->ssc_un); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_READ_CAPACITY:entry { print(*(struct scsi_device *)args[0]->ssc_un); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_READ_CAPACITY:return { printf("%s:%d",probefunc, args[1]);}' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_RDWR:entry { printf("%s:start_block %d bufflen %d\n",probefunc,args[4], args[3]); ustack();exit(0); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_ssc_send:entry { print(*(struct sd_ssc_t *)args[0]); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_ssc_send:return { printf("%s: returns %d\n",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:scsi:scsi_uscsi_copyin:entry { print(*(struct uscsi_cmd *)args[3]);ustack();exit(0); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:scsi:scsi_uscsi_copyin:return { printf("%s:%d",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_MODE_SELECT:entry { printf("%s: returns %d\n",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:scsi:scsi_uscsi_handle_cmd:return { printf("%s: returns %d\n",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:genunix:physio:return { printf("%s: returns %d\n",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:vioscsi:vioscsi_tran_start:entry { print(*(struct scsi_device *)args[0]->a.a_sd); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:vioscsi:vioscsi_tran_getcap:return { printf("%s:%d",probefunc, args[1]);ustack(); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
# this one returns entry/return
dtrace -x switchrate=1000hz -q -n 'fbt:vioscsi:vioscsi_tran_getcap:entry { printf("%s:%s",probefunc, stringof(args[1])); }fbt:vioscsi:vioscsi_tran_getcap:return { printf("returns %s:%d\n",probefunc, args[1]); } ' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_send_scsi_TEST_UNIT_READY:return { printf("returns %s:%d\n",probefunc, args[1]); } ' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:cmlb:cmlb_ioctl:return { printf("returns %s:%d\n",probefunc, args[1]); } ' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:cmlb:cmlb_dkio_get_mboot:entry { print(*(struct cmlb_lun *)args[0]); } fbt:cmlb:cmlb_dkio_get_mboot:return { printf("returns %s:%d\n",probefunc, args[1]); } ' -c 'fdisk /dev/rdsk/c1t0d1p0'
The last thing I was doing as I traced all the error back to biowait, where it just could not read a block, was to trace vioscsi and not sd, I find out that vioscsi is failing in get geometry capability (I don't know how bad it is)
vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:geometryreturns vioscsi_tran_getcap:-1 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:geometryreturns vioscsi_tran_getcap:-1 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32 vioscsi_tran_getcap:max-cdb-lengthreturns vioscsi_tran_getcap:32
I just added geometry cap to vioscsi, but same error. One thing I installed OpenBSD 7.2 and 7.3 and both fail, the error is on the vioscsi driver as I removed at boot and the boot process continues. Should the problem would be in the attach function?.
This is a table with scsi errors
https://www.ibm.com/docs/en/flashsystem-v7000u/1.5.2?topic=problems-smart-ascascq-error-codes-messages https://www.ibm.com/docs/en/spectrum-protect/8.1.0?topic=messages-standard-asc-ascq-codes-descriptions
beginning format. The current time is Tue Apr 11 13:14:08 2023 Formatting... Format failed Retry of formatting operation without any of the standard mode selects and ignoring disk's Grown Defects list. The disk may be able to be reformatted this way if an earlier formatting operation was interrupted by a power failure or SCSI bus reset. The Grown Defects list will be recreated by format verification and surface analysis. Retry format without mode selects and Grown Defects list? y Formatting... Illegal request during format ASC: 0x20 ASCQ: 0x0 Illegal request during format ASC: 0x20 ASCQ: 0x0
0x20 means INVALID COMMAND OPERATION CODE
Redhat explanation about scsi codes
vioscsi_softc_t
dtrace -x switchrate=1000hz -q -n 'fbt:vioscsi:vioscsi_cmd_handler:entry { print(*(struct vioscsi_softc_t *)args[1]); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
dtrace -x switchrate=1000hz -q -n 'fbt:sd:sd_enable_descr_sense:entry { print(*(struct sd_ssc_t *)args[1]); }' -c 'fdisk /dev/rdsk/c1t0d1p0'
Situation is the same, today at least on the free tier of OCI, according to users, illumos will boot if the image is setup to use the paravirtualized drivers. So I'm stopping my research now.
Just tested again in a Linux instance which virtio drivers they report.
00:04.0 SCSI storage controller: Red Hat, Inc. Virtio SCSI
Subsystem: Oracle/SUN Virtio SCSI
Physical Slot: 4
Flags: bus master, fast devsel, latency 0, IRQ 11
I/O ports at c000 [size=64]
Memory at 81010000 (32-bit, non-prefetchable) [size=4K]
Memory at 800004000 (64-bit, prefetchable) [size=16K]
Capabilities: [98] MSI-X: Enable+ Count=4 Masked-
Capabilities: [84] Vendor Specific Information: VirtIO: <unknown>
Capabilities: [70] Vendor Specific Information: VirtIO: Notify
Capabilities: [60] Vendor Specific Information: VirtIO: DeviceCfg
Capabilities: [50] Vendor Specific Information: VirtIO: ISR
Capabilities: [40] Vendor Specific Information: VirtIO: CommonCfg
Kernel driver in use: virtio-pci