net/mlx5: Skip clock update work when device is in error state
authorMoshe Shemesh <moshe@nvidia.com>
Wed, 19 Jul 2023 08:33:44 +0000 (11:33 +0300)
committerSaeed Mahameed <saeedm@nvidia.com>
Mon, 7 Aug 2023 18:48:40 +0000 (11:48 -0700)
When device is in error state, marked by the flag
MLX5_DEVICE_STATE_INTERNAL_ERROR, the HW and PCI may not be accessible
and so clock update work should be skipped. Furthermore, such access
through PCI in error state, after calling mlx5_pci_disable_device() can
result in failing to recover from pci errors.

Fixes: ef9814deafd0 ("net/mlx5e: Add HW timestamping (TS) support")
Reported-and-tested-by: Ganesh G R <ganeshgr@linux.ibm.com>
Closes: https://lore.kernel.org/netdev/9bdb9b9d-140a-7a28-f0de-2e64e873c068@nvidia.com
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c

index 973babf..377372f 100644 (file)
@@ -227,10 +227,15 @@ static void mlx5_timestamp_overflow(struct work_struct *work)
        clock = container_of(timer, struct mlx5_clock, timer);
        mdev = container_of(clock, struct mlx5_core_dev, clock);
 
+       if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
+               goto out;
+
        write_seqlock_irqsave(&clock->lock, flags);
        timecounter_read(&timer->tc);
        mlx5_update_clock_info_page(mdev);
        write_sequnlock_irqrestore(&clock->lock, flags);
+
+out:
        schedule_delayed_work(&timer->overflow_work, timer->overflow_period);
 }