attempt to access beyond end of device
Michael E Brown
Michael_E_Brown at dell.com
Tue May 1 14:25:15 CDT 2007
On Tue, May 01, 2007 at 12:08:18PM -0400, David Titzer wrote:
> Seeing the following paired kernel error messages:
>
> "attempt to access beyond end of device"
> "dm-1: rw=0, want=26553873480, limit=3786932224"
wow, that is a huge discrepancy. You are definitely in for some
problems.
>
> The server is a PE2950 running RHEL ES 4.4. The system has one filesystem under LVM, physically living on an AoE storage server. AoE drivers are up-to-date. Server exports the root of that logical filesystem to other servers. The server's kernel is 2.6.9-42-ELsmp.
>
> We are seeing odd small-file corruption. Sometimes, this corruption occurs to static data files after being accessed for read.
The error would definitely explain corruption.
>
> I've not seen info related to this problem on recent Red Hat releases, or recent kernels for that matter.
>
> I could use a good starting point for troubleshooting this! This isn't
> the only AoE installation I'm working with, but it is the only one
> using PE2950s, and the only one with such errors. Thanks.
You need to double-check all of the sizes of everything. Things to check
for:
-- partition sizes represent actual size of disk
-- lvm pv sizes match size of device it is on
-- lvm vg sizes match size of all pvs in vg
-- check that you dont have any devices that may have gone offline
and shrunk your vg size unexpectedly. Something like a raid0 or
something with an offline device would be *bad*.
-- check that the size of the fs on the device matches the size of
the device.
Things that could have gone wrong:
-- somebody resized a device underneath you. For example, if you
had a raw AoE device (not sure what kind you have, do they do raid
underneath?) that was, eg. 100GB, did a pvcreate, vgextend, etc,
then somebody resized the device to 50GB, you might not see problems
until it started filling up.
--
Michael
More information about the Linux-PowerEdge
mailing list