Results 1 to 4 of 4

Thread: mce detected memory error

  1. #1

    mce detected memory error

    Hi Experts,
    I found some interesting issue in /var/log/messages. I installed SLES 12 SP3 for SAP on Lenovo X3850 X6.

    2019-10-17T18:02:15.441498+07:00 hostname mcelog[5163]: Running trigger `socket-memory-error-trigger'
    2019-10-17T18:02:15.441503+07:00 hostname mcelog[5163]: Hardware event. This is not a software error.
    2019-10-17T18:02:15.441548+07:00 hostname mcelog[5163]: Corrected error
    2019-10-17T18:02:15.441575+07:00 hostname mcelog[5163]: Transaction: Memory read error
    2019-10-17T18:02:15.441579+07:00 hostname mcelog[5163]: MemCtrl: Corrected memory read error


    After surfing some solution, I found this (https://www.suse.com/support/kb/doc/?id=7022118). I add that kernel options (mce=ignore_ce).
    And then this error/invalid appear on

    2019-10-29T00:33:32.041734+07:00 hostname kernel: [ 9.938351] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
    2019-10-29T00:33:32.041973+07:00 hostname kernel: [ 15.612800] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 0.
    2019-10-29T00:33:32.041974+07:00 hostname kernel: [ 15.612801] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 1.
    2019-10-29T00:33:32.041975+07:00 hostname kernel: [ 15.612801] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 2.
    2019-10-29T00:33:32.041980+07:00 hostname kernel: [ 15.612802] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 3.
    2019-10-29T00:33:32.041981+07:00 hostname kernel: [ 15.612803] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 4.
    2019-10-29T00:33:32.041982+07:00 hostname kernel: [ 15.612804] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 5.
    2019-10-29T00:33:32.041983+07:00 hostname kernel: [ 15.612805] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 6.
    2019-10-29T00:33:32.041984+07:00 hostname kernel: [ 15.612806] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 7.
    2019-10-29T00:33:32.041985+07:00 hostname kernel: [ 15.612806] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 8.

    How to solve this?


    Thanks and regards

    Thomson Malau

  2. Re: mce detected memory error

    Quote Originally Posted by thomsonmalau View Post
    Hi Experts,
    I found some interesting issue in /var/log/messages. I installed SLES 12 SP3 for SAP on Lenovo X3850 X6.

    2019-10-17T18:02:15.441498+07:00 hostname mcelog[5163]: Running trigger `socket-memory-error-trigger'
    2019-10-17T18:02:15.441503+07:00 hostname mcelog[5163]: Hardware event. This is not a software error.
    2019-10-17T18:02:15.441548+07:00 hostname mcelog[5163]: Corrected error
    2019-10-17T18:02:15.441575+07:00 hostname mcelog[5163]: Transaction: Memory read error
    2019-10-17T18:02:15.441579+07:00 hostname mcelog[5163]: MemCtrl: Corrected memory read error


    After surfing some solution, I found this (https://www.suse.com/support/kb/doc/?id=7022118). I add that kernel options (mce=ignore_ce).
    And then this error/invalid appear on

    2019-10-29T00:33:32.041734+07:00 hostname kernel: [ 9.938351] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
    2019-10-29T00:33:32.041973+07:00 hostname kernel: [ 15.612800] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 0.
    2019-10-29T00:33:32.041974+07:00 hostname kernel: [ 15.612801] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 1.
    2019-10-29T00:33:32.041975+07:00 hostname kernel: [ 15.612801] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 2.
    2019-10-29T00:33:32.041980+07:00 hostname kernel: [ 15.612802] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 3.
    2019-10-29T00:33:32.041981+07:00 hostname kernel: [ 15.612803] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 4.
    2019-10-29T00:33:32.041982+07:00 hostname kernel: [ 15.612804] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 5.
    2019-10-29T00:33:32.041983+07:00 hostname kernel: [ 15.612805] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 6.
    2019-10-29T00:33:32.041984+07:00 hostname kernel: [ 15.612806] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 7.
    2019-10-29T00:33:32.041985+07:00 hostname kernel: [ 15.612806] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 8.

    How to solve this?


    Thanks and regards

    Thomson Malau
    Hi
    Are you sure it's not a real hardware problem with the RAM? Have you tested the ram, reseated it?
    Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
    SUSE SLE, openSUSE Leap/Tumbleweed (x86_64) | GNOME DE
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below... Thanks!

  3. #3

    Re: mce detected memory error

    Hi Malcolm,
    Thanks for your reply.

    Quote Originally Posted by malcolmlewis View Post
    Hi
    Are you sure it's not a real hardware problem with the RAM?
    When we saw hardware log, there is no error about memory or other failure.

    Quote Originally Posted by malcolmlewis View Post
    Have you tested the ram, reseated it?
    I already tested system with stress-test tools (stress-ng) and running well.


    Thomson Malau

  4. Re: mce detected memory error

    Hi
    Then perhaps you can go in and tweak the trigger?

    The configuration files are in /etc/mcelog/ then the man pages... man mcelog.triggers, man mcelog.conf etc
    Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
    SUSE SLE, openSUSE Leap/Tumbleweed (x86_64) | GNOME DE
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below... Thanks!

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •