64-bit linux : Error code 131 (0x83)


Advanced search

Message boards : Number crunching : 64-bit linux : Error code 131 (0x83)

AuthorMessage
Profile [BMF] kdr_98
Send message
Joined: Jan 11 07
Posts: 6
Credit: 798,180
RAC: 0
Message 7382 - Posted 8 Dec 2007 22:46:55 UTC

    Last modified: 8 Dec 2007 22:47:29 UTC

    I have several WU's ending with an error stats 131.
    (this happens about 2 times a day).
    Some examples :
    WU 5429534
    WU 5429290
    WU 5384334
    WU 5393589

    ____________

    Member of BOINC.BE

    Profile [BMF] kdr_98
    Send message
    Joined: Jan 11 07
    Posts: 6
    Credit: 798,180
    RAC: 0
    Message 7388 - Posted 9 Dec 2007 16:37:55 UTC - in response to Message 7382.

      I have several WU's ending with an error stats 131.
      (this happens about 2 times a day).
      Some examples :
      WU 5429534
      WU 5429290
      WU 5384334
      WU 5393589


      I think I have found the problem one of the cores of my quad had to high temperatures. There were faults in other projects as well.

      ____________

      Member of BOINC.BE

      Dagorath
      Send message
      Joined: Jan 7 07
      Posts: 381
      Credit: 3,365,400
      RAC: 0
      Message 7389 - Posted 9 Dec 2007 18:43:13 UTC - in response to Message 7388.


        I think I have found the problem one of the cores of my quad had to high temperatures. There were faults in other projects as well.


        One core was too hot... interesting. How can that happen? How does one fix it? I am guessing the cores are arranged in 2 X 2 matrix and the heat sink is not making good thermal contact in the area directly over the hot core but it is making good thermal contact in the area over the 3 other cores.

        Profile [BMF] kdr_98
        Send message
        Joined: Jan 11 07
        Posts: 6
        Credit: 798,180
        RAC: 0
        Message 7396 - Posted 10 Dec 2007 21:08:42 UTC - in response to Message 7389.

          Last modified: 10 Dec 2007 21:10:18 UTC


          I think I have found the problem one of the cores of my quad had to high temperatures. There were faults in other projects as well.


          One core was too hot... interesting. How can that happen? How does one fix it? I am guessing the cores are arranged in 2 X 2 matrix and the heat sink is not making good thermal contact in the area directly over the hot core but it is making good thermal contact in the area over the 3 other cores.


          The first core was about 2 degrees hotter then the other 3, and when the temperature rises to 72°C it starting to give faults. The other cores were below that temperature. I removed the heatsinck and placed it back with some thermal pasta and now it seems to be working better.
          Strange you could easyly see this in the project under windows.
          1 WU gave an application fault and restarted,
          ____________

          Member of BOINC.BE

          Profile Webmaster Yoda
          Avatar
          Send message
          Joined: Dec 31 06
          Posts: 81
          Credit: 4,544,249
          RAC: 0
          Message 7421 - Posted 14 Dec 2007 5:39:35 UTC

            Last modified: 14 Dec 2007 5:42:05 UTC

            I don't know the configuration of the dies, other than that they appear to sit beside each other (4 in a row).

            On my quads, temps tend to be highest on cores 0 and 3 - not sure whether that's the inside or outside ones.

            Right now core temps (degrees Celcius) are:

            Q6600 (liquid cooled) at 3.6GHz: 61 - 58 - 54 - 62
            Q6600 (air cooled) at 3.0GHz: 62 - 57 - 60 - 61

            So in both cases, 0 and 3 are the hottest (albeit marginal on the slower one)

            I don't think I've had many errors (except ABC work units crashing when the network goes down occasionally), though I don't look at each and every work unit.
            ____________


            Join the #1 Aussie Alliance on ABC@Home

            Post to thread

            Message boards : Number crunching : 64-bit linux : Error code 131 (0x83)


            Return to ABC@home main page


            Copyright © 2013 University of Leiden