Computation errors...


Advanced search

Message boards : Problems : Computation errors...

AuthorMessage
jedirock
Avatar
Send message
Joined: Sep 15 07
Posts: 6
Credit: 777,259
RAC: 0
Message 9196 - Posted 1 Nov 2008 18:08:58 UTC

    My machine seems to be having troubles with crunching ABC WUs. Out of three that have been returned so far, two have had errors, another one was validated. These are the machine's tasks: http://abcathome.com/results.php?hostid=64997 It's running Fedora 8 x86_64 on a Core 2 Duo clocked at 2.53GHz and 2GB of DDR3 RAM. Both WUs have quit with SIGSEGVs. Anyone have an idea what's going on?

    Profile mikey
    Avatar
    Send message
    Joined: Aug 8 08
    Posts: 1244
    Credit: 4,574,433
    RAC: 1,476
    Message 9199 - Posted 1 Nov 2008 20:54:48 UTC - in response to Message 9196.

      My machine seems to be having troubles with crunching ABC WUs. Out of three that have been returned so far, two have had errors, another one was validated. These are the machine's tasks: http://abcathome.com/results.php?hostid=64997 It's running Fedora 8 x86_64 on a Core 2 Duo clocked at 2.53GHz and 2GB of DDR3 RAM. Both WUs have quit with SIGSEGVs. Anyone have an idea what's going on?


      Boinc Wiki says"
      "ERR_FILE_TOO_BIG -131

      One of the output files is bigger than the maximum set by the project for upload.
      BOINC will not try to upload this file.

      Solution: Go to the project's forums and report this behavior."

      You have done the reporting part!

      When was the last time you turned your pc off? I had some of these errors and turning the pc off for about an hour seemed to fix the problems.
      ____________

      fractal
      Send message
      Joined: Aug 16 08
      Posts: 3
      Credit: 1,061,242
      RAC: 0
      Message 9200 - Posted 1 Nov 2008 20:55:23 UTC

        It may be a batch of bad wu's. I have something similar on http://abcathome.com/results.php?hostid=64801 and http://abcathome.com/results.php?hostid=59329

        It may be a problem with multiple projects as I am converting those machines over from a different project and may not have selected "keep applications in memory when suspended" on those machines when the errors occurred.

        jedirock
        Avatar
        Send message
        Joined: Sep 15 07
        Posts: 6
        Credit: 777,259
        RAC: 0
        Message 9202 - Posted 1 Nov 2008 22:24:43 UTC

          I could believe it was a batch of bad WUs. The machine was on for about a hour before the errors occurred. I also set suspended applications to be kept in memory, as I read that that fixed some problems before. The next two WUs are in the pipeline at 32% and 16%, we'll see how they go.

          Profile S@NL - FilmFreak
          Forum moderator
          Project administrator
          Project developer
          Send message
          Joined: Mar 14 07
          Posts: 33
          Credit: 318,089
          RAC: 0
          Message 9214 - Posted 4 Nov 2008 8:50:48 UTC - in response to Message 9202.

            Thanks for reporting the errors and please keep an eye on this. If there are any more errors please let me know, I'll look into it.
            ____________
            "Life is short and meaningless, unless you make the best of it."

            jedirock
            Avatar
            Send message
            Joined: Sep 15 07
            Posts: 6
            Credit: 777,259
            RAC: 0
            Message 9215 - Posted 4 Nov 2008 16:39:35 UTC - in response to Message 9214.

              Last modified: 4 Nov 2008 16:48:22 UTC

              Thanks for reporting the errors and please keep an eye on this. If there are any more errors please let me know, I'll look into it.

              I haven't seen anything more yet, but I enabled the keep applications in memory option, and my Linux 64 box has been booted into XP over the past few days to clear out another cache of work. I'll keep an eye on it.

              EDIT: 5 more WUs just returned, all valid. Whatever it is seems to have either disappeared or was related to memory suspension. BTW, I do not have CPU throttling enabled.

              Len Goddard
              Send message
              Joined: Nov 19 08
              Posts: 3
              Credit: 1,167
              RAC: 0
              Message 9297 - Posted 19 Nov 2008 23:00:23 UTC

                I subscribed to this project today (at work). So far my system is reporting 13 computation errors out of 13 units.

                The system is a pentium 4 running ubuntu linux (8.04). Please let me know if there is any diagnostic information required and I will append it tomorrow when I get back to the office.

                Profile Rebirther
                Avatar
                Send message
                Joined: Nov 21 06
                Posts: 26
                Credit: 225,395
                RAC: 0
                Message 9299 - Posted 20 Nov 2008 12:11:37 UTC - in response to Message 9297.

                  I subscribed to this project today (at work). So far my system is reporting 13 computation errors out of 13 units.

                  The system is a pentium 4 running ubuntu linux (8.04). Please let me know if there is any diagnostic information required and I will append it tomorrow when I get back to the office.


                  Do you have this problem in other projects too?
                  Perhaps could be some issues with damaged RAM.

                  Profile mikey
                  Avatar
                  Send message
                  Joined: Aug 8 08
                  Posts: 1244
                  Credit: 4,574,433
                  RAC: 1,476
                  Message 9301 - Posted 20 Nov 2008 12:53:06 UTC - in response to Message 9297.

                    I subscribed to this project today (at work). So far my system is reporting 13 computation errors out of 13 units.

                    The system is a pentium 4 running ubuntu linux (8.04). Please let me know if there is any diagnostic information required and I will append it tomorrow when I get back to the office.


                    Did you tell your IT folks that you were doing this? You may need permissions to let it do what it needs to do. I checked the latest unit and it had lots of errors in it.
                    ____________

                    Len Goddard
                    Send message
                    Joined: Nov 19 08
                    Posts: 3
                    Credit: 1,167
                    RAC: 0
                    Message 9302 - Posted 20 Nov 2008 12:54:35 UTC

                      No problems with einstein@home. I found the "retain in memory" option and I'm trying that although I would prefer not to have to pin memory in that way.

                      zombie67 [MM]
                      Avatar
                      Send message
                      Joined: Dec 27 06
                      Posts: 111
                      Credit: 2,074,629
                      RAC: 286
                      Message 9303 - Posted 20 Nov 2008 13:31:49 UTC - in response to Message 9302.

                        No problems with einstein@home. I found the "retain in memory" option and I'm trying that although I would prefer not to have to pin memory in that way.


                        It doesn't actually "pin" memory. If that ram is needed, it will cache that data on the HD.
                        ____________
                        Dublin, CA
                        Team SETI.USA

                        pensierolaterale
                        Send message
                        Joined: Sep 23 08
                        Posts: 3
                        Credit: 169,369
                        RAC: 0
                        Message 9305 - Posted 20 Nov 2008 20:20:19 UTC

                          I would like to report that I am experiencing a lot of this computation errors too

                          my results:
                          http://abcathome.com/results.php?userid=23669

                          ciao

                          Len Goddard
                          Send message
                          Joined: Nov 19 08
                          Posts: 3
                          Credit: 1,167
                          RAC: 0
                          Message 9308 - Posted 21 Nov 2008 8:21:15 UTC

                            The retain in memory option appears to have solved the problem

                            Profile KSMarksPsych
                            Avatar
                            Send message
                            Joined: Nov 21 06
                            Posts: 47
                            Credit: 1,755,780
                            RAC: 0
                            Message 9309 - Posted 21 Nov 2008 8:57:44 UTC - in response to Message 9305.

                              I would like to report that I am experiencing a lot of this computation errors too

                              my results:
                              http://abcathome.com/results.php?userid=23669

                              ciao


                              Have you tried leaving apps in memory?
                              ____________
                              Kathryn :o)
                              The BOINC FAQ Service
                              The Unofficial BOINC Wiki
                              The Trac System
                              More BOINC information than you can shake a stick of RAM at.

                              Profile mikey
                              Avatar
                              Send message
                              Joined: Aug 8 08
                              Posts: 1244
                              Credit: 4,574,433
                              RAC: 1,476
                              Message 9311 - Posted 21 Nov 2008 10:45:39 UTC - in response to Message 9308.

                                The retain in memory option appears to have solved the problem


                                Good, I wish it were a default instead of having to check it when people subscribe to more than one project per machine. It would solve alot of problems. The checkpointing seems to need some tweaking, that is what is supposed to pick right up where you left off when you come back to a project after switching.
                                ____________

                                pensierolaterale
                                Send message
                                Joined: Sep 23 08
                                Posts: 3
                                Credit: 169,369
                                RAC: 0
                                Message 9312 - Posted 21 Nov 2008 12:23:42 UTC - in response to Message 9309.

                                  I would like to report that I am experiencing a lot of this computation errors too

                                  my results:
                                  http://abcathome.com/results.php?userid=23669

                                  ciao


                                  Have you tried leaving apps in memory?


                                  Yes since yesterday, I will check the next results to see if it is working.

                                  ktf
                                  Send message
                                  Joined: Feb 4 07
                                  Posts: 14
                                  Credit: 139,932
                                  RAC: 0
                                  Message 9313 - Posted 21 Nov 2008 16:50:15 UTC

                                    I still have this -131-errors... we're wasting power here ^^

                                    Dagorath
                                    Send message
                                    Joined: Jan 7 07
                                    Posts: 381
                                    Credit: 3,365,400
                                    RAC: 0
                                    Message 9315 - Posted 21 Nov 2008 22:23:32 UTC - in response to Message 9313.

                                      I still have this -131-errors... we're wasting power here ^^


                                      Have you selected "leave applications in memory when suspended" in your preferences?

                                      Remember there are 2 ways to select it:

                                      1) If you adjust that setting in your website preferences then you need to click "Update preference" on the web page and then select that project in BOINC manager and click "Update". If you do not do all that then the setting will not take effect.

                                      2) If you select it in your preferences in BOINC manager then you also have to click Advanced -> Read Local Prefs File to make the setting take effect.

                                      ktf
                                      Send message
                                      Joined: Feb 4 07
                                      Posts: 14
                                      Credit: 139,932
                                      RAC: 0
                                      Message 9318 - Posted 22 Nov 2008 14:32:47 UTC

                                        Yes, it is turned on, but I actually use this computer and shut it down every night, so it seems to me it doesn't make any sense to turn it on. Futhermore, I use all of the available memory for foreground-programs, so it will cache to HD anyway.

                                        jedirock
                                        Avatar
                                        Send message
                                        Joined: Sep 15 07
                                        Posts: 6
                                        Credit: 777,259
                                        RAC: 0
                                        Message 9319 - Posted 22 Nov 2008 16:05:16 UTC - in response to Message 9318.

                                          Last modified: 22 Nov 2008 16:05:34 UTC

                                          ...but I actually use this computer and shut it down every night...

                                          That'd be why. Shutting it down flushes the memory, while most people on here will leave their computers running 24/7. ABC apps really need some better checkpointing... If you guys need help, I'd be glad to oblige.

                                          Dagorath
                                          Send message
                                          Joined: Jan 7 07
                                          Posts: 381
                                          Credit: 3,365,400
                                          RAC: 0
                                          Message 9320 - Posted 22 Nov 2008 16:59:43 UTC - in response to Message 9318.

                                            Last modified: 22 Nov 2008 17:02:11 UTC

                                            KTF,

                                            If ABC is the only project that gives you problems then detach it for now. The current run is nearly done and they have said there will definitely be a new app for the next run. Perhaps their new app will work better for you.

                                            ktf
                                            Send message
                                            Joined: Feb 4 07
                                            Posts: 14
                                            Credit: 139,932
                                            RAC: 0
                                            Message 9322 - Posted 23 Nov 2008 10:20:24 UTC

                                              Last modified: 23 Nov 2008 10:20:43 UTC

                                              Okay, I'll detach for now :) I'll be back soon ^^

                                              pensierolaterale
                                              Send message
                                              Joined: Sep 23 08
                                              Posts: 3
                                              Credit: 169,369
                                              RAC: 0
                                              Message 9328 - Posted 25 Nov 2008 20:51:53 UTC - in response to Message 9312.

                                                Last modified: 25 Nov 2008 20:52:41 UTC



                                                Have you tried leaving apps in memory?


                                                Yes since yesterday, I will check the next results to see if it is working.


                                                My last results are ok with the keep in memory option enabled

                                                Profile mikey
                                                Avatar
                                                Send message
                                                Joined: Aug 8 08
                                                Posts: 1244
                                                Credit: 4,574,433
                                                RAC: 1,476
                                                Message 9333 - Posted 26 Nov 2008 10:20:28 UTC - in response to Message 9328.



                                                  Have you tried leaving apps in memory?


                                                  Yes since yesterday, I will check the next results to see if it is working.


                                                  My last results are ok with the keep in memory option enabled


                                                  For some reason that seems to do the trick a lot of the time.
                                                  ____________

                                                  Post to thread

                                                  Message boards : Problems : Computation errors...


                                                  Return to ABC@home main page


                                                  Copyright © 2013 University of Leiden