New application!


advanced search

Message boards : Questions and Answers : Beta Testing : New application!

Reply to this thread
Subscribe to this thread
Sort
AuthorMessage
wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10041 - Posted 5 Jun 2009 0:23:22 UTC
Last modified: 5 Jun 2009 0:27:38 UTC

Hi all,

The new and improved abc finder application (previous announcement) is ready for testing! We would really appreciate if you could help test it. Hopefully the testing will not take long, since the new client is basically the well-tested old abc@home client code with a new algorithm inserted.


Right now we have win32, linux i686 and linux x86_64 binaries available. Mac OS X and win64 binaries will follow as soon as we have build machines for those platforms set up again, hopefully still in time for the tests.


Since our old abcbeta machine has sadly died completely, I have set up a new (very basically configured) project with about 850 randomly selected new workunits and a quorum of two results per workunit. The URL to attach to is:

http://abctest.math.leidenuniv.nl/abcbeta/


Please let me know of any issues you encounter, or things you think could be improved. I'll be online most of tomorrow and the weekend to address any problems that pop up. I added some extra comments and questions below.


Thanks!
-Willem Jan




What may need special attention:

The amount of credits:
Feedback on this is very much welcomed. Do you think the new application gives the right amount of credit?

Stability:
Does the application crash? If you pause/stop it, does it resume properly?

Progress report:
The new application first spends a couple of seconds on determining how many potential triples will be scanned in this workunit. When this scan is complete, it uses this number to estimate how much time remains. This seems to be fairly accurate, since the amount of time per triple doesn't change much inside a workunit, but please let me know if it over/underestimates the required time for you.

Total time for a workunit:
We have tried to make all workunits roughly the same
length (something in the order of 4 hours, depending a lot on CPU speed of course), and I think it should be more consistent than with the old application. However, there are still some workunits near the edges of the (two dimensional) search area that are longer or shorter than expected, because our time heuristics don't work as well in corner cases. If you run into extreme examples, or think the default length should be shorter or longer, please let
me know.




Some more technical notes:

I looked again at the possibility of a CUDA GPU version, but it doesn't seem like a good fit, since our new algorithm requires fast random memory access, and operates on large integers. GPUs seem to be more suited for localized memory access and floating point data, on the other hand. So sadly I do not think GPU support would increase performance.

The code fundamentally uses 64 bit integers since we are looking for ABC triples that don't fit in 32 bit integers. In the new code we included some fast 64 bit arithmetic that is not possible on 32 bit machines, so the speed difference between 64 and 32 bit machines will likely be larger than with the old algorithm.

There were some unexplained crashes with the old application on certain linux distributions. If my guess on what caused those is correct, they should not
occur with the new application.

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10042 - Posted 5 Jun 2009 0:39:33 UTC

I'll start off with the first bug report myself:
It seems the workunit metadata (number of triples found, total number of triples to scan and such) doesn't always get recomputed properly after aborting and resuming, which will either cause a reset of the workunit's computation, or a validation failure at the end.

I'll fix this first thing in the morning tomorrow.

DoctorNow User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 11
ID: 24
Credit: 85,308
RAC: 1
Message 10043 - Posted 5 Jun 2009 5:57:06 UTC

Great news! :-)

Good to hear Beta is back.
Is it possible, that we can tell the stats sites to get the stats or do you think it's too early?
____________
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10044 - Posted 5 Jun 2009 9:10:41 UTC

Ready to test... first WUs are already running. :-)
____________
Lovely greetings from Cori

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10046 - Posted 5 Jun 2009 9:50:05 UTC

When shutting down BOINC & restarting it again on my Ubuntu Linux Box the Progession & To Completion showed nothing but the CPU Time remained the same as it was before shutting down.

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10047 - Posted 5 Jun 2009 10:14:22 UTC - in response to Message ID 10046.

When shutting down BOINC & restarting it again on my Ubuntu Linux Box the Progession & To Completion showed nothing but the CPU Time remained the same as it was before shutting down.

Almost the same on my Vista x64 host. Only difference:
After restarting BOINC the CPU time picked up where it left off but the progress bar jumped to 100%. ;-)
The WUs are crunching along happily so far though.
____________
Lovely greetings from Cori

Crystal Pellet
private message
Joined: Jan 16, 2007
Posts: 14
ID: 1251
Credit: 689,531
RAC: 0
Message 10048 - Posted 5 Jun 2009 11:54:57 UTC
Last modified: 5 Jun 2009 12:00:34 UTC

I suspended a task after 34 minutes runtime and 13,549% progress.
Resumed after a minute and the cpu was running @ 100%, but the progress stayed on 13,549%. To test that progress would not go on after another 34 minutes let it run to 74 minutes runtime, still no progress.
Restarted the Boinc client.
Progress restarted at 0%, but cpu went on with the initial 74 minutes.
Let it run now without interruption. Atm at 35% and 170 minutes (incl. initial 74min)
Linux64 AMD Phenom X4 2,1GHz.

I think this is the failure Willem Jan already mentioned, but here behaviour a bit extended described.

Sirius B User profile image
Avatar
private message
Joined: Mar 12, 2008
Posts: 13
ID: 16872
Credit: 1,007,534
RAC: 2,724
Message 10049 - Posted 5 Jun 2009 12:09:29 UTC

After attaching to the beta project, had numerous wu's d/l'ed. At this moment, have 4 showing 100% completion, but they are still crunching. Original completion time showed 00:48:22 & currently at 4hrs:30+. Is this to be expected?
____________

zombie67
Avatar
private message
Joined: Dec 27, 2006
Posts: 111
ID: 330
Credit: 1,735,320
RAC: 0
Message 10050 - Posted 5 Jun 2009 13:30:47 UTC

I am also seeing apps that show as 100%, but still running.
____________
Dublin, CA
Team SETI.USA

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10051 - Posted 5 Jun 2009 13:35:15 UTC - in response to Message ID 10050.

I am also seeing apps that show as 100%, but still running.


Same here on a Windows XP Pro 64 Bit Box, have 2 @ 100% but still running. I did have 2 64 Bit Linux finish though in around 4 hours running time.

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10052 - Posted 5 Jun 2009 13:44:56 UTC

On my XP x64 lappy I have one WU running @56.5% after 4 hours...
____________
Lovely greetings from Cori

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10053 - Posted 5 Jun 2009 13:55:59 UTC - in response to Message ID 10052.

On my XP x64 lappy I have one WU running @56.5% after 4 hours...


I'd fire that Lappy ... ;P ... the 2 that finished in 4 hr's were on a Lappy of mine ... :)

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10054 - Posted 5 Jun 2009 13:58:23 UTC

It sounds likely those are symptoms of the resume bug. I found the problem, and it was something that occurred only after it printed the (correct) numbers to stderr.txt, which was why I missed it before. New builds with a fix should be up soon.

The first bunch of workunits is still a bit variable in length I think. The workunits that have been completed so far have taken between 1.9 and 9.9 hours. Hopefully the runtime will stabilize after the first couple of dozen from the test set.

-Willem Jan

zombie67
Avatar
private message
Joined: Dec 27, 2006
Posts: 111
ID: 330
Credit: 1,735,320
RAC: 0
Message 10055 - Posted 5 Jun 2009 14:27:37 UTC

Should we abort the tasks in progress, then?
____________
Dublin, CA
Team SETI.USA

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10056 - Posted 5 Jun 2009 14:28:06 UTC
Last modified: 5 Jun 2009 14:29:22 UTC

New linux i686 and x86_64 clients (version 1.01) are up that fix the resume problem. If you have a running task that has been suspended and resumed with 1.00, that has unfortunately been corrupted, so it is best to abort it.

I'll build a new Win32 binary when I get home in a couple of hours, and will try to do a Win64 at the same time.

-Willem Jan

MarymommyP User profile image
private message
Joined: Oct 6, 2007
Posts: 2
ID: 10560
Credit: 18,680
RAC: 22
Message 10057 - Posted 5 Jun 2009 14:59:09 UTC

getting through the task but there is no end date/time.

Crystal Pellet
private message
Joined: Jan 16, 2007
Posts: 14
ID: 1251
Credit: 689,531
RAC: 0
Message 10058 - Posted 5 Jun 2009 15:37:42 UTC
Last modified: 5 Jun 2009 15:39:45 UTC

The task mentioned in my former post ended successfull as far I can see after 6.5 hours runtime. This is including the 74 minutes double running.
After 100% normal ending and uploading.

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10059 - Posted 5 Jun 2009 15:49:37 UTC - in response to Message ID 10058.

The task mentioned in my former post ended successfull as far I can see after 6.5 hours runtime. This is including the 74 minutes double running.
After 100% normal ending and uploading.


Yes, you're right; everything seems ok on that run. In the stderr.txt you can see that after the second resume the code detected the state file was inconsistent (because of the 1.00 bug), and therefore it restarted from scratch.

-Willem Jan

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10060 - Posted 5 Jun 2009 16:08:13 UTC

Hummmmmmm, I think I'll be moving on now if .001 Credits is all your giving for 5 Hours of Processing. Every WU I have Validated has got this amount so far ...

Task ID 4062
Name abc_sieve_wu_00002609_0
Workunit 1887
Created 4 Jun 2009 22:46:15 UTC
Sent 5 Jun 2009 7:25:54 UTC
Received 5 Jun 2009 14:29:13 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 22
Report deadline 8 Jun 2009 7:25:54 UTC
CPU time 18801.66703
stderr out <core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
[08:35:43 ABC] starting abc client code_version 390
[08:35:43 ABC] happy fishing - the ABC-crew!
[08:35:43 ABC] detected a little endian machine at runtime
[08:35:43 ABC] retrieved parameters: id = 2609, rx in [1,1794), ry in [15835,17223), gcd(x,210) = 1, c in [1,1000000000000000000), bs 3000, sb 200000
[08:35:43 ABC] can't open abc_sieve_state to read header data
[08:35:43 ABC] starting from scratch
[13:49:41 ABC] finished with hits=8513

</stderr_txt>
]]>

Validate state Valid
Claimed credit 174.375278363863
Granted credit 0.001
application version 1.00

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10061 - Posted 5 Jun 2009 16:08:46 UTC - in response to Message ID 10053.
Last modified: 5 Jun 2009 16:10:29 UTC

On my XP x64 lappy I have one WU running @56.5% after 4 hours...


I'd fire that Lappy ... ;P

No, please... not my Dell!


... the 2 that finished in 4 hr's were on a Lappy of mine ... :)

My WU on the lappy is now @89.425% after 6:10 hrs!
PS. Damn, I have forgotten if that WU is a restarted one or if it's really that long.
I think this one has been running through from the beginning... I want at least one 'good' WU.
____________
Lovely greetings from Cori

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10062 - Posted 5 Jun 2009 16:11:47 UTC
Last modified: 5 Jun 2009 16:24:42 UTC

I want at least one 'good' WU


You'll need 100 good ones to get on the Board though with the .001 Credits being Granted ... hahaha ;)

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10063 - Posted 5 Jun 2009 16:24:52 UTC

Oops, 0.001 credits? I'm looking into that now.

-Willem Jan


P.S. Version 1.01 for Win32 is now online, fixing the resume problem.

Sirius B User profile image
Avatar
private message
Joined: Mar 12, 2008
Posts: 13
ID: 16872
Credit: 1,007,534
RAC: 2,724
Message 10064 - Posted 5 Jun 2009 16:27:06 UTC
Last modified: 5 Jun 2009 16:29:52 UTC

The 4 on my workstation are close to 9hrs & still crunching.

3 on my server have completed......OUCH!!!!

2019
2019
2018

Both wu's have the same id but different task id's 4327 & 4326

Edit: - it looks like I was my own wingman :)
____________

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10065 - Posted 5 Jun 2009 16:28:13 UTC - in response to Message ID 10063.

... P.S. Version 1.01 for Win32 is now online, fixing the resume problem.

Can I finsih the two old (1.0) WUs I have still chrunching, they seem to have run fine so far? ;-)

PS. But lappy has to test the new ones as well... or maybe I'll better wait until the Win 64-bit app is ready. ;-)))
____________
Lovely greetings from Cori

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10066 - Posted 5 Jun 2009 16:32:48 UTC - in response to Message ID 10065.

... P.S. Version 1.01 for Win32 is now online, fixing the resume problem.

Can I finsih the two old (1.0) WUs I have still chrunching, they seem to have run fine so far? ;-)


Yes, if they never suspended/resumed, everything is fine. The only change between 1.00 and 1.01 was the resume fix.

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10067 - Posted 5 Jun 2009 16:44:08 UTC - in response to Message ID 10066.
Last modified: 5 Jun 2009 17:08:35 UTC

... P.S. Version 1.01 for Win32 is now online, fixing the resume problem.

Can I finsih the two old (1.0) WUs I have still chrunching, they seem to have run fine so far? ;-)


Yes, if they never suspended/resumed, everything is fine. The only change between 1.00 and 1.01 was the resume fix.


Cool. My lappy will finish the WU in about 15 minutes. After 7 hrs, wow... :-)
The quad has started later and should finish in 35 minutes with a run time of a bit over 4 hrs.

EDIT: Lappy's result :-)
____________
Lovely greetings from Cori

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10069 - Posted 5 Jun 2009 18:10:19 UTC
Last modified: 5 Jun 2009 18:18:10 UTC

1.01 binaries for new platforms are up: Win64, OS X i686, OS X PPC.

The Win64 binary doesn't yet have all 64 bit optimizations as I haven't ported the assembly parts from gcc syntax to MSVC syntax yet. Once that is done, it should be a bit faster than 1.01. Even 1.01 should already give a very decent speed-up over the Win32 version, though.

A colleague is working on a 64 bit Mac OS X 10.5 client, so that should be next.

-Willem Jan

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10070 - Posted 5 Jun 2009 19:00:38 UTC

What about the Credit Issue of only Granting .001 Credits ???

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10072 - Posted 5 Jun 2009 19:50:25 UTC - in response to Message ID 10070.

What about the Credit Issue of only Granting .001 Credits ???


I've made a change to the server just now which may fix it. We'll see if it worked when the next workunit is done.

-Willem Jan

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10073 - Posted 5 Jun 2009 22:07:23 UTC

Thanks for the help so far! On the client side, the results that have been returned all look good, except for the resumed tasks broken by v1.00.

On the server side, there is the 0.001 credit issue which I hope my last server change improved. The amount of granted credits will probably need tweaking, but let's first get it into numbers that aren't clearly wrong.

There's also still a bug in the assimilator as visible in the server status, but I hope to tackle that soon.

I'll be back tomorrow morning.

-Willem Jan



P.S. The 64 bit Mac OS X client is now up too.

zombie67
Avatar
private message
Joined: Dec 27, 2006
Posts: 111
ID: 330
Credit: 1,735,320
RAC: 0
Message 10074 - Posted 6 Jun 2009 5:15:49 UTC

I am testing now with:

64 bit leopard
32 bit leopard
PPC/tiger
____________
Dublin, CA
Team SETI.USA

Sirius B User profile image
Avatar
private message
Joined: Mar 12, 2008
Posts: 13
ID: 16872
Credit: 1,007,534
RAC: 2,724
Message 10075 - Posted 6 Jun 2009 6:09:51 UTC



As you can see from the screenshot, it looks like I'm my own wingman on numerous units. Also, will these complete or timeout?

Should they be aborted?
____________

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10076 - Posted 6 Jun 2009 9:24:22 UTC - in response to Message ID 10075.
Last modified: 6 Jun 2009 9:28:56 UTC


As you can see from the screenshot, it looks like I'm my own wingman on numerous units. Also, will these complete or timeout?

Should they be aborted?


With which client is this? If it is with the 1.00 binary, I'm afraid it's best to abort the 100%-and-running ones.


It seems the 0.001 credit issue is fixed now, by the way. I also changed the project config settings so now it should no longer be possible to be your own wingman. I had forgotten to change this from when I was the only user on the project for early testing.



-Willem Jan

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10077 - Posted 6 Jun 2009 10:20:50 UTC

It seems the 0.001 credit issue is fixed now


Yes it has, Thanks ... :)

Sirius B User profile image
Avatar
private message
Joined: Mar 12, 2008
Posts: 13
ID: 16872
Credit: 1,007,534
RAC: 2,724
Message 10079 - Posted 6 Jun 2009 13:07:06 UTC - in response to Message ID 10076.


As you can see from the screenshot, it looks like I'm my own wingman on numerous units. Also, will these complete or timeout?

Should they be aborted?


With which client is this? If it is with the 1.00 binary, I'm afraid it's best to abort the 100%-and-running ones.


It seems the 0.001 credit issue is fixed now, by the way. I also changed the project config settings so now it should no longer be possible to be your own wingman. I had forgotten to change this from when I was the only user on the project for early testing.

-Willem Jan


Just got back from london & found that several have reported in, so will leave them running, unless you think it best to abort them?

I was thinking of letting them run until their deadline of 07:36 8/6/09 & aborting those left as I don't think that all will complete, even though they are running on 3 quads only.
____________

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10080 - Posted 6 Jun 2009 15:01:17 UTC - in response to Message ID 10077.

It seems the 0.001 credit issue is fixed now


Yes it has, Thanks ... :)


Yup, I can confirm my very 1st granted result, too! :-)
(The rest is all pending, so come on wing-men...! *LOL*)


PS. I noticed the quorum of two uses the lower value for both WUs to grant right now, but I guess the credits adjustment will come later so this isn't a complaint. *grin*
____________
Lovely greetings from Cori

[AF>EDLS>Ouest]_Damien User profile image
Avatar
private message
Joined: Nov 21, 2006
Posts: 13
ID: 95
Credit: 6,774
RAC: 0
Message 10081 - Posted 6 Jun 2009 17:18:15 UTC - in response to Message ID 10075.

As you can see from the screenshot, it looks like I'm my own wingman on numerous units. Also, will these complete or timeout?

Should they be aborted?


I had the same problem, but only after a reboot of my PC during the time of their calculation. Then, I aborted these WU "finished" but always "in progress".

____________

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10082 - Posted 7 Jun 2009 10:15:45 UTC
Last modified: 7 Jun 2009 10:16:40 UTC

The WU's being sent are messing up the ability to get work from other Projects because they are way over estimated on how long their going to take to run. Some say 60+ Hours & then BOINC thinks you have all this work on your Computer & won't send you any work from the other Projects.

Sorry to say I'm just going to Abort them, even the ones in Progress because I think they are taking to long to run when compared to the ones I was getting a few days ago.

Also the Credits are a mess, you get 10 Credits Per Hour for 1 WU & 25-30 for the next, using the lowest claimed is the worst way there is to give Credits. I know it keeps the Cheaters in line somewhat but is screws every body else too ... IMO

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10083 - Posted 7 Jun 2009 10:52:52 UTC

PoorBoy: I understand completely. Thanks for the computed workunits and the reports. They'll be very useful for solving this problem. Do you have the names (abc_sieve_wu_XXXXXXXX) of specific workunits that are wrong? That would help with determining the cause.


In any case, there definitely seems to be something wrong with the initial runtime estimates. One aspect of this appears to be that the number of FLOPS of a host (which BOINC bases the run time estimateon) is often not a good indicator of the actual speed, since for example 32/64 bit differences can be very large.


Does the progress counter itself work properly? I.e., does the completed percentage seem to increase smoothly?


My plan for the credits is to link that to the number of triples scanned in a work unit, but I need to work out a fair formula for that.

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10085 - Posted 7 Jun 2009 12:12:55 UTC - in response to Message ID 10083.

... Does the progress counter itself work properly? I.e., does the completed percentage seem to increase smoothly?

From what I have seen it went pretty smooth (under all my Windows x64 OSes: XP, Vista and Win7). :-)

My plan for the credits is to link that to the number of triples scanned in a work unit, but I need to work out a fair formula for that.

Now that sounds cool! :-)
But it will only be "fair" if a WU containing less triples is running much shorter than one with many triples to be scanned... I hope I did get this right? ;-)
____________
Lovely greetings from Cori

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10086 - Posted 7 Jun 2009 12:30:24 UTC - in response to Message ID 10085.

... Does the progress counter itself work properly? I.e., does the completed percentage seem to increase smoothly?

From what I have seen it went pretty smooth (under all my Windows x64 OSes: XP, Vista and Win7). :-)


That's good news. Hopefully there's just a miscomputation of the initial runtime guess on the server, then.


My plan for the credits is to link that to the number of triples scanned in a work unit, but I need to work out a fair formula for that.

Now that sounds cool! :-)
But it will only be "fair" if a WU containing less triples is running much shorter than one with many triples to be scanned... I hope I did get this right? ;-)


Yes, that's correct. And theoretically this is the case. I'll analyze the timing results from computed workunits tomorrow to see if reality matches theory :-)


I'm very happy to see that so far there hasn't been a single crash or inconsistent output (other than those caused by the resume problem).


-Willem Jan

P.S. There was a serious performance problem in the Mac OS X clients. Version 1.02 for OS X fixes this. (i386/ppc are up now; x86_64 will hopefully follow soon.)

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10087 - Posted 7 Jun 2009 13:11:58 UTC - in response to Message ID 10086.
Last modified: 7 Jun 2009 13:12:15 UTC

My plan for the credits is to link that to the number of triples scanned in a work unit, but I need to work out a fair formula for that.

Now that sounds cool! :-)
But it will only be "fair" if a WU containing less triples is running much shorter than one with many triples to be scanned... I hope I did get this right? ;-)


Yes, that's correct. And theoretically this is the case. I'll analyze the timing results from computed workunits tomorrow to see if reality matches theory :-)

...

P.S. There was a serious performance problem in the Mac OS X clients. Version 1.02 for OS X fixes this. (i386/ppc are up now; x86_64 will hopefully follow soon.)


I am happily waiting for an updated x64 app then.
It will have more optimization included I guess so I am looking forward to a little speed test... ;-)

And hopefully the "credit theory" will work. *smiles*
____________
Lovely greetings from Cori

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10088 - Posted 7 Jun 2009 13:45:31 UTC - in response to Message ID 10083.
Last modified: 7 Jun 2009 13:47:39 UTC

PoorBoy: I understand completely. Thanks for the computed workunits and the reports. They'll be very useful for solving this problem. Do you have the names (abc_sieve_wu_XXXXXXXX) of specific workunits that are wrong? That would help with determining the cause.


In any case, there definitely seems to be something wrong with the initial runtime estimates. One aspect of this appears to be that the number of FLOPS of a host (which BOINC bases the run time estimateon) is often not a good indicator of the actual speed, since for example 32/64 bit differences can be very large.


Does the progress counter itself work properly? I.e., does the completed percentage seem to increase smoothly?


My plan for the credits is to link that to the number of triples scanned in a work unit, but I need to work out a fair formula for that.


Sorry but I just Aborted them before reading the Thread again so I have no Idea which ones they were. I have to leave for awhile but this afternoon I'll download some more & see which ones they were/are.

They seem to Progress okay but then I haven't been watching that closely either, some would be at 9 hr's though & only show 25& done, I didn't think these things ran that long ???

Might have something to do with the AQUA CUDA WU's I'm running to but it shouldn't has I've gave them their own Core to run on so they shouldn't interfere with the ABC WU's, or at least I wouldn't think they would.

They come all different though, 1 will say 14 hr's another 25 hr's & some as high as 60+ hr's. All my Systems are 64-Bit Windows XP Pro if that makes any difference ...

zombie67
Avatar
private message
Joined: Dec 27, 2006
Posts: 111
ID: 330
Credit: 1,735,320
RAC: 0
Message 10089 - Posted 8 Jun 2009 3:05:20 UTC - in response to Message ID 10074.
Last modified: 8 Jun 2009 3:05:47 UTC

I am testing now with:

64 bit leopard

Seems to have worked just fine. Waiting for my wingman to return his.

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=5255

32 bit leopard

Works, but credits are mush less than claimed:

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=5298

PPC/tiger

It appears to be working. But it will not complete in time. After 26 hours, it is only 6% complete, and sill not complete by the deadline tomorrow. So I have aborted it. With longer due dates, I think it would work.

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=5384
____________
Dublin, CA
Team SETI.USA

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10090 - Posted 8 Jun 2009 8:12:43 UTC - in response to Message ID 10089.

I am testing now with:

64 bit leopard

Seems to have worked just fine. Waiting for my wingman to return his.

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=5255

32 bit leopard

Works, but credits are mush less than claimed:

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=5298

PPC/tiger

It appears to be working. But it will not complete in time. After 26 hours, it is only 6% complete, and sill not complete by the deadline tomorrow. So I have aborted it. With longer due dates, I think it would work.

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=5384


Thanks for the report. There was a rather serious performance problem with the OS X clients that has been solved now. The new clients (1.02) are likely 5 times as fast as the old ones.

AMDave
private message
Joined: Dec 28, 2006
Posts: 1
ID: 348
Credit: 28,873
RAC: 0
Message 10092 - Posted 8 Jun 2009 12:16:14 UTC

Looked like all was ok with the client on Ubuntu x86_64, BOINC v 6.6.20

but the 'test' server has issues:

Mon 08 Jun 2009 22:09:17 EST abcbeta Message from server: Server error: can't attach shared memory

So I couldn't return anything.
____________

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10093 - Posted 8 Jun 2009 19:04:55 UTC - in response to Message ID 10090.

Thanks for the report. There was a rather serious performance problem with the OS X clients that has been solved now. The new clients (1.02) are likely 5 times as fast as the old ones.


Installed on my G5, one work unit downloaded and processing. Suspended everything else to get ABCbeta to run, now it's downloaded a ton of WUs. I think I'm over committed.
____________

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10095 - Posted 8 Jun 2009 22:28:18 UTC

New versions 1.03 are up for linux i686, linux x86_64, win32, windows x86_64.

They have a fix for a bug that could cause computation errors in corner cases. Also, the search range in which they can work has been expanded a bit.

The 1.03 win64 version now has improved arithmetic (using MSVC's arithmetic intrinsics) to match the other 64 bit clients. It gives a very nice 10% speed boost over 1.01.

Mac OS X versions for 1.03 will follow soon.

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10096 - Posted 8 Jun 2009 22:58:14 UTC

Cool! I've downloaded fresh WUs on my Vista and XP hosts (both x64). :-)

Only little problem: BOINC was downloading way too much work because the estimated time to completion was a bit too 'ambitious' (~30 minutes at the beginning)...
So I had to abort about the half of the new WUs to make sure BOINC won't start panicing. ;-)
I hope my cancelled WUs will be resent to someone else?

PS. Will we be able to notice the speed increase? I mean are the WUs the same as before so we will have shorter runtimes now?
____________
Lovely greetings from Cori

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10097 - Posted 8 Jun 2009 23:08:04 UTC - in response to Message ID 10096.


PS. Will we be able to notice the speed increase? I mean are the WUs the same as before so we will have shorter runtimes now?


Yes, you're right. The WUs are the same, and the runtimes will be shorter on Win64. In a sample run I did with a (fake) tiny workunit the runtime was reduced from 225s to 200s.

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10098 - Posted 9 Jun 2009 1:18:20 UTC - in response to Message ID 10096.
Last modified: 9 Jun 2009 1:21:10 UTC

Only little problem: BOINC was downloading way too much work because the estimated time to completion was a bit too 'ambitious' (~30 minutes at the beginning)...
So I had to abort about the half of the new WUs to make sure BOINC won't start panicing. ;-)
I hope my cancelled WUs will be resent to someone else?

I suspect that I 'm going to have to do the same thing, in three hours of computation the first WU is 14% complete and I have about 40 more WUs due in 2½ days. {edit} 10.4.11, 2.0 GHz DP G5, 8 GB RAM. {/edit}
____________

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10099 - Posted 9 Jun 2009 1:34:41 UTC
Last modified: 9 Jun 2009 1:49:45 UTC

Wow, the new 64-bit app for Windoze is really fast!!
Have completed several WUs in under 3 hours!

Maybe I should have kept my big load of WUs instead of cancelling the half of them.
But when I received almost 50 tasks at once (with a really low cache of 0.5 days) I was a bit worrying...
Well, next time I'm going to watch the WUs better before cancelling. ;-)))



EDIT: Ok, before I'm getting all too excited: there's some WU which will need ~3.5 or 4 hours but that's still quite an impressive speed improvement IMHO.
Guessing from a quick comparison with my older WUs I would say the new x64 app is ~30% faster than before.
____________
Lovely greetings from Cori

Sirius B User profile image
Avatar
private message
Joined: Mar 12, 2008
Posts: 13
ID: 16872
Credit: 1,007,534
RAC: 2,724
Message 10103 - Posted 9 Jun 2009 8:40:16 UTC

That's nice to hear. Been getting no work from ABC for over 36 hrs & run out of wu's, so switched rigs over to beta. XP X64 rig got plenty, but the rest keeps getting "shared memory error"
____________

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10105 - Posted 9 Jun 2009 9:33:52 UTC

Ooops, the "shared memory" error keeps me from reporting finished WUs! *sniff* ;-)))
____________
Lovely greetings from Cori

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10106 - Posted 9 Jun 2009 13:43:47 UTC

Anybody running these things, I have close to 24,000 Pending Credits piled up so far ... Getting the Shared memory message too when trying to connect ...

DoctorNow User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 11
ID: 24
Credit: 85,308
RAC: 1
Message 10107 - Posted 9 Jun 2009 18:29:45 UTC

Same here: "shared memory"... Waiting for reporting the WUs!
____________
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10108 - Posted 9 Jun 2009 21:00:20 UTC

Woo-hoo, servers are back! Just reported all my WUs! :-))
____________
Lovely greetings from Cori

zombie67
Avatar
private message
Joined: Dec 27, 2006
Posts: 111
ID: 330
Credit: 1,735,320
RAC: 0
Message 10109 - Posted 10 Jun 2009 3:18:54 UTC - in response to Message ID 10089.

PPC/tiger

It appears to be working. But it will not complete in time. After 26 hours, it is only 6% complete, and sill not complete by the deadline tomorrow. So I have aborted it. With longer due dates, I think it would work.


I ran a new one, with the *much* faster app 1.02. It validated.

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=6715

Credits are still much less than claimed.
____________
Dublin, CA
Team SETI.USA

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10110 - Posted 10 Jun 2009 3:26:46 UTC

abc_sieve_1.02_powerpc-apple-darwin is 80% through the first WU after 30 hours of computation, I've aborted all but the two WUs in progress.
____________

Sirius B User profile image
Avatar
private message
Joined: Mar 12, 2008
Posts: 13
ID: 16872
Credit: 1,007,534
RAC: 2,724
Message 10111 - Posted 10 Jun 2009 7:38:51 UTC

Server down again.
____________

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10113 - Posted 10 Jun 2009 10:28:31 UTC - in response to Message ID 10109.

I ran a new one, with the *much* faster app 1.02. It validated.

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=6715

Credits are still much less than claimed.

It looks to me like you finished the wu in 19 hours, is that correct? If so, I wonder why mine are running slower on a faster G5.
____________

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10114 - Posted 10 Jun 2009 11:29:45 UTC - in response to Message ID 10111.

Server down again.


The feeder seems to lose its connection to the mysql server from time to time, which is strange since it's an unmodified BOINC one. I'll try to add a keep-alive to hopefully prevent the mysql connection from dropping.

-Willem Jan

Sirius B User profile image
Avatar
private message
Joined: Mar 12, 2008
Posts: 13
ID: 16872
Credit: 1,007,534
RAC: 2,724
Message 10115 - Posted 10 Jun 2009 14:07:52 UTC - in response to Message ID 10114.

Server down again.


The feeder seems to lose its connection to the mysql server from time to time, which is strange since it's an unmodified BOINC one. I'll try to add a keep-alive to hopefully prevent the mysql connection from dropping.

-Willem Jan



Thanks Willem. It's not a problem though, as eventually, they'll all report in when ready.
____________

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10116 - Posted 10 Jun 2009 14:43:34 UTC - in response to Message ID 10113.
Last modified: 10 Jun 2009 15:07:20 UTC

I ran a new one, with the *much* faster app 1.02. It validated.

http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=6715

Credits are still much less than claimed.

It looks to me like you finished the wu in 19 hours, is that correct? If so, I wonder why mine are running slower on a faster G5.

I don't know if I'm reading it right but I think maybe your G5 is only running BOINC whereas mine is multi-tasking (this is my primary GP Mac)?

The first wu finally finished, slow and like zombie 67's wu less credit than claimed: http://abctest.math.leidenuniv.nl/abcbeta/result.php?resultid=6675 Second wu still in progress.
____________

zombie67
Avatar
private message
Joined: Dec 27, 2006
Posts: 111
ID: 330
Credit: 1,735,320
RAC: 0
Message 10122 - Posted 12 Jun 2009 2:11:34 UTC

Your task finished faster than mine. Compare the times.
____________
Dublin, CA
Team SETI.USA

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10129 - Posted 12 Jun 2009 19:25:13 UTC - in response to Message ID 10122.

Your task finished faster than mine. Compare the times.

I guess I've been reading the times wrong. BOINCManager said I had 40 hours on each of my first two work units, but the two results I've finished were about 15.5 and 17.75 hours according to the results. Is BOINCManager really that far off in it's elapsed time calculations?
____________

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10130 - Posted 12 Jun 2009 19:57:58 UTC - in response to Message ID 10129.
Last modified: 12 Jun 2009 19:58:21 UTC

Your task finished faster than mine. Compare the times.

I guess I've been reading the times wrong. BOINCManager said I had 40 hours on each of my first two work units, but the two results I've finished were about 15.5 and 17.75 hours according to the results. Is BOINCManager really that far off in it's elapsed time calculations?

Elapsed time in BOINC manager should be correct but estimated time to completion can be way off. ;-)

Only other scenario I can think of is that the wall-clock time needed for the WUs was for some reason much higher than the CPU time.
Hm... did you run some AQUA mt-WUs on that host, too?
____________
Lovely greetings from Cori

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10131 - Posted 13 Jun 2009 1:03:08 UTC - in response to Message ID 10130.

Elapsed time in BOINC manager should be correct but estimated time to completion can be way off. ;-)

Only other scenario I can think of is that the wall-clock time needed for the WUs was for some reason much higher than the CPU time.
Hm... did you run some AQUA mt-WUs on that host, too?

No, I'm not familiar with AQUA mt, do they have a PPC Mac application?
____________

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10132 - Posted 13 Jun 2009 1:10:33 UTC - in response to Message ID 10131.

Elapsed time in BOINC manager should be correct but estimated time to completion can be way off. ;-)

Only other scenario I can think of is that the wall-clock time needed for the WUs was for some reason much higher than the CPU time.
Hm... did you run some AQUA mt-WUs on that host, too?

No, I'm not familiar with AQUA mt, do they have a PPC Mac application?

Hm... they have an app for "Mac OS 10.4 or later running on Intel".
But that's different from PPC Mac I guess. :-(
____________
Lovely greetings from Cori

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10133 - Posted 13 Jun 2009 1:14:31 UTC - in response to Message ID 10130.

Elapsed time in BOINC manager should be correct but estimated time to completion can be way off. ;-)

Only other scenario I can think of is that the wall-clock time needed for the WUs was for some reason much higher than the CPU time.
Hm... did you run some AQUA mt-WUs on that host, too?

No, I'm not familiar with AQUA mt, do they have a PPC Mac application?
____________

Paul
private message
Joined: Feb 25, 2009
Posts: 1
ID: 29219
Credit: 7,747
RAC: 20
Message 10138 - Posted 16 Jun 2009 6:58:13 UTC

Hi

I download 1 task and it showed 38 minutes to finish it ran for over 3hours and was only 30% complete so it looked like at least a 9 hour run but you only had 2 days to do it today you download another 10 so that could have been about 100 hours to complete in 2 days.

I think they should either be quicker or more time given to complete and as I run a lot of other projects I had to abort all 11 of these as there was no time to do anything else otherwise.

Paul

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10140 - Posted 16 Jun 2009 9:13:01 UTC
Last modified: 16 Jun 2009 9:21:02 UTC

Can we get am Update on how/where the ABCbeta Project is doing/going ??? I'm sitting on over 50,000 Pending Credits for the ABC BETA Project & if the Projects Dead I'll divert the Resources to some place else. Not Complaining but just asking on how it's going ???

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10143 - Posted 18 Jun 2009 15:55:39 UTC

A status update:

The plan is to wait until the weekend, and then analyze the timing results from all returned workunits. Hopefully a pattern will emerge in the running times, so that we can fix the expected run time of new workunits.

After that, I'll re-generate a set of new workunits for the beta test with the new heuristics. From the current results, it looks like I should make them a lot shorter than they are now. Does 2 or 4 times as short sound reasonable to you?

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10144 - Posted 18 Jun 2009 17:00:41 UTC

I am having to abort lots of WUs for the beta app because the actual runtime is so much longer than the estimated runtime. Is there any way either the deadlines can be extended or the estimated runtime be made more accurate?
____________

Paladin*
private message
Joined: Nov 21, 2006
Posts: 25
ID: 58
Credit: 7,328,000
RAC: 0
Message 10146 - Posted 18 Jun 2009 18:46:23 UTC - in response to Message ID 10143.

A status update:

The plan is to wait until the weekend, and then analyze the timing results from all returned workunits. Hopefully a pattern will emerge in the running times, so that we can fix the expected run time of new workunits.

After that, I'll re-generate a set of new workunits for the beta test with the new heuristics. From the current results, it looks like I should make them a lot shorter than they are now. Does 2 or 4 times as short sound reasonable to you?


Yes that sounds good, a lot of the Wu's ere running 8-10 hr's on my Quads, so shortening them by a factor of 3 or 4 would put them in the 2-3 hr range.

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10147 - Posted 18 Jun 2009 18:52:06 UTC - in response to Message ID 10146.

A status update:

The plan is to wait until the weekend, and then analyze the timing results from all returned workunits. Hopefully a pattern will emerge in the running times, so that we can fix the expected run time of new workunits.

After that, I'll re-generate a set of new workunits for the beta test with the new heuristics. From the current results, it looks like I should make them a lot shorter than they are now. Does 2 or 4 times as short sound reasonable to you?


Yes that sounds good, a lot of the Wu's ere running 8-10 hr's on my Quads, so shortening them by a factor of 3 or 4 would put them in the 2-3 hr range.


Yup, have to agree. :-)))
____________
Lovely greetings from Cori

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10148 - Posted 18 Jun 2009 21:52:51 UTC - in response to Message ID 10143.

A status update:

The plan is to wait until the weekend, and then analyze the timing results from all returned workunits. Hopefully a pattern will emerge in the running times, so that we can fix the expected run time of new workunits.

After that, I'll re-generate a set of new workunits for the beta test with the new heuristics. From the current results, it looks like I should make them a lot shorter than they are now. Does 2 or 4 times as short sound reasonable to you?

Will doing this cause BOINC to simply DL more WUs causing effectively the same situation?
____________

Crystal Pellet
private message
Joined: Jan 16, 2007
Posts: 14
ID: 1251
Credit: 689,531
RAC: 0
Message 10149 - Posted 19 Jun 2009 9:18:46 UTC - in response to Message ID 10148.

A status update:

The plan is to wait until the weekend, and then analyze the timing results from all returned workunits. Hopefully a pattern will emerge in the running times, so that we can fix the expected run time of new workunits.

After that, I'll re-generate a set of new workunits for the beta test with the new heuristics. From the current results, it looks like I should make them a lot shorter than they are now. Does 2 or 4 times as short sound reasonable to you?

Will doing this cause BOINC to simply DL more WUs causing effectively the same situation?

Yes it will.
If you attach a project for the first time it's always a good idea to reduce your additional work buffer and crunch a few WU's. In that way Boinc client can calculate the Duration Correction Factor taking into account the speed of your computer and the (mis-)estimation of the calculation duration coming with the WU.
For the next work request Boinc knows how long the tasks will be (how much work should requested) and also calculates the debts of your new project against your other projects.

mikey User profile image
Avatar
private message
Joined: Aug 8, 2008
Posts: 812
ID: 21896
Credit: 4,323,646
RAC: 0
Message 10152 - Posted 20 Jun 2009 8:47:10 UTC - in response to Message ID 10148.

A status update:

The plan is to wait until the weekend, and then analyze the timing results from all returned workunits. Hopefully a pattern will emerge in the running times, so that we can fix the expected run time of new workunits.

After that, I'll re-generate a set of new workunits for the beta test with the new heuristics. From the current results, it looks like I should make them a lot shorter than they are now. Does 2 or 4 times as short sound reasonable to you?

Will doing this cause BOINC to simply DL more WUs causing effectively the same situation?

If you go into your default settings you can set it to only download say 0.25 or even 0.10 days of work by it's own calculations. That should solve your immediate problems of having to abort units. Your Account, General Preferences and then under Network Usage. If you set the first line to 0.00 I think it only downloads a unit or so at a time because it thinks you have an always connected computer. If you then adjust the additional work line to say 0.10 or 0.25 that tries to add enough work for 1/10 of a day or 1/4 of a day additionally. This is NOT perfect but it sorta works. When you start, in your case, leave the additional work line blank too and Boinc should only get a unit or so at a time.
____________

ziegenmelker
private message
Joined: Jan 22, 2007
Posts: 3
ID: 1890
Credit: 146,036
RAC: 0
Message 10194 - Posted 29 Jun 2009 14:26:50 UTC

Server error:

Mo 29 Jun 2009 16:22:50 CEST|abcbeta|Message from server: Server error: can't attach shared memory

cu,
Michael

wjpalenstijn
Forum moderator
Project administrator
Project developer
Project scientist
private message
Joined: Dec 12, 2006
Posts: 55
ID: 306
Credit: 791
RAC: 0
Message 10196 - Posted 29 Jun 2009 18:59:15 UTC

Status update: after analyzing a number of workunits, it seems the speed heuristic actually does quite well, but the code that builds workunits makes many workunits much longer than intended.

Together with lowering the target size, I hope fixing that will make the new workunits behave much more reasonably.

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10197 - Posted 30 Jun 2009 3:35:16 UTC

This work unit stopped progress at about 12%, I paused and continued it in hopes it would finish but at 59:41 (elapsed, not CPU) I aborted it. It was about 8 hours past due at that time and still at 12% completed.
____________

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10198 - Posted 30 Jun 2009 5:01:56 UTC
Last modified: 30 Jun 2009 5:42:17 UTC

It looks like this work unit had similar issues. No mention of permission errors but it failed to detect a heartbeat several times before erroring out. Both are on my 06/2005 G5 2.0 DP.

(edit) And this work unit also errored out on my G5 Power Mac. I don't see any bad work units for either of my G4s, is there a particular issue with the PPC970?
____________

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10199 - Posted 30 Jun 2009 5:20:34 UTC
Last modified: 30 Jun 2009 5:43:28 UTC

(deleted)
____________

mikey User profile image
Avatar
private message
Joined: Aug 8, 2008
Posts: 812
ID: 21896
Credit: 4,323,646
RAC: 0
Message 10203 - Posted 30 Jun 2009 10:22:29 UTC

The only thing I am seeing, both Windows and Linux machines, is that the time to completion starts out at wildly different numbers. I have several that think they will take 70 hours to finish and some that think they will only take 15 hours to finish. I don't know where to go to see how long they actually took though. My ABC account stats only shows the non beta units.
____________

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10204 - Posted 30 Jun 2009 10:57:29 UTC - in response to Message ID 10203.

... I don't know where to go to see how long they actually took though. My ABC account stats only shows the non beta units.

You can see them at your ABC Beta account: http://abctest.math.leidenuniv.nl/abcbeta/home.php ;-)
____________
Lovely greetings from Cori

Penguirl
private message
Joined: Nov 24, 2007
Posts: 15
ID: 12398
Credit: 57,887
RAC: 67
Message 10205 - Posted 30 Jun 2009 19:31:28 UTC

I have another work unit that appears to be doing the same thing. It ran up to 15% in a few hours, now it's seemingly stuck at 15% after 12 hours. I am going to abort it and take my G5 off the beta program until there is a fix.
____________

mikey User profile image
Avatar
private message
Joined: Aug 8, 2008
Posts: 812
ID: 21896
Credit: 4,323,646
RAC: 0
Message 10207 - Posted 1 Jul 2009 9:16:48 UTC - in response to Message ID 10204.

... I don't know where to go to see how long they actually took though. My ABC account stats only shows the non beta units.

You can see them at your ABC Beta account: http://abctest.math.leidenuniv.nl/abcbeta/home.php ;-)


thanks
____________

Cori User profile image
Avatar
private message
Joined: Nov 8, 2006
Posts: 2333
ID: 26
Credit: 1,094,817
RAC: 0
Message 10212 - Posted 3 Jul 2009 6:33:37 UTC
Last modified: 3 Jul 2009 6:38:32 UTC

I found several WUs which were successfully completed but didn't get granted - only my wingmen got credits.
Example: http://abctest.math.leidenuniv.nl/abcbeta/workunit.php?wuid=3053

Were my results really 'wrong' or why did this happen?
I noticed that my box needed longer to crunch the affecetd WUs than my wingmen but does this mean my results are not matching either?
____________
Lovely greetings from Cori

Ondra@SpaceFamily.CZ User profile image
Avatar
private message
Joined: Mar 9, 2008
Posts: 3
ID: 16766
Credit: 10,629
RAC: 0
Message 10234 - Posted 13 Jul 2009 15:25:42 UTC

I have a question...when will be new wu's available?

Administrator
private message
Joined: Jun 13, 2010
Posts: 1
ID: 64184
Credit: 0
RAC: 0
Message 11243 - Posted 15 Jun 2010 3:28:26 UTC - in response to Message ID 10041.

为什么开始时上面显示剩余时间是6小时,可我运行了两天了,还没完成那个任务,才完成25%,而且剩余时间还是显示为7小时
____________

梁书豪 User profile image
private message
Joined: Jun 21, 2010
Posts: 1
ID: 64630
Credit: 0
RAC: 0
Message 11263 - Posted 21 Jun 2010 11:20:12 UTC

I have a question , if I fisihed my program, what will happen?
____________

mikey User profile image
Avatar
private message
Joined: Aug 8, 2008
Posts: 812
ID: 21896
Credit: 4,323,646
RAC: 0
Message 11268 - Posted 22 Jun 2010 11:10:38 UTC - in response to Message ID 11263.

I have a question , if I fisihed my program, what will happen?


Finished what program? Do you mean your workunits you got from this, or any other, project? If so they just get returned and you get new ones to work on. The Project itself decides what we work on and how many workunits they need to prove their theory right or wrong.

Reply to this thread

Message boards : Questions and Answers : Beta Testing : New application!



Return to ABC@home main page


Copyright © 2010 University of Leiden