- DistributedDataMining: Shorter workunits for our medical application
06.06.2011 08:13 Uhr
Hi there, as already announced by Daniel, we are going to continue our research in the field of Medical Data Analysis. Last night, I generated about 100,000 new workunits for the medical application. We took into account the results of our latest poll and adapted our workunit structure. This time, the WU runtime is significantly shorter. Depending on your CPU a single experiment needs between 30 and 90 minutes. During the next few days it might happen that the estimated WU runtime is far away from reality. We simply don’t have enough information to provide a valid estimation. Please, let the WU finish anyway. The more valid WUs we get the better our estimation for the following units will be. I guess the issue will be solved in two or three days. Thanks for your support.
- DistributedDataMining: Website changes
05.06.2011 17:47 Uhr
Recently, I did some changes regarding our dDM website in order to honour the efforts of our dDM members:
- The Member of the day
- and the Donators for our server infrastructure
are prominently presented on the website. I’d like to thank all members for their contribution to the dDM project.
- DistributedDataMining: State of affairs
01.06.2011 13:54 Uhr
Hi there, This is Daniel, the guy who’s (more or less) responsible for the Medical Data Analysis application you’ve been all working on so impressively in the last months. First of all, I’d like to take this opportunity to greatly thank all of you for your massive support and the computational power you’re generously donating to this project! So far, thousands of experiments were successfully conducted with your help that would have taken me ages to do on my own. I really appreciate your work, bringing forward the whole project big time! Currently, I’m in the process of analyzing this huge amount of results you provided in order to make sense of it scientifically, since I’m working on a publication on that topic. As you can imagine, the analysis part will take some time – but a first glimpse at the data already revealed promising things. After having explored the data in a more “horizontal” manner up to now (investigating different combinations of parameters and configurations), in the next weeks I’d like to continue “vertically” with the experiments (testing the best combinations identified so far at increasing level of detail, allowing for more profound conclusions about the validity of our results). I’m sure that this will give really interesting insights. So I hope you are all set for the next round of the project and to get some numbers crunched… Best, Daniel
- DistributedDataMining: New workunits for the Time Series Analysis Application
10.05.2011 12:23 Uhr
Some of you might have notice that there are again some workunits for the time series analysis application. During the weekend I’ve generated about 35 thousand WUs. These WUs were processed before and got cancelled for different reasons. Mainly because the client didn’t have java installed, in some cases there were some problems with too less available memory. After finishing these old WUs a new batch of WUs will be available. For the new batch we will use the same wrapper technology as it is already in use for our medical application. In addition, we are going to integrate a newer version of the open source data mining suite RapidMiner.
- DistributedDataMining: Faster validation and less pending credit
02.05.2011 11:35 Uhr
Some members might remember that we had some problems regarding the cpu time counting in versions prior to 1.21 of the medical data analysis application. Since all WUs assigned to this malicious versions have been sent back to the dDM server, I decided to soften the safety mechanism that was responsible for plausibility checks. So far, a heuristic was used in order to hold back suspicious results for a manual check. As a consequence the validation of affected results was time delayed. From now on, the heuristic is less strict and the number of results that has to be checked by me will be significantly decreased. As a result, the result validation will be faster and we will have less pending credit.
- DistributedDataMining: New application version 1.23 for Medical Data Analysis
27.04.2011 14:44 Uhr
Today, I’ve released Version 1.23 of the Medical Data Analysis Application. Supported operating systems are Windows and Linux. Besides some bug fixes and minor changes of the error logging mechanism, the overall performance was improved by reducing the communication between boincclient and java. Comments or error reports are welcome in the Forum.
- DistributedDataMining: Server problems
17.03.2011 23:26 Uhr
Recently, we had some serious server troubles and the project went offline for a couple of hours. So far, I don’t know what exactly happened. I’ll look into it. As far as I can say, we didn’t loose any data and the database is consistent. It was not necessary to restore the latest backup.
- DistributedDataMining: Wrong CPU-time counting in Medical Data Analysis
02.03.2011 22:44 Uhr
We are facing a problem regarding CPU-time counting. In some cases, the CPU time for a WU is not counted correctly. The boinc manager reports then hundreds or thousands of CPU hours and consequently the credit it much too high. This problem was briefly discussed in our ddm forum and I am working on solving this issue. In fact, I’ve recently released a couple of new application versions. It’s quite hard to find the error because it appears rarely and all my local tests are working perfect. In the meantime I’ve activated a safety mechanism: Suspicious WU were not credited automatically and checked manually. From time to time, a WU having a wrong cpu time gets credit anyway. This happens because the safety mechanism uses just heuristics in order to find malicious WUs and doesn’t work in all cases. The credits of all affected WUs will be corrected at once, as soon I’ve found the cause of the error. Latest version is 1.18. It should handle suspending/resuming correctly and has as well some changes in the cputime counting parts.
- DistributedDataMining: Errors in Medical Data Analysis – Application Version 1.10
15.02.2011 23:04 Uhr
Recently, I’ve released version 1.10 in order to overcome the resume/suspend problem. Even if we had some progress regarding suspending/resuming other problems have occurred: - Due to extensive logging the error log file exceeds the upper size limit in some case . The effected WUs won’t be uploaded to the dDM server and are marked as failure. I am going to grant the credit anyway. - During Suspending/Resuming cpu time is counted twice. As a result the reported run times are way to high. A safety mechanism, I’ve implemented on the server a couple of months ago, gets activated and put the uploaded WU on hold. In the web interface these WUs appear as ‘Pending’. I’ve to figure out how to handle this situation and to correct the cpu time. Good news: I’ve found the error that is responsible for the double cpu time counting. In addition I’ve decreased the number of messages in the error log file. It shouldn’t exceed its limits any longer. The new version will be 1.11. I am going to release it today.
- DistributedDataMining: New application version for Medical Data Analysis
10.02.2011 02:42 Uhr
Today, I’ve release version 1.08 of our Medical Data Analysis Application. There are some minor changes: - Improved error logging and handling - Corrected CPU time counting - Suspending/Resuming under Windows
- DistributedDataMining: New application for Medical Data Analysis
14.01.2011 23:31 Uhr
As already announced here, we continue Daniel’s research in the field of Medical Data Analysis. Therefore, we’ve implemented a new and more flexible java wrapper. Now, after finishing our tests, a new application about Laryngeal high-speed video classification is available for Windows and Linux operation systems. Please report any noticeable problems via the forum.
- DistributedDataMining: Team Challenge of The Knights Who Say Ni!
07.01.2011 22:17 Uhr
Recently, the team The Knights Who Say Ni! started a team challenge on our DistributedDataMining project. The challenge was originally announced here. So far, 15 team members are participating in order to support our research and to increase their team credit. Today, our thank goes especially to the team The Knights Who Say Ni! for supporting our research. Due to the expected higher server load our dDM project might suffer from performance loses. We are constantly working on overcoming these issues. Please report any noticeable problems via the forum. We like to emphasize the correct project URL http://www.distributeddatamining.org/DistributedDataMining and the need for java. Further information for new dDM members can be found here.
- DistributedDataMining: New poll about your preferred WU runtime
20.12.2010 04:14 Uhr
As mentioned before, we are planning to continue our research in Medical Data Analysis. Our latest tests are promising and we’ve already released a small number of Linux test WUs to the public. The characteristic of the data and the new features of our latest RapidMiner wrapper makes it possible to determine the runtime of the new workunits in advance. In order to find out, what runtime is preferred by our dDM members, we’ve started a new poll. Please, vote for your preferred WU runtime and help us to support your demands.
- DistributedDataMining: Press report about dDM
18.12.2010 06:52 Uhr
Recently, the popular german journal Handelsblatt has briefly reported about the dDM project: http://www.handelsblatt.com/technologie/it-internet/verteiltes-rechnen-wenn-der-eigene-rechner-zur-alien-falle-wird;2680330;10#bgStart Besides SETI@home, Einstein@home and other well known BOINC projects, our DistributedDataMining project is listed as one of the most popular Distributed Computing projects. We are proud of the publicity and the appreciation.
- DistributedDataMining: Book announcement
15.12.2010 12:27 Uhr
Today, it’s a great pleasure to announce the latest book by Dr. Daniel Voigt: Objective Analysis and Classification of Vocal Fold Dynamics from Laryngeal High-Speed Recordings. Aachen: Shaker Verlag GmbH; 2010 Daniel’s work in the field of Medical Data Analysis about Laryngeal high-speed video classification was partially powered by the dDM project. As usual, the results of our efforts are public available: Daniel published his phd thesis as book at Shaker. Currently, we are planning to continue our research in this area. As soon as our final tests are finished a new medical application will be available for all dDM members. Congratulations to Daniel and special thanks to the dDM community for supporting our research!
- DistributedDataMining: Team Challenge of L'Alliance Francophone
16.09.2010 08:26 Uhr
The team L’Alliance Francophone runs a team challenge from September 17th to October 1st on our dDM project. The challenge includes the whole team and was originally announced here. Today, our thank goes especially to the team L’Alliance Francophone for supporting our research. Due to the expected higher server load our dDM project might suffer from performance loses. We are constantly working on overcoming these issues. Please report any noticeable problems via the Number Crunching Forum. We like to emphasize the correct project URL http://www.distributeddatamining.org/DistributedDataMining and the need for java. Further information for new dDM members can be found here.
- DistributedDataMining: Aborted WUs and granted credit
30.08.2010 20:27 Uhr
Today, I’ve noticed that about 30 WUs got cancelled recently due to an unknown server problem. Because of the long runtime of these WUs, I added the complete credit to the related user accounts anyway. Now, I am going to have a look into the problem in order to fix it. Sorry for the inconvenience.
- DistributedDataMining: More RAM for dDM servers
30.08.2010 08:53 Uhr
Recently, I’ve noticed some performance problems of the dDM database. In some cases these problems might lead to slow or delayed connections of the boinc clients. Worst case scenario was as follows: Clients couldn’t connect to the server and wait for one hour (because of load reducing) before they try again. In order to speed up the data base, I’ve spent more RAM to the servers. This includes the frontend server that is responsible for the websites and the backend server that carries the dDM data base. Let’s see if it helps to normalize situation.
- DistributedDataMining: New images for the Boinc Manager
27.08.2010 17:00 Uhr
Today, I’ve added two new images that are shown in the simple view of the Boinc Manager. I’ve chosen these images in order to symbolize the Stock Price Prediction Application. The first one is a coloured version of the SPP logo that is also used on the website. The second one is less abstract and shows a Stock Price Diagnosis. I hope the new images pleases you even if the most dDM members prefer the advanced view in the BM.
- DistributedDataMining: New AppVersion 4.30: 32bit for Win and Linux + 64bit for linux
01.08.2010 16:01 Uhr
Today, I’ve released AppVersion 4.30 for Linux and Windows systems. It’s the first time, dDM releases a 64bit version for Linux. Besides that, there are no big changes – just small bug fixes. The most mentionable new feature addresses the problem of multiple java processes that remain after a crash of the java wrapper: Every time a java process is create the java wrapper stores the java process ID in the checkpoint file. After a restart of the wrapper (for whatever reason) it is checking if the old java process (based on the stored PID) is still running. Doing so we can avoid that two java processes doing the same work and consume the double cpu time.
- DistributedDataMining: Suggestions for website improvements
24.07.2010 07:02 Uhr
Our new website is on-line for several days and dDM members could participate in a poll about the new layout. Most of the participants like the new website but 20% don’t like the new layout. Therefore, I’d like to invite you to post your comments, feedback and suggestions how we could improve our new website furthermore. Please, use the Forum Website Issues to share your ideas.
- DistributedDataMining: New operation system for the dDM backend server
18.07.2010 17:26 Uhr
Today, I’ve updated the operation system of the dDM backend server. From now on, a 64 bit debian (lenny) system is running in a XEN virtual machine. Doing so, the dDM software fully supports the 64 bit architecture of the XEN host system. That should end in a better overall performance. In addition, the switch to a 64bit OS makes the development of 64 bit applications for dDM much easier. Because of unforeseen problems regarding file and database permissions the dDM backend server was off-line for about 1 hour. During this time, also the boinc functionality that is included in the new dDM website was not available. Today’s architecture switch was the last step in a batch of changes. During the last days the dDM project got a new website, a new project URL and a better server infrastructure. I hope any experienced inconvenience is compensated by the new dDM features.
- DistributedDataMining: New dDM website
16.07.2010 19:22 Uhr
Recently, I was working on a new dDM website. After finishing the beta test phase our new site is now on-line. The objective of the new site is to separate it from my personal website in order to focus on the project part. Therefore, we have a new domain name: http://www.distributeddatamining.org In addition, I tried to realize a better integration of the boinc functionality. Therefore, I’ve implemented a module for the popular content management system drupal. From now on, any boinc content is shown inside the dDM website instead of opening a new tab or window. To reach the non-public part of the website (your boinc settings, private messages …) you have to login by using your usual boinc account data (email + password). Any comments, hints or feedback is welcome in the Number crunching forum.
- DistributedDataMining: New AppVersion 429
22.06.2010 15:00 Uhr
Today, I’ve released AppVersion 4.29. The new version fixes the following bugs:
- The java process did not finish after finishing the boinc manager (reported by bloodrain).
- The java process did allocate to much memory (reported by Augustine).
- The trickle message frequency was too high (reported by ritterm). Now, it’s set to once per hour for each WU.
- The update frequency of the status messages sent to the boinc manager was to low. Now it’s dynamically adapted based on the WU running time.
- In case of errors the WU was restarted even if the error was unrecoverable. From now on, an unrecoverable error will quit the WU immediately.
- There is still the problem of the 0xc0000005 errors. It seems to be caused by memory problems. Quite often they are recoverable. From now on, the WU will wait 120 seconds before restarting the WU. This should help to normalize the memory situation on the affected systems.
- In rare cases the checkpoint file can’t be read (reported by magyarficko) As consequence the cputime isn’t counted correctly. I added some debug code to find out what causes this problem.
- There was a bug in the on-demand trickle message mechanism that sends the error log file back to the server. Now, its working for linux as well as windows machines.
- DistributedDataMining: Bug fixing efforts and WU error rate
24.05.2010 12:31 Uhr
During the last weeks, I was wondering if our bug fixing efforts were successful and if we were able to improve the dDM application. Therefore, I’ve implemented a script that calculates the fraction of cpu time that got lost because of WU errors. This calculation is done – once per day – for each single AppVersion that has processed a significant number of WUs. Thereby, beta-WUs and application independent errors are not taken into account. The results are daily updated and shown at http://www.nicoschlitter.de/ddm_apps. I am glad of reaching a state where less than 0.5 percent of the total cpu time gets lost because of WU aborts. Of course, we will continue our efforts of decreasing the number of errors. I’d like to thank you for your patience, feedback and bug reports. Without your help dDM couldn’t be successful.
- DistributedDataMining: WUs with beta-flag for a better quality control
21.05.2010 16:22 Uhr
Recently, we had to face some WUs that didn’t work correctly because of a bug in the underlying data mining framework (RapidMiner). Regarding WUs, dDM is a special project, because the dDM-WUs differ not just in data – they also differ in the applied data processing algorithm. Therefore, distributing WUs with new algorithms brings always the risk of WU errors. To minimize these risks, I test a small batch of the new WUs before I distribute them. As usual it runs without having any errors. Later on some dDM members report problems that are documented here and here. The problem occurs in WUs that uses Support Vector Machines with Polynomial and Epnechnikov Kernels for time series analysis. I thought about how to avoid such problems in the future and came up with the following idea. From now on, each WU that uses new algorithms is tagged with a
flag. The new dDM scheduler distributes these beta-WUs only to users that have agreed to receive beta applications. Doing so, we can test more of the new WUs on different platforms while non-beta tester are not involved at all. After a certain time without problems regarding the beta WUs they will loose its beta state and will be distributed to all dDM members. If you are willing to act as a beta tester please agree to
Run test applications?in your account settings. Consequently, being a beta tester results in a higher probability to get WUs that finish with an error. Therefore, beta testers get credits for beta-WUs – even if they error out. If you don’t want to be a beta tester for WUs please reject running test applications.
- DistributedDataMining: New AppVersion 421
20.05.2010 15:56 Uhr
I’ve released AppVersion 4.21. I did some bug fixes and extended the trickle message features:
- Now, the App is able to react to a specific server command and sends the current error logfile (stderr.txt) back to the server. Doing so, I will be able to check the state of WUs in case there are noticeable problems.
- The problems regarding error code -1073741819 seems to be solved. It was caused by the decompressing (unzip) of the input files.
- There was a problem that WUs end in an infinite loop. Now it should be solved.
- I’ve changed the mechanism that detects WUs with no more progress during a long time.
- In addition, I did some performance improvements of the java wrapper: Uncompressing of input files is done just once and IO-load is reduced by better data caching.
- DistributedDataMining: New AppVersion 419
13.05.2010 13:16 Uhr
I’ve released AppVersion 4.19. The new version deals with trickle messages. It is able to send and receive messages to/from the server. The messages will be exchanged when the boinc client sends an update request to server. I’ve increased the update frequency to ensure a steady communication. Because of this mechanism the following features are provided:
- In order to monitor the behaviour of single WUs the application reports the cputime and the progress of a WU in answer to a specific server request. This feature helps to detect failing WUs.
- The application reports the start of a new WU to the dDM server. Therefore, it’s now possible to abort downloaded but unstarted WUs by sending a dedicated message from the server to the application. This will be used in case of failing WUs. This feature is a consequence of a this situation.
In addition, the handling of the memory problem (described here and here) was improved. Doing so, the progress of an WU shouldn’t slow down after a long runtime. Unfortunately, the new version brings some negative side effects. In some rare cases there is a problem with errors (0xc0000005 and -1073741819). So far, I don’t know the cause but I am going to look into it. The error occurs at the start of a WU and doesn’t cost much cputime. Credit for this failing WUs will be granted. In spite of this known problems, I’ve released this version anyway because the advantages will outstrip the drawbacks.
- DistributedDataMining: New AppVersion 412
07.05.2010 22:24 Uhr
Today, I’ve released AppVersion 4.12. Besides some bugfixes, version 4.12 takes care of a problem that is related to the java garbage collector. Long running WUs consume a lot of memory and memory management becomes more complex. At some point the java memory manager (garbage collector) takes the most CPU time and the experiment doesn’t go further. This isn’t a general java issue – there might be an error in the RapidMiner framework. To overcome this problem the new java wrapper divides the WU in several parts and each part is iteratively processed by RapidMiner. The number of splits is handled dynamical – depending on the WU runtime and the current progress. The speed-up factor is quite good. The runtime of the current WUs (SVM-Learner with parameter optimization) are reduced down to 25% of the original duration! Consequently, we save 75% of the computational power. All users are encouraged to abort waiting tasks that are assigned to earlier application versions.
- DistributedDataMining: New AppVersion 409
04.05.2010 09:35 Uhr
Yesterday, I’ve released AppVersion 409 for Windows and Linux. It brings the following new features:
- In order to avoid short-running WUs (less then a minute), the new version supports a parameter optimization of the underlying learning algorithms. Doing so, it will be possible to perform different experiments on the same data. Thereby, the experiment setup will be the same but the used parameters will differ. As a consequence the runtime of a WU is increased by a factor that depends on the number of different parameter settings. Currently the parameter optimization is done by a simple grid search but it’s easily extendible to other approaches e.g. genetic algorithms.
- In addition, the new version allows to specify the amount of RAM that is usable by the java processes. Hopefully, the better memory management will reduce the number of -226 errors.
- The new version sends more information back to the server. Besides the experiment results (output) and the optimization related files (opt_final,opt_grid) also logfiles (stderr.txt, stdout) will be transmitted to the server in order to find the cause of the -226 errors.
- In former versions, the consumed CPU time was counted incorrectly if the java process was killed while the java wrapper was suspended. The reported cpu time and the resulting credit was much too high and I had to correct them manually. The new version solves this problem.