- DistributedDataMining: New Research Topic
23.03.2012 16:05 Uhr
Hi there, currently, we are in the final stage of releasing a new application which deals basically with the simulation of evolutionary processes. Using the Multi-Agent Simulation of Evolution we investigate the biological phenomenon of aposematism (also referred to as warning coloration). This term describes the evolutionary strategy of certain animal species to indicate their unpalatability/toxicity to potential predators by developing skin colors and patterns that can be easily perceived by them. Prominent examples of toxic animals with distinct warning coloration are poison dart frogs, coral snakes and fire salamanders. For tackling this interesting research challenge, we developed a distributed multi-agent model that simulates the dynamic interactions of predator and prey populations over time. By systematically testing different adaptation and learning strategies for the agents and exploring the parameter space of our simulation model using the computational power of the dDM project, we might be able to deepen the understanding of the aposematism phenomenon and the evolutionary paths leading to it. So far, dDM members won’t get WUs of this new application. Currently, we are finishing our final tests and distribute these WUs to preselected hosts only. Soon, I’ll send out these WUs to beta test members. After finishing our beta test the new application will be available for all members. Cheers, Nico
- DistributedDataMining: Website Restructuring
22.03.2012 18:58 Uhr
Hi there, as you may have seen, I’ve restructured the dDM website in order to gain clarity. The main changes are:
- I reworked the start site, which now provides a brief overview about the dDM project, its history and its objectives.
- The research challenges went to a separate page.
- I also created a new site which briefly reports about the dDM achievements.
- The menu structure was revised. All items related to the member account were moved the right.
- I also created four different color schemes, which can be freely chosen by the members (left bottom corner).
Your comments, suggestions and wishes are much appreciated. I am going to add some more minor web site features soon and would be happy to include some of your ideas. Furthermore, I’ll update and add some research related information. That’s it for now. Cheers, Nico
- DistributedDataMining: Download problems caused by hardware issues
26.02.2012 19:17 Uhr
Recently, we had some hard-disc issues and as a result the data files of our medical application are temporary not available. We are going to reconstruct the data using the latest backup. Since this will take some hours, just a few workunits will be available in the meantime. We do not expect any problems regarding the user data (e.g. credits) because our ddm database is not affected. Best regards Nico
- DistributedDataMining: Security Update to Drupal 6.24
08.02.2012 15:07 Uhr
Due to some reported security vulnerabilities I’ve updated the content management system of our dDM website to Drupal version 6.24. Please, report any problems related to the dDM website in our forum. Best regards Nico
- DistributedDataMining: Server Maintenance
20.01.2012 11:18 Uhr
Since we are moving the dDM-Server to new hardware the project website and all BOINC functions won’t be available on Sunday (2012/01/22). So far, we don’t expect any problems. There should be no need for any changes in the client configuration. Best regards Nico
- DistributedDataMining: Scientific contribution
09.01.2012 11:56 Uhr
As already announced in an earlier post, some new research results from our Medical Data Analysis application were presented at the 162nd Meeting of the Acoustical Society of America which was held in San Diego, CA from 31 Oct – 4 Nov 2011. Our poster contribution was titled “Identifying relevant analysis parameters for the classification of vocal fold dynamics” and received lots of positive feedback from conference attendees and initiated interesting and inspiring scientific discussions. In the presented work, we systematically investigated the influence of a set of control parameters on the classification accuracy of the automatic diagnosis system for vocal fold dynamics based on high-speed videos. The particular suitability of certain parameter combinations was revealed in this study, helping to further improve the practical application of our diagnostic framework. The poster was heavily based on the results that we got from the extensive experiments conducted with the massive help of the DistributedDataMining community. An abstract of this work can be found in the accompanying conference proceedings: J. Acoust. Soc. Am. Volume 130, Issue 4, pp. 2550-2550 (2011). Thanks a lot for your support! Best, Daniel
- DistributedDataMining: New WUs and MD5 download errors
11.12.2011 09:44 Uhr
Currently, I am generating new workunits for our medical application. It turned out, I was acting a bit careless, because as a consequence many members are getting MD5 download errors now. The reason is quite simple. During WU generation files are getting copied to the server. Sometimes these files are still existing because we use the some data files for multiple workunits. In these cases we just change the learning parameters or the experiment setup. Anway, overwriting these existing files leads to problems, because the old and the new file have different MD5 checksums and hence all WUs that are related to the old files error out with an MD5 download error. About 30.000 WUs are affected. I’ll take care of it but it might take a while to identify the failing WUs. There is a chance, that the dDM members error out the affected old WUs in the meantime. In that case, the new WUs should work fine. Lesson learned: Never generate new WUs if you are in a hurry – in the end it takes much more time and causes preventable trouble. I am sorry for the inconvenience! Best regards, Nico
- DistributedDataMining: Let dDM benefit from your Christmas shopping
07.12.2011 16:16 Uhr
Hi there, As you know, the dDM project does not get financial support by universities, research institutes or private commercial organizations. Thus, we depend on private help to keep the project and our research running. Since the dDM website became part of the Amazon Associates Program, the dDM project can benefit from your Christmas shopping at Amazons online shop. Amazon rewards dDM with up to 6% of any issued gift certificate and up to 10% of each sold store item. If you intend to buy some Christmas gifts at Amazon, please remember dDM and follow the URL provided on this page. The provided URL links to the closest Amazon store of your region and contains our dDM partner ID. In case the wrong store is chosen, please select the right one out of the given country list. The dDM project gets a reward for each item bought after using these links. As always, all Amazon rewards or PayPal Donations are used to pay the server rent, maintenance and internet traffic. The possible remain is used to finance our scientific publications or conference presentations. Perhaps, we can even buy some new hardware in order to replace the aged project server. Thank you in advance for all of your generous support! Best regards and have a nice Christmas season! Nico
- DistributedDataMining: New application version 1.35 for Medical Data Analysis
05.12.2011 13:22 Uhr
Recently, I’ve released version 1.35 of our medical data analysis application. I am happy to announce that this version solves the problem of never ending workunits. The new version is out there for almost a week and so far there aren’t any workunits that had to be killed manually by our ddm members. Best regards, Nico
- DistributedDataMining: New application version 1.34 for Medical Data Analysis
29.11.2011 12:00 Uhr
Today, I’ve released a new version for our medical application. Version 1.34 should be able to detect the java location automatically. Doing so, it shouldn’t be necessary to adapt the PATH variable manually. This was necessary because it seems that the path variable that we used until now was changed by an automatic windows update. As a consequence our wrapper couldn’t find java and all workunits failed. Some users might have received an email notification and the suggestion to install java even if java was already installed on their computers. I am sorry for the confusion and the inconvenience caused by the circumstances. The new version uses the variables JAVA_HOME, JRE_HOME and JDK_HOME, which are usually set during java installation. I did some tests and it worked fine here. Lets see how it works out there. Any feedback and error reports are welcome in the forum related to the medical application. Regards, Nico
- DistributedDataMining: Update on Medical Application
29.08.2011 12:38 Uhr
Hi there, Here’s a quick update on our medical application: the latest results look really good! Lots of interesting relationships between the analyzed input variables were revealed and relevant research questions could be answered. Parts of these results will be presented at a scientific conference in November. Expect more details on that soon! We’ve learned a lot from the results regarding the use of parameter optimization and feature selection – which now allows us to create much more efficient (and hopefully even more reliable) experiments. For example, we’re planning to apply sophisticated search methods from the field of evolutionary computation combined with a more statistically sound validation approach. For that purpose, we developed a new version of our application, which will run the new experiments much faster and will take over quite a lot of the data analysis steps that were done manually before. This new version is being tested extensively at the moment and will be available soon. But we’re also thinking about implementing completely new ideas and exploring different research directions. For example, one potential project will be focusing on agent-based modeling in the field of simulating the emergence of biological and social phenomena. Well, that’s it for now. Thanks again for your support! Best, Daniel
- DistributedDataMining: Pending credit
17.08.2011 10:32 Uhr
As you might have noticed we had some minor problems regarding pending credits in the past. Today, I’ve fixed this issue. The problem was caused by a safety mechanism that is uses to avoid benchmark cheating. The algorithm marks suspicious workunits for further inspection and the affected workunits were pending until I’ve checked them manually. Unfortunately, there was a slight error in the cheating detection heuristic and as a result much more workunits than necessary were marked. Today, I found the problem and corrected the code. At a first glance, there won’t be any more pending workunits.
- DistributedDataMining: New application version 5.01 for Time Series Analysis Application
16.08.2011 14:39 Uhr
Today, I’ve released version 5.01 of the Time Series Analysis Application. The new version uses the same wrapper technology that is already in use for the medical application. It overcomes several problems and decreases the fraction of failing workunits. So far, the new version supports 32&64 bit Linux systems only. Versions for Windows will be published soon. As always, comments and error reports are welcome in the forum.
- DistributedDataMining: Website translations powered by Google
11.08.2011 18:22 Uhr
Hi there, recently, some users suggested to provide this website in different languages. Unfortunately, a translation in other languages would be a huge effort and can’t be done by myself. Therefore, I was looking for a simple solution and I’ve found Google Translate. Today, I’ve added this nice feature in our dDM website in order to translate the content in different languages. From now on, you find a Translate section in the right top corner. I hope this motivates new users to join the dDM community and helps to understand what we are trying to achieve. Any comments are welcome in the website issues forum. Best regards Nico
- DistributedDataMining: Publication announcement
21.07.2011 08:50 Uhr
Recently, we got notification of the acceptance of our latest scientific paper that is titled Dengraph-HO Density-based Hierarchical Community Detection for Explorative Visual Network Analysis. In the context of our Social Network Analysis sub-project we developed the new DenGraph-HO algorithm that is able to detect hierarchical communities in social networks. The paper will be presented at the Thirty-first SGAI International Conference on Artificial Intelligence (AI-2011) in December 2011. After presentation it will be published in the conference proceedings and the dDM website. Without the contribution of our dDM members this work wouldn’t have been possible. Thank you for your support.
- DistributedDataMining: Shorter workunits for our medical application
06.06.2011 08:13 Uhr
Hi there, as already announced by Daniel, we are going to continue our research in the field of Medical Data Analysis. Last night, I generated about 100,000 new workunits for the medical application. We took into account the results of our latest poll and adapted our workunit structure. This time, the WU runtime is significantly shorter. Depending on your CPU a single experiment needs between 30 and 90 minutes. During the next few days it might happen that the estimated WU runtime is far away from reality. We simply don’t have enough information to provide a valid estimation. Please, let the WU finish anyway. The more valid WUs we get the better our estimation for the following units will be. I guess the issue will be solved in two or three days. Thanks for your support.
- DistributedDataMining: Website changes
05.06.2011 17:47 Uhr
Recently, I did some changes regarding our dDM website in order to honour the efforts of our dDM members:
- The Member of the day
- and the Donators for our server infrastructure
are prominently presented on the website. I’d like to thank all members for their contribution to the dDM project.
- DistributedDataMining: State of affairs
01.06.2011 13:54 Uhr
Hi there, This is Daniel, the guy who’s (more or less) responsible for the Medical Data Analysis application you’ve been all working on so impressively in the last months. First of all, I’d like to take this opportunity to greatly thank all of you for your massive support and the computational power you’re generously donating to this project! So far, thousands of experiments were successfully conducted with your help that would have taken me ages to do on my own. I really appreciate your work, bringing forward the whole project big time! Currently, I’m in the process of analyzing this huge amount of results you provided in order to make sense of it scientifically, since I’m working on a publication on that topic. As you can imagine, the analysis part will take some time – but a first glimpse at the data already revealed promising things. After having explored the data in a more “horizontal” manner up to now (investigating different combinations of parameters and configurations), in the next weeks I’d like to continue “vertically” with the experiments (testing the best combinations identified so far at increasing level of detail, allowing for more profound conclusions about the validity of our results). I’m sure that this will give really interesting insights. So I hope you are all set for the next round of the project and to get some numbers crunched… Best, Daniel
- DistributedDataMining: New workunits for the Time Series Analysis Application
10.05.2011 12:23 Uhr
Some of you might have notice that there are again some workunits for the time series analysis application. During the weekend I’ve generated about 35 thousand WUs. These WUs were processed before and got cancelled for different reasons. Mainly because the client didn’t have java installed, in some cases there were some problems with too less available memory. After finishing these old WUs a new batch of WUs will be available. For the new batch we will use the same wrapper technology as it is already in use for our medical application. In addition, we are going to integrate a newer version of the open source data mining suite RapidMiner.
- DistributedDataMining: Faster validation and less pending credit
02.05.2011 11:35 Uhr
Some members might remember that we had some problems regarding the cpu time counting in versions prior to 1.21 of the medical data analysis application. Since all WUs assigned to this malicious versions have been sent back to the dDM server, I decided to soften the safety mechanism that was responsible for plausibility checks. So far, a heuristic was used in order to hold back suspicious results for a manual check. As a consequence the validation of affected results was time delayed. From now on, the heuristic is less strict and the number of results that has to be checked by me will be significantly decreased. As a result, the result validation will be faster and we will have less pending credit.
- DistributedDataMining: New application version 1.23 for Medical Data Analysis
27.04.2011 14:44 Uhr
Today, I’ve released Version 1.23 of the Medical Data Analysis Application. Supported operating systems are Windows and Linux. Besides some bug fixes and minor changes of the error logging mechanism, the overall performance was improved by reducing the communication between boincclient and java. Comments or error reports are welcome in the Forum.
- DistributedDataMining: Server problems
17.03.2011 23:26 Uhr
Recently, we had some serious server troubles and the project went offline for a couple of hours. So far, I don’t know what exactly happened. I’ll look into it. As far as I can say, we didn’t loose any data and the database is consistent. It was not necessary to restore the latest backup.
- DistributedDataMining: Wrong CPU-time counting in Medical Data Analysis
02.03.2011 22:44 Uhr
We are facing a problem regarding CPU-time counting. In some cases, the CPU time for a WU is not counted correctly. The boinc manager reports then hundreds or thousands of CPU hours and consequently the credit it much too high. This problem was briefly discussed in our ddm forum and I am working on solving this issue. In fact, I’ve recently released a couple of new application versions. It’s quite hard to find the error because it appears rarely and all my local tests are working perfect. In the meantime I’ve activated a safety mechanism: Suspicious WU were not credited automatically and checked manually. From time to time, a WU having a wrong cpu time gets credit anyway. This happens because the safety mechanism uses just heuristics in order to find malicious WUs and doesn’t work in all cases. The credits of all affected WUs will be corrected at once, as soon I’ve found the cause of the error. Latest version is 1.18. It should handle suspending/resuming correctly and has as well some changes in the cputime counting parts.
- DistributedDataMining: Errors in Medical Data Analysis – Application Version 1.10
15.02.2011 23:04 Uhr
Recently, I’ve released version 1.10 in order to overcome the resume/suspend problem. Even if we had some progress regarding suspending/resuming other problems have occurred: - Due to extensive logging the error log file exceeds the upper size limit in some case . The effected WUs won’t be uploaded to the dDM server and are marked as failure. I am going to grant the credit anyway. - During Suspending/Resuming cpu time is counted twice. As a result the reported run times are way to high. A safety mechanism, I’ve implemented on the server a couple of months ago, gets activated and put the uploaded WU on hold. In the web interface these WUs appear as ‘Pending’. I’ve to figure out how to handle this situation and to correct the cpu time. Good news: I’ve found the error that is responsible for the double cpu time counting. In addition I’ve decreased the number of messages in the error log file. It shouldn’t exceed its limits any longer. The new version will be 1.11. I am going to release it today.
- DistributedDataMining: New application version for Medical Data Analysis
10.02.2011 02:42 Uhr
Today, I’ve release version 1.08 of our Medical Data Analysis Application. There are some minor changes: - Improved error logging and handling - Corrected CPU time counting - Suspending/Resuming under Windows
- DistributedDataMining: New application for Medical Data Analysis
14.01.2011 23:31 Uhr
As already announced here, we continue Daniel’s research in the field of Medical Data Analysis. Therefore, we’ve implemented a new and more flexible java wrapper. Now, after finishing our tests, a new application about Laryngeal high-speed video classification is available for Windows and Linux operation systems. Please report any noticeable problems via the forum.
- DistributedDataMining: Team Challenge of The Knights Who Say Ni!
07.01.2011 22:17 Uhr
Recently, the team The Knights Who Say Ni! started a team challenge on our DistributedDataMining project. The challenge was originally announced here. So far, 15 team members are participating in order to support our research and to increase their team credit. Today, our thank goes especially to the team The Knights Who Say Ni! for supporting our research. Due to the expected higher server load our dDM project might suffer from performance loses. We are constantly working on overcoming these issues. Please report any noticeable problems via the forum. We like to emphasize the correct project URL http://www.distributeddatamining.org/DistributedDataMining and the need for java. Further information for new dDM members can be found here.
- DistributedDataMining: New poll about your preferred WU runtime
20.12.2010 04:14 Uhr
As mentioned before, we are planning to continue our research in Medical Data Analysis. Our latest tests are promising and we’ve already released a small number of Linux test WUs to the public. The characteristic of the data and the new features of our latest RapidMiner wrapper makes it possible to determine the runtime of the new workunits in advance. In order to find out, what runtime is preferred by our dDM members, we’ve started a new poll. Please, vote for your preferred WU runtime and help us to support your demands.
- DistributedDataMining: Press report about dDM
18.12.2010 06:52 Uhr
Recently, the popular german journal Handelsblatt has briefly reported about the dDM project: http://www.handelsblatt.com/technologie/it-internet/verteiltes-rechnen-wenn-der-eigene-rechner-zur-alien-falle-wird;2680330;10#bgStart Besides SETI@home, Einstein@home and other well known BOINC projects, our DistributedDataMining project is listed as one of the most popular Distributed Computing projects. We are proud of the publicity and the appreciation.
- DistributedDataMining: Book announcement
15.12.2010 12:27 Uhr
Today, it’s a great pleasure to announce the latest book by Dr. Daniel Voigt: Objective Analysis and Classification of Vocal Fold Dynamics from Laryngeal High-Speed Recordings. Aachen: Shaker Verlag GmbH; 2010 Daniel’s work in the field of Medical Data Analysis about Laryngeal high-speed video classification was partially powered by the dDM project. As usual, the results of our efforts are public available: Daniel published his phd thesis as book at Shaker. Currently, we are planning to continue our research in this area. As soon as our final tests are finished a new medical application will be available for all dDM members. Congratulations to Daniel and special thanks to the dDM community for supporting our research!