Breakthroughs in genomic and proteomic data generation lead to development and emergence of several disciplines. Up to 1 trillion bases can be sequenced in one 6-day run by Illumina HiSeq 2500 ( Rhoads & Au, 2015) while mass spectrometry can now completely analyse a proteome and quantify protein concentrations in an entire organism ( Ahrne et al., 2015). Massive data are now affordably, easily and frequently generated by genomic and proteomic assays such as massively parallel sequencing and high-throughput mass spectrometry. The grid implementation by BOINC also helped tap unused computing resources during the off-hours and could be easily modified for other available bioinformatics software. Thus, the grid implementation of BLAST by BOINC is an efficient alternative to the HPC for sequence alignment. The estimated durations of BLAST analysis for 4 million sequence reads on a desktop PC, HPC and the grid system were 568, 24 and 5 days, respectively. The result and processing time were compared to those from a single desktop computer and HPC. Sequencing results from Illumina platform were aligned to the human genome database by BLAST on the grid system. In order to test the performance of the grid system, we adapted the Basic Local Alignment Search Tools (BLAST) to the BOINC system. Fifty desktop computers were used for setting up a grid system during the off-hours. In this study, we implemented grid computing in a computer training center environment using Berkeley Open Infrastructure for Network Computing (BOINC) as a job distributor and data manager combining all desktop computers to virtualize the HPC. Other means were developed to allow researchers to acquire the power of HPC without a need to purchase and maintain one such as cloud computing services and grid computing system. However, the HPC is expensive and difficult to access. Due to the size of the data, a high performance computer (HPC) is required for the analysis and interpretation. Nevertheless, the data are not readily usable or meaningful until they are further analysed and interpreted. Consequently, a massive amount of experiment data is now rapidly generated. Moreover, ensemble-based DA with the mask-state algorithm is promising and flexible, because it implements exactly the standard DA without any approximation and it realizes the satisfying performance without any change in the full model.Development of high-throughput technologies, such as Next-generation sequencing, allows thousands of experiments to be performed simultaneously while reducing resource requirement. The mask-state algorithm is generic and thus can be embedded in any ensemble-based DA framework. Subsequently, the total amount of computing time for volcanic ash DA is reduced to an acceptable level. Experimental results show the mask-state algorithm significantly speeds up the analysis step. This will reduce the computational cost in the analysis step. After a study of the characteristics of the ensemble ash state, we propose a mask-state algorithm which records the sparsity information of the full ensemble state matrix and transforms the full matrix into a relatively small one. Based on evaluations of the computational time, the analysis step of the assimilation turns out to be the most expensive part. In this study, we investigate a strategy to accelerate the data assimilation (DA) algorithm. We show how this design can be used as an efficient distributed computing platform within the clo. This is difficult to achieve using volunteers, and at the same time, using scalable cloud resources for short on demand projects can optimize the use of the available resources. The BOINC system is ideal in those cases where not only does the research community involved need low-cost access to massive computing resources but also where there is a significant public interest in the research being done.We discuss the way in which cloud services can help BOINC-based projects to deliver results in a fast, on demand manner. The majority of these have been built using the Berkeley Open Infrastructure for Network Computing (BOINC) platform, which provides a range of different services to manage all computation aspects of a project. Volunteer or crowd computing is becoming increasingly popular for solving complex research problems from an increasingly diverse range of areas.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |