What a scientific benchmarking is?
Scientific benchmarking helps determine the precision, recall and other metrics of bioinformatics resources (in the form of software tool) in unbiased scenarios, which have been set up through reference databases, ad-hoc input and test data sets reflecting specifying scientific challenges. Chosen metrics allow us to objectively evaluate the relative scientific performance of the different participating resources.
For conducting a scientific benchmarking in OpenEBench, you need to understand the following concepts:
- Participant: A software application (program, server or pipeline) that needs to be evaluated and be compared with other participants. The evaluation is done comparing the participant results with a reference dataset. All the participants that are going to be compared using the same reference dataset are grouped in a challenge.
- Challenge: A challenge comprises one reference dataset and a set of evaluation metrics that will be applied to all the challenge participants.
- Reference Dataset: It is a set of data that contains the expected results for a participant.
- Evaluation metrics: Set of metrics used to compare the results of the participants with the reference dataset, for example, ......
- Benchmarking Event: The challenges are organized in the context of a benchmarking event, that can contain more than one challenge. A benchmarking event is defined as a time-bound contest where a tool, pipeline, service or product, i.e., the participant, is compared against other participants using a predefined collection of reference datasets and assessment metrics.
- Community: The organised group behind the benchmarking.
Communities are the central concept for scientific benchmarking in OpenEBench. The scientific benchmaking is seeing as a community-driven effort to improve research software in the field. Communities organise benchmarking events, that are composed by several challenges, each challenge contains all the participants that are going to be evaluated together.