Dataset group

From CoGepedia
Jump to: navigation, search

In CoGe's database, the database group table permits several datasets to be joined together in order to create one higher-order unit of data. Practically, such groups of datasets create a genome for a particular organism. The use of the term "dataset group" is still used for historically resources, but is also called a "genome" in CoGe and its documentation. A dataset group may contain a single dataset or multiple datasets, depending on how a genome was represented by its data files. For example, a large multi-chromosome genome often has each chromosome and its associated genomic features and annotations stored in separate files. However, sometimes all the data is stored in a single file. By tracking these data files separately as datasets and joining them together through dataset groups, this facilitates CoGe keeping track of the providence of the data. Also, dataset groups can bring together data from different sources for creating a genome within CoGe. For example, if a distinct research group characterizes a specific class of genomic features that were not described in the original datasets.