Data security model

From CoGepedia
Revision as of 15:06, 26 March 2014 by Elyons (Talk | contribs)

Jump to: navigation, search
CoGe's genomes database schema. This database can store multiple genomes from multiple organisms. The data is partitioned into three major section: Genome sequence information (yellow), genomic feature information (blue), dataset information (green), user model (orange)

Overview

CoGe's data security model relies on UserGroups, which bind together a set of users and a set of genomes.

Details

The security model is shown in CoGe's database schema by the orange tables. Users may belong to UserGroups, and these UserGroups may have access to particular datasets and genomes. The UserGroups have a role (Owner, Editor, Reader) that permits them to do particular things with the genomes.

Data Specifics

Primary Sequence and Experiment Data

All primary sequence data and experimental data (e.g., transcriptomes/SNPs) are stored in non-web accessible directories. The service responsible for retrieving sequence data requires authentication for each transaction on privileged/restricted access data. In addition, all data is stored without identifying information as to the organism from which they are derived.

Derivative Sequence Data

These data include fasta sequences derived from the primary sequence data (e.g., CDS sequences), results from whole genome comparative analyses, processed experiments, etc. These data are stored in non-web accessible directories. The service responsible for retrieving sequence data requires authentication for each transaction on privileged/restricted access data.

Backups

All primary sequence data are backed-up daily. All experiments are backed-up daily. All metadata stored in CoGe's main relational database are backed up daily. The backup process utilizes irsync (from iRODS). Backups are kept daily for a week, weekly for a month, and monthly for 6 months. The primary CoGe server's hard-drives are RAID6. Backups are kept in iRODS and are multiply redundant (multiple servers replicating data among separate data centers in different US states).

User Accounts

CoGe does not manage user account information. This is a service provided by iPlant which is used for user management and authentication. CoGe keeps a minimal amount of information about each user including real name, user name, and email address. If you would like your user account removed from CoGe, please contact us.