Evolutionary age

From CSBLwiki

(Difference between revisions)
Jump to: navigation, search
(Tree data)
(Pfam data (24.0))
Line 12: Line 12:
*Loading into MySQL
*Loading into MySQL
**Not all DBs were imported (only a few key DBs for this study)
**Not all DBs were imported (only a few key DBs for this study)
-
**pfamseq, genome_seqs, ncbi_taxonomy, genome_species, pfamA, gene_ontology
+
**pfamseq, genome_seqs(not available), ncbi_taxonomy, genome_species(not availabe), pfamA, gene_ontology
 +
**Release 24 - pfamseq, pfamA, ncbi_taxonomy, taxonomy, pfamA_reg_full_significant
 +
***in_full should "1"
 +
<pre>
 +
The pfamA_reg_full_significant and pfamA_reg_full_insignificant tables contain, as the names suggest, the significant and insignificant data respectively. Significant hits are those with a bits score above the curated threshold for the family, whilst insignificant matches are those that score below the curated threshold. With respect to the tables that contain significant data (pfamA_reg_full_significant and pfamA_reg_full), there is an extra column called 'in_full'. The matches that are present in the full alignment for a Pfam family have this column set to 1, while those that are not present in the full alignment have the 'in_full' column set to 0. Where there is an overlapping fragment match and a full length match to the same Pfam-A family, only one of the matches will be present in the full alignment for that Pfam-A family. </pre>
<pre>
<pre>
mysql -u user -p
mysql -u user -p

Revision as of 06:19, 16 August 2010

Contents

Evolutionary age of protein domains

(Based on this reference)

Error fetching PMID 16959887:
  1. Error fetching PMID 16959887: [Reference]

Pfam data (24.0)

# download total DB (estimated ~2 days)
wget -c ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/database.tar &
The pfamA_reg_full_significant and pfamA_reg_full_insignificant tables contain, as the names suggest, the significant and insignificant data respectively. Significant hits are those with a bits score above the curated threshold for the family, whilst insignificant matches are those that score below the curated threshold. With respect to the tables that contain significant data (pfamA_reg_full_significant and pfamA_reg_full), there is an extra column called 'in_full'. The matches that are present in the full alignment for a Pfam family have this column set to 1, while those that are not present in the full alignment have the 'in_full' column set to 0. Where there is an overlapping fragment match and a full length match to the same Pfam-A family, only one of the matches will be present in the full alignment for that Pfam-A family. 
mysql -u user -p
mysql>create DATABASE pfam24; \q
mysql -u user -p pfam24 < FULL_PATH/pfamseq.sql
mysql -u user -p
mysql>use pfam24
mysql>load data local infile 'pfamseq.txt' into table pfamseq FIELDS ENCLOSED BY "\'";
mysql -u user -p'passwd' database < loadscript.sql &

Tree data

Results

Procedure

Personal tools
Namespaces
Variants
Actions
Site
Choi lab
Resources
Toolbox