Mesostate

From CSBLwiki

(Difference between revisions)

Jump to: navigation, search

Latest revision as of 02:24, 6 September 2010

Procedure

Standard data set

We are going to use the SCOP DB: sequences and structures in the Astral compendium

Torsion angles

Structural alphabets based on torsion angle distributions in the ramachandran plot

Randomly 5000 residues picked, Red=Sheets, Green=Helices

Mesostate

Data production by 배형섭
Torsion angle mesostate by LINUS (Rose Lab)

Mesostate - it contains an error www.pnas.org/content/102/45/16227/F1.large.jpg

Calculation

Following tools can be used to calculate torsion angles of backbones
- MMTK (python molecular modeling toolkit)
- DSSP (by C. Sander)

Alphabet assignment

Information theory

Profiling

Normalization?
Distance metric?

Applications

Status(result)

Check your data with Perl script (pdbstyle-1.75.new)
- check.pl < pdbstyle-1.75.new > tmp.txt

while(<>) {
  chomp;
  @tmp = split/\t/,$_;
  if($tmp[8]=~/_/) { next }
  if(scalar(@tmp)==9) { print $_,"\n" }
}

txt2csv.pl < tmp.txt > tmp1.csv

while(<>) {
  chomp;
  $_=~s/\[|\]//g;
  @line = split/\s+/,$_;
  $tmp = '"'.join("\"\,\"",@line)."\"";
  print $tmp,"\n";
}

Using R

meso = read.csv("tmp1.csv")
save(meso,"meso.rdata")

## load saved R data
load("meso.rdata")
## analysis
dim(meso) # dimension 11,810,116 residues
meso[1:2,] # check first two rows in the data (list)
dom = unique(meso$Domain)
ndom = length(dom) # 65,485 SCOP domains
nrow(meso)/ndom # average 180 residues (domain size)
                # consider 1st, last residues are skipped..
##
## ramachandran plot
##
# randomly picking 5,000 residue's Phi & Psi
rn = sample(nrow(meso),5000)
png(file="ramachandran.plot")
plot(meso$Phi[rn],meso$Psi[rn],xlab="Phi",ylab="Psi",xlim=c(-180,180),ylim=c(-180,180),main="Ramachandran plot",col="gray")
# randomly picking 5,000 helices
rn = sample(which(meso$Structure=='H'),5000)
points(meso$Phi[rn],meso$Psi[rn],col="green")
# randomly picking 5,000 sheets
rn = sample(which(meso$Structure=='E'),5000)
points(meso$Phi[rn],meso$Psi[rn],col="red")
dev.off()
##

Randomly 5000 residues picked, Red=Sheets, Green=Helices

6 x 6 bins

library(ash)
x = as.matrix(meso[,7:8])
ab = matrix(c(-180,-180,180,180),2,2)
nbin = c(6,6)
bins = bin2(x,ab,nbin)
-----
> print(bins)
       [,1]  [,2]    [,3]   [,4]   [,5]    [,6]
[1,]  90545 18116   83067 120228 284531 1369188
[2,] 100305 84717 3685965 485872 653175 2066098
[3,]   4849 86683 1565006   5826  39358  294463
[4,]  23516  9348    2594 135671  25882    3456
[5,]  54491 12777  109935 234363  17244   41903
[6,]  35495  3524   13284   6641   5944   34984

6x6 bins - heat color density

References

Error fetching PMID 14985506:
Error fetching PMID 19188606:

Error fetching PMID 14985506: [LFF]
Error fetching PMID 19188606: [FPP]

All Medline abstracts: PubMed HubMed

@@ Line 2: / Line 2: @@
 | __TOC__
 |}
+<!--
 ==Concept==
 *Protein structures (3D) can be parsed into a finite number of structural alphabets (e.g. a torsion angle of each residue)
@@ Line 8: / Line 10: @@
 *Using these structural ''words'', 1) building a profile, 2) relating evolution of molecules (proteins & folds) and organismic phylogeny (genomes; a catalog of words, sentences..)
 *Assumptions - ...
-*Expectation - Multiple Birth (Birth and death) model??
+*Expectation - Multiple Birth (Birth and death) model??-->
 ==Procedure==
@@ Line 16: / Line 18: @@
 ===Torsion angles===
 *Structural alphabets based on torsion angle distributions in the [[:en:ramachandran plot|ramachandran plot]]
-<pre>
+[[file:ramachandran.png|thumb|left|Randomly 5000 residues picked, Red=Sheets, Green=Helices]]
-images to be posted...
-</pre>
 ====Mesostate====
 *Data production by [[배형섭]]
 *[http://roselab.jhu.edu/dist/manual/meso_lett.html Torsion angle mesostate] by LINUS (Rose Lab)
-[[file:F1.large.jpg|150px|thumb|Mesostate - it contains an error http://www.pnas.org/content/102/45/16227/F1.large.jpg]]
+[[file:F1.large.jpg|150px|thumb|left|Mesostate - it contains an error www.pnas.org/content/102/45/16227/F1.large.jpg]]
 ====Calculation====
@@ Line 38: / Line 39: @@
 ==Status(result)==
-*TBA
+*Check your data with Perl script (pdbstyle-1.75.new)
+**check.pl < pdbstyle-1.75.new > tmp.txt
+<pre>
+while(<>) {
+  chomp;
+  @tmp = split/\t/,$_;
+  if($tmp[8]=~/_/) { next }
+  if(scalar(@tmp)==9) { print $_,"\n" }
+}
+</pre>
+*txt2csv.pl < tmp.txt > tmp1.csv
+<pre>
+while(<>) {
+  chomp;
+  $_=~s/\[|\]//g;
+  @line = split/\s+/,$_;
+  $tmp = '"'.join("\"\,\"",@line)."\"";
+  print $tmp,"\n";
+}
+</pre>
+*Using [[R]]
+<pre>
+meso = read.csv("tmp1.csv")
+save(meso,"meso.rdata")
+</pre>
+<pre>
+## load saved R data
+load("meso.rdata")
+## analysis
+dim(meso) # dimension 11,810,116 residues
+meso[1:2,] # check first two rows in the data (list)
+dom = unique(meso$Domain)
+ndom = length(dom) # 65,485 SCOP domains
+nrow(meso)/ndom # average 180 residues (domain size)
+                # consider 1st, last residues are skipped..
+##
+## ramachandran plot
+##
+# randomly picking 5,000 residue's Phi & Psi
+rn = sample(nrow(meso),5000)
+png(file="ramachandran.plot")
+plot(meso$Phi[rn],meso$Psi[rn],xlab="Phi",ylab="Psi",xlim=c(-180,180),ylim=c(-180,180),main="Ramachandran plot",col="gray")
+# randomly picking 5,000 helices
+rn = sample(which(meso$Structure=='H'),5000)
+points(meso$Phi[rn],meso$Psi[rn],col="green")
+# randomly picking 5,000 sheets
+rn = sample(which(meso$Structure=='E'),5000)
+points(meso$Phi[rn],meso$Psi[rn],col="red")
+dev.off()
+##
+</pre>
+[[file:ramachandran.png|thumb|left|Randomly 5000 residues picked, Red=Sheets, Green=Helices]]
+*6 x 6 bins
+<pre>
+library(ash)
+x = as.matrix(meso[,7:8])
+ab = matrix(c(-180,-180,180,180),2,2)
+nbin = c(6,6)
+bins = bin2(x,ab,nbin)
+-----
+> print(bins)
+       [,1]  [,2]    [,3]   [,4]   [,5]    [,6]
+[1,]  90545 18116   83067 120228 284531 1369188
+[2,] 100305 84717 3685965 485872 653175 2066098
+[3,]   4849 86683 1565006   5826  39358  294463
+[4,]  23516  9348    2594 135671  25882    3456
+[5,]  54491 12777  109935 234363  17244   41903
+[6,]  35495  3524   13284   6641   5944   34984
+</pre>
+[[file:binsplot.png|thumb|left|6x6 bins - heat color density]]
 ==References==
+<biblio>
+#LFF pmid=14985506
+#FPP pmid=19188606
+</biblio>

Mesostate

From CSBLwiki

Latest revision as of 02:24, 6 September 2010

Contents

Procedure

Standard data set

Torsion angles

Mesostate

Calculation

Alphabet assignment

Profiling

Applications

Status(result)

References

Personal tools

Namespaces

Variants

Views

Actions

Search

Site

Choi lab

Resources

Toolbox