Mesostate

From CSBLwiki

(Difference between revisions)
Jump to: navigation, search
(Mesostate)
 
(22 intermediate revisions not shown)
Line 2: Line 2:
| __TOC__  
| __TOC__  
|}
|}
 +
 +
<!--
==Concept==
==Concept==
*Protein structures (3D) can be parsed into a finite number of structural alphabets (e.g. a torsion angle of each residue)
*Protein structures (3D) can be parsed into a finite number of structural alphabets (e.g. a torsion angle of each residue)
Line 8: Line 10:
*Using these structural ''words'', 1) building a profile, 2) relating evolution of molecules (proteins & folds) and organismic phylogeny (genomes; a catalog of words, sentences..)
*Using these structural ''words'', 1) building a profile, 2) relating evolution of molecules (proteins & folds) and organismic phylogeny (genomes; a catalog of words, sentences..)
*Assumptions - ...
*Assumptions - ...
-
*Expectation - Multiple Birth (Birth and death) model??
+
*Expectation - Multiple Birth (Birth and death) model??-->
==Procedure==
==Procedure==
Line 16: Line 18:
===Torsion angles===
===Torsion angles===
*Structural alphabets based on torsion angle distributions in the [[:en:ramachandran plot|ramachandran plot]]
*Structural alphabets based on torsion angle distributions in the [[:en:ramachandran plot|ramachandran plot]]
-
<pre>
+
[[file:ramachandran.png|thumb|left|Randomly 5000 residues picked, Red=Sheets, Green=Helices]]
-
images to be posted...
+
 
-
</pre>
+
====Mesostate====
====Mesostate====
*Data production by [[배형섭]]
*Data production by [[배형섭]]
-
*Torsion angle mesostate [http://www.roselab.jhu.edu/ (by GD Rose Group)]
+
*[http://roselab.jhu.edu/dist/manual/meso_lett.html Torsion angle mesostate] by LINUS (Rose Lab)
-
http://www.pnas.org/content/102/45/16227/F1.large.jpg
+
[[file:F1.large.jpg|150px|thumb|left|Mesostate - it contains an error www.pnas.org/content/102/45/16227/F1.large.jpg]]
-
[[file:F1.large.jpg|150px|Mesostate]]
+
-
[[배형섭]]
+
====Calculation====
====Calculation====
Line 40: Line 39:
==Status(result)==
==Status(result)==
-
*TBA
+
*Check your data with Perl script (pdbstyle-1.75.new)
 +
**check.pl < pdbstyle-1.75.new > tmp.txt
 +
<pre>
 +
while(<>) {
 +
  chomp;
 +
  @tmp = split/\t/,$_;
 +
  if($tmp[8]=~/_/) { next }
 +
  if(scalar(@tmp)==9) { print $_,"\n" }
 +
}
 +
</pre>
 +
*txt2csv.pl < tmp.txt > tmp1.csv
 +
<pre>
 +
while(<>) {
 +
  chomp;
 +
  $_=~s/\[|\]//g;
 +
  @line = split/\s+/,$_;
 +
  $tmp = '"'.join("\"\,\"",@line)."\"";
 +
  print $tmp,"\n";
 +
}
 +
</pre>
 +
 
 +
*Using [[R]]
 +
<pre>
 +
meso = read.csv("tmp1.csv")
 +
save(meso,"meso.rdata")
 +
</pre>
 +
 
 +
<pre>
 +
## load saved R data
 +
load("meso.rdata")
 +
## analysis
 +
dim(meso) # dimension 11,810,116 residues
 +
meso[1:2,] # check first two rows in the data (list)
 +
dom = unique(meso$Domain)
 +
ndom = length(dom) # 65,485 SCOP domains
 +
nrow(meso)/ndom # average 180 residues (domain size)
 +
                # consider 1st, last residues are skipped..
 +
##
 +
## ramachandran plot
 +
##
 +
# randomly picking 5,000 residue's Phi & Psi
 +
rn = sample(nrow(meso),5000)
 +
png(file="ramachandran.plot")
 +
plot(meso$Phi[rn],meso$Psi[rn],xlab="Phi",ylab="Psi",xlim=c(-180,180),ylim=c(-180,180),main="Ramachandran plot",col="gray")
 +
# randomly picking 5,000 helices
 +
rn = sample(which(meso$Structure=='H'),5000)
 +
points(meso$Phi[rn],meso$Psi[rn],col="green")
 +
# randomly picking 5,000 sheets
 +
rn = sample(which(meso$Structure=='E'),5000)
 +
points(meso$Phi[rn],meso$Psi[rn],col="red")
 +
dev.off()
 +
##
 +
</pre>
 +
[[file:ramachandran.png|thumb|left|Randomly 5000 residues picked, Red=Sheets, Green=Helices]]
 +
*6 x 6 bins
 +
<pre>
 +
library(ash)
 +
x = as.matrix(meso[,7:8])
 +
ab = matrix(c(-180,-180,180,180),2,2)
 +
nbin = c(6,6)
 +
bins = bin2(x,ab,nbin)
 +
-----
 +
> print(bins)
 +
      [,1]  [,2]    [,3]  [,4]  [,5]    [,6]
 +
[1,]  90545 18116  83067 120228 284531 1369188
 +
[2,] 100305 84717 3685965 485872 653175 2066098
 +
[3,]  4849 86683 1565006  5826  39358  294463
 +
[4,]  23516  9348    2594 135671  25882    3456
 +
[5,]  54491 12777  109935 234363  17244  41903
 +
[6,]  35495  3524  13284  6641  5944  34984
 +
</pre>
 +
[[file:binsplot.png|thumb|left|6x6 bins - heat color density]]
 +
 
==References==
==References==
 +
<biblio>
 +
#LFF pmid=14985506
 +
#FPP pmid=19188606
 +
</biblio>

Latest revision as of 02:24, 6 September 2010

Contents


Procedure

Standard data set

Torsion angles

Randomly 5000 residues picked, Red=Sheets, Green=Helices

Mesostate

Mesostate - it contains an error www.pnas.org/content/102/45/16227/F1.large.jpg

Calculation

Alphabet assignment

Profiling

Applications

Status(result)

while(<>) {
  chomp;
  @tmp = split/\t/,$_;
  if($tmp[8]=~/_/) { next }
  if(scalar(@tmp)==9) { print $_,"\n" }
}
while(<>) {
  chomp;
  $_=~s/\[|\]//g;
  @line = split/\s+/,$_;
  $tmp = '"'.join("\"\,\"",@line)."\"";
  print $tmp,"\n";
}
meso = read.csv("tmp1.csv")
save(meso,"meso.rdata")
## load saved R data
load("meso.rdata")
## analysis
dim(meso) # dimension 11,810,116 residues
meso[1:2,] # check first two rows in the data (list)
dom = unique(meso$Domain)
ndom = length(dom) # 65,485 SCOP domains
nrow(meso)/ndom # average 180 residues (domain size)
                # consider 1st, last residues are skipped..
##
## ramachandran plot
##
# randomly picking 5,000 residue's Phi & Psi
rn = sample(nrow(meso),5000)
png(file="ramachandran.plot")
plot(meso$Phi[rn],meso$Psi[rn],xlab="Phi",ylab="Psi",xlim=c(-180,180),ylim=c(-180,180),main="Ramachandran plot",col="gray")
# randomly picking 5,000 helices
rn = sample(which(meso$Structure=='H'),5000)
points(meso$Phi[rn],meso$Psi[rn],col="green")
# randomly picking 5,000 sheets
rn = sample(which(meso$Structure=='E'),5000)
points(meso$Phi[rn],meso$Psi[rn],col="red")
dev.off()
##
Randomly 5000 residues picked, Red=Sheets, Green=Helices
library(ash)
x = as.matrix(meso[,7:8])
ab = matrix(c(-180,-180,180,180),2,2)
nbin = c(6,6)
bins = bin2(x,ab,nbin)
-----
> print(bins)
       [,1]  [,2]    [,3]   [,4]   [,5]    [,6]
[1,]  90545 18116   83067 120228 284531 1369188
[2,] 100305 84717 3685965 485872 653175 2066098
[3,]   4849 86683 1565006   5826  39358  294463
[4,]  23516  9348    2594 135671  25882    3456
[5,]  54491 12777  109935 234363  17244   41903
[6,]  35495  3524   13284   6641   5944   34984
6x6 bins - heat color density

References

Error fetching PMID 14985506:
Error fetching PMID 19188606:
  1. Error fetching PMID 14985506: [LFF]
  2. Error fetching PMID 19188606: [FPP]
All Medline abstracts: PubMed HubMed
Personal tools
Namespaces
Variants
Actions
Site
Choi lab
Resources
Toolbox