Human Protein Tyrosine Phosphatase Sequence Similarity Network
Human Protein Tyrosine Phosphatase Sequence Similarity Network
The protein tyrosine phosphatase (PTP) family has a central role in signal transduction by controlling
the phosphorylation state of tyrosine, serine and threonine residues.
The power graph of the protein tyrosine phosphatase homology network is shown below. The
network consists of 279 nodes, each one representing a protein. Edges between two proteins corresponds
to highly significant alignments of the sequences with a BLASTP E-value of at most 10-46. PTPs are
usually classified into classical specific phosphatases, dual specificity phosphatases, and other minor classes,
such as low molecular weight phosphatases and myotubularins. Classical specific phosphatases are further
subdivided into receptor type and non receptor type. Unsurprisingly, because of their sequence similarities,
the categories of receptor, non-receptor, and dual-specificity phosphatases are delineated by the power graph
representation. For example the receptor type PTPs are grouped in one power node signifying that they all
are similar to one another with E-values below 10-46, same for different classes of non-receptor type PTPs,
and other, such as myotubularins. Interestingly, the different classes of receptor PTPs, such as types A, B,
C, D, F, H, T are discriminated solely on the basis of shared similarity to non-receptor PTPs.
(The applet should load shortly in the space above. Please be patient...)
The cross-links between different regions of the hierarchy constitute a new insight with respect to traditional
clustering methods. For example, a group of 6 type B receptor PTPs are linked by a power edge
to two type 2 non-receptor PTPs. (There are the only nodes having labels)
While the common PTP domains are aligned for the six sequences, we also observe that the second copy
of the tyrosine phosphatase domain of the two type G PTPs align to an unannotated region of about 370
amino acids with a sequence identity of 14% and a similarity of 39% (BLOSUM 62). This region corresponds
with high probability (NorMD = 1.014) to a non-receptor phosphatase domain listed in ProDom -
a database of automatically generated clusters of homologous sequence fragments.
The previous result suggests that the second phosphatase domain
of type 22 PTPs got eroded though the accumulation of mutations following a release in selection pressure.
The detection of similarity cross-links in the hierarchy is the contribution of Power Graph Analysis to
the analysis of homology networks. These cross-links constitute a weak signal in networks and are difficult
to detect. In this case the evidence for this domain erosion is carried by only eight similarity links between
four and two proteins whereas the original network has 4849 edges. In the power graph representation it is
one power edge among only 209.