The Flag-based Algorithm - A Novel Greedy Method that Optimizes Protein Communities Detection

Razvan Bocu, Sabin Tabirca

Abstract


Proteins and the networks they determine, called interactome networks, have received attention at an important degree during the last years, because they have been discovered to have an influence on some complex biological phenomena, such as problematic disorders like cancer. This paper presents a contribution that aims to optimize the detection of protein communities through a greedy algorithm that is implemented in the C programming language. The optimization involves a double improvement in relation to protein communities detection, which is accomplished both at the algorithmic and programming level. The resulting implementation’s performance was carefully tested on real biological data and the results acknowledge the relevant speedup that the optimization determines. Moreover, the results are in line with the previous findings that our current research produced, as it reveals and confirms the existence of some important properties of those proteins that participate in the carcinogenesis process. Apart from being particularly useful for research purposes, the novel community detection algorithm also dramatically speeds up the proteomic databases analysis process, as compared to some other sequential community detection approaches, and also to the sequential algorithm of Newman and Girvan.

Keywords


Interactome networks, protein-protein interactions, protein communities, cancer, greedy algorithm

Full Text:

PDF

References


J. Yoon, A. Blumer and K. Lee, An algorithm for modularity analysis of directed and weighted biological networks based on edge-betweenness centrality: Bioinformatics, 2006.
http://dx.doi.org/10.1093/bioinformatics/btl533

M. Girvan and M.E.J. Newman, Community structure in social and biological networks: State University of New Jersey, 2002.

D. Ucar et al., Improving functional modularity in protein-protein interactions graphs using hub-induced subgraphs: Ohio State University, 2007.

K. Lehmann and M. Kaufmann, Decentralized algorithms for evaluating centrality in complex networks: IEEE, 2002.

J. Griebsch et al., A fast algorithm for the iterative calculation of betweenness centrality: Technical University of Munchen, 2004.

G.H. Traver et al., How complete are current yeast and human protein-interaction networks? : Genome biology, 2006.

R. Bunescu et al., Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome: Genome biology, 2005.

U. Brandes, A faster algorithm for betweenness centrality: University of Konstanz, 2001.

B. Preiss, Data structures and algorithms with object-oriented design patterns in C++: John Wiley and sons, 1998.

EMBL-EBI, The IntAct protein interactions database. URL: http://www.ebi.ac.uk/intact/site/index.jsf, 2009.

A. Grama et al., Introduction to parallel computing, second edition: Addison-Wesley, 2003.

University of California, The DIP protein interactions database. URL: http://dip.doembi. ucla.edu/, 2009.

R. Bocu and S. Tabirca, Betweenness Centrality Computation - A New Way for Analyzing the Biological Systems: Proceedings of the BSB 2009 conference, Leipzig, Germany, 2009.

L.C. Freeman, A set of measures of centrality based on betweenness: Sociometry, Vol. 40, 35-41, 1977.
http://dx.doi.org/10.2307/3033543

P.F. Jonsson and P.A. Bates, Global topological features of cancer proteins in the human interactome: Bioinformatics Advance Access, 2006.

Wellcome Trust Sanger Institute, The Pfam protein families database. URL: http://pfam.sanger.ac.uk/, 2009.

R. Bocu and S. Tabirca, Sparse networks-based speedup technique for proteins betweenness centrality computation: International Journal of Biological and Life Sciences, 2009.

S. Wachi et al., Interactome-transcriptome analysis reveals the high centrality of genes differentially expressed in lung cancer tissues: Bioinformatics, 21, 4205-4208, 2005.
http://dx.doi.org/10.1093/bioinformatics/bti688

G. Palla et al., Uncovering the overlapping community structure of complex networks in nature and society: Nature, 435, 814-818, 2005.
http://dx.doi.org/10.1038/nature03607

P.F. Jonsson et al., Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis: BMC Bioinformatics, 7, 2, 2006.
http://dx.doi.org/10.1186/1471-2105-7-2

R. Bocu and S. Tabirca, Proteomic Data Analysis Optimization Using a Parallel MPI C Approach: IEEE Computer Society, The First International Conference on Advances in Bioinformatics and Applications, 2010.
http://dx.doi.org/10.1109/biosciencesworld.2010.11

A. Clauset, M.E.J. Newman and Ch. Moore, Finding community structure in very large networks: Phys. Rev. E 70, 066111, 2004.
http://dx.doi.org/10.1103/PhysRevE.70.066111

S. Schnell, S. Fortunato and R. Sourav, Is the intrinsic disorder of proteins the cause of the scale-free architecture of protein-protein interaction networks? : Proteomics 7 no. 6, 961-964, 2007.
http://dx.doi.org/10.1002/pmic.200600455

V. Batagelj and U. Brandes, Efficient generation of large random networks: Physical Review E. 71:036113-036118, 2005.
http://dx.doi.org/10.1103/PhysRevE.71.036113

R. Bocu, Detecting community structure in networks: Eur. Phys. J. B 38, 321-330, 2004.
http://dx.doi.org/10.1140/epjb/e2004-00124-y

V.D. Blondel, J.-L. Guillaume, R. Lambiotte and E. Lefebvre, Fast unfolding of communities in large networks: Journal of Statistical Mechanics, arXiv:0803.0476v2 [physics.soc-ph], 2008.




DOI: https://doi.org/10.15837/ijccc.2011.1.2198



Copyright (c) 2017 Razvan Bocu, Sabin Tabirca

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

CC-BY-NC  License for Website User

Articles published in IJCCC user license are protected by copyright.

Users can access, download, copy, translate the IJCCC articles for non-commercial purposes provided that users, but cannot redistribute, display or adapt:

  • Cite the article using an appropriate bibliographic citation: author(s), article title, journal, volume, issue, page numbers, year of publication, DOI, and the link to the definitive published version on IJCCC website;
  • Maintain the integrity of the IJCCC article;
  • Retain the copyright notices and links to these terms and conditions so it is clear to other users what can and what cannot be done with the  article;
  • Ensure that, for any content in the IJCCC article that is identified as belonging to a third party, any re-use complies with the copyright policies of that third party;
  • Any translations must prominently display the statement: "This is an unofficial translation of an article that appeared in IJCCC. Agora University  has not endorsed this translation."

This is a non commercial license where the use of published articles for commercial purposes is forbiden. 

Commercial purposes include: 

  • Copying or downloading IJCCC articles, or linking to such postings, for further redistribution, sale or licensing, for a fee;
  • Copying, downloading or posting by a site or service that incorporates advertising with such content;
  • The inclusion or incorporation of article content in other works or services (other than normal quotations with an appropriate citation) that is then available for sale or licensing, for a fee;
  • Use of IJCCC articles or article content (other than normal quotations with appropriate citation) by for-profit organizations for promotional purposes, whether for a fee or otherwise;
  • Use for the purposes of monetary reward by means of sale, resale, license, loan, transfer or other form of commercial exploitation;

    The licensor cannot revoke these freedoms as long as you follow the license terms.

[End of CC-BY-NC  License for Website User]


INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL (IJCCC), With Emphasis on the Integration of Three Technologies (C & C & C),  ISSN 1841-9836.

IJCCC was founded in 2006,  at Agora University, by  Ioan DZITAC (Editor-in-Chief),  Florin Gheorghe FILIP (Editor-in-Chief), and  Misu-Jan MANOLESCU (Managing Editor).

Ethics: This journal is a member of, and subscribes to the principles of, the Committee on Publication Ethics (COPE).

Ioan  DZITAC (Editor-in-Chief) at COPE European Seminar, Bruxelles, 2015:

IJCCC is covered/indexed/abstracted in Science Citation Index Expanded (since vol.1(S),  2006); JCR2018: IF=1.585..

IJCCC is indexed in Scopus from 2008 (CiteScore2018 = 1.56):

Nomination by Elsevier for Journal Excellence Award Romania 2015 (SNIP2014 = 1.029): Elsevier/ Scopus

IJCCC was nominated by Elsevier for Journal Excellence Award - "Scopus Awards Romania 2015" (SNIP2014 = 1.029).

IJCCC is in Top 3 of 157 Romanian journals indexed by Scopus (in all fields) and No.1 in Computer Science field by Elsevier/ Scopus.

 

 Impact Factor in JCR2018 (Clarivate Analytics/SCI Expanded/ISI Web of Science): IF=1.585 (Q3). Scopus: CiteScore2018=1.56 (Q2);

SCImago Journal & Country Rank

Editors-in-Chief: Ioan DZITAC & Florin Gheorghe FILIP.