Please use this identifier to cite or link to this item: https://hdl.handle.net/1822/6634

Full metadata record
DC FieldValueLanguage
dc.contributor.authorExposto, José-
dc.contributor.authorMacedo, Joaquim-
dc.contributor.authorPina, António Manuel Silva-
dc.contributor.authorAlves, Albano Agostinho Gomes-
dc.contributor.authorAmaro, José Carlos Rufino-
dc.date.accessioned2007-06-18T16:10:04Z-
dc.date.available2007-06-18T16:10:04Z-
dc.date.issued2007-01-
dc.identifier.citationINTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, 21, Estoril, Portugal, 2007 – “ICOIN 2007 : proceedings of the 21st International Conference on Information Networking”. [S.l. : s.n., 2007?].eng
dc.identifier.urihttps://hdl.handle.net/1822/6634-
dc.description.abstractThis paper presents a multi-objective approach to Web space partitioning, aimed to improve distributed crawling efficiency. The in- vestigation is supported by the construction of two different weighted graphs. The first is used to model the topological communication infras- tructure between crawlers and Web servers and the second is used to represent the amount of link connections between servers’ pages. The values of the graph edges represent, respectively, computed RTTs and pages links between nodes. The two graphs are further combined, using a multi-ob jective partitio- ning algorithm, to support Web space partitioning and load allocation for an adaptable number of geographical distributed crawlers. Partitioning strategies were evaluated by varying the number of partiti- ons (crawlers) to obtain merit figures for: i) download time, ii) exchange time and iii) relocation time. Evaluation has showed that our partitio- ning schemes outperform traditional hostname hash based counterparts in all evaluated metric, achieving on average 18% reduction for download time, 78% reduction for exchange time and 46% reduction for relocation time.eng
dc.description.sponsorshipFundação para a Ciência e a Tecnologia (FCT)eng
dc.language.isoengeng
dc.rightsopenAccesseng
dc.subjectDatabaseseng
dc.subjectComputer communications and networkseng
dc.titleEfficient partitioning strategies for distributed Web crawlingeng
dc.typeconferencePapereng
dc.peerreviewedyeseng
Appears in Collections:DI/CCTC - Artigos (papers)

Files in This Item:
File Description SizeFormat 
icoin2007-exp.pdfDocumento principal254,32 kBAdobe PDFView/Open

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID