Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/55212

Registo completo
Campo DCValorIdioma
dc.contributor.authorCosta, Eduardapor
dc.contributor.authorCosta, Carlos A.por
dc.contributor.authorSantos, Maribel Yasminapor
dc.date.accessioned2018-07-02T08:59:56Z-
dc.date.issued2018-
dc.identifier.isbn9783319777115por
dc.identifier.issn2194-5357-
dc.identifier.urihttps://hdl.handle.net/1822/55212-
dc.description.abstractHive is a tool that allows the implementation of Data Warehouses for Big Data contexts, organizing data into tables, partitions and buckets. Some studies have been conducted to understand ways of optimizing the performance of data storage and processing techniques/technologies for Big Data Warehouses. However, few of these studies explore whether the way data is structured has any influence on how Hive responds to queries. Thus, this work investigates the impact of creating partitions and buckets in the processing times of Hive-based Big Data Warehouses. The results obtained with the application of different modelling and organization strategies in Hive reinforces the advantages associated to the implementation of Big Data Warehouses based on denormalized models and, also, the potential benefit of adequate partitioning that, once aligned with the filters frequently applied on data, can significantly decrease the processing times. In contrast, the use of bucketing techniques has no evidence of significant advantages.por
dc.description.sponsorshipThis work is supported by COMPETE: POCI-01-0145- FEDER-007043 and FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2013, and by European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project nº 002814; Funding Reference: POCI-01-0247-FEDER-002814].por
dc.language.isoengpor
dc.publisherSpringer Verlagpor
dc.relationinfo:eu-repo/grantAgreement/FCT/5876/147280/PTpor
dc.rightsrestrictedAccesspor
dc.subjectBig datapor
dc.subjectBig data warehousepor
dc.subjectBucketspor
dc.subjectHivepor
dc.subjectPartitionspor
dc.titlePartitioning and bucketing in hive-based big data warehousespor
dc.typeconferencePaperpor
dc.peerreviewedyespor
oaire.citationStartPage764por
oaire.citationEndPage774por
oaire.citationVolume746por
dc.date.updated2018-06-30T18:32:02Z-
dc.identifier.doi10.1007/978-3-319-77712-2_72por
dc.description.publicationversioninfo:eu-repo/semantics/publishedVersionpor
sdum.export.identifier5167-
sdum.journalAdvances in Intelligent Systems and Computingpor
Aparece nas coleções:CAlg - Artigos em livros de atas/Papers in proceedings

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
Costa et al. - 2018 - Partitioning and Bucketing in Hive-Based Big Data .pdf
Acesso restrito!
484,15 kBAdobe PDFVer/Abrir

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID