Repository logo
 
Publication

Performance Evaluation Analysis of Spark Streaming Backpressure for Data-Intensive Pipelines

dc.contributor.authorMatteussi, Kassiano José
dc.contributor.authorAnjos, Julio
dc.contributor.authorLEITHARDT, VALDERI
dc.contributor.authorResin Geyer, Claudio Fernando
dc.date.accessioned2023-02-01T18:32:23ZPT
dc.date.available2023-02-01T18:32:23ZPT
dc.date.issued2022-06-23PT
dc.date.updated2022-06-28T08:44:32Z
dc.description.abstractA significant rise in the adoption of streaming applications has changed the decisionmaking processes in the last decade. This movement has led to the emergence of several Big Data technologies for in-memory processing, such as the systems Apache Storm, Spark, Heron, Samza, Flink, and others. Spark Streaming, a widespread open-source implementation, processes data-intensive applications that often require large amounts of memory. However, Spark Unified Memory Manager cannot properly manage sudden or intensive data surges and their related inmemory caching needs, resulting in performance and throughput degradation, high latency, a large number of garbage collection operations, out-of-memory issues, and data loss. This work presents a comprehensive performance evaluation of Spark Streaming backpressure to investigate the hypothesis that it could support data-intensive pipelines under specific pressure requirements. The results reveal that backpressure is suitable only for small and medium pipelines for stateless and stateful applications. Furthermore, it points out the Spark Streaming limitations that lead to in-memory-based issues for data-intensive pipelines and stateful applications. In addition, the work indicates potential solutions.pt_PT
dc.description.versionN/Apt_PT
dc.identifier.doi10.3390/s22134756pt_PT
dc.identifier.slugcv-prod-3014543
dc.identifier.urihttp://hdl.handle.net/10400.26/43560PT
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.subjectbackpressure;pt_PT
dc.subjectbig data;pt_PT
dc.subjectspark streaming;pt_PT
dc.subjectstream processingpt_PT
dc.titlePerformance Evaluation Analysis of Spark Streaming Backpressure for Data-Intensive Pipelinespt_PT
dc.typejournal article
dspace.entity.typePublication
oaire.citation.endPage4756pt_PT
oaire.citation.issue13pt_PT
oaire.citation.startPage4756pt_PT
oaire.citation.titleSensorspt_PT
oaire.citation.volume22pt_PT
person.familyNameMatteussi
person.familyNameAnjos
person.familyNameREIS QUIETINHO LEITHARDT
person.familyNameResin Geyer
person.givenNameKassiano José
person.givenNameJulio
person.givenNameVALDERI
person.givenNameClaudio Fernando
person.identifier1487846
person.identifier967546
person.identifierJsOq45sAAAAJ&hl=pt-PT
person.identifier.ciencia-id8C1B-D75D-360B
person.identifier.ciencia-id0614-5834-E7F3
person.identifier.orcid0000-0002-9131-6849
person.identifier.orcid0000-0003-3623-2762
person.identifier.orcid0000-0003-0446-9271
person.identifier.orcid0000-0002-8602-2336
person.identifier.ridI-4821-2017
person.identifier.scopus-author-id56785790000
person.identifier.scopus-author-id35303109600
rcaap.cv.cienciaid0614-5834-E7F3 | Valderi Reis Quietinho Leithardt
rcaap.rightsopenAccesspt_PT
rcaap.typearticlept_PT
relation.isAuthorOfPublicationf8cc4e91-0a81-436a-ac1e-ef6a2013128f
relation.isAuthorOfPublicationf83c9edd-0553-4d29-b170-4061024ff961
relation.isAuthorOfPublicationab15f7c6-e882-406e-813d-2629e9cec5c8
relation.isAuthorOfPublication08e6899a-68bc-427c-b26f-c4abc4793647
relation.isAuthorOfPublication.latestForDiscovery08e6899a-68bc-427c-b26f-c4abc4793647

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
sensors-22-04756-v2.pdf
Size:
1.8 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.89 KB
Format:
Item-specific license agreed upon to submission
Description: