Please use this identifier to cite or link to this item:
Title: Ανάλυση Δεδομένων Σε Κατανεμημένα Συστήματα Πραγματικού Χρόνου
Authors: Τζίμα Σοφία
Βαρβαρίγου Θεοδώρα
Keywords: real-time systems
apache storm
sentiment analysis
probabilistic topic models
latent dirichlet allocation
cluster computing
parallel processing
big data systems
social networks
Issue Date: 27-Mar-2015
Abstract: The scope of this thesis is the study and development of real-time systems, whichprocess large collections of documents and draw conclusions about their content andemotion. We studied different nature algorithms in order to determine both the performanceof the used tools and the real-time response of the algorithm, using metrics such as memoryusage, the amount of data units processed per second, as well as the responsiveness of thesystem under severe time constraints.The usefulness of real-time systems that process document collections, in order todraw conclusions about their content, becomes obvious if we consider the raise of socialnetworks as modern forms of communication and expression. The analysis of user createdcontent in social networks, allows us to compute useful statistics about the feeling thatprevails in public opinion around a particular theme.During the study, algorithms from the areas of probabilistic topic modelling andsentiment analysis were analyzed. We implemented those algorithms using Apache Storm,and created topologies that run endlessly in a Storm cluster. Those topologies accept data asa stream of events and export real-time information about them. Such systems are capable ofmonitoring events and making immediate decisions, based on them.
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File SizeFormat 
DT2015-0067.pdf1.28 MBAdobe PDFView/Open

Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.