Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/15941
Full metadata record
DC FieldValueLanguage
dc.contributor.authorΑναστασιαδης Αντωνιος
dc.date.accessioned2018-07-23T16:51:48Z-
dc.date.available2018-07-23T16:51:48Z-
dc.date.issued2011-3-14
dc.date.submitted2008-12-1
dc.identifier.urihttp://artemis-new.cslab.ece.ntua.gr:8080/jspui/handle/123456789/15941-
dc.description.abstractThe scope of this thesis was the development of methods for the automaticextraction of the posts found in blog pages on the internet, and to classifythem as to the opinion they represent regarding a specific topic. Thosemethods take advantage of the syntactic information of the HTML code ofthe blog web pages, as well as their feeds and the date strings they contain.We also use an algorithm with Support Vector Machines to classify theextracted posts into two collections that represent the positive and negativeopinions respectively.Moreover, we developed a standaloneJava application, that given acorpus of blogs, it extracts their posts in an automatic and efficient way.We also developed tools that format the extracted data in feature vectorrepresentation format that is ready for classification, as well as classify it.This work can be used as a basis for a more complex system thatfinds, separates and classifies blogs using more advanced methods suchas lingual analysis and machine learning to extract and classify their posts.
dc.languageGreek
dc.subjectblogs
dc.subjectdata mining
dc.subjectclassification
dc.titleΔιαχωρισμός Και Κατηγοριοποίηση Καταχωρήσεων Ιστολογίων
dc.typeDiploma Thesis
dc.description.pages86
dc.contributor.supervisorΣελλής Τιμολέων
dc.departmentΤομέας Επικοινωνιών, Ηλεκτρονικής & Συστημάτων Πληροφορικής
dc.organizationΕΜΠ, Τμήμα Ηλεκτρολόγων Μηχανικών & Μηχανικών Υπολογιστών
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File SizeFormat 
DT2011-0040.pdf808.42 kBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.