Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/15941
Title: | Διαχωρισμός Και Κατηγοριοποίηση Καταχωρήσεων Ιστολογίων |
Authors: | Αναστασιαδης Αντωνιος Σελλής Τιμολέων |
Keywords: | blogs data mining classification |
Issue Date: | 14-Mar-2011 |
Abstract: | The scope of this thesis was the development of methods for the automaticextraction of the posts found in blog pages on the internet, and to classifythem as to the opinion they represent regarding a specific topic. Thosemethods take advantage of the syntactic information of the HTML code ofthe blog web pages, as well as their feeds and the date strings they contain.We also use an algorithm with Support Vector Machines to classify theextracted posts into two collections that represent the positive and negativeopinions respectively.Moreover, we developed a standaloneJava application, that given acorpus of blogs, it extracts their posts in an automatic and efficient way.We also developed tools that format the extracted data in feature vectorrepresentation format that is ready for classification, as well as classify it.This work can be used as a basis for a more complex system thatfinds, separates and classifies blogs using more advanced methods suchas lingual analysis and machine learning to extract and classify their posts. |
URI: | http://artemis-new.cslab.ece.ntua.gr:8080/jspui/handle/123456789/15941 |
Appears in Collections: | Διπλωματικές Εργασίες - Theses |
Files in This Item:
File | Size | Format | |
---|---|---|---|
DT2011-0040.pdf | 808.42 kB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.