Show simple item record

dc.contributor.authorVOGEL, CARL
dc.contributor.editorMarkus, Aleksyen
dc.contributor.editoret, alen
dc.date.accessioned2009-09-18T13:20:31Z
dc.date.available2009-09-18T13:20:31Z
dc.date.issued2003
dc.date.submitted2003en
dc.identifier.citationCormac O'Brien and Carl Vogel `Spam filters: Bayes vs. Chi-Squared; letters vs. Words? in International Symposium on Information and Communication Technologies, Dublin, eds. Markus Aleksy, et al., 2003, pp 298 - 303en
dc.identifier.otherY
dc.identifier.urihttp://hdl.handle.net/2262/32954
dc.descriptionPUBLISHEDen
dc.description.abstractWe compare two statistical methods for identifying spam or junk electronic mail. Spam filters are classifiers which determine whether an email is junk or not. The proliferation of spam email has made electronic filtering vitally important. The magnitude of the problem is discussed. We examine the Naive Bayesian method in relation to the `Chi by degrees of Freedom? approach, the latter used in the field of authorship identification. Both methods produce very promising results. However, the `Chi by degrees of Freedom? has the advantage of providing significance measures, which will help to reduce false positives. Statistics based on character-level tokenization proves more effective than word-level.en
dc.format.extent298en
dc.format.extent303en
dc.format.extent71419 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoenen
dc.publisherACMen
dc.rightsYen
dc.subjectComputer Scienceen
dc.titleSpam filters: Bayes vs. Chi-Squared; letters vs. wordsen
dc.typeConference Paperen
dc.type.supercollectionscholarly_publicationsen
dc.type.supercollectionrefereed_publicationsen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/vogel


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record