Select Page

changing architecture of component

As I am in the middle of the development (a little bit more than 60% done) of my PHP Bayesian Naive Filter (a learning filter against spams comment, guestbook, and posting in general) for Joomla/Mambo and after reading some paper found on internet:

  • On Attacking Statistical Spam Filters, Gregory L. Wittel and S. Felix Wu – Department of Computer Science – University of California, Davis One Shields Avenue, Davis, CA 95616 USA
  • A Naive Bayes Spam Filter, Kai Wei [email protected] CS281A Project, Fall 2003
  • But there is more…

I decide that my project will be certainly a failure if I rewrite or reuse a Bayesian filter engine which is not accurate or using the latest countermeasures. Since I do not want to develop during 3 years an effective filter (will I ever be able to do it???), I came across the idea of implementing the component com_bayesiannaivefilter in such a way that I can abstract the core engine and use the work done by the best open source project.

It is also clear for me since the beginning that a spam filter must be trained on a very large data volume (more than 1000 messages, the more the better) in order to categorize the message with accuracy. Webservices will have my preference as an internet entities with the require cpu horsepower and data store should be able to offer the best categorizing messages efficiency….

My component will be able to use following Bayesian Naive Filter core: (planned but not done, I it is technically possible I will do it)

PluginsRemarqsPossible open source project
JAVAI am a J2EE developer, Back to the roots 🙂Som. to propose? contact me!
PHPCore done, but very simple tokenization and hashing of message
Volume of data small
Som. to propose? contact me!
PERLCan PHP call perl code?Som. to propose? contact me!
CGI-BINShould be easy to doSom. to propose? contact me!
WEBSERVICESShould be easy as soon as we found a WS provider
Data volume?
Som. to propose? contact me!

Each technology may contains many core engine, or different versions. I will fill this table with possible candidate (You can heelp me by suggesting or speeding development).

Core requirement:

  • Use mySQL,
  • Most of internet provider allow the use of CGI-BIN, PERL, JAVA

This project will be soon committed to #Joomla forge!

About The Author

Cédric Walter

I worked with various Insurances companies across Switzerland on online applications handling billion premium volumes. I love to continuously spark my creativity in many different and challenging open-source projects fueled by my great passion for innovation and blockchain technology.In my technical role as a senior software engineer and Blockchain consultant, I help to define and implement innovative solutions in the scope of both blockchain and traditional products, solutions, and services. I can support the full spectrum of software development activities, starting from analyzing ideas and business cases and up to the production deployment of the solutions.I'm the Founder and CEO of Disruptr GmbH.