Automatic Thread Tagger

Description

When a user submits a new thread this modification will automatically take keywords from the thread title and use these as tags. You can use Automatic Thread Tagger to propose the user AJAX tags for his new thread, or it assigns new tags after saving the new thread. It can add the translated thread prefix to the tags.
Additionally, you can tag existing threads via maintenance and also scheduled tasks.

This modification is a successor to the terminated Automatic Thread Tagger by MrEyes:


As an example, if a user submits a thread with a title of:
"Fish Food for Cats!"

The thread will be automatically tagged with:

- Fish
- Food
- Cats

If the user also submits an actual tag of "Fish" this will not be duplicated. Any rules you have setup for tagging will be respected.
If you choose to do so this product will also automatically tag threads created by incoming RSS feeds.

Demo
I cannot show you the process of creation, but here is a list of tags generated by Auto Thread Tagger:




Automatic Tagging of existing threads
You can tag existing threads via maintenance or scheduled task/cron. They will be created with a special flag so they can be easily identified and deleted. Manual assigned tags are not touched. Maintenance is also working if Automatic Tagging is disabled via settings. Great if you want to test some settings. Automatic Tagging will take the date of the thread creation and also the userid of the creator. This process can be automated by running a scheduled job once a night.

Please keep in mind that tags that were proposed via AJAX are not tagged as auto tagged and therefore cannot be identified as such (and therefore not deleted automatically). If you want to retain the auto tagged flag you should disable AJAX and enable the tagging after the thread has been saved. As an alternative way you can also disable this and let new threads be tagged in the night from the scheduled job.


Installation / Upgrade
1. Upload all files from "upload" to your server, take care of the directory structure
2. Import "product-auto_thread_tagger110.xml" as a product, overwrite if it's already installed
3. Check settings
4. Run maintenance / Auto Tag Threads to tag existing threads (needed if you want to use the cron)

After install, and by default the modification is disabled, this will allow you to play around with configuration before switching it on.


Troubleshooting
If you report a bug please post the thread title that created it, without this I cannot test it and improve the language parsers.

* If no threads are tagged you will have to check the following:
- Is the modification enabled? Is the action you are testing enabled? (vBulletin tagging, whole auto thread tagger system, AJAX, new threads)
- Are the words you are using badwords or filtered out?

* Cron/Scheduled Task is not tagging all threads.
- The cron is limited to 500 (you can change this via settings) threads per run to avoid heavy impact on server. Make sure you run maintenance auto tagger before this to tag old threads. You can check the scheduled tasks log to see if it is running correctly.
Important: If a thread title does not meet minimum requirements to be included in tags (f.e. one word thread titles, too short words), it will be forever in this queue.

* I'm using polish, arabic, turkish, etc.. language and the tagger is not working like it should.
- If not already replaced, replace the filter replacement '&'=>'and' with ' & '=>'and' (a space before and after &)



Todo
What comes next? You decide. Tell me what you are missing and I'll look if it can be integrated.


Why thread title and not thread text?
Parsing the thread text for tags is an extremely unlikely addition as this would require some fairly heavy processing to ensure quality of tags.



History
1.1.0 GOLD, 31th July 2008
- Added tagging by AJAX. If enabled, tags will be added to the tag field after leaving the title field. There is also a setting that controls the availability of that feature and also controls if leaving the title field will erase all tags entered by the user.
- Added cron to auto tag not tagged threads in the night. Please run maintenance auto tagger before this to tag old threads.
- Improved overall process of tagging
- Tags that were added via new thread process, RSS or maintenance are marked as autotagged (this does not count for AJAX tagging) and can be removed
- No need for editing rssposter.php anymore. RSS Feeds are beeing checked via maintenance or cron.
- Fixed Smart Tags for existing threads
- Obeying max tags limit while creating a new thread
- Tagging thread also after changing the title
- Stripping of HTML and BBCodes
- Possibility to add thread prefix as a tag
- Fixed not tagging threads if tag limits are set to unlimited (0)
- Fixed problems with other languages. Please keep in mind to replace the filter replacement '&'=>'and' with ' & '=>'and'
- Fixed problems with duplicate tags and enabled "Disable Auto Tag if Tagged?"
- Removed exclude Usergroup setting
- Maintenance Auto Tag Threads has now the possibility to hard-delete all auto tags prior to new tagging. With this you can skip the Delete Auto Tags step.
- Smaller fixes and changes

1.0.1, 17th July 2008
- Added: Automatic Tags via maintenance are now associated with the UserID that created the Thread. Just remove auto tags and re-run auto tag to associate all tags to the users. Usefull if you use vBExperience and want to reward users.
- Added: Additionally to the configuration of the auto tagger the vBulletin tags badwords are also taken as blacklist
- Added: Workaround for vBulletin 3.7.0 for non existing function "split_tag_list"
- Added: New setting to filter out dates like 01/02/2008, 05/06/08, 01.02.2008
- Changed: Location of settings, moved below Tagging Options. The new name of the setting group is "Tagging Options (Automatic Thread Tagger)"
- Changed: Behaviour of auto tagging: It is not deleting old tags anymore, please delete old tags before this.

1.0.0 Beta 4, 16th July 2008
- First public release after this modification has been taken over by me
- Added: Maintenance to add and delete automatic tags. Maintenance is also working if Auto Tagging is disabled.
- Fixed: Now obeying tag limits (max/min lenght, max tags)