Hi,I have an application where we gather RSS news feed from around the world into this staging table.From that staging table, we only need "important" news articles to move into the real RSSItems table.I have been asked to exclude RSS items where words like Kardashian is in the description. What we would rather have are topics about disasters, terror, crime, politics, scandal.My first thought is to create a filter table with the serious words but then I thought even articles about reality starts may have such words. But listing nonsense words are just as unreliable.I am asking for just a idea of a common sense approach.I want to left join the staging table with the filter table to fill the main RSSItem table with as much "real" news as possible.Any Ideas?Does anyone know of ready to purchase topical filter list?Any ideas or direction will be appreciated.
↧