Joi Ito's conversation with the living web.

crm_logo.jpgWe talked about spam filters earlier. I use TMDA which is based on whitelisting. The controllable regex multilator is a technical filtering technology. These technologies keep getting smarter. It sort of reminds me of the convolutions we used to go through at Infoseek to get rid of spam sites from our indexes. I remember that some site used to produced different pages to the infoseek search bot by looking at the id... Anyway, this "CRM114" looks interesting.

CRM114 - the Controllable Regex Mutilator
CRM-114 is a system to examine incoming e-mail, system log streams, data files or other data streams, and to sort, filter, or alter the incoming files or data streams according to whatever the user desires. Criteria for categorization of data can be by satisfaction of regexes, by sparse spectra, or by other means. Accuracy of the sparse spectra function has been seen in excess of 99 per cent, for 1/4 megabyte of learning text. In other words, CRM114 learns, and it learns fast .

Leave a comment