This is an article to document how to output the most frequently used words in a MySQL database column.
How?
Note this only applies to MySQL. So the following query will return a count of each word in your database column by order of most used first (replace 'mydbprefix_content' with the name of your database table AND `introtext` with the name of your column):
SELECT SUM(`total_count`) as `total`, `value` FROM ( SELECT count(*) AS `total_count`, REPLACE(REPLACE(REPLACE(x.`value`,'?',''),'.',''),'!','') as `value` FROM ( SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(ExtractValue(t.`introtext`, '//text()'), ' ', n.n), ' ', -1) `value` FROM `mydbprefix_content` t CROSS JOIN ( SELECT a.N + b.N * 10 + 1 n FROM (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b ORDER BY n ) n WHERE n.n <= 1 + (LENGTH(ExtractValue(t.`introtext`, '//text()')) - LENGTH(REPLACE(ExtractValue(t.`introtext`, '//text()'), ' ', ''))) ORDER BY `value` ) AS x GROUP BY x.`value` ) AS y WHERE LENGTH(`value`)>5 AND TRIM(`value`) NOT IN ('Article','This',' ','content','component','applies','because','doesn''t') GROUP BY `value` ORDER BY `total` DESC LIMIT 0,50;Additional Note(s):
- I have added a WHERE clause which omits words less than 5 characters in length, and some other words not to count.
- I have added the ORDER BY to give me the highest count first in descending order.
- I have added a LIMIT to only return the top 50 words.
Source(s):