[SQL Server] Keyword Tally

access_time Monday, February 24, 2014 8:18 pm
account_circle Andy Boot

Background

Our website show one or many table records on a page, one of the fields being a keywords input area. The input area contains words separated by a comma.

Goal

We would like to show visitors or admins the most commonly used keywords from all our pages. This can be implemented in the form of a widget, dedicated page or a reporting area.

Building The Test Table

In this example, we’ll be using a database named Playground, an build a table named TallyData (this will be recreated upon each query execution).

Use Playground

DROP Table TallyData
CREATE Table TallyData (docid int, keywords varchar(max) );
INSERT INTO TallyData
VALUES (100, ‘Test, SQL, ASP, .net, Help’),
(2010, ‘SQL, Andy, ASP’),
(4, ‘Andy, Test’),
(2340, ‘HelloWorld, Andy, Help’)

Code

DECLARE @document AS TABLE (
[Count] INT,
keyword VARCHAR(MAX))
;WITH cte(docid, word, keywords)
AS (SELECT docid,
LEFT(keywords, Charindex(‘,’, keywords + ‘,’) – 1),
Stuff(keywords, 1, Charindex(‘,’, keywords + ‘,’), ”)
FROM [Playground].[dbo].[TallyData]
UNION ALL
SELECT docid,
LEFT(keywords, Charindex(‘,’, keywords + ‘,’) – 1),
Stuff(keywords, 1, Charindex(‘,’, keywords + ‘,’), ”)
FROM cte
WHERE keywords > ”)
INSERT INTO @document ([Count], Keyword)
SELECT COUNT(docid),
Ltrim(Rtrim(word))
FROM cte
GROUP BY Ltrim(Rtrim(word))

SELECT [Count], Keyword FROM @document

Output

Count Keyword
1            .net
3           Andy
2           ASP
1           HelloWorld
2           Help
2           SQL
2           Test

Conclusion

Notice how our query has stacked up all the keywords, broken them down into individual words and then calculated how many of each there are.

There are endless possibilities you could do with this query with a few tweaks, for example you could target this at a field containing a few hundred words (such as forum posts), split by a space and/or other characters and you have yourself a word counter / popularity rating function.