I love reader’s contribution this blog as that brings variety in articles. I encourage my readers to provide their contribution and I will publish then with their name.
Blog Reader Ashish Jain has posted very simple script which will remove duplicate entry from comma delimited string. User Defined Function has very simple logic behind it. It takes comma delimited string and then converts it to table and runs DISTINCT operation on the table. DISTINCT operation removes duplicate value. After that it converts the table again into the string and it can be used.
I have modified original contribution from Ashish so now it completely covers the subject intended to cover. I would suggest that this UDF should be kept handy to perform this tedious task easily.
CREATE FUNCTION dbo.DistinctList
(
@List VARCHAR(MAX),
@Delim CHAR
)
RETURNS
VARCHAR(MAX)
AS
BEGIN
DECLARE @ParsedList TABLE
(
Item VARCHAR(MAX)
)
DECLARE @list1 VARCHAR(MAX), @Pos INT, @rList VARCHAR(MAX)
SET @list = LTRIM(RTRIM(@list)) + @Delim
SET @pos = CHARINDEX(@delim, @list, 1)
WHILE @pos > 0
BEGIN
SET @list1 = LTRIM(RTRIM(LEFT(@list, @pos - 1)))
IF @list1 <> ''
INSERT INTO @ParsedList VALUES (CAST(@list1 AS VARCHAR(MAX)))
SET @list = SUBSTRING(@list, @pos+1, LEN(@list))
SET @pos = CHARINDEX(@delim, @list, 1)
END
SELECT @rlist = COALESCE(@rlist+',','') + item
FROM (SELECT DISTINCT Item FROM @ParsedList) t
RETURN @rlist
END
GO
SELECT dbo.DistinctList('342,34,456,34,3454,456,aa,bb,cc,aa',',') DistinctList
GO
I encourage my readers to send their contribution as well so I can include their contribution as well.
Reference : Pinal Dave (https://darkslategrey-bat-805937.hostingersite.com)
24 Comments. Leave new
Is there a reason that
SELECT @rlist = COALESCE(@rlist+‘,’,”) + item
is not
SELECT @rlist = COALESCE(@rlist+@Delim,”) + item
What should have probably been done here is to use a CLR function, as CLR routines are much better at procedural code than T-SQL.
This could have been done in a few lines of code, and performed much better as well.
Hi Dave,
Dispite being the SQL guy I think that this really looks like one of those problems that are best cracked with CLR.
Have a look at my post
where I provide a demonstation of how the CLR code executes at leat 3 times faster.
Hi,
I m wroking in MNC as SQL developer i required dis function but i m not getting it so can u help me out to get dis UDF and also wat it actually does.
Hi Guys,
Try out this simple solution.
CREATE FUNCTION DBO.RECU (@DATA NVARCHAR(1000))
RETURNS @RETTABLE TABLE
( COL1 NVARCHAR(200))
AS
BEGIN
IF LEN(@DATA) > 0 AND CHARINDEX (‘,’,@DATA) > 0
BEGIN
INSERT INTO @RETTABLE
SELECT LEFT(@DATA,CHARINDEX (‘,’,@DATA)-1) AS COL1
UNION ALL
SELECT COL1 FROM DBO.RECU(RIGHT(@DATA,LEN(@DATA)-CHARINDEX (‘,’,@DATA)))
END
RETURN
END
SELECT DISTINCT COL1 FROM DBO.RECU(‘342,34,456,34,3454,456,aa,bb,cc,aa,’)
Thanks and Regards
Muralikrishnan
Dave,
I am impressed with your blog pages. There is lot of materail and it is really helpful for SQL pro’s.
I have a requirement and wanted to know if you or anyone can help…
Table1
ID (Pkey) Column2 (Account numbers)
1001 U987654321
1002 123213121,3434334,881216665
1003 A123456789,C123456789
1004 U1818012, U808561773……
Column 1 is a unique ID number which is the Pkey. Column2 holds (Accoount numbers) and can have upto 12 account numbers. My requirement is convert them in to table like ….
1001 U987654321
1002 123213121
1002 3434334
1002 881216665
1003 A123456789
1003 C123456789
1004 U808561773
1004 U808561773
……..
Pls let me know how to accomplish this using SQL 2008.
Thanks,
Hari
@Hari,
How about using While loop or cursor, Because I dont know any other simple method to solve this.
1. Write a while loop,
2. Every time assign Column2 value to a variable.
3. use another while loop and Use substring function to reduce the size of variable and each time you reduce the size you put value into a temporary table.
4. At the end of while loop, do a select * from temp table.
~ IM.
Hi ,i am Aluri from bangalore and i am a dot net developer ,i hv a requirement that i couldn’t solve,hope you can solve my problem:
well my requirement is to count duplicate words in a sentense which is stored in a column example:
1. I am A software engineer.I am a dot net developer.
that is one record,what i need is,to count homany times repeted values in that record ,and the result would be like this:
I am (2)
Software(2)
like that,i realy tried many string functions but faild to get the right result,hope you can help me
my mail id is: [email id removed]
thank you
Aluri
I was looking for a way to do this and found your blog. The function worked great! So THANKS for that!
While I was searching for how to do something totally different, however, I actually revisited the dedupe problem and came up with the following:
declare @list nvarchar(max)
set @list = ‘342,34,456,34,3454,456,aa,bb,cc,aa’
declare @x xml
declare @delim nvarchar(1)
set @delim = ‘,’
set @x = ”+replace(lower(@list),@delim,”)+”
select replace(cast(@x.query(‘distinct-values(data(/list/delim))’) as varchar(max)),’ ‘,@delim) as result
Basically I turn the delimited list in to XML and use ‘distinct-values’ to get rid of dupes.
One gotcha to keep in mind is ‘distinct-values’ appears to be an exact match comparison and is why I included the ‘lower()’ command. So if @list = ”342,34,456,34,3454,456,aa,bb,cc,aa,AA’ AND you are not able to leave the ‘lower()’ command in there, you’d get both ‘aa’ and ‘AA’ in the result.
Enjoy!
Thanks for your post, how do you keep the same order in the comma delimited list? After using dbo.DistintList() it changes the order to SQL ordering, meaning numbers first then letters. I am trying to take individual rows of products and create the size/color array.
XS,S,M,L,XL,2XL,3XL turns into an output of 2XL,3XL,L,M,S,XS.
I need these to be in size order (S to 3XL). I have another column for SortPriority but this is not obeyed after running the function.
Use this one to split string
DECLARE @str varchar(100),@sep VARCHAR(10)
select @str = ‘1,2,3,4’ ,@sep = ‘,’
;
WITH Split(stpos,Endpos)
AS( SELECT 0 stpos
,CHARINDEX(@sep,@str) Endpos
UNION ALL
SELECT Endpos+1
,CHARINDEX(@sep,@str,Endpos+1)
FROM split
WHERE Endpos > 0
)
SELECT SUBSTRING(@str,stpos,COALESCE(NULLIF(Endpos,0),LEN(@str)+1)-stpos)
FROM split
CREATE FUNCTION DBO.DISTINCT_MSG_SEQ
(
@LIST VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @DISTINCT_MSGLIST TABLE
(
S_NO INT IDENTITY(1,1),
ITEM VARCHAR(MAX)
)
DECLARE @LIST1 VARCHAR(MAX),
@POS INT,
@RLIST VARCHAR(MAX),
@MAX VARCHAR(MAX),
@DELIM VARCHAR
SET @DELIM = ‘,’
SET @LIST = LTRIM(RTRIM(@LIST)) + @DELIM
SET @POS = CHARINDEX(@DELIM, @LIST, 1)
SET @LIST1 = LTRIM(RTRIM(LEFT(@LIST, @POS – 1)))
INSERT INTO @DISTINCT_MSGLIST VALUES (CAST(@LIST1 AS VARCHAR(MAX)))
SET @LIST = SUBSTRING(@LIST, @POS+1, LEN(@LIST))
SET @POS = CHARINDEX(@DELIM, @LIST, 1)
WHILE @POS > 0
BEGIN
SET @LIST1 = LTRIM(RTRIM(LEFT(@LIST, @POS – 1)))
IF @LIST1 ”
SELECT @MAX = ITEM FROM @DISTINCT_MSGLIST WHERE S_NO = ISNULL((SELECT MAX(S_NO) FROM @DISTINCT_MSGLIST) ,0)
IF @MAX @LIST1
INSERT INTO @DISTINCT_MSGLIST VALUES (CAST(@LIST1 AS VARCHAR(MAX)))
SET @LIST = SUBSTRING(@LIST, @POS+1, LEN(@LIST))
SET @POS = CHARINDEX(@DELIM, @LIST, 1)
END
SELECT @RLIST = COALESCE(@RLIST+’,’,”) + ITEM
FROM (SELECT ITEM FROM @DISTINCT_MSGLIST) T
RETURN @RLIST
END
The above function will display the unique msg sequence from the list
eg : – If you pass the string ‘A,A,B,B,A,B’ , this function gives the output as ‘A,B,A,B’
i have problem in solving the following case.can any one help regarding this one
Input = ‘A:1, 1, 2, 4 B:3, 5, 5, 6 C:7, 7, 9, 4’
Required Output = ‘A:1, 2, 4 B:3, 5, 6 C:4, 7, 9’
Good idea and code .. Just one problem though: .. having dynamic field names in the query, there are cases like “ISNULL(, ”)” – which the funtion will split by “,” ..
to remove duplicate id’s under columns i faced lot of strougle but it s very greatful to me
Create a Number Table with ID column
select SUBSTRING(@str,id+1,CHARINDEX(‘,’,@str,id+1)-id-1) from Tally
where id < LEN(@str)
and substring(@str,id,1) = ','
GROUP BY SUBSTRING(@str,id+1,CHARINDEX(',',@str,id+1)-id-1)
ORDER BY MIN(ID)
Hi, Can anyone help me out to delete consecutive duplicate words in a string through a procedure.
Appreciate dave. Thanks …
Thank your for this solution. It was very helpful !
I was change some logic
CREATE FUNCTION dbo.DistinctList
(
@List VARCHAR(MAX),
@Delim CHAR
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @ParsedList TABLE
(
Item VARCHAR(MAX)
)
declare @rlist varchar(100)
Declare @op varchar(100)
DECLARE @FoundIndex INT=CHARINDEX(@Delim,@list)
WHILE (@FoundIndex 0)
Begin
set @op=SUBSTRING(@list,0,charindex(@Delim,@list))
Set @list=SUBSTRING(@list,charindex(@Delim,@list)+1,Len(@list))
SET @FoundIndex = CHARINDEX(@Delim, @list)
INSERT INTO @ParsedList VALUES (CAST(@op AS VARCHAR(MAX)))
End
SELECT @rlist = COALESCE(@rlist+’,’,”) + item
FROM (SELECT DISTINCT Item FROM @ParsedList) t
RETURN @rlist
END
Satish, your logic ignores the case of there not being any delimiters, where we would want the one item to be returned. To add that case, add the code
IF @FoundIndex = 0 INSERT INTO @ParsedList VALUES (@List)
Right above the loop.
SELECT dbo.DistinctList(‘342,34,456,34,3454,456,aa,bb,cc,aa’,’,’) DistinctList
This function remove duplicate and order by item. How can I just remove duplicate without ordering items, I am looking output like
342,34,456,3454,aa,bb,,cc
Please provide me updated function.