SQL SERVER – Remove Duplicate Entry from Comma Delimited String – UDF

I love reader’s contribution this blog as that brings variety in articles. I encourage my readers to provide their contribution and I will publish then with their name.

Blog Reader Ashish Jain has posted very simple script which will remove duplicate entry from comma delimited string. User Defined Function has very simple logic behind it. It takes comma delimited string and then converts it to table and runs DISTINCT operation on the table. DISTINCT operation removes duplicate value. After that it converts the table again into the string and it can be used.

I have modified original contribution from Ashish so now it completely covers the subject intended to cover. I would suggest that this UDF should be kept handy to perform this tedious task easily.

CREATE FUNCTION dbo.DistinctList
(
@List VARCHAR(MAX),
@Delim CHAR
)
RETURNS
VARCHAR
(MAX)
AS
BEGIN
DECLARE
@ParsedList TABLE
(
Item VARCHAR(MAX)
)
DECLARE @list1 VARCHAR(MAX), @Pos INT, @rList VARCHAR(MAX)
SET @list = LTRIM(RTRIM(@list)) + @Delim
SET @pos = CHARINDEX(@delim, @list, 1)
WHILE @pos > 0
BEGIN
SET
@list1 = LTRIM(RTRIM(LEFT(@list, @pos - 1)))
IF @list1 <> ''
INSERT INTO @ParsedList VALUES (CAST(@list1 AS VARCHAR(MAX)))
SET @list = SUBSTRING(@list, @pos+1, LEN(@list))
SET @pos = CHARINDEX(@delim, @list, 1)
END
SELECT
@rlist = COALESCE(@rlist+',','') + item
FROM (SELECT DISTINCT Item FROM @ParsedList) t
RETURN @rlist
END
GO
SELECT dbo.DistinctList('342,34,456,34,3454,456,aa,bb,cc,aa',',') DistinctList
GO

SQL SERVER - Remove Duplicate Entry from Comma Delimited String - UDF distinctstring

I encourage my readers to send their contribution as well so I can include their contribution as well.

Reference : Pinal Dave (https://darkslategrey-bat-805937.hostingersite.com)

SQL Function, SQL Scripts, SQL String
Previous Post
SQL SERVER – Find Number of Rows and Disk Space Reserved – Using sp_spaceused Interesting Observation
Next Post
SQLAuthority News – Ahmedabad User Group Meeting January 17 2009

Related Posts

24 Comments. Leave new

  • Is there a reason that
    SELECT @rlist = COALESCE(@rlist+‘,’,”) + item

    is not

    SELECT @rlist = COALESCE(@rlist+@Delim,”) + item

    Reply
  • Nicholas Paldino
    January 15, 2009 9:57 pm

    What should have probably been done here is to use a CLR function, as CLR routines are much better at procedural code than T-SQL.

    This could have been done in a few lines of code, and performed much better as well.

    Reply
  • Hi Dave,

    Dispite being the SQL guy I think that this really looks like one of those problems that are best cracked with CLR.

    Have a look at my post
    where I provide a demonstation of how the CLR code executes at leat 3 times faster.

    Reply
  • Hi,

    I m wroking in MNC as SQL developer i required dis function but i m not getting it so can u help me out to get dis UDF and also wat it actually does.

    Reply
  • Hi Guys,

    Try out this simple solution.

    CREATE FUNCTION DBO.RECU (@DATA NVARCHAR(1000))
    RETURNS @RETTABLE TABLE
    ( COL1 NVARCHAR(200))
    AS
    BEGIN
    IF LEN(@DATA) > 0 AND CHARINDEX (‘,’,@DATA) > 0
    BEGIN
    INSERT INTO @RETTABLE
    SELECT LEFT(@DATA,CHARINDEX (‘,’,@DATA)-1) AS COL1
    UNION ALL
    SELECT COL1 FROM DBO.RECU(RIGHT(@DATA,LEN(@DATA)-CHARINDEX (‘,’,@DATA)))
    END
    RETURN
    END

    SELECT DISTINCT COL1 FROM DBO.RECU(‘342,34,456,34,3454,456,aa,bb,cc,aa,’)

    Thanks and Regards
    Muralikrishnan

    Reply
  • Dave,
    I am impressed with your blog pages. There is lot of materail and it is really helpful for SQL pro’s.

    I have a requirement and wanted to know if you or anyone can help…

    Table1

    ID (Pkey) Column2 (Account numbers)

    1001 U987654321
    1002 123213121,3434334,881216665
    1003 A123456789,C123456789
    1004 U1818012, U808561773……

    Column 1 is a unique ID number which is the Pkey. Column2 holds (Accoount numbers) and can have upto 12 account numbers. My requirement is convert them in to table like ….

    1001 U987654321
    1002 123213121
    1002 3434334
    1002 881216665
    1003 A123456789
    1003 C123456789
    1004 U808561773
    1004 U808561773
    ……..

    Pls let me know how to accomplish this using SQL 2008.

    Thanks,
    Hari

    Reply
  • Imran Mohammed
    August 25, 2009 9:18 am

    @Hari,

    How about using While loop or cursor, Because I dont know any other simple method to solve this.

    1. Write a while loop,
    2. Every time assign Column2 value to a variable.
    3. use another while loop and Use substring function to reduce the size of variable and each time you reduce the size you put value into a temporary table.
    4. At the end of while loop, do a select * from temp table.

    ~ IM.

    Reply
    • Hi ,i am Aluri from bangalore and i am a dot net developer ,i hv a requirement that i couldn’t solve,hope you can solve my problem:
      well my requirement is to count duplicate words in a sentense which is stored in a column example:
      1. I am A software engineer.I am a dot net developer.
      that is one record,what i need is,to count homany times repeted values in that record ,and the result would be like this:
      I am (2)
      Software(2)
      like that,i realy tried many string functions but faild to get the right result,hope you can help me

      my mail id is: [email id removed]
      thank you

      Aluri

      Reply
  • I was looking for a way to do this and found your blog. The function worked great! So THANKS for that!

    While I was searching for how to do something totally different, however, I actually revisited the dedupe problem and came up with the following:

    declare @list nvarchar(max)
    set @list = ‘342,34,456,34,3454,456,aa,bb,cc,aa’
    declare @x xml
    declare @delim nvarchar(1)
    set @delim = ‘,’

    set @x = ”+replace(lower(@list),@delim,”)+”
    select replace(cast(@x.query(‘distinct-values(data(/list/delim))’) as varchar(max)),’ ‘,@delim) as result

    Basically I turn the delimited list in to XML and use ‘distinct-values’ to get rid of dupes.

    One gotcha to keep in mind is ‘distinct-values’ appears to be an exact match comparison and is why I included the ‘lower()’ command. So if @list = ”342,34,456,34,3454,456,aa,bb,cc,aa,AA’ AND you are not able to leave the ‘lower()’ command in there, you’d get both ‘aa’ and ‘AA’ in the result.

    Enjoy!

    Reply
  • Thanks for your post, how do you keep the same order in the comma delimited list? After using dbo.DistintList() it changes the order to SQL ordering, meaning numbers first then letters. I am trying to take individual rows of products and create the size/color array.

    XS,S,M,L,XL,2XL,3XL turns into an output of 2XL,3XL,L,M,S,XS.

    I need these to be in size order (S to 3XL). I have another column for SortPriority but this is not obeyed after running the function.

    Reply
  • Use this one to split string

    DECLARE @str varchar(100),@sep VARCHAR(10)
    select @str = ‘1,2,3,4’ ,@sep = ‘,’
    ;
    WITH Split(stpos,Endpos)
    AS( SELECT 0 stpos
    ,CHARINDEX(@sep,@str) Endpos
    UNION ALL
    SELECT Endpos+1
    ,CHARINDEX(@sep,@str,Endpos+1)
    FROM split
    WHERE Endpos > 0
    )
    SELECT SUBSTRING(@str,stpos,COALESCE(NULLIF(Endpos,0),LEN(@str)+1)-stpos)
    FROM split

    Reply
  • CREATE FUNCTION DBO.DISTINCT_MSG_SEQ
    (
    @LIST VARCHAR(MAX)
    )
    RETURNS VARCHAR(MAX)
    AS
    BEGIN
    DECLARE @DISTINCT_MSGLIST TABLE
    (
    S_NO INT IDENTITY(1,1),
    ITEM VARCHAR(MAX)
    )
    DECLARE @LIST1 VARCHAR(MAX),
    @POS INT,
    @RLIST VARCHAR(MAX),
    @MAX VARCHAR(MAX),
    @DELIM VARCHAR
    SET @DELIM = ‘,’
    SET @LIST = LTRIM(RTRIM(@LIST)) + @DELIM
    SET @POS = CHARINDEX(@DELIM, @LIST, 1)
    SET @LIST1 = LTRIM(RTRIM(LEFT(@LIST, @POS – 1)))

    INSERT INTO @DISTINCT_MSGLIST VALUES (CAST(@LIST1 AS VARCHAR(MAX)))
    SET @LIST = SUBSTRING(@LIST, @POS+1, LEN(@LIST))
    SET @POS = CHARINDEX(@DELIM, @LIST, 1)
    WHILE @POS > 0
    BEGIN
    SET @LIST1 = LTRIM(RTRIM(LEFT(@LIST, @POS – 1)))
    IF @LIST1 ”
    SELECT @MAX = ITEM FROM @DISTINCT_MSGLIST WHERE S_NO = ISNULL((SELECT MAX(S_NO) FROM @DISTINCT_MSGLIST) ,0)
    IF @MAX @LIST1
    INSERT INTO @DISTINCT_MSGLIST VALUES (CAST(@LIST1 AS VARCHAR(MAX)))
    SET @LIST = SUBSTRING(@LIST, @POS+1, LEN(@LIST))
    SET @POS = CHARINDEX(@DELIM, @LIST, 1)
    END
    SELECT @RLIST = COALESCE(@RLIST+’,’,”) + ITEM
    FROM (SELECT ITEM FROM @DISTINCT_MSGLIST) T

    RETURN @RLIST

    END

    The above function will display the unique msg sequence from the list

    eg : – If you pass the string ‘A,A,B,B,A,B’ , this function gives the output as ‘A,B,A,B’

    Reply
  • i have problem in solving the following case.can any one help regarding this one

    Input = ‘A:1, 1, 2, 4 B:3, 5, 5, 6 C:7, 7, 9, 4’

    Required Output = ‘A:1, 2, 4 B:3, 5, 6 C:4, 7, 9’

    Reply
  • Good idea and code .. Just one problem though: .. having dynamic field names in the query, there are cases like “ISNULL(, ”)” – which the funtion will split by “,” ..

    Reply
  • to remove duplicate id’s under columns i faced lot of strougle but it s very greatful to me

    Reply
  • Create a Number Table with ID column

    select SUBSTRING(@str,id+1,CHARINDEX(‘,’,@str,id+1)-id-1) from Tally
    where id < LEN(@str)
    and substring(@str,id,1) = ','
    GROUP BY SUBSTRING(@str,id+1,CHARINDEX(',',@str,id+1)-id-1)
    ORDER BY MIN(ID)

    Reply
  • Hi, Can anyone help me out to delete consecutive duplicate words in a string through a procedure.

    Reply
  • Appreciate dave. Thanks …

    Reply
  • Thank your for this solution. It was very helpful !

    Reply
  • Satish Vellanki
    June 30, 2016 11:46 am

    I was change some logic

    CREATE FUNCTION dbo.DistinctList
    (
    @List VARCHAR(MAX),
    @Delim CHAR
    )
    RETURNS VARCHAR(MAX)
    AS
    BEGIN
    DECLARE @ParsedList TABLE
    (
    Item VARCHAR(MAX)
    )
    declare @rlist varchar(100)
    Declare @op varchar(100)
    DECLARE @FoundIndex INT=CHARINDEX(@Delim,@list)
    WHILE (@FoundIndex 0)
    Begin
    set @op=SUBSTRING(@list,0,charindex(@Delim,@list))
    Set @list=SUBSTRING(@list,charindex(@Delim,@list)+1,Len(@list))
    SET @FoundIndex = CHARINDEX(@Delim, @list)
    INSERT INTO @ParsedList VALUES (CAST(@op AS VARCHAR(MAX)))
    End
    SELECT @rlist = COALESCE(@rlist+’,’,”) + item
    FROM (SELECT DISTINCT Item FROM @ParsedList) t
    RETURN @rlist
    END

    Reply
    • Nathan Sutherland
      April 14, 2017 11:47 pm

      Satish, your logic ignores the case of there not being any delimiters, where we would want the one item to be returned. To add that case, add the code

      IF @FoundIndex = 0 INSERT INTO @ParsedList VALUES (@List)

      Right above the loop.

      Reply
  • SELECT dbo.DistinctList(‘342,34,456,34,3454,456,aa,bb,cc,aa’,’,’) DistinctList
    This function remove duplicate and order by item. How can I just remove duplicate without ordering items, I am looking output like
    342,34,456,3454,aa,bb,,cc

    Please provide me updated function.

    Reply

Leave a Reply