Convert to Proper Case in T-SQL

What?
So this is an article exploring how to convert UPPERCASE text into mixed case. The feed is originally for a personnel feed so it won't be converting long paragraphs of English text. Instead it will be applied to names and addresses as well as job titles and departments.

Why?
We wanted a T-SQL version despite having successfully built a custom script component for SSIS.

How?
You can search my site for how to do this in VB or C#. This version is entirely using T-SQL and we're going to use a function so no dynamic SQL (execute, evaluate).

Method #1: First version
This was my first attempt based on snippets around the web. This one has two sections, the first part capitalizes any letter after a non-alpha letter. The remainder of it is to deal with exceptions to the rule, words that you want in a specific case.
CREATE FUNCTION ufn_ProperCase(@Text AS VARCHAR(8000))
RETURNS VARCHAR(8000)
AS
BEGIN
   -- declare some variables
   DECLARE @Reset BIT;
   DECLARE @Ret VARCHAR(8000);
   DECLARE @i INT;
   DECLARE @c VARCHAR(2);

   -- specify reset
   SELECT @Reset = 1, @i=1, @Ret = '';

   IF (UPPER(@Text)=@Text Collate Latin1_General_CS_AI)
	BEGIN

		-- cycle through each character, 
		-- if non-alpha, uppercase next alpha character.
		-- if alpha then lowercase subsequent alphas.
		-- NON-ALPHA EXCEPTIONS ADDED TO REGEXP (eg. apostrophe)
		WHILE (@i <= len(@Text))
   			SELECT @c= substring(@Text,@i,1),
               @Ret = @Ret + CASE WHEN @Reset=1 THEN UPPER(@c) ELSE LOWER(@c) END,
               @Reset = CASE WHEN @c like '[a-zA-Z'']' THEN 0 ELSE 1 END,
               @i = @i +1

		-- add a leading and trailing space to indicate word delimiters (bol & eol)
		SET @Ret = ' ' + @Ret + ' ';

		-- Any character immediately after non-alpha is now in uppercase.
		-- Base the following overrides/exceptions on this.

		-- specify exceptions to be lowercased
		SET @Ret = REPLACE(@Ret, ' And ', LOWER(' And '));
		SET @Ret = REPLACE(@Ret, ' At ', LOWER(' At '));
		SET @Ret = REPLACE(@Ret, ' Of ', LOWER(' Of '));
		SET @Ret = REPLACE(@Ret, ' The ', LOWER(' The '));

		-- specify exceptions to be uppercased
		SET @Ret = REPLACE(@Ret, ' Bi ', UPPER(' BI '));  -- Business Inteligence
		SET @Ret = REPLACE(@Ret, ' Hr ', UPPER(' HR '));  -- Human Resources
		SET @Ret = REPLACE(@Ret, ' It ', UPPER(' IT '));  -- Info Tech Services

		-- specify exceptions to be custom cased: McDonalds, Mc Donalds
		IF(@Ret LIKE '% Mc%')
			BEGIN
	   			SELECT @c = SUBSTRING(@Ret,CHARINDEX(' Mc',@Ret)+3,1)
				SET @Ret = REPLACE(@Ret, ' Mc'+LOWER(@c), ' Mc'+UPPER(@c))
			END

		-- specify exceptions to be custom cased: O'Neils
		IF(@Ret LIKE '% O''%')
			BEGIN
	   			SELECT @c = SUBSTRING(@Ret,CHARINDEX(' O''',@Ret)+3,1)
				SET @Ret = REPLACE(@Ret, ' O'''+LOWER(@c), ' O'''+UPPER(@c))
			END

		-- specify exceptions to be custom cased: D'Artagnan
		IF(@Ret LIKE '% D''%')
			BEGIN
	   			SELECT @c = SUBSTRING(@Ret,CHARINDEX(' D''',@Ret)+3,1)
				SET @Ret = REPLACE(@Ret, ' D'''+LOWER(@c), ' D'''+UPPER(@c))
			END

		-- specify exceptions to be custom cased: I'Anson
		IF(@Ret LIKE '% I''%')
			BEGIN
	   			SELECT @c = SUBSTRING(@Ret,CHARINDEX(' I''',@Ret)+3,1)
				SET @Ret = REPLACE(@Ret, ' I'''+LOWER(@c), ' I'''+UPPER(@c))
			END

		-- remove leading and trailing spaces
		SET @Ret = LTRIM(RTRIM(@Ret));

	END
   ELSE
    -- return the string unaffected if it is not in uppercase
    SET @Ret=@Text

   RETURN @Ret

END
Tip: Snippet to remove this function
Add this to the beginning if you want to test the function a lot. I didn't include this in case you already have a function called this.
IF object_id(N'ufn_ProperCase', N'FN') IS NOT NULL
        DROP FUNCTION ufn_ProperCase
GO
Method #2: Revised version
This version will use a variable table but it could be adapted to feed off any database table. As before it capitalizes words after non-alpha letters as well as apostrophe, but some name exceptions are built into the same case statement (McDonalds, O'Neils, D'Artagnan) and hopefully a clear way on how to add more.

Then I have the custom exceptions, here simply type the word as it is intended to ultimately display as, the case-insensitive replace by T-SQL will find that instance and replace it with exactly how the word is written in the function.
CREATE FUNCTION ufn_ProperCase(@Text AS VARCHAR(8000))
RETURNS VARCHAR(8000)
AS
BEGIN
   -- declare some variables
   DECLARE @Reset BIT;
   DECLARE @Ret VARCHAR(8000);
   DECLARE @i INT;
   DECLARE @c0 CHAR(1);
   DECLARE @c1 CHAR(1);
   DECLARE @c2 CHAR(1);
   DECLARE @CaseExceptions TABLE(MyIndex int identity(1,1), MyString VARCHAR(8000));
   DECLARE @exceptionToFind VARCHAR(8000);

   -- specify reset
   SELECT @Reset = 1, @i=1, @Ret = '';

   -- only apply if all characters are already in uppercase
   IF (UPPER(@Text)=@Text COLLATE Latin1_General_CS_AI)
        BEGIN

                -- add a leading and trailing space to indicate word delimiters (bol & eol)
                SET @Text = ' ' + @Text + ' ';

                -- cycle through each character,
                -- if non-alpha, uppercase next alpha character.
                -- if alpha then lowercase subsequent alphas.
                WHILE (@i <= LEN(@Text))
                        SELECT
                                @c0=SUBSTRING(@Text,@i-2,1), @c1=SUBSTRING(@Text,@i-1,1), @c2=SUBSTRING(@Text,@i,1),
                                @Ret = @Ret + CASE WHEN @Reset=1 THEN UPPER(@c2) ELSE LOWER(@c2) END,
                                @Reset = CASE
                                                -- EXCEPTION: McDonalds and Mc Donalds but not Tomcat
                                                WHEN @c0 = ' ' AND @c1 = 'M' AND @c2 = 'c' THEN 1

                                                -- EXCEPTION: D'Artagnan and I'Anson
                                                -- EXCEPTION: O'Neils and O' Sullivan but not Mo'Money
                                                WHEN @c0 = ' ' AND @c1 IN ('D', 'I', 'O') AND @c2 = '''' THEN 1

                                                WHEN @c2 LIKE '[a-zA-Z'']' THEN 0       -- Apply LOWER to any character after alphas or apostrophes
                                                ELSE 1                                  -- Apply UPPER to any character after symbols/punctuation
                                        END,
                                @i = @i +1


                -- remove leading spaces so first words will not be lowercased
                -- (eg. "The" at the beginning of a sentence)
                SET @Ret = LTRIM(@Ret);

                -- custom exceptions: the search is case-insensitive and will
                -- convert the word to the case as written below.
                INSERT INTO @CaseExceptions (MyString) VALUES
                ('and'), ('at'), ('of'), ('the')                -- common conversions
                ,('BI')                                         -- Business Inteligence
                ,('HR')                                         -- Human Resources
                ,('IT')                                         -- Info Tech Services

                DECLARE exceptions_cursor CURSOR FOR
                        SELECT ' '+MyString+' ' FROM @CaseExceptions
                OPEN exceptions_cursor
                        FETCH NEXT FROM exceptions_cursor
                                INTO @exceptionToFind;
                WHILE @@FETCH_STATUS = 0
                        BEGIN
                                SET @Ret = REPLACE(@Ret, @exceptionToFind, @exceptionToFind)

                                FETCH NEXT FROM exceptions_cursor INTO @exceptionToFind;
                        END
                CLOSE exceptions_cursor
                DEALLOCATE exceptions_cursor


                -- remove any leading and trailing spaces
                SET @Ret = LTRIM(RTRIM(@Ret));

        END
   ELSE
        BEGIN
                -- return the string unaffected if it is not in uppercase
                SET @Ret=@Text
        END


   RETURN @Ret

END

Method #3: No Cursor
Personally, I felt the cursor allowed your exception list to be longer and to be run off a populated SQL table, but if it's a short one like I was tasked with, then you can put the words of mixed case in a comma delimited string. The string would contain the case of your choosing and using T-SQL case-insensitivity will overwrite whatever match it finds with what you typed in the code. Here's my final version:
CREATE FUNCTION dbo.ufn_ProperCase(@Text AS VARCHAR(8000))
RETURNS VARCHAR(8000)
AS
BEGIN
   -- declare some variables
   DECLARE @Reset BIT; DECLARE @Ret VARCHAR(8000); DECLARE @i INT;
   DECLARE @c0 CHAR(1); DECLARE @c1 CHAR(1); DECLARE @c2 CHAR(1);
   DECLARE @CaseLen INT;
   DECLARE @CaseExceptions VARCHAR(8000);
   DECLARE @CaseValue VARCHAR(8000);

   -- Set some default values
   SELECT @Reset = 1, @i=1, @Ret = '';

   -- only apply if all characters are already in uppercase
   IF (UPPER(@Text)=@Text COLLATE Latin1_General_CS_AI)
        BEGIN

                -- add a leading and trailing space to indicate word delimiters (bol & eol)
                SET @Text = ' ' + @Text + ' ';

                -- cycle through each character,
                -- if non-alpha, uppercase next alpha character.
                -- if alpha then lowercase subsequent alphas.
                WHILE (@i <= LEN(@Text))
                        SELECT
                                @c0=SUBSTRING(@Text,@i-2,1), @c1=SUBSTRING(@Text,@i-1,1), @c2=SUBSTRING(@Text,@i,1),
                                @Ret = @Ret + CASE WHEN @Reset=1 THEN UPPER(@c2) ELSE LOWER(@c2) END,
                                @Reset = CASE
                                                -- EXCEPTION: McDonalds and Mc Donalds but not Tomcat
                                                WHEN @c0 = ' ' AND @c1 = 'M' AND @c2 = 'c' THEN 1

                                                -- EXCEPTION: D'Artagnan, I'Anson, O'Neils and O' Sullivan but Mo'Money
                                                WHEN @c0 = ' ' AND @c1 IN ('D', 'I', 'O') AND @c2 = '''' THEN 1

                                                WHEN @c2 LIKE '[a-zA-Z'']' THEN 0               -- Apply LOWER to any character after alphas or apostrophes
                                                ELSE 1                                                                  -- Apply UPPER to any character after symbols/punctuation
                                        END,
                                @i = @i +1

                -- add a trailing space in case the previous rule changed this.
                SET @Ret = @Ret + ' ';

                -- custom exceptions: this search is case-insensitive and will
                -- replace the word to the case as it is written in the list.
                -- NOTE: this list has to end with a comma!
                SELECT @i=0, @CaseLen=0,
                        @CaseExceptions = 'and,at,de,for,in,of,or,the,to,van,y,'+
                        'BI,GP,HR,IT,NHS,OVC,'+
                        'PA,PR,PT,PTHP,UET,UK,USA,VC,'+
                        'AskMe,HoD,PO Box,SportMe,'

                -- Loop through exception cases
                WHILE CHARINDEX(',', @CaseExceptions, @i+1)>0
                        BEGIN
                                -- get the delimited word
                                SET @CaseLen = CHARINDEX(',', @CaseExceptions, @i+1) - @i
                                SET @CaseValue = SUBSTRING(@CaseExceptions, @i, @CaseLen)

                                -- replace it in the original text
                                SET @Ret = REPLACE(@Ret, ' '+@CaseValue+' ', ' '+@CaseValue+' ')

                                -- get position of next word
                                SET @i = CHARINDEX(',', @CaseExceptions, @i+@CaseLen) +1
                        END

                -- remove any leading and trailing spaces
                SET @Ret = LTRIM(RTRIM(@Ret));

                -- capitalize first character of data irrespective of previous rules
                SET @Ret = UPPER(SUBSTRING(@Ret,1,1)) + SUBSTRING(@Ret,2,LEN(@Ret));

        END
   ELSE
    BEGIN
                -- return the string unaffected if it is not in uppercase
                SET @Ret=@Text
        END

   RETURN @Ret
END

Examples of what this script does:
Reminder: The script only affects data already in UPPERCASE. Change the IF (UPPER(@Text)=@Text COLLATE Latin1_General_CS_AI) line if you want to change this.

OriginalProper-Cased
JOHN SMITH-JONESJohn Smith-Jones
HEAD OF IT SERVICESHead of IT Services
FLAT 123AFlat 123A
MRS. O'CONNOR-MCDONALDMrs. O'Connor-McDonald
ORG_DEPT_STATUSOrg_Dept_Status
SEND.HELP.NOWSend.Help.Now
PO BOX 1234PO Box 1234
B'MTH BH1 3ABB'mth Bh1 3Ab
123 THE WORLD'S END123 the World's End
'WONDERWALL''wonderwall'
"FAIRLANDS""Fairlands"
C/O DAD'S (JIMMY)C/O Dad's (Jimmy)
SampleTextSampleText
lowercase.textlowercase.text

Examples of usage:
SELECT
    ufn_ProperCase([Surname])
FROM
    employees


Additional:
Note how the above doesn't convert postal codes, specifically British postcodes, which isn't included in the examples above because the function was only going to be applied to names and titles rather than addresses, but here it is anyway, add this after the "loop through exception cases" while clause:
-- Now for some UK (British) PostCode recognition
SELECT @i=1, @CaseExceptions = @Ret;
WHILE CHARINDEX(' ', @CaseExceptions, @i+1)>0
        BEGIN
                -- get the delimited word
                SET @CaseLen = CHARINDEX(' ', @CaseExceptions, @i+1) - @i
                SET @CaseValue = SUBSTRING(@CaseExceptions, @i, @CaseLen)

                -- Is this word a UK postal outward code (2 letters and number(s))
                IF ((LEN(@CaseValue)<=4) AND (SUBSTRING(@CaseValue, 1, 1) LIKE '[a-zA-Z]') AND (SUBSTRING(@CaseValue, 2, 1) LIKE '[a-zA-Z]') AND (SUBSTRING(@CaseValue, 3, 1) LIKE '[0-9]'))
                        SET @CaseValue = UPPER(@CaseValue);

                -- Is this word a UK postal inward code (number followed by letters)
                IF ((LEN(@CaseValue)<=4) AND (SUBSTRING(@CaseValue, 1, 1) LIKE '[0-9]') AND (SUBSTRING(@CaseValue, 2, 1) LIKE '[a-zA-Z]') AND (SUBSTRING(@CaseValue, 3, 1) LIKE '[a-zA-Z]'))
                        SET @CaseValue = UPPER(@CaseValue);

                SET @Ret = REPLACE(@Ret, ' '+@CaseValue+' ', ' '+@CaseValue+' ')

                -- get position of next word
                SET @i = CHARINDEX(' ', @CaseExceptions, @i+@CaseLen) +1
        END



Related Articles

Joes Revolver Map

Accreditation

Badge - Certified Zoho Creator Associate
Badge - Certified Zoho Creator Associate

Donate & Support

If you like my content, and would like to support this sharing site, feel free to donate using a method below:

Paypal:
Donate to Joel Lipman via PayPal

Bitcoin:
Donate to Joel Lipman with Bitcoin - Valid till 8 May 2022 3QnhmaBX7LQSRsC9hh6Je9rGQKEGNQNfPb
© 2021 Joel Lipman .com. All Rights Reserved.