SOUNDEX (Transact-SQL)

THIS TOPIC APPLIES TO: yesSQL Server (starting with 2008)yesAzure SQL DatabaseyesAzure SQL Data Warehouse yesParallel Data Warehouse

Returns a four-character (SOUNDEX) code to evaluate the similarity of two strings.

Topic link icon Transact-SQL Syntax Conventions

Syntax

-- Syntax for SQL Server, Azure SQL Database, Azure SQL Data Warehouse, Parallel Data Warehouse  

SOUNDEX ( character_expression )  

Arguments

character_expression
Is an alphanumeric expression of character data. character_expression can be a constant, variable, or column.

Return Types

varchar

Remarks

SOUNDEX converts an alphanumeric string to a four-character code that is based on how the string sounds when spoken. The first character of the code is the first character of character_expression, converted to upper case. The second through fourth characters of the code are numbers that represent the letters in the expression. The letters A, E, I, O, U, H, W, and Y are ignored unless they are the first letter of the string. Zeroes are added at the end if necessary to produce a four-character code. For more information about the SOUNDEX code, see The Soundex Indexing System.

SOUNDEX codes from different strings can be compared to see how similar the strings sound when spoken. The DIFFERENCE function performs a SOUNDEX on two strings, and returns an integer that represents how similar the SOUNDEX codes are for those strings.

SOUNDEX is collation sensitive. String functions can be nested.

SOUNDEX Compatibility

In previous versions of SQL Server, the SOUNDEX function applied a subset of the SOUNDEX rules. Under database compatibility level 110 or higher, SQL Server applies a more complete set of the rules.

After upgrading to compatibility level 110 or higher, you may need to rebuild the indexes, heaps, or CHECK constraints that use the SOUNDEX function.

  • A heap that contains a persisted computed column defined with SOUNDEX cannot be queried until the heap is rebuilt by running the statement ALTER TABLE <table> REBUILD.

  • CHECK constraints defined with SOUNDEX are disabled upon upgrade. To enable the constraint, run the statement ALTER TABLE <table> WITH CHECK CHECK CONSTRAINT ALL.

  • Indexes (including indexed views) that contain a persisted computed column defined with SOUNDEX cannot be queried until the index is rebuilt by running the statement ALTER INDEX ALL ON <object> REBUILD.

Examples

The following example shows the SOUNDEX function and the related DIFFERENCE function. In the first example, the standard SOUNDEX values are returned for all consonants. Returning the SOUNDEX for Smith and Smythe returns the same SOUNDEX result because all vowels, the letter y, doubled letters, and the letter h, are not included.

-- Using SOUNDEX  
SELECT SOUNDEX ('Smith'), SOUNDEX ('Smythe');  

Here is the result set. Valid for a Latin1_General collation.


----- -----   
S530  S530    

(1 row(s) affected)  

The DIFFERENCE function compares the difference of the SOUNDEX pattern results. The following example shows two strings that differ only in vowels. The difference returned is 4, the lowest possible difference.

-- Using DIFFERENCE  
SELECT DIFFERENCE('Smithers', 'Smythers');  
GO  

Here is the result set. Valid for a Latin1_General collation.

-----------   
4             

(1 row(s) affected)  

In the following example, the strings differ in consonants; therefore, the difference returned is 2, the greater difference.

SELECT DIFFERENCE('Anothers', 'Brothers');  
GO  

Here is the result set. Valid for a Latin1_General collation.

-----------   
2             

(1 row(s) affected)  

Examples: Azure SQL Data Warehouse and Parallel Data Warehouse

The following example shows the SOUNDEX function and the related DIFFERENCE function. In the first example, the standard SOUNDEX values are returned for all consonants. Returning the SOUNDEX for Smith and Smythe returns the same SOUNDEX result because all vowels, the letter y, doubled letters, and the letter h, are not included.

-- Using SOUNDEX  
SELECT SOUNDEX ('Smith'), SOUNDEX ('Smythe');  

Here is the result set. Valid for a Latin1_General collation.


----- -----   
S530  S530    

(1 row(s) affected)  

The DIFFERENCE function compares the difference of the SOUNDEX pattern results. The following example shows two strings that differ only in vowels. The difference returned is 4, the lowest possible difference.

-- Using DIFFERENCE  
SELECT DIFFERENCE('Smithers', 'Smythers');  
GO  

Here is the result set. Valid for a Latin1_General collation.

-----------   
4             

(1 row(s) affected)  

In the following example, the strings differ in consonants; therefore, the difference returned is 2, the greater difference.

SELECT DIFFERENCE('Anothers', 'Brothers');  
GO  

Here is the result set. Valid for a Latin1_General collation.

-----------   
2             

(1 row(s) affected)  

See Also

DIFFERENCE (Transact-SQL)
String Functions (Transact-SQL)
ALTER DATABASE Compatibility Level (Transact-SQL)