Find Duplicate Records and the Number of Duplicates in SQL (MySQL)

I have found the need to account for duplicates and the number of duplicates in many databases. I’ve come across many people who have a minimal understanding of SQL and create wild php scripts to find duplicates, when really all that is needed is a single SQL statement.

select COLUMN, count(COLUMN) as cnt from TABLE group by COLUMN order by cnt desc

Where COLUMN is the column (all the same) you wish to find duplicate entries for. It will display the number of duplicates as an integer in descending order grouped by each entry in the COLUMN column. TABLE is the database table you wish to analyze.

Ex. Find the number of instances of the same name in an employee database:

select FirstName, count(FirstName) as cnt from Employee_Info group by FirstName order by cnt desc

MySQL has a sense of Humor (old)

I was browsing around the MySQL bug section posting some bugs and looking for old ones that have been solved. I came across bug #2: Does not make Toast

I guess it’s funny in that nerdy way. So this begs the question: When will computers control toasters and other kitchen appliances? There are fridges with computers built into them, fridges with control systems, and computers put into fridges but no ovens or oven-fridge combinations that choose a recipe, mix it up and cook it based on a computer’s recommendation. I say, bring on the computer cook! Or the hydrating oven from Back to the Future Part II w/ the mini-pizzas.

Wordpress & PHP on Windows Server IIS 6

This is a quick tip to those who are having trouble installing or using Wordpress on Windows Server with PHP.

Make sure in the PHP.ini file that the following is set:

; Magic quotes for runtime-generated data, e.g. data from SQL, from exec(), etc.
magic_quotes_runtime = Off   

If this is ON, Wordpress will not allow you to login. It will spit out the following error:

“You do not have sufficient permissions to access this page.”

You CAN have magic_quotes ON though, which is a different setting. The above setting is for data being pulled from a MySQL database for example.

Also make sure the following is set:

; cgi.fix_pathinfo provides *real* PATH_INFO/PATH_TRANSLATED support for CGI.  PHP’s
; previous behaviour was to set PATH_TRANSLATED to SCRIPT_FILENAME, and to not grok
; what PATH_INFO is.  For more information on PATH_INFO, see the cgi specs.  Setting
; this to 1 will cause PHP CGI to fix it’s paths to conform to the spec.  A setting
; of zero causes PHP to behave as before.  Default is zero.  You should fix your scripts
; to use SCRIPT_FILENAME rather than PATH_TRANSLATED.
cgi.fix_pathinfo=1

Unable to load dynamic library php_mysql.dll (MySQL 5.0.41)

You might have seen this error before in the Windows event viewer. However, even if you haven’t seen it, you still might (if you upgrade MySQL). Below is the full error you might find:

The description for Event ID ( 2 ) in Source ( PHP-5.x ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: php[4560], PHP Warning:  PHP Startup: Unable to load dynamic library ‘c:\dir\php_mysql.dll’ - Access is denied.

The reason is that in MySQL 5.0.41 libmysql.dll cannot be dynamically loaded on windows. The symbol USE_TLS is no longer defined in the Windows build.  Windows is expected to allocate the storage for THR_KEY_mysys in mysys/my_thr_init.c using the pthread_key macro.

How to fix this? Downgrade to MySQL 5.0.40 or 5.0.37. You can find them by simply pasting the download URL of the current build and changing the numbers. They always leave old builds on the FTP server.

vBulletin / Photopost and MySQL 5

On a few sites I run I have vBulletin and / or Photopost installed. Upgrading from MySQL 4 to MySQL 5 is simple because these programs are 100% compatible with both (as they should be). Except for one issue. When installing MySQL 5 it defaults to asking to be installed in “Strict Mode” which is how a database should work. Strict Mode behaves in a way that when a malformed statement is entered into the database, it aborts the operation. This is great for helping to provide a structure and predictability to the data in the database. However, some scripts are not programmed in a way that would allow it to work well with strict mode. In vBulletin, this error will show up:

  MySQL is running in strict mode. While you may proceed, some areas of vBulletin may not function properly. It is strongly recommended that you set $config['Database']['force_sql_mode'] to true in your includes/config.php file!

 

I would disable strict mode if possible over forcing vBulletin to use it, as some add-ons don’t play well with it.

If you are using the MySQL Administrator, go to startup variables -> advanced -> sql mode and note the options in the field. If there are no options stated, you are NOT in strict mode. If you see any or all of the following, the database is operating in strict mode:

STRICT_TRANS_TABLES, NO_AUTO_CREATE_USER, NO_ENGINE_SUBSTITUTION

Just remove all 3 statements to disable strict mode. If your server has been operating in strict mode previously and scripts have been designed around it, be sure you can disable it without any adverse effects.

Here is an explanation from the MySQL manual on the above options:

STRICT_TRANS_TABLES

If a value could not be inserted as given into a transactional table, abort the statement. For a non-transactional table, abort the statement if the value occurs in a single-row statement or the first row of a multiple-row statement.

NO_AUTO_CREATE_USER

Prevents GRANT from automatically creating new users unless a non-empty password is specified.

NO_ENGINE_SUBSTITUTION

Control automatic substitution of the default storage engine when a statement such as CREATE TABLE or ALTER TABLE specifies a storage engine that is disabled or not compiled in.

With NO_ENGINE_SUBSTITUTION disabled, the default engine is used and a warning occurs if the desired engine is known but disabled or not compiled in. If the desired engine is invalid (not a known engine name), an error occurs and the table is not created or altered.

MySQL Performance Tip #1: Query Cache

This tip can be done in either Windows or Linux. In Windows using the MySQL Administrator go into the Health Tab and then the Performance Tab. Look at the key_reads and key_read_requests numbers. The ratio of key_reads to key_read_requests should be above 1:100 if you can spare the memory. It should be no lower than 1:10.

How do you increase the ratio for better performance? Increase the  key_buffer_size value to an acceptable level.