≡ Menu

Mysql Remove Duplicate Data or Rows With DISTINCT

How do I remove duplicate data or rows from a MySQL query?

If output from a MySQL query contains duplicate data or row and if you would like to remove the same use DISTINCT.

DISTINCT combined with ORDER BY needs a temporary table in many cases. Consider following query:
mysql> SELECT firstname FROM address;

If you need unique firstname i.e remove duplicate contains type the command:
mysql> SELECT DISTINCT firstname FROM address;

You can use DISTINCT with multiple column as follows:
mysql> SELECT DISTINCT firstname, city FROM address;

For DISTINCT optimization please see this help page.

Tweet itFacebook itGoogle+ itPDF itFound an error/typo on this page?

{ 83 comments… add one }

  • jafar January 21, 2008, 6:03 pm


    delete from table1
    USING table1, table1 as vtable
    WHERE (NOT table1.ID=vtable.ID)
    AND (table1.field_name=vtable.field_name)

    1. Here you tell mysql that there is a table1.
    2. Then you tell it that you will use table1 and a virtual table with the values of table1.
    3. This will let mysql not compare a record with itself!
    4. Here you tell it that there shouldn’t be records with the same field_name.

    • Susanta Sri October 26, 2010, 2:43 pm

      You saved my day Man!!!

      • Vikram December 18, 2010, 7:59 am

        Nice 1 bro…!!

    • pablo sanchez February 8, 2011, 3:44 pm

      Thank you!

      • overseer May 16, 2011, 9:56 am

        Totally awesome.

    • kalaivani July 7, 2011, 9:38 am

      Hi frnd,
      your query is gud, but i have one doubt while i’m deleting…….
      Suppose if i have two or more table and if that table contain more than 80000 records, and i find some duplicate records in my table & try to delete them means the other table will be suffer or not? because i kept all id’s as primary key & set to other table into foreign key, while i’m deleting a primary key, Will foreign key show error? or will other table be suffered?

    • manish September 4, 2011, 4:51 am

      thank you jafar bhayi .

    • kaushik December 1, 2011, 9:24 am

      thanx jafar..that is gr8 man..:)

    • Aaron April 18, 2012, 7:59 pm

      Works well, but be cautioned that both the duplicates are deleted, I almost got in trouble with this. Also seems to not work when there are null values being compared. Correct me if I am wrong.

  • jafar January 21, 2008, 6:22 pm

    The DISTINCT command is not working somehow!
    Also the delete rule what I send early is not good enough:
    It deletes all values which are duplicate,
    instead of leaving one behind!

  • Glen Rubin February 24, 2008, 12:29 am

    I have fixed this problem and have a tutorial on my website: http://www.weareallhumansnow.com

  • Amit Vyas March 3, 2008, 10:24 am

    select count(*) as num, name,id from table group by name having count(*)>1;

  • mikemon March 25, 2008, 6:05 pm

    The first answer by jafar is close, you just need some condition that is true for all except one of the duplicates.
    I ended up using:

    delete from table1
    USING table1, table1 as vtable
    WHERE (table1.ID > vtable.ID)
    AND (table1.field_name=vtable.field_name)

    Used “greater than” instead of “not equal”. This handles both the conditions that the records not be the same and that only one record will return false.
    Since we use greater than (or less than), only the first (or last) duplicate returns false, so it is not selected for deletion.

    • srsumon May 17, 2010, 6:31 am

      thx. mikemon. It really helps

    • Simon November 23, 2010, 2:32 am

      Thanks Mikemon and Jafar,

      It worked for me !

      /Simon

    • dave February 18, 2011, 1:30 pm

      Thanks mikemon and jafar!

  • SpyGM August 25, 2008, 2:36 pm

    here the code bro..

    tmp = table name;
    a = cols 1
    b = cols 2

    ALTER IGNORE TABLE tmp ADD UNIQUE INDEX(a,b);

    before :

    id | a | b
    1 | 1 | 2
    2 | 1 | 2
    3 | 2 | 3
    5 | 3 | 4

    after:

    id | a | b
    2 | 1 | 2
    3 | 2 | 3
    5 | 3 | 4

    tested… :D
    this not my idea..
    but i read it over net..
    and i`m used it..

    • Anonymous May 17, 2010, 1:26 am

      Adopted this solution, work like a charm! Thanks a lot

    • aeon October 31, 2010, 5:01 pm

      You’re a lifesaver man!

    • arjan October 12, 2012, 11:27 am

      Simply Wonderfull. This one worked and saved me hours! Thanx!

  • bardh7 September 16, 2008, 6:18 pm

    this could also help

    before :

    id | name
    ————-
    1 | test |
    2 | test |
    3 | php |
    4 | echo |

    after:

    id | name|
    ————
    2 | test |
    3 | php |
    4 | echo |

    <?php
    $cn= mysql_connect("localhost", "root", "") or die('Could not connect');
    $db= mysql_select_db(test, $cn) or die('Could not select database');
        $query="SELECT * FROM name";
        $result=mysql_query($query);
        echo mysql_error();
            while($row = mysql_fetch_array($result, MYSQL_BOTH))
            {
                    $query1="SELECT * FROM tablename where name = '".$row[1]."'";
                    $result1=mysql_query($query1);
                    $count = mysql_num_rows($result1) - 1;
                    mysql_query("DELETE FROM tablename WHERE name='".$row[1]."' LIMIT $count",$link);
                    echo "deleted $row[1] ";
    }
            echo mysql_error();
        ?>
    • MANNULAL August 26, 2011, 1:32 pm

      IT IS THE WAY OF HELPING EACH OTHER

  • Julia Valencia October 7, 2008, 11:06 am

    it is difficult to delete duplicates.

  • Caleb Gray December 31, 2008, 2:04 am

    This is the proper format, allowing for easy debugging of errors…
    DELETE
    FROM table1
    USING table1, table1 AS vtable
    WHERE vtable.id > table1.id
    AND table1.field_name = vtable.field_name

    If you’re nervous about deleteing everything, you can make sure it’s working first:
    SELECT *
    FROM table1, table1 AS vtable
    WHERE vtable.id > table1.id
    AND vtable.field_name = table1.field_name

    I just did this and it worked like a charm on hundreds of thousands of database entries.

    – Caleb

    • avo July 12, 2010, 11:40 am

      it’s been 10 minutes, still “loading” or “processing”… is this normal?

      • mkquake May 2, 2011, 6:59 pm

        Sure, if you are working with 80286 !

    • Tunc November 10, 2010, 4:49 pm

      Wow! Saved my life, thanks…

    • Ricky July 7, 2011, 4:05 am

      WOW your the MAN!!

      I used exactly what you said and then i checked with

      SELECT field_name,
      COUNT(field_name) AS NumOccurrences
      FROM cron_jobs
      GROUP BY field_name
      HAVING ( COUNT(field_name) > 1 )

      Worked like a charm!!

  • alex January 13, 2009, 1:56 pm

    Probably a stupid question, but what does the ‘id’ in table1.id and vtable.id stand for and does it have to be replaced individually?

  • blingin' January 15, 2009, 6:25 pm

    The “id” is the primary key column.

  • Ricardo Furtado January 20, 2009, 7:31 pm

    The Caleb Gray solution worked fine !

    Nice code ! Thank you for helping !!

  • James January 28, 2009, 1:38 pm

    SpyGM is spot on – and saved me a massive amount of time. His solution allows you to delete duplicates based on a combination of columns, rather than just a simple 1 to 1 comparison.

    The code I used for my table as follows (based on SpyGM’s comment):

    ALTER IGNORE TABLE `referrer` ADD UNIQUE INDEX(user_id,referrer_id);

    But as I understand it you could have more than 2 columns in here, or just the 1 if that’s all you need.

    Cheers

    • ganesh December 2, 2011, 5:54 am

      hey guys this work thanks………..

  • John February 10, 2009, 5:56 am

    The following fails if one of the unique index columns is more than 1000 characters long.

    ALTER IGNORE TABLE `referrer` ADD UNIQUE INDEX(user_id,referrer_id);

    Still useful…

  • Me April 12, 2009, 12:08 am

    SpyGM’s solution works for me.

    I’d guess it was Mike Chirico’s original he found: http://free.netartmedia.net/Databases/Databases8.html ;o)

  • Ruggie June 18, 2009, 9:10 pm

    Be careful using SpyGM’s suggestion, creating unique indexes on multiple columns may get rid of duplicates but may also have undesired results.

    For example, imagine a table like this:
    ID|Email|AcctID|DateAdded
    1|a@a.com|12|1/1/2009
    2|a@a.com|13|1/3/2009
    3|b@b.com|12|2/2/2009
    4|a@a.com|12|3/3/2009

    If this table is used to associate an email address to an account (that is, multiple email addresses associated to multiple accounts), you would want duplicates available in the email column and acctid column, but not duplicates where both cells are the same. (Record 4 would be the undesired record in this case)
    Forcing a unique index on the email and acctid column would wipe out row 2 and 4, when we only want to get rid of row 4.

    In situations like this, its a good idea to write your code to check for an existing record already containing those two pieces of information before you insert a new record.
    But if you already have a database full of mixed duplicates like this, then the best bet would be to compare every row like in the php example bardh7 listed above.

    Also, Indexes get updated every time a database gets updated, and they take additional space on top of your database. Used correctly, they can speed up searches on a database, used incorrectly they can have a negative impact on server performance.

  • Bruno Correia August 8, 2009, 10:05 pm

    This works:

    — remove duplicated entries
    — —————————————————————————————
    — step 1
    CREATE TABLE temp_table AS
    SELECT * FROM foo WHERE 1 GROUP BY bar;
    — step 2
    DROP TABLE foo;
    — step 3
    RENAME TABLE temp_table TO foo;
    — —————————————————————————————

  • Kyle August 20, 2009, 6:10 pm

    Caleb Gray’s solution also worked fine for me. The only problem I had was that because of the nature of the query a LIMIT clause could be used. I got around this by using a DELETE QUICK and following up after with an OPTIMIZE TABLE table1.

  • Staffo December 17, 2009, 2:43 pm

    Ruggie is wrong. I tested the ALTER IGNORE command, and it will not remove row with id 3 in his example – just id=4 will be deleted.

  • Ruggie December 19, 2009, 11:27 pm

    Turns out I did make a mistake.
    In the past when I create a unique index I’ve always done it per column. Which can cause undesired effects sometimes.
    (Example: ALTER IGNORE TABLE `test` ADD UNIQUE INDEX(user_id); )
    I didn’t know that by passing two column identifiers it would create a single index containing data from both columns. I assumed that it would create two indexes.

    This feature isn’t documented on the mysql ALTER syntax page but I did find it on the CREATE INDEX syntax page. Here is a quote: “A column list of the form (col1,col2,…) creates a multiple-column index. Index values are formed by concatenating the values of the given columns. ”

    Thanks Staffo, your response encouraged me take a second look!

  • Disappointed February 11, 2010, 8:51 am

    I could not even read the article because a banner ad was in the way that I could not close. This is a lame web site.

  • Ashish February 12, 2010, 11:34 am

    Thanks, jafar and mikemon for the right query.

  • mohsen March 20, 2010, 10:39 pm

    thank u so mush, this code is a life saver, and u too thanks

  • dana May 12, 2010, 7:17 pm

    Spygm’s worked great! Thanks,

    SpyGM:
    ALTER IGNORE TABLE tmp ADD UNIQUE INDEX(a,b);

  • sa June 5, 2010, 4:10 pm

    qive me mysql query which i delete onlu one duplicate row field thanks

    • Jim May 6, 2011, 1:39 am

      TRUNCATE TABLE bar WHERE duplicate IS true;

  • avo July 12, 2010, 11:41 am

    how long does it usually take for the query to finish its execution? :S

  • Gerbald September 12, 2010, 11:54 am

    This worked for me (no guarantees):

    DELETE FROM `MyTable`
    WHERE `DuplCol` IN (
    SELECT `DuplCol`
    FROM (

    SELECT `DuplCol` , count( * ) AS c
    FROM `MyTable`
    GROUP BY `DuplCol`
    HAVING c >1
    ) AS dctbl
    )
    AND `id` NOT IN (
    SELECT `mi`
    FROM (
    SELECT `DuplCol` , min( `id` ) AS mi
    FROM `MyTable`
    GROUP BY `DuplCol`
    ) AS idtbl
    );

    before:

    id | DuplCol
    ———–
    1 | aaa
    2 | bbb
    3 | bbb
    4 | aaa
    5 | ccc

    after:

    id | DuplCol
    ———–
    1 | aaa
    2 | bbb
    5 | ccc

    • John Komla December 8, 2011, 12:17 pm

      This saved my day. I even customized this to fit my project, as different users insert into that same table, only records attached to a particular logging session are removed from the table.

      This works great !

      Thank you again !

      • Neeraj August 23, 2013, 1:08 pm

        Gerbald has given a perfect answer !!

  • Dj November 18, 2010, 7:51 am

    thanx bro

  • INDRAJITH December 20, 2010, 7:43 am

    thanz bro………………………………………………….u saved me

  • Rangga December 24, 2010, 8:24 am

    Wow it’s really work . .

  • brij March 16, 2011, 1:12 pm

    DELETE FROM vehicle_makes USING vehicle_makes,
    vehicle_makes AS vtable WHERE (
    NOT vehicle_makes.id = vtable.id
    ) AND (
    vehicle_makes.name = vtable.name
    )

    =====================
    id name
    =====================
    2 Ottawa
    3 AM General/Hummer
    5 John Deere
    4 WRV 6 WRV
    17 Blue Bird
    20 Chance
    21 Crane Carrier
    22 Crane Carrier
    ======================

  • choudhury March 18, 2011, 3:00 am

    wow! This really worked for me…Thanks a lot Gerald…

    DELETE FROM `MyTable`
    WHERE `DuplCol` IN (
    SELECT `DuplCol`
    FROM (

    SELECT `DuplCol` , count( * ) AS c
    FROM `MyTable`
    GROUP BY `DuplCol`
    HAVING c >1
    ) AS dctbl
    )
    AND `id` NOT IN (
    SELECT `mi`
    FROM (
    SELECT `DuplCol` , min( `id` ) AS mi
    FROM `MyTable`
    GROUP BY `DuplCol`
    ) AS idtbl
    );

  • vikingu October 13, 2011, 10:16 am

    i’ll test at first, but looks like should workout!!

  • sane November 10, 2011, 11:31 am

    !!!!! I need synatax in mysql !!!!!!

    i create a table called studentaccount
    if i create a row at the end one column called “status” should hav value boolean value as true

    if i delete a row the status should be false

    what is the syntax plzzzz help me

  • Afshin November 17, 2011, 9:07 pm

    All of you assume that there is a unique ID or we can create a new table!
    Let’s say there is no ID and there is just one or two columns, and we don’t want to create a new table!

    any idea?

  • bob December 13, 2011, 5:19 am
    DELETE FROM table
     WHERE ID NOT IN (SELECT *
                        FROM (SELECT MIN(n.ID)
                                FROM table n
                            GROUP BY n.FIELDNAME) x)
    
    • Swamy December 18, 2014, 5:16 am

      I don’t know the reason right now,but I really want to say one thing that none of the query worked for my case except yours.

      Thanks alot…
      ur work saved my time…

    • Swamy December 18, 2014, 5:17 am

      Thanks alot bob..

  • John Ortiz December 22, 2011, 2:56 pm

    Thanks a lot for this mini-guide. It was useful for me.

  • Shairyar December 29, 2011, 9:07 am

    great thanks, exactly what i was looking for.

  • neeraj January 16, 2012, 8:16 am

    I like to below query if there is need to remove duplicate rows of table that does not contain primary key:

    DELETE FROM items WHERE GREATEST(0,@num := IF(NAME = @NAME, @num + 1, 0),LEAST(0, LENGTH(@NAME := NAME)))>0

    In my example items table just has one column “name” and above query removes the duplicate rows

  • Chicago Web January 17, 2012, 3:26 am

    Here is a much faster solution for large tables…. The other methods will take hours on a very large table.

    First we make a new table with distinct values for the field we are trying to remove duplicates for.

    CREATE TABLE new_table as
    SELECT * FROM old_table WHERE 1 GROUP BY column_to_remove_duplicates;

    Step 2: remove old table

    DROP TABLE old_table;

    Step 3: mv the new table back to the old one

    RENAME TABLE new_table TO old_table;

    Done!

  • Panama Red March 20, 2012, 5:50 pm

    I’m sure alot of these ways work, but Chicago Web’s way was quick, simple and did the trickt the first time. Thanks!

    PR

  • Mian Waqar April 23, 2012, 7:12 pm

    I want to ask something. I don’t understand how to get same data from table at only one time. Here are the example

    Like table name product_data and there are atleast six field.

    in data field the date look like this

    product Date
    23-04-2012
    23-04-2012
    23-04-2012
    23-04-2012
    23-04-2012
    23-04-2012
    18-04-2012
    18-04-2012

    But i want the result look like this.

    23-04-2012
    18-04-2012
    I only want to fetch the same date field in only one time. So anyone can tell me the solution.

    • Neeraj Kumar April 24, 2012, 2:53 am

      Mian, this is quite simple as you can get unique rows so use the below query:-

      SELECT DISTINCT productDate FROM product_data;

      • Mian Waqar April 24, 2012, 6:39 am

        thanks dear,

  • Kapil May 5, 2012, 10:34 am
    -----------------------------------------------------
    mysql> select * from md5_tbl;
    +----------------------------------+---------+
    | MD5                              | String  |
    +----------------------------------+---------+
    | 1F3870BE274F6C49B3E31A0C6728957F | apple   |
    | 7A10EA1B9B2872DA9F375002C44DDFCE | ball    |
    | B068931CC450442B63F5B3D276EA4297 | name    |
    | 1F3870BE274F6C49B3E31A0C6728957F | apple   |
    | 1F3870BE274F6C49B3E31A0C6728957F | apple   |
    | 791E0040179E55C81CAE0F69C0C9C662 | appleds |
    | 26402CC420AFFB03828FC87F89996F0B | apple1  |
    | C8453E1CD1329CFA0E6A938C7CC8012A | m1aball |
    | B8830D11F09FC99FD4786CB55098DEE0 | nameaq  |
    | AC05A6DD80CEAC53C3D9C4CCC00F2BFF | appled  |
    | 7A10EA1B9B2872DA9F375002C44DDFCE | ball    |
    | 84152737B78203F6117158A879E92B08 | m1abAll |
    | 1F3870BE274F6C49B3E31A0C6728957F | apple   |
    | 1F3870BE274F6C49B3E31A0C6728957F | apple   |
    | 87557F11575C0AD78E4E28ABEDC13B6E | End     |
    +----------------------------------+---------+
    15 rows in set (0.00 sec)
    -----------------------------------------------------
    100millions of data.
    

    How should I delete the duplicate row?

  • A Kader June 7, 2012, 8:37 am

    take dump of md5_tbl; (for safety)

    mysql>use ;
    create view temp_tb as select distinct MD5,String from md5_tbl;
    create table md5_tbl_cp like md5_tbl; insert into md5_tbl_cp select * from temp_tb;
    drop view temp_tb;
    delete from md5_tbl;
    insert into md5_tbl select * from md5_tbl_cp;
    drop table md5_tbl_cp;

  • shivam July 30, 2012, 7:29 am

    hi friends the above example which is given by mikemon is very good and it helps you alot

  • nikos August 5, 2012, 11:05 pm

    the following MySQL commands will create a temporary table and populate it with all columns GROUPED by one column name (the unique column that has duplicates) and order them by the primary key ascending. The second command simply renames the temporary table to the current used table, et voila you have removed all duplicates, only one entry of the former duplicates will stay in the database.

    that is a MUCH faster than jafars solution.
    Here are the two commands:

    CREATE TEMPORARY TABLE videos_temp AS SELECT * FROM videos GROUP by title ORDER BY videoid ASC;

    ALTER TABLE videos_temp RENAME videos;

  • nikos August 5, 2012, 11:30 pm

    > The following MySQL commands will create a temporary table and
    > populate it with all columns GROUPED by one column name (the column
    > that has duplicates) and order them by the primary key ascending. The
    > second command creates a real table from the temporary table. The
    > third command drops the table that is being used and finally the last
    > command renames the second temporary table to the current being used
    > table name.

    thats a really fast solution.
    Here are the four commands:

    1. CREATE TEMPORARY TABLE videos_temp AS SELECT * FROM videos GROUP by
    title ORDER BY videoid ASC;
    2. CREATE TABLE videos_temp2 AS SELECT * FROM videos_temp;
    3. DROP TABLE videos;
    4. ALTER TABLE videos_temp2 RENAME videos;

  • Allyssa August 29, 2012, 8:01 am

    Thank you! I got some precious information from your comment that you’ve shared guys. Thanks a lot! :-)

  • yogesh August 31, 2012, 7:40 am

    very good example thanks

  • Sanchari September 7, 2012, 6:04 am

    great…query works fine……..

  • bede October 16, 2012, 10:13 am

    thank you so much jafar! absolutely work even this year is 2012

  • parthiban December 24, 2012, 8:07 am

    mysql_query(“delete from mst_result using mst_result,mst_result as vtable where(mst_result.login > vtable.login) and(mst_result.login=vtable.login)”)

    its not working for me friends…

    i want to delete duplicate rows from..
    plz help me…

  • vamsi January 31, 2013, 4:10 pm

    I want to ask something. I don’t understand how to delete a single row from the below table
    Here are the example
    | user| filename | status |
    vk | Base paper.pdf | Failed |
    | vk | attendance.txt | Faile d |
    vamsi | ppt.pptx | Failed |
    | vamsi2dare621 | ppt.pptx | Failed |
    vamsi2dare621 | Clavister-security-in-the-cloud.pdf | Failed |
    vamsi | ms iv unit material mgt full.pptx | Failed |
    | vamsi2dare621 | ms iv unit material mgt full.pptx | Failed |

    IN the Above table i need to delete the user VK having filename attendence.txt but how to delete the row with out affecting the another user vk with file name Basepaper.pdf
    please give me a quick reply

  • Shahneel January 13, 2014, 2:05 pm

    Thanks a lot. I have been searching for this…

  • subrahmanyam August 2, 2014, 6:33 pm

    Simple Solution to Remove duplicates.
    I tried this simple query to remove duplicate title’s from my table – nap_news.

    DELETE n1 from nap_news n1, nap_news n2 WHERE n1.title= n2.title AND n1.id < n2.id

    Hope it will helps someone.

  • sreenath April 28, 2015, 9:24 am

    I have so many records in the tabel i have to print details based on the from date and to date.if from date is 01-01-14 and to date is 01-12-14. In this range i have only two records for the id:100. i.e
    no name sal month
    100 sree 10000 01-01-14
    100 sree 20000 01-12-14

    but i need out put like

    no name sal month
    100 sree 10000 01-01-14
    100 sree 0 01-02-14
    100 sree 0 01-03-14
    .
    .
    .
    .
    100 sree 20000 01-12-14

Leave a Comment