Some odd times I find myself wanting to remove duplicates from a table and it ends up being a hassle. Here are two simple ways to do so:

The First method is using my ruby gem rearmed. It is a collection for helpful methods and monkey patches that can be safely applied on an opt-in basis. The method in rearmed is called find_duplicates and can be used like so.


# Duplicates based on all attributes excluding id & timestamps
Model.find_duplicates

# Duplicates based on specific columns
Model.find_duplicates(:name, :description)

# Remove Duplicates
Modal.find_duplicates(:name, delete: {keep: :first})

However if you want to be able to do this without adding a gem then you can use the following method in your model to find the duplicates. Note though you must decide how to determine which items you are delete and which to keep


class Model < ApplicationRecord

  def self.get_duplicates(*columns)
    self.order('created_at ASC').select("#{columns.join(',')}, COUNT(*)").group(columns).having("COUNT(*) > 1")
  end

  def self.dedupe(*columns)
    # find all models and group them on keys which should be common
    self.group_by{|x| columns.map{|col| x.send(col)}}.each do |duplicates|
      first_one = duplicates.shift # or pop to keep last one instead

      # if there are any more left, they are duplicates then delete all of them
      duplicates.each{|x| x.destroy}
    end
  end

end


columns = [:name, :description]
Model.get_duplicates(*columns).dedupe(*columns)
# or 
Model.get_duplicates(:name, :description).dedupe(:name, :description)


Related External Links: