All you need to know about Sidekiq
Processing Background Jobs with Rails
What is Sidekiq?
Sidekiq is a Ruby framework to perform background jobs. It is an open-source job scheduler written in Ruby that is very useful for handling expensive computations and processes that are better served outside of the main web application.
How Sidekiq works?
Sidekiq requires three parts to work
1. Client
The Sidekiq client runs in any Ruby process (typically a puma, unicorn, or Passenger process) and allows you to create jobs for processing later.
There are two ways to create a job in your application code:
2.6.2 :001 > TestWorker.perform_async('easy')
=> "280fc47747be01da2077cc6d"
or2.6.2 :005 > Sidekiq::Client.push('class' => TestWorker, 'args' => ['easy'])
=> "66d2829dbba8f1b80212d05e"
These two methods are equivalent and create a Hash which represents the job. The client serializes the Hash to a JSON string and pushes that String into a queue in Redis. This means the arguments to your worker must be simple JSON data types (numbers, strings, boolean, array, hash). Complex Ruby objects (e.g. Date, Time, ActiveRecord models) will not serialize properly.
2. Redis
Redis provides data storage for Sidekiq. Jobs pushed by sidekiq-client remains in Redis Queue.
3. Server
Sidekiq server process pulls jobs from the queue in Redis and processes them. Like your web processes, Sidekiq boots Rails so your jobs and workers have the full Rails API, including Active Record, available for use. The server will instantiate the worker and call perform with the given arguments. Everything else is up to your code.
The number of jobs that sidekiq server will perform at a time depends on available threads. we can set the concurrency in sidekiq.yml file which should be same as pool size in database.yml
Let's find more with code:
I am pre assuming that you have sidekiq and Redis installed on your rails application.
Ensure Redis and Sidekiq servers are running on your machine. To start Redis server on your local machine run $ ‘Redis-server’ and for sidekiq run $ ‘bundle exec sidekiq’ on your terminal.
This is how my sidekiq.yml file looks
:concurrency: 5
:pidfile: tmp/pids/sidekiq.pid
staging:
:concurrency: 10 # This should be same as pool size in database.yml
production:
:concurrency: 10 # This should be same as pool size in database.yml
I am using a development environment with concurrency 5.
let’s create a worker under app/workers/ directory and name it test_worker.rb
class TestWorker
include Sidekiq::Worker
sidekiq_options :queue => :default , :retry => 1def perform(type)
case type
when 'easy'
sleep 5
puts 'this is easy job 5 sec wait'
when 'medium'
sleep 20
puts 'this is medium job 20 sec wait'
when 'hard'
sleep 30
puts 'this is hard job 30 sec wait'
when 'error'
sleep 15
puts 'there is a error, wait for 15 sec'
raise 'raised error'
end
end
end
performing job:
2.6.2 :058 > TestWorker.perform_async('easy')
=> “f6c5ea4baa2689c1742dcac3”
2.6.2 :059 > Sidekiq::Queue.new('default').size
=> 0
Wait, What? my default queue is empty? then where is my job?
So, sidekiq has picked the job and executed it. Now let’s push 10 hard jobs in the queue, in this case, Sidekiq server picks up 5 jobs at a time which is equal to the number of concurrency the other five jobs will wait in a queue.
2.6.2 :061 > 10.times {TestWorker.perform_async('hard')}
=> 10
2.6.2 :062 > Sidekiq::Queue.new('default').size
=> 5
We can also schedule a job in sidekiq:
2.6.2 :063 > TestWorker.perform_at(10.minutes.from_now, 'easy')
=> “67fed0bd2bf3a6d67b7f68e4”
2.6.2 :064 > Sidekiq::Queue.new('default').size
=> 0
Hell, Job is not in default queue? then where?
Where to find Scheduled Jobs?
Sidekiq places scheduled jobs in ScheduledSet which is named queue which actually inherits SortedSet. The scheduled sorted set holds all scheduled jobs in chronologically-sorted order. At scheduled time job moves to their respective queues.
2.6.2 :067 > ss = Sidekiq::ScheduledSet.new
=> #<Sidekiq::ScheduledSet:0x00007f96549d5bd8 @name=”schedule”, @_size=1>
2.6.2 :068 > ss.size
=> 1
to remove all jobs
ss.clear
What happens to the job when it raises an exception?
When a job raises an error, Sidekiq places it in the RetrySet for automatic retry later. Jobs are sorted based on when they will next retry. Sidekiq will retry failures with an exponential backoff using the formula (retry_count ** 4) + 15 + (rand(30) * (retry_count + 1)) seconds (i.e.
. It will perform 25 retries over approximately 21 days. Assuming you deploy a bug fix within that time, the job will get retried and successfully processed. After 25 times, Sidekiq will move that job to the Dead Job queue, assuming that it will need manual intervention to work.
15, 16, 31, 96, 271, ... seconds + a random amount of time)
The maximum number of retries can be globally configured by adding the following to your sidekiq.yml:
:max_retries: 1 or we can pass retry as an optional parameter to sidekiq_options in the worker.
2.6.2 :076 > TestWorker.perform_async(‘error’)
=> “df30d22e593318a28b63c2ef”after 15 seconds when jobs fails it gets added to retryset2.6.2 :078 > rs = Sidekiq::RetrySet.new
=> #<Sidekiq::RetrySet:0x00007f9654356460 @name="retry", @_size=1>
2.6.2 :079 > rs.size
=> 1
to remove all jobs
rs.clear
Dead Jobs
Like RetrySet and ScheduledSet, the DeadSet holds all jobs considered dead by Sidekiq, ordered by when they died. It supports the same basic operations as the others.
The Deadset is a holding pen for jobs that have failed all their retries. Sidekiq will not retry those jobs. The Deadset is limited by default to 10,000 jobs or 6 months so it doesn’t grow infinitely. Only jobs configured with 0 or greater retries will go to the Dead set. Use retry: false
if you want a particular type of job to be executed only once, no matter what happens
2.6.2 :082 > ds = Sidekiq::DeadSet.new
=> #<Sidekiq::DeadSet:0x00007f965435c0e0 @name="dead", @_size=1>
2.6.2 :083 > ds.size
=> 1to remove all jobs
ds.clear
Important console commands you must know:
Stats:
stats = Sidekiq::Stats.new
stats.processed # => 100
stats.failed # => 3
stats.queues # => { "default" => 1001, "email" => 50 }
Gets the number of jobs enqueued in all queues (does NOT include retries and scheduled jobs).
stats.enqueued # => 5
Stats History:
All dates are UTC and history stats are cleared after 5 years.
Get a history of failed/processed stats:
s = Sidekiq::Stats::History.new(2) # Indicates how many days of data you want starting from today (UTC)
s.failed # => { "2019-07-05" => 120, "2019-07-04" => 234 }
s.processed # => { "2019-07-05" => 1010, "2019-07-04" => 1500 }
Start from a different date:
s = Sidekiq::Stats::History.new( 3, Date.parse("2019-07-03") )
s.failed # => { "2019-07-03" => 10, "2019-07-02" => 24, "2019-07-01" => 4 }
s.processed # => { "2019-07-03" => 124, "2019-07-02" => 345, "2019-07-01" => 355 }
Queues:
get all queue
Sidekiq::Queue.all
Get a queue
Sidekiq::Queue.new # the "default" queue
Sidekiq::Queue.new("queue_name")
Gets the number of jobs within a queue.
Sidekiq::Queue.new.size # => 4
Deletes all Jobs in a Queue, by removing the queue.
Sidekiq::Queue.new.clear
Delete jobs within the queue default
with a jid
of 'abcdef1234567890' or worker name is 'TestWorker'
queue = Sidekiq::Queue.new("default")
queue.each do |job|
job.klass # => 'TestWorker'
job.args # => ['easy']
job.delete if job.jid == 'abcdef1234567890' || job.klass == 'TestWorker'
end
************************** THANK YOU **************************
To read about more options that sidekiq provides read: https://github.com/mperham/sidekiq/wiki/Advanced-Options
To understand rack and rack middleware take a quick read:
https://medium.com/@shashwat12june/rack-and-rack-middleware-f93513ac92a6