OccamsRecord provides the power tools that ActiveRecord forgot. Specifically, advanced eager loading capabilities, full support for hand-written SQL, cursors, and high performance for read-only operations. Use it alongside ActiveRecord to unlock the power of your database.
Contribute to OccamsRecord's development at github.com/jhollinger/occams-record.
Full documentation is available at rubydoc.info/gems/occams-record.
Occam's Razor & Simplicity
Do not multiply entities beyond necessity. -- William of Ockham
This definition of simplicity is a core tenant of OccamsRecord. Good libraries are simple, fast, and stay out of your way.
Fast & read-only
OccamsRecord embraces simplicity by doing less. The vast majority of ActiveRecord objects are used read-only, yet each prepares and holds internal state just in case it's used for writing. By returning results as structs, OccamsRecord offers a baseline 3x-5x speed boost and 2/3 memory reduction.
No N+1 problem
OccamsRecord simply refuses to do lazy loading, aka the "N+1 query problem". If you want to use an association, eager load it up-front. While ActiveRecord now supports similar opt-in behavior, it still can't beat Occams in speed and the power of defaults.
No arbitrary limitations
OccamsRecord also embraces simplicity by making things easier for you, the person writing code. ActiveRecord has poor support for hand-written SQL, advanced eager loading scenarios, and advanced database features like cursors. OccamsRecord makes everything simpler by providing all of this in a way that's familiar and easy to use.
Installation
Simply add the occams-record
gem to your Gemfile
:
gem 'occams-record'
OccamsRecord has no configuration. As a companion library to ActiveRecord, all database configuration comes from ActiveRecord.
Your First Query
Docs: query
Write your queries like normal:
q = Order.
completed.
where("order_date > ?", 30.days.ago).
order("order_date DESC")
But run them with OccamsRecord:
orders = OccamsRecord.
query(q).
to_a
Now instead of bloated ActiveRecord objects, orders
is an array of fast, small structs!
You may use any Enumerable method to run your query and iterate results:
OccamsRecord.query(q).each { |order| ... }
OccamsRecord.query(q).map { |order| ... }
OccamsRecord.query(q).reduce([]) { |acc, order| ... }
Batching
OccamsRecord provides find_each and find_in_batches methods that work similarly to their ActiveRecord counterparts.
OccamsRecord.query(q).find_each { |order|
...
}
OccamsRecord.query(q).find_in_batches { |orders|
orders.each { |order|
...
}
}
Using PostgreSQL? Consider using find_each_with_cursor or find_in_batches_with_cursor for a performance boost. See Cursors for more info.
Eager Loading
Docs: eager_load
OccamsRecord's eager_load
method is similar to ActiveRecord's preload
(i.e. it uses a separate query instead of a join).
OccamsRecord.
query(q).
eager_load(:customer).
eager_load(:line_items).
find_each { |order|
puts order.customer.first_name
puts order.line_items[0].cost
}
Nested eager loading is done with blocks. Isn't it so much more readable?
OccamsRecord.
query(q).
eager_load(:customer).
eager_load(:line_items) {
eager_load(:product)
eager_load(:something_else) {
eager_load(:yet_another_thing)
}
}.
find_each { |order|
puts order.customer.first_name
order.line_items.each { |i|
puts i.product.name
puts i.something_else.yet_another_thing.description
}
}
There's lots more that eager_load
can do. We'll cover it in Advanced Eager Loading.
Advanced Eager Loading
Docs: eager_load
Select just the columns you need
Pulling back only the columns you need can be noticeably faster and use less memory, especially for wide tables.
OccamsRecord.
query(q).
eager_load(:customer, select: "id, name")
Fully customize the query
You can snag the eager load's query and customize it using your model's scopes or query builder methods (select
, where
, joins
, order
, etc).
OccamsRecord.
query(q).
eager_load(:customer, ->(q) { q.active.order(:name) })
There's a block-based syntax that's easier to read for long queries:
OccamsRecord.
query(q).
eager_load(:customer) {
scope { |q|
q.active.
joins(:account).
where("accounts.something = ?", true).
select("customers.id, customers.name")
}
}
Block-argument syntax
If you need to call methods from the surrounding environment, like params
in a Rails controller, use the block-argument syntax.
OccamsRecord.
query(q).
eager_load(:customer) { |c|
c.scope { |q| q.where(some_column: params[:some_value]) }
c.eager_load(:account) { |a|
a.eager_load(:something_else)
}
}
Ad Hoc Associations
Docs: eager_load_many, eager_load_one
On rare occasions you may need to eager load an association that doesn't actually exist in your models. Maybe it's too convoluted to represent with ActiveRecord. Or maybe it's just deeply nested and you don't want to waste time/memory loading all the intermediate records.
eager_load_many
The following example uses eager_load_many
to load a non-existent, has-many association on Product
called customers
. Each product will have a customers
attribute that contains the customers who bought the product.
OccamsRecord.
query(Product.all).
eager_load_many(:customers, {:id => :product_id}, "
SELECT DISTINCT product_id, customers.*
FROM line_items
INNER JOIN orders ON line_items.order_id = orders.id
INNER JOIN customers on orders.customer_id = customers.id
WHERE
line_items.product_id IN (:ids)
AND customers.created_at >= :date
", binds: {
date: params[:date]
})
That's a lot, so we'll break it down. The method call really just looks like this:
eager_load_many(:customers, {:id => :product_id}, "SOME SQL", binds: {date: some_date})
The first argument, :customers
, simply gives this made-up association a name. We'll call product.customers
to get a product's customers.
The second argument, {:id => :product_id}
defines the parent-child mapping. In this case it says, "The parent product records have an id
field, and it will match the product_id
field in the child customers."
The third argument is the SQL that loads customers. Notice the line_items.product_id IN (:ids)
section. That's ensuring we're only loading customers that are related to the products we've loaded. OccamsRecord will provide those ids for us - don't worry. (And it's only called :ids
because we defined the parent mapping as :id
. If the parent mapping was instead :code
, we'd put :codes
in the SQL.)
The forth argument is optional. It can be a Hash or Array of any other query parameters you need.
eager_load_one
eager_load_one
works exactly the same but for one-to-one relationships.
Nesting ad hoc associations
Like other eager loads, you can nest ad hoc ones. Here's an eager_load_many
with an eager_load_one
nested inside:
OccamsRecord.
query(Product.all).
eager_load_many(:customers, {:id => :product_id}, "SELECT...") {
eager_load_one(:something, {:id => :customer_id}, "SELECT...")
}
Here's an eager_load_many
with a regular eager_load
nested!
OccamsRecord.
query(Product.all).
eager_load_many(:customers, {:id => :product_id}, "SELECT...", model: Customer) {
eager_load(:profile)
}
Notice that we added model: Customer
to eager_load_many
's arguments. That annotates the ad hoc association with the model, allowing us to use the regular eager_load
on Customer
associations like :profile
.
Hand-written SQL
Docs: sql
Sometimes you have to write a big, gnarly SQL query by hand. Here's an example using Common Table Expressions (CTE).
OccamsRecord.
sql("
WITH regional_sales AS (
SELECT region, SUM(amount) AS total_sales
FROM orders
GROUP BY region
), top_regions AS (
SELECT region
FROM regional_sales
WHERE total_sales > :min_sales
)
SELECT
region,
product,
SUM(quantity) AS product_units,
SUM(amount) AS product_sales
FROM orders
WHERE region IN (:regions)
GROUP BY region, product;
", {
min_sales: 10_000,
regions: ["A", "B", "C"],
}).
each { |order|
...
}
OccamsRecord supports several query param syntaxes:
# Rails-style
OccamsRecord.sql("SELECT ... WHERE orders.date > :date", {date: date})
OccamsRecord.sql("SELECT ... WHERE orders.date > ?", [date])
# Ruby-style
OccamsRecord.sql("SELECT ... WHERE orders.date > %{date}", {date: date})
OccamsRecord.sql("SELECT ... WHERE orders.date > %s", [date])
Eager Loading
Docs: sql, eager_load, eager_load_many, eager_load_one
Unlike ActiveRecord, OccamsRecord lets you eager load associations when using hand-written SQL. There are two ways to do it.
Using a model
If your results are close enough to a model, you can annotate the query with the model and eager_load
its associations.
OccamsRecord.
sql("
SELECT * FROM orders
WHERE order_date > :date
ORDER BY order_date DESC, id
", {
date: 30.days.ago
}).
model(Order).
eager_load(:customer) {
eager_load(:profile)
}.
each { |order|
puts order.customer.profile.username
}
This works because the query is returning orders.customer_id
, and that's the foreign key for the Order#customer
relationship.
Using ad hoc associations
If your results don't resemble a model, or you need to load associations from various models, you can write the SQL yourself in an "ad hoc association". See Ad Hoc Associations for more details.
OccamsRecord.
sql("
SELECT * FROM orders
WHERE order_date > :date
ORDER BY order_date DESC, id
", {
date: 30.days.ago
}).
eager_load_one(:customer, {:customer_id => :id}, "
SELECT * FROM customers
WHERE id IN (:customer_ids)
").
each { |order|
puts order.customer.name
}
This will take the customer_id
column from the parent query and match it to the id
column in the eager load query.
Batched Loading
Docs: sql, find_each, find_in_batches, find_each_with_cursor, find_in_batches_with_cursor
Unlike ActiveRecord, OccamsRecord lets you use batched loading with hand-written SQL. There are two ways to do it.
Cursor based
If you're using PostgreSQL, using cursors for batched loading is faster and easy:
OccamsRecord.
sql("
SELECT * FROM orders
WHERE order_date > :date
ORDER BY order_date DESC, id
", {
date: 10.years.ago
}).
find_each_with_cursor(batch_size: 1000) { |order|
...
}
Read more about using Cursors in OccamsRecord.
OFFSET & LIMIT based
With other databases you'll need to fall back to the traditional (and potentially slower) OFFSET & LIMIT
approach.
OccamsRecord.
sql("
SELECT * FROM orders
WHERE order_date > :date
ORDER BY order_date DESC, id
LIMIT :batch_limit
OFFSET :batch_offset
", {
date: 10.years.ago
}).
find_each(batch_size: 1000) { |order|
...
}
OccamsRecord will provide the values for :batch_limit
and :batch_offset
. Just put the references in the right place.
Cursors
Docs: find_each_with_cursor, find_in_batches_with_cursor, cursor
Note: This section is only relevant to applications using PostgreSQL.
For batched loading, cursors perform better than the traditional OFFSET & LIMIT
approach. If you're using PostgreSQL, take advantage of them with find_each_with_cursor and find_in_batches_with_cursor.
OccamsRecord.
query(q).
eager_load(:customer).
find_each_with_cursor { |order|
...
}
OccamsRecord.
query(q).
eager_load(:customer).
find_in_batches_with_cursor { |orders|
orders.each { |order| ... }
}
If you need custom logic when using your cursor, use the lower-level cursor method:
OccamsRecord.
query(q).
eager_load(:customer).
cursor.
open { |cursor|
cursor.move(:forward, 300)
orders = cursor.fetch(:forward, 100)
orders.each { |order| ... }
}
The cursor
var is an instance of OccamsRecord::Cursor.
Cursors with hand-written SQL
Using cursors with hand-written SQL is a breeze with find_each_with_cursor and find_in_batches_with_cursor.
OccamsRecord.
sql("
SELECT * FROM orders
WHERE order_date > :date
ORDER BY order_date DESC, id
", {
date: 10.years.ago
}).
find_each_with_cursor(batch_size: 1000) { |order|
...
}
And you still have low-level access via cursor:
OccamsRecord.
sql("
SELECT * FROM orders
WHERE order_date > :date
ORDER BY order_date DESC, id
", {
date: 10.years.ago
}).
cursor.
open { |cursor|
cursor.move(:forward, 300)
orders = cursor.fetch(:forward, 100)
orders.each { |order| ... }
}
Instance Methods
OccamsRecords results are just plain structs; they don't have methods from your Rails models. (Separating your persistence layer from your domain is good thing!) But sometimes you need a few methods. OccamsRecord provides two ways of accomplishing this.
Injecting modules
You may also specify one or more modules to be included in your results:
module MyOrderMethods
def description
"#{order_number} - #{date}"
end
end
module MyProductMethods
def expensive?
price > 100
end
end
orders = OccamsRecord.
query(Order.all, use: MyOrderMethods).
eager_load(:line_items) {
eager_load(:product, use: [MyProductMethods, OtherMethods])
}.
find_each { |order|
puts order.description # MyOrderMethods#description
puts order.line_items[0].product.expensive? # MyProductMethods#expensive?
}
ActiveRecord fallback mode
This is an ugly hack of last resort if you can’t easily extract a method from your model into a shared module. Plugins, like carrierwave, are a good example. When you call a method that doesn’t exist on an OccamsRecord result, it will initialize an ActiveRecord object and forward the method call to it.
The active_record_fallback
option must be passed either :lazy
or :strict
(recommended). :strict
enables ActiveRecord’s strict loading option, helping you avoid N+1 queries in your model code. :lazy
allows them. (:strict
is only available for ActiveRecord 6.1 and later.)
The following will forward any nonexistent methods for Order
and Product
records:
orders = OccamsRecord.
query(Order.all, active_record_fallback: :strict).
eager_load(:line_items) {
eager_load(:product, active_record_fallback: :strict)
}.
run
Exceptions
Aside from various possible RuntimeError
and ArgumentError
exceptions, OccamsRecord has several well-defined exceptions for common errors. The most useful are described here.
OccamsRecord::MissingEagerLoadError
As mentioned in the introduction, OccamsRecord won't lazy load any associations for you. If you forget to eager load one and try to use it, it will throw an OccamsRecord::MissingEagerLoadError
exception.
OccamsRecord.
query(q).
eager_load(:line_items) {
eager_load(:product)
}.
find_each { |order|
# this throws because it tries to access "category", which we didn't eager load
puts order.line_items[0].product.category.name
}
The message contains helpful information telling us exactly where we forgot to eager load it:
Association 'category' is unavailable on Product because it was not eager loaded! Occams Record trace: root.line_items.product
OccamsRecord::MissingColumnError
Elsewhere we noted that your eager loads can specify a subset of columns to select (for performance reasons). If you try to access a column you didn't select, it will throw an OccamsRecord::MissingColumnError
exception.
OccamsRecord.
query(q).
eager_load(:line_items) {
eager_load(:product, select: "id, name")
}.
find_each { |order|
# this throws because it tries to access the "description" column, which we didn't select
puts order.line_items[0].product.description
}
The message contains helpful information telling us exactly where we forgot to select it:
Column 'description' is unavailable on Product because it was not included in the SELECT statement! Occams Record trace: root.line_items.product"