contains_eager()
is used in SQLAlchemy ORM when you’ve already joined the data you want to load, and you want to inform the ORM about this. This way, the ORM won’t need to send extra queries to the database to retrieve the related data.
Let’s look at an example in Flask-SQLAlchemy:
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy.orm import contains_eager
db = SQLAlchemy()
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String)
class Order(db.Model):
id = db.Column(db.Integer, primary_key=True)
user_id = db.Column(db.Integer, db.ForeignKey('user.id'))
user = db.relationship('User', backref='orders')
# when querying for orders, also join users
orders = db.session.query(Order).join(User)
.options(contains_eager(Order.user)).all()
In this example, when we query for Order
, we also join
the User
and inform the ORM using contains_eager()
. Now, when you access order.user
for any Order
object in orders
, SQLAlchemy ORM will not issue a new query to fetch the user because it knows the user has already been eagerly loaded.
When you define a relationship in SQLAlchemy, the ORM automatically knows how to load related objects. However, by default, it does this using a loading technique known as “lazy loading”. Under lazy loading, the related objects are not loaded until they are actually accessed. This means that in your example, even though you are joining User
in the query, SQLAlchemy won’t actually load the User
objects when the Order
objects are fetched. Instead, it will issue a new query to fetch a User
object every time you access the user
attribute of an Order
object for example Order.user.name
The contains_eager()
option changes this behavior. It tells SQLAlchemy that the join should be used to load the User
objects eagerly, i.e., at the same time as the Order
objects. This way, when you access the user
attribute of an Order
object, SQLAlchemy won’t need to issue a new query, because the User
object has already been loaded.
This can be a significant performance optimization if you’re going to access the user
attribute of every Order
, because it can drastically reduce the number of queries that need to be issued. But if you’re only going to access the user
attribute of a few Order
objects, lazy loading might be more efficient, because it avoids loading User
objects that you don’t need.
Joinedload()
Both joinedload
and subqueryload
are options for eager loading, which is a concept in SQLAlchemy where you load all the data you need from the database at once, as opposed to lazy loading, where additional data is queried as it’s needed. Here’s how they differ:
joinedload
is used when you want SQLAlchemy to automatically join the related table and load the related objects in the same query when you load the main object. It is a form of eager loading. For example
users = User.query.options(joinedload(User.orders)).all()
This will fetch all users and their related orders in a single query, using a SQL JOIN.
contains_eager
, on the other hand, is used when you are manually joining the related table in the query, but you still want SQLAlchemy to treat the related objects as being eagerly loaded.
For example:
users = User.query.join(User.orders).options(contains_eager(User.orders)).all()
In this case, you are manually joining the Order
table, but you still want the orders for each user to be treated as if they were eagerly loaded.
In general, use joinedload
when you want SQLAlchemy to take care of the join and loading the related objects, and use contains_eager
when you are handling the join yourself but still want the related objects to be treated as eagerly loaded.
Joinedlod is efficient in terms of the number of queries (since everything is loaded in a single query), but it can create very large result sets because each row of the main table is repeated for each related object.
subqueryload()
This results in a separate SQL query for each relation, but these queries are constructed using a subquery so that only one round trip to the database is required. The main advantage of subqueryload
over joinedload
is that it does not result in duplication of the main object data. However, it can be slower for small numbers of objects because of the overhead of the additional SQL queries.
For example:
users = User.query.options(subqueryload(User.orders)).all()
This would fetch all users in one query, and all related orders for those users in a second query, using a subquery to match the orders to the users.
In general, joinedload
is most useful when loading a single object and its related objects, and subqueryload
is often better when loading multiple objects and their related objects. However, the best choice can depend on the specific structure and data distribution of your database.