is used in SQLAlchemy ORM when you’ve already joined the data you want to load, and you want to inform the ORM about this. This way, the ORM won’t need to send extra queries to the database to retrieve the related data.
Let’s look at an example in Flask-SQLAlchemy:
from flask_sqlalchemy import SQLAlchemy from sqlalchemy.orm import contains_eager db = SQLAlchemy() class User(db.Model): id = db.Column(db.Integer, primary_key=True) name = db.Column(db.String) class Order(db.Model): id = db.Column(db.Integer, primary_key=True) user_id = db.Column(db.Integer, db.ForeignKey('user.id')) user = db.relationship('User', backref='orders') # when querying for orders, also join users orders = db.session.query(Order).join(User) .options(contains_eager(Order.user)).all()
In this example, when we query for
Order, we also
User and inform the ORM using
contains_eager(). Now, when you access
order.user for any
Order object in
orders, SQLAlchemy ORM will not issue a new query to fetch the user because it knows the user has already been eagerly loaded.
When you define a relationship in SQLAlchemy, the ORM automatically knows how to load related objects. However, by default, it does this using a loading technique known as “lazy loading”. Under lazy loading, the related objects are not loaded until they are actually accessed. This means that in your example, even though you are joining
User in the query, SQLAlchemy won’t actually load the
User objects when the
Order objects are fetched. Instead, it will issue a new query to fetch a
User object every time you access the
user attribute of an
Order object for example Order.user.name
contains_eager() option changes this behavior. It tells SQLAlchemy that the join should be used to load the
User objects eagerly, i.e., at the same time as the
Order objects. This way, when you access the
user attribute of an
Order object, SQLAlchemy won’t need to issue a new query, because the
User object has already been loaded.
This can be a significant performance optimization if you’re going to access the
user attribute of every
Order, because it can drastically reduce the number of queries that need to be issued. But if you’re only going to access the
user attribute of a few
Order objects, lazy loading might be more efficient, because it avoids loading
User objects that you don’t need.
subqueryload are options for eager loading, which is a concept in SQLAlchemy where you load all the data you need from the database at once, as opposed to lazy loading, where additional data is queried as it’s needed. Here’s how they differ:
joinedload is used when you want SQLAlchemy to automatically join the related table and load the related objects in the same query when you load the main object. It is a form of eager loading. For example
users = User.query.options(joinedload(User.orders)).all()
This will fetch all users and their related orders in a single query, using a SQL JOIN.
contains_eager, on the other hand, is used when you are manually joining the related table in the query, but you still want SQLAlchemy to treat the related objects as being eagerly loaded.
users = User.query.join(User.orders).options(contains_eager(User.orders)).all()
In this case, you are manually joining the
Order table, but you still want the orders for each user to be treated as if they were eagerly loaded.
In general, use
joinedload when you want SQLAlchemy to take care of the join and loading the related objects, and use
contains_eager when you are handling the join yourself but still want the related objects to be treated as eagerly loaded.
Joinedlod is efficient in terms of the number of queries (since everything is loaded in a single query), but it can create very large result sets because each row of the main table is repeated for each related object.
This results in a separate SQL query for each relation, but these queries are constructed using a subquery so that only one round trip to the database is required. The main advantage of
joinedload is that it does not result in duplication of the main object data. However, it can be slower for small numbers of objects because of the overhead of the additional SQL queries.
users = User.query.options(subqueryload(User.orders)).all()
This would fetch all users in one query, and all related orders for those users in a second query, using a subquery to match the orders to the users.
joinedload is most useful when loading a single object and its related objects, and
subqueryload is often better when loading multiple objects and their related objects. However, the best choice can depend on the specific structure and data distribution of your database.