-
-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to export tables #4090
base: develop
Are you sure you want to change the base?
Conversation
@ghislaineguerin @kgodey I've assigned you to this PR to get a sign off on the UX. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall method seems reasonable. Nice work on that!
However, I think most of the SQL code and a large part of the Python code could be removed if you can get ServerCursor.fetchmany
working (or just use the builtin itersize
param and loop through rows one at a time clientside).
If there's a really good reason that doesn't work, what is it?
Also, please just pull apart the msar.list_records_from_table
result to get the actual records to return. I don't want to have to maintain that formatted query in multiple places if we can avoid it. Alternatively, you could extract the CTE-generating code and call it from both the record lister and the iterable record lister SQL functions. I prefer the former approach, since it prioritizes the efficiency of the hot path, i.e., listing records for display normally.
while True: | ||
cursor.execute(fetch_query) | ||
rows = cursor.fetchall() | ||
if not rows: | ||
break | ||
yield rows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the fetchmany
functionality of the psycopg.ServerCursor
class? This seems like it's reinventing the wheel.
See the docs for ServerCursor.
I think it has all the pieces you need, and it'd be more convenient than rolling our own if it works.
'SELECT %1$s FROM %2$I.%3$I %7$s %6$s LIMIT %4$L OFFSET %5$L', | ||
COALESCE(columns_sql, 'NULL'), | ||
msar.get_relation_schema_name(tab_id), | ||
msar.get_relation_name(tab_id), | ||
limit_, | ||
offset_, | ||
msar.build_order_by_expr(tab_id, order_), | ||
msar.build_where_clause(tab_id, filter_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code replication is avoidable by just calling the record lister, then pulling apart the result. Given that this is not exactly a "Hot Path", I think the very minor inefficiency involved is well worth the increase in maintainability.
Fixes #4067
This PR implements downloads for tables using the following approach:
Querying data:
Transferring:
Approaches tested
Querying data:
COPY (SELECT * FROM table) TO
instead ofCOPY table TO
(since we need filtering & sorting and support for explorations in the future). This syntax has negligible performance difference compared to reading from a cursor.COPY
is not supported inside a db function.Transferring:
What is not implemented in this PR:
UX changes from the design
Screen.Recording.2024-12-18.at.7.32.06.PM.mov
Checklist
Update index.md
).develop
branch of the repositoryvisible errors.
Developer Certificate of Origin
Developer Certificate of Origin