mediagoblin-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [GMG-Devel] Adventures in database migrations


From: Jessica Tallon
Subject: Re: [GMG-Devel] Adventures in database migrations
Date: Sun, 06 Dec 2015 22:31:10 +0100
User-agent: mu4e 0.9.15; emacs 24.5.1

Hey,

Sorry that you've had a bunch of issues. Hopefully I can commit some
fixes for some of this though.

ayleph writes:
> I recently upgraded a moderate-sized database from migration 26 to
> migration 43. I encountered a few issues along the way and had to attack
> this in steps. I wanted to share my experience in case we can make
> things easier for others.
>
> Migration 29 caused me a couple of issues.
>
> Issue 29-1: I had numerous errors of the type "AttributeError:
> 'NoneType' object has no attribute 'id'. These were caused by entries in
> my core__activities table which had a media_id column entry that no
> longer existed in core__media_entries. My core__activities table had
> roughly 9000 entries, and over 1300 of those referenced media entries
> which no longer existed on the server.

Yeah so what's happening here is we now use the "Graveyard"
model when you remove most objects (users, media, comments, etc.), a
Graveyard object/tombstone is created for the object with a small shell
of metadata which everything now points to. Prior to this coming about
if objects were deleted there were some not so great side effects you're
experiancing.

I think the solution is providing a migration which removes broken
activities. As for the problematic migrations such as this one, fix
them so if they find a broken activity, they just skip them and let the
broken activity migration take care of them.

> Solution 29-1: I did this the long way. I added a line to
> mediagoblin/db/migrations.py to print the id of each entry in
> core__activities that returned an error. I collected all of these and
> ran them through a bash script that connected to the psql database and
> ran "delete from core__activities where id=$id". There's probably a
> smarter/quicker way to do this in sql statements.
>
> Issue 29-2: Migration 29 ate my hard disk. This migration iterates
> through core__activities (which, as I mentioned, was about 9000 items on
> this instance). As the migration chugged through this giant for loop, it
> kept eating disk space until my /var partition was full. My entire
> database only takes up about 28M in a psql dump, but the live database
> grew to over 1GB during the migration.

Sounds like quite a problem, I never tested my migrations on such a
large number of records and the efficiancy of them was an oversight on
my part. I shall go through them tomorrow and look for places this could
happen and more or less do what you've suggested by moving the commits
inside the loops.

> Solution 29-2: I modified mediagoblin/db/migrations.py to perform
> db.commit() after each entry in the gigantic for loop instead of once at
> the very end. This probably created a pretty big IO hit, but I'll take
> that over a full partition and a failed migration.
>
> Migration 35 caused the same disk issue as 29-2. Again, I resolved this
> by moving the db.commit() statement inside the for loop.
>
> Migration 42 caused an issue similar to 29-1. I got errors saying "null
> value in column 'obj_pk' violates not-null constraint." In this case,
> the offending entries were rows inside core__reports_on_media whose
> media_entry_id was null and rows inside core__reports whose object_id
> was null. To get past this, I just nuked all the offending rows.
>
> If any of these sound like bugs, I'd be happy to file reports on the
> issue tracker. It's highly likely that some of my problems were caused
> by failed account deletions that left things behind in the database,
> such as errors deleting accounts that have created blogs [1]. I feel
> like the db.commit issues need a better solution than "get a bigger hard
> drive" though.

Both the efficiany issues and broken activities problems are issues. If
you can report them that'd be great otherwise I can always do it when I
come to fix them tomorrow. If you do file them can you reply with the
issue numbers just so i can make a note for when i get to them.

> Something else I should mention is that I had problems trying to upgrade
> using a series of commits instead of jumping straight to current master.
> To debug the migration failures, I would checkout a specific commit and
> see what happened during the migration. I found that in some cases,
> choosing the wrong commit would cause issues. For example, at some point
> in time the core__activities table definition changes from having a
> column named "object" to having one named "object_id". If you run a
> migration from an earlier commit, you end up with the "object" column
> which causes problems down then road. Just something to be aware of for
> anyone else debugging in steps like I tried to do.

Yeah in general, I would probably avoid checking out older commits and
running their migrations, they might contain bugs which were later
fixed.

> [1] https://issues.mediagoblin.org/ticket/5308


Thanks for all your testing Ayleph :) I'll try and get these fixed ASAP.


--
Thanks,
Jessica.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]