MediaGoblin in Docker¶
Since version 0.14.0, Mediagoblin natively supports Docker. We push release versions of Mediagoblin as Docker images to the project’s Docker Hub account. This makes it easy for anyone to spin up a new service in Docker with no more prerequisite than the Docker runtime itself.
You can start a single standalone container using the official images published to Docker hub. For real deployments, it is however recommended to deploy a multi-container stack using, e.g., Docker Compose.
This page documents how to do either of those things.
Quickstart¶
A standalone container in charge of both serving and processing content can simply be started with
docker run --interactive --tty \ --publish=6543:6543 --volume=/PATH/TO/YOUR/DATA:/srv \ mediagoblin/mediagoblin:0.14.0.dev
This will download the official image from Docker Hub, and create a container running Mediagoblin. It will be accessible at http://localhost:6543.
The --publish
option (or -p
for short) makes the container’s port 6543
available to the host.
The --volume
option (-v
for short) mount a path from local filesystem
(/PATH/TO/YOUR/DATA
, in this example) into the container. This is where
Mediagoblin will store all its data. It can be empty initially, or have been
previously initialised.
The --interactive --tty
(or -it
) options are not strictly needed, but
they should allow you to terminate the running process by sending it a
Ctrl+C
, rather than having to use docker kill
.
Note
See further down in this section for more details on data persistence.
On first run of the container, the administrator’s password will be autogenerated, and shown (once, and only once) in the log output.
===============================================================================
NEW ADMINISTRATOR ACCOUNT CREATED
ADMIN_USER=admin
ADMIN_PASSWORD=<AUTOGENERATED PASSWORD>
ADMIN_EMAIL=admin@example.com
===============================================================================
Note
See further down in this section to learn how to choose or change the admin’s username, password or email.
First run¶
If all goes well, you should see the following output on first run.
usermod: no changes
Creating missing configuration file paste.ini ...
Creating missing configuration file mediagoblin.ini ...
Creating empty database mediagoblin.db ...
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
INFO [alembic.runtime.migration] Running upgrade -> 52bf0ccbedc1, initial revision
INFO [alembic.runtime.migration] Running upgrade 52bf0ccbedc1 -> a98c1a320e88, Image media type initial migration
INFO [alembic.runtime.migration] Running upgrade 52bf0ccbedc1 -> 101510e3a713, #5382 Removes graveyard items from collections
INFO [alembic.runtime.migration] Running upgrade 101510e3a713 -> 8429e33fdf7, Remove the Graveyard objects from CommentNotification objects
INFO [alembic.runtime.migration] Running upgrade 8429e33fdf7 -> 4066b9f8b84a, use_comment_link_ids_notifications
INFO [alembic.runtime.migration] Running upgrade 4066b9f8b84a -> 3145accb8fe3, remove tombstone comment wrappers
INFO [alembic.runtime.migration] Running upgrade 3145accb8fe3 -> 228916769bd2, ensure Report.object_id is nullable
INFO [alembic.runtime.migration] Running upgrade 228916769bd2 -> cc3651803714, add main transcoding progress column to MediaEntry
INFO [alembic.runtime.migration] Running upgrade 228916769bd2 -> afd3d1da5e29, Subtitle plugin initial migration
Laying foundations for __main__:
+ Laying foundations for Privilege table
Cannot link theme... no theme set
Linked asset directory for plugin "coreplugin_basic_auth":
/opt/mediagoblin/lib/python3.11/site-packages/mediagoblin/plugins/basic_auth/static
to:
/srv/user_dev/plugin_static/coreplugin_basic_auth
Creating admin user ...
User created (and email marked as verified).
The user admin is now an admin.
===============================================================================
NEW ADMINISTRATOR ACCOUNT CREATED
ADMIN_USER=admin
ADMIN_PASSWORD=<AUTOGENERATED PASSWORD>
ADMIN_EMAIL=admin@example.com
===============================================================================
Running /opt/mediagoblin/lazyserver.sh -c ./paste.ini --server-name=broadcast ...
Using paster config: ./paste.ini
Using paster from $PATH
+ export CELERY_ALWAYS_EAGER=true
+ paster serve ./paste.ini --server-name=broadcast --reload
Starting subprocess with file monitor
2024-07-14 08:09:30,760 INFO [mediagoblin.app] GNU MediaGoblin 0.14.0.dev main server starting
2024-07-14 08:09:31,054 INFO [mediagoblin.app] Setting up plugins.
2024-07-14 08:09:31,054 INFO [mediagoblin.init.plugins] Importing plugin module: mediagoblin.plugins.geolocation
2024-07-14 08:09:31,054 INFO [mediagoblin.init.plugins] Importing plugin module: mediagoblin.plugins.processing_info
2024-07-14 08:09:31,054 INFO [mediagoblin.init.plugins] Importing plugin module: mediagoblin.plugins.basic_auth
2024-07-14 08:09:31,054 INFO [mediagoblin.init.plugins] Importing plugin module: mediagoblin.media_types.image
2024-07-14 08:09:31,114 INFO [mediagoblin.init.celery] Setting celery configuration from object "mediagoblin.init.celery.dummy_settings_module"
Starting server in PID 58.
2024-07-14 08:09:31,122 INFO [waitress] Serving on http://0.0.0.0:6543
It will be terser on subsequent runs, because configuration and databases already exist, and data migrations aren’t necessary (unless upgrading to a new version of the container).
You can confirm that the container is running happily with the docker ps
command, which will show the running containers, ports and health status (if configured).
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
541710f616d5 mediagoblin/mediagoblin:0.14.0.dev "/opt/mediagoblin/en…" 37 seconds ago Up 36 seconds (healthy) 0.0.0.0:6543->6543/tcp, :::6543->6543/tcp vibrant_germain
At this point, you should be able to point your browser to http://locahost:6543 and be greeted by the Mediagoblin landing page.
Data persistence¶
Data in a Docker image is read-only. Any change in a live container remains until the container is destroyed, and is lost thereafter. This includes all changes made in /srv, where de Mediagoblin data resides. This is obviously not desirable for media storage.
Docker has support for various types of storage mechanisms for this purpose. We saw in the previous section how to start the container in such a way that a local path is bind-mounted onto /srv.
-v /PATH/TO/YOUR/DATA:/srv
This means that any data written by the containers will be written to /PATH/TO/YOUR/DATA in the host filesystem. As it is outside of Docker’s control, this data will persist even if the Mediagoblin container is destroyed.
A new container instance can then be restarted with the same bind-mount volume option. It will resume serving the data transparently. This is useful for backups, as well as as an upgrade path between subsequent versions of Mediagoblin without losing data.
Starting with an empty data directory, the container will create the
configuration and the database on first run. You can confirm it with ls
/PATH/TO/YOUR/DATA
outside of the container.
$ ls /PATH/TO/YOUR/DATA
mediagoblin mediagoblin.db mediagoblin.ini paste.ini user_dev
You can also make manual changes to the data if needed.
Warning
The argument to the –volume option must be an absolute path, otherwise it will be interpreted as the name of a Docker volume.
Using a Docker volume is another way to ensure data persistence across container recreation. Rather than writing data out into the specified host filesystem, Docker will manage the volume (volume-name, in the following example) internally.
-v volume-name:/srv
While this offers the same data persistence benefits, management of the data
should be done with the docker volume
command. Moreover, it may not be as
straightforward to access and back-up without more Docker knowledge. Using this
approach is therefore only recommended to users already familiar with it.
Administrator account¶
A default administrator account
is created by the entrypoint script. The login is admin
, and the
password is automatically generated if unspecified. The details of the admin
account are output in the logs the very first time a new instance is initialised.
You can override both those values on first run, by passing overrides via the environment.
docker run --p 6543 -v /PATH/TO/YOUR/DATA:/srv \ mediagoblin/mediagoblin:0.14.0.dev \ -e ADMIN_USER=myadmin -e ADMIN_PASSWORD=generateme \ mediagoblin/mediagoblin:0.14.0.dev
Note
If the ADMIN_PASSWORD
is set to generateme
(the default), it will be
auto-generated on first run, i.e., when no database exists in the data
directory yet. The generated password will be output, once, in the startup
logs.
Alternatively, you can change the current admin password after at anytime by
using the gmg
tool.
docker run --p 6543 -v /PATH/TO/YOUR/DATA:/srv \ mediagoblin/mediagoblin:0.14.0.dev \ gmg changepw admin <GOOD STRONG PASSWORD>
You can, of course, use gmg
in this way for any other task you would
generally perform in non-containerised environments.
Configuring plugins¶
By default, no plugin is enabled in the example configuration file. As for non-containerised deployments of Mediagoblin, plugins can be enabled by adding relevant sections to the mediagoblin.ini configuation file.
However, plugins can be preconfigured when a new containerised environment is
initialised, by passing a snippet of configuration file, with embedded
newlines, for the [plugins]
section via the PLUGINS
environment variable.
docker run --interactive --tty \ --p 6543 -v /PATH/TO/YOUR/DATA:/srv \ -e PLUGINS='[[mediagoblin.media_types.audio]]n[[mediagoblin.media_types.video]]navailable_resolutions = 144p,240pn' mediagoblin/mediagoblin:0.14.0.dev
This mechanism is only active on first initialisation of an empty data
directory. It can however be forced by setting the FORCE_RECONFIG
environment variable to true
.
... -e FORCE_RECONFIG=true ...
Warning
Force-reconfiguration has not been thoroughly tested, and may not behave flawlessly.
Docker Compose stack¶
Docker Compose allows to encode more details about how to run a container, such as volumes, ports and environments variables. This is done via configuration file instead of the command line. It also allows spinning up more that one container at a time, and setting up the necessary network environment so they can communicate with each other.
Multiple configurations files can be used at the same time, to selectively
configure or various aspect of the desired stack. Mediagoblin takes this
approach, in providing a basic docker-compose.yml
, which contains shared
options.
Note
Historically, docker-compose
was a command separate to docker
itself, but functionality has now been merged and extended. This guide
therefore uses the docker compose
subcommand.
Standalone service¶
Prior to delving into multi-container stacks, you can have a look at the
standalone docker-compose.standalone.yml
which does very little more than
the docker
commands in the previous section. There are however two
noteworthy differences.
version: '3'
services:
lazyserver:
image: mediagoblin/mediagoblin:0.14.0.dev
ports:
- "6543:6543"
healthcheck:
test: [ "CMD", "curl", "-sf", "http://localhost:6543" ]
timeout: 30s
interval: 10s
retries: 5
volumes:
- mediagoblin-data:/srv:rw
env_file: docker-compose.env
volumes:
mediagoblin-data:
driver_opts:
# Present the local ./data directory as a named volume
type: none
o: bind
device: ${PWD}/data
First, in the volumes
section, a named docker volume, mediagoblin-data
is created for /srv
. As discussed before, the volume will be reused every
time a stack is brought up. At the end of the file, in the volumes
section,
additional parameters are provided so the mediagoblin-data
volume is
actually mapped to a bind mount. It is configured to use the data
subdirectory of the current path where the stack was started.
Second, it uses an env_file
, which allows to conveniently pass a number of
environment variables to the container. Those can include the parameters for of
the ADMIN_PASSWORD
, or PLUGINS
, as discussed previously.
These changes will be carried over through the next few sections.
Note
docker compose
uses file docker-compose.yml
by default, which we’ll
discuss later. To use the standalone variation, the -f
option can be used.
docker compose -f docker-compose.standalone.yml up
Note
By default, docker will keep hold of the terminal, and output logs from the
application. To regain use of the terminal, you can add the -d
flag at the
end of this command. To see the logs, you can then use docker compose logs
-f
.
As before, this will make the Mediagoblin instance available at http://localhost:6543/. You can log in as the admin, and upload a file before moving on to the next section.
You can shut the container down with
docker compose -f docker-compose.standalone.yml down
Multi-container stack¶
The previous section was a light introduction into docker-compose.yml
files, but didn’t achieve much. We can now move on to defining more than one
service in the stack: separate Paste and Celery containers, with a side of
RabbitMQ and Nginx.
The basic docker-compose.yml
file does just that.
version: '3'
services:
paste:
image: mediagoblin/mediagoblin:0.14.0.dev
# build:
# context: .
# dockerfile: Dockerfile-debian-12-sqlite
# args:
# build_doc: no
# run_tests: no
depends_on:
rabbitmq:
condition: service_started
volumes:
- mediagoblin-data:/srv:rw
ports:
- "6543:6543"
env_file: docker-compose.env
environment:
- 'CELERY_ALWAYS_EAGER=false'
- 'BROKER_URL=amqp://rabbitmq:5672'
# XXX need host = 0.0.0.0 for server:main, or a way to select
# server-name=broadcast
# command: /opt/mediagoblin/bin/gmg -cf /srv/mediagoblin.ini serve /srv/paste.ini
command: /opt/mediagoblin/bin/paster serve /srv/paste.ini --server-name=broadcast
healthcheck:
test: [ "CMD", "curl", "-sf", "http://localhost:6543" ]
celery:
image: mediagoblin/mediagoblin:0.14.0.dev
depends_on:
rabbitmq:
condition: service_healthy
paste:
condition: service_started
volumes:
- mediagoblin-data:/srv:rw
env_file: docker-compose.env
environment:
- 'CELERY_CONFIG_MODULE=mediagoblin.init.celery.from_celery'
- 'MEDIAGOBLIN_CONFIG=/srv/mediagoblin.ini'
- 'SKIP_MIGRATE=true'
- 'BROKER_URL=amqp://rabbitmq:5672'
# command: /opt/mediagoblin/bin/gmg -cf /srv/mediagoblin.ini celery
command: /opt/mediagoblin/bin/celery worker
healthcheck:
test: [ "CMD", "/opt/mediagoblin/bin/celery", "inspect", "ping" ]
rabbitmq:
image: rabbitmq
expose:
- "5672"
healthcheck:
test: [ "CMD", "rabbitmq-diagnostics", "-q", "ping" ]
volumes:
mediagoblin-data:
driver_opts:
# Present the local ./data directory as a named volume
type: none
o: bind
device: ${PWD}/data
It is fairly similar to the standalone setup, except it defines all three
services. Both paste
and celery
are essentially the same, except for
the command
that is executed. Some additional environment variables are set
in the environment
section, most notably where to find RabbitMQ. The
healthcheck
of the Celery container is also adjusted to remain useful.
One last service is started, based on the official RabbitMQ images, to support
communication between both containers, and some start-up order rules are
defined via the depends_on
sections.
As this configuration is in the default docker-compose.yml
file, starting the stack up is fairly straight forward.
docker compose up
As before, this stack uses the mediagoblin-data
named volume, which is
mounted in both Paste and Celery containers. If you started a fresh lazyserver
before, and uploaded some test data, you should still be able to access it now.
Nginx reverse-proxy¶
When running a non-test instance, it is not recommended to expose the application straight to the public internet. Instead, it is good practice to put a reverse-proxy in between, to handle the fine details of the HTTP protocol. Nginx tends to be a good choice.
As discussed in the deployment documentation, the Nginx configuration needs to be adjusted to best work with Mediagoblin. For ease of use, we build and publish a pre-configured Nginx image to Docker hub alongside the Mediagoblin one.
You can extend your Compose stack from docker-compose.yml
by also including the Nginx service defined in docker-compose.nginx.yml
.
docker compose -f docker-compose.yml -f docker-compose.nginx.yml up
For simplicity in your own deployment, you can include all services
in a
single file.
Note
As the nginx container is added via an override, the paste
container continues to expose it own port to the rest of the system.
Third-party cloud providers¶
Containerisation of Mediagoblin offers a new way to run the service on third-party hosts. However, as the containerisation of Mediagoblin is still very recent, we haven’t explored the various cloud providers and deployments methods.
If you’ve had success with this type of deployment, please consider contributing your experience to the documentation!
Dockerised Build¶
It is possible (and perhaps even preferred) to build Mediagoblin within a container.
This will create a Docker image suitable to run on its own (using lazyserver), or as part of a Docker Compose stack with separate containers for Paste, Celery, and RabbitMQ, as well as the optional pre-configured Nginx.
Core container¶
Unlike a local build, the only dependency required by a Docker build is the
docker
tool itself. When present, the configure
script will prefer this
approach (unless --without-docker
is explicitely passed).
The steps to perform a build nonetheless follow the familiar incantation.
./configure && make
This will create a build stage with the necessary build dependencies, such as
bower
and -dev
packages, create a final image containing the built package,
and run the tests within a container started from that image.
The name of the image will be mediagoblin/mediagoblin:<VERSION>
, where
<VERSION>
is set from the configure.ac
e.g.,
mediagoblin/mediagoblin:0.14.0.dev
.
When building this way, the dependencies for most plugins (media types and core plugins) are
included. Two notable exceptions are support of Documents (but
not PDFs), and STL files. Their dependencies (unoconv
and
blender
, respectively) were deemed too large to include by default.
While the make
-based build is the simplest, it is possible to build custom
containers, with a preferred set of dependencies, directly with docker build
.
. Detailing this process is beyond the scope of this chapter. However you can
have a look at the Dockerfile
to see what build arguments (ARG
,
configurable via --build-arg
), are supported.
Python wheel and documentation¶
It is also possible to build the Python Wheel and the docs out of the image, with
make dist
# and
make docs
respectively.
Note
While the wheel is getting built successfully, it is still a work in progress, and it has not been tested yet.
Preconfigured Nginx image¶
As part of the Docker-based build process, a dedicated Dockerfile.nginx
is
also created. This allows us to build the pre-configured Nginx Docker image which gets pushed to Docker hub.
docker build -f Dockerfile.nginx . -t mediagoblin/nginx:0.14.0.dev