Describe the problem you would like to solve
The OJS application comes in many parts and each installation is an assemblage of many custom components. Its inherent design burdens application and system maintainers with having to deal with a high amount of mutable state and many, often synchronous, service integrations.
First published in 2011 ans subsequently iterated upon, the patterns published as The Twelve-Factor App have significantly improved software development practices in distributed systems, together with widespread availability of cgroups containers ultimately leading to the emergence of what would later be dubbed OCI Containers (Open Container Initiative) in 2013 and Kubernetes in 2014.
Next to using a single source of truth in version control for multiple deployments as its first principle, further dimensions of work for a distributed systems engineer are called out by the 12 factors, emphasis mine.
- Explicitly declare and isolate dependencies (II. Dependencies)
- Store config in the environment (III. Config)
- Treat backing services as attached resources (IV. Backing services)
- Strictly separate build and run stages (V. Build, release, run)
- Execute the app as one or more stateless processes (VI. Processes)
- Export services via port binding (VII. Port binding)
- Scale out via the process model (VIII. Concurrency)
- Maximize robustness with fast startup and graceful shutdown (IX. Disposability)
- Keep development, staging, and production as similar as possible (X. Dev/prod parity)
- Treat logs as event streams (XI. Logs)
- Run admin/management tasks as one-off processes (XII. Admin processes)
These parentalpractical advises, taken normatively, help to isolate side-effects in your application and allow to deeply integrate it into its execution environment, whether it be (a) a local shell, (b) a local container, (c) a remote deploy to an operating system or (d) a remote deploy into a container. They were originally formulated for PaaS environments by Heroku and have since been applied to and extended to containers.
Describe the solution you’d like
I would like to see an official stance of PKP according to the dimensions outlined by the 12 factors, ideally in some kind of roadmap that allows tracking the implementation status of each of the vectors. While some will probably already be accounted for, others will need deep reconsideration/refactoring of the systems to increase their robustness.
This activity spans the whole Software development lifecycle.
Who is asking for this feature?
This feature is needed by technical DevOps people engaged in developing and running PKP apps following SRE (Site Reliability Engineering) principles.
Additional information
- Dependency management with Composer through
composer.json
instead ofgit
submodules had been discussed and validated in Managing PKP depencies with Composer - #8 by asmecher first. - Storing the configuration in the environment is currently of concern to the development of the containers, but in the long term will require upstream support in pkp-lib, e.g. by using stock Laravel Configuration. Migrations will have to be considered here as well.
- https://github.com/pkp/containers/issues/2
- Please also find ideas about an early proof of concept in Consider implementing the phpdotenv usage to handle configuration · pkp/pkp-lib#8015 from 2021-05-19, which seems depreciated by the native Laravel support.
- Add PHP error page and application env config value · Issue #10027 · pkp/pkp-lib · GitHub
- Add a config option to set the database engine to InnoDB · Issue #7311 · pkp/pkp-lib · GitHub
- Wrong or missing config.inc.php variable breaks the upgrade · Issue #9781 · pkp/pkp-lib · GitHub
- https://github.com/pkp/containers/issues/2
- When a DevOps engineer is concerned with handling mutable state of their applications, they prefer decoupling those into network services, such as databases for structured data and object storage for BLOBs, only falling back to file systems on block volumes in absence of the prior. This state is it then which is held by the backing services that offer interaction by their protocols. Sending emails would be another special case of this.
Currently the application(s) store all state in the local file system and cannot decouple their sometimes large document repositories from the actual deployment of the application. If the document store would live in a file system abstraction, such as S3, the application process could be scheduled across separate compute instances. (Also see 6. Processes and 8. Concurrency) Some works allowing for such mode of operation have already begun, others are already considered.- Add support for other file storage drivers (Flysystem) · Issue #11120 · pkp/pkp-lib · GitHub
- pkp-lib#9009 also suggests to consider file system migrations.
- As with the prior kind of mutable state, database migrations migrate the evolution in the schema of the application domain and also need to be considered. They are in reported progress of being migrated to Laravel’s Eloquent integration.
- Eloquent has a single and very specific open issue at Port Genre and GenreDAO to Use Eloquent Model · Issue #10133 · pkp/pkp-lib · GitHub . Would there be another one for tracking the overall migration from the current implementation over to it?
- Database: Seeding - Laravel 12.x - The PHP Framework For Web Artisans
- Database: Migrations - Laravel 12.x - The PHP Framework For Web Artisans
- Ideally, together with (3.), we can also provide the conventional pattern to supply access parameters for the database with a URL supplied in a
DATABASE_URL
-like variable.
- Add support for other file storage drivers (Flysystem) · Issue #11120 · pkp/pkp-lib · GitHub
- Dealing with build and run stages of the application as separate processes working from different source artifacts, here source code, there e.g. immutable containers, means to automate build and deploy more in CI to benefit from declarative configuration, isolation of side-effects. Standardising development and deployment patterns around OCI containers is one way to achieve this. Getting rid of git submodules through (2.) will help to alleviate this. Offering immutable and reusable build artifacts with declarative dependency management also improves security by providing observability into structured data of the dependency tree in aggregated
composer.json
files, eventually being used to construct a BOM. Signing git commits and (container) builds will further help to increase confidence in the robustness of the deployment artifacts. - Processes are stateless in so far, as they don’t require a local file system and externalise all side effects (application state, configuration state) into (4.) backing services. This requires reading the configuration configuration without involving stateless artifacts, such as configuration files, directly from the environment (3.)
- This is made easier by declarative configuration (3.).
- Separate instances of the application can be run behind a load balancer in the (private) cloud, when at least (3.), (4.) and (6.) are met.
- This basically means that PID 1 in your container knows how to react to SIGHUP to avoid receiving a SIGKILL after 10 seconds.
- Declarative configuration of immutable and stateless deployment environments helps reproducibility and standardisation of their components. Strict adherence to principle (1.) is a sufficient condition for this.
- The log streams can then be fed into external log aggregation, correlation and analytics environments.
- This is often made easier with providing a CLI for the local server process and/or for the API. Task execution runners can temporarily help to standardise common actions in absence of the former (see
just
). For Laravel this is called the Artisan Console.
When these things, and more, are into place, we can easily separate the side-effects of a distributed application and scale them independently. Decoupling application from state in backing services further allows to run stateless replicas that allow for high-availability and load balancing through horizontal scaling. Following conventional patterns seen elsewhere will also improve the Developer Experience (DevEx) and recognisability of components in the OJS ecosystem. Open Systems Design emphasises loose-coupling of components, to provide for resilience and easier manageable, locally bound state.
Other people have suggested to extend the list of namely 12 patterns to more dimensions, which could nowadays maybe read something like:
- Secrets management in distributed environments
- Unit testing of commits and end-to-end testing of deployments. Some might argue this to be part of (1.) already.
- Visibility and Observability through monitoring, logs and distributed tracing (OpenTelemetry), eventually as an extension to (11.).
next to probably other related SRE practices and Platform capabilities.
What comes to your mind when you think about making the PKP application platform more robust and resilient for the next decades to come?
—
PS: Links to open and ongoing issues related to the subjects above can be found in the Addendum: Identifying the 12 factors in the PKP application platform library and (OJS) container manifests - HedgeDoc