phpc.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A server for PHP programmers & friends. Join us for discussions on the PHP programming language, frameworks, packages, tools, open source, tech, life, and more.

Administered by:

Server stats:

833
active users

Another thought. I'm thinking I'll try to build it without the assumption of a SQL database. It seems to me one of the scaling issues that the community is having is around the fact that the database has to be a single node, and can only be scaled vertically, which works fine but has limits and becomes expensive.

I'm thinking I would like to design it with compatibility with a database option that's designed to run as a cluster to allow for easier horizontal scaling.

@andrewfeeney I've also been curious about using to back an ActivityPub server. Writes scale massively, and reads cache with DAX/Redis/etc. Should be easy to index for timelines.

Vendor lock-in is an issue though, unless persistence is modular. Then you're stuck with lowest common denominator database functionality.

ActivityPub does seem better suited for a clustered document database. I'm looking forward to following your progress. 😀

@jeff Thanks for these comments, these are great points!

I'll admit, I've been pigeonholed in SQL land for so long that I'm not really across the tradeoffs of other ways of approaching persistence. I gather there'll be consistency and sharding considerations, but what that looks like in the actual software architecture is all a mystery to me.

Building read / write splitting into the architecture from the start is something I hadn't thought of yet, but makes total sense.

Andrew Feeney

@jeff I agree that ActivityPub seems well suited to a clustered document database. It seems as though it's basically a massive distributed JSON-LD document.

Maybe I'm being naive about the complexity, but I'm wondering if I can decouple the persistence solution well enough at the application layer (using something like the Repository pattern, or some other approach) that it would be possible to implement drivers for any kind of proprietary store that made sense.

@jeff For instance you could have a DynamoDB driver, and an ill-advised flat file disk access driver.

I'm not going to personally implement drivers for every kind of database, but at least building for the possibility of different kinds of persistence from the start makes sense for me.

I think leaning too hard into vendor lock-in doesn't fit fell with the ethos of the Fediverse, which is why I'd at least want to leave the possibility of options, ideally both SQL and NoSQL if possible.

@andrewfeeney Yeah, we do something very similar for volatile cache (Redis, Memcached, formerly APC), full-text search (MySQL, Elasticsearch, Sphinx), and object storage (disk, S3, database).

There's a high-level interface to implement for actions (e.g. get/set, query/index). Anyone can implement support for a new engine.

The main upside is we can pick simple defaults that 'just work' self-hosted w/o extra apps, and scale bigger clients. Downside is losing specialization.