My first experience with Lemmy was thinking that the UI was beautiful, and lemmy.ml (the first instance I looked at) was asking people not to join because they already had 1500 users and were struggling to scale.

1500 users just doesn’t seem like much, it seems like the type of load you could handle with a Raspberry Pi in a dusty corner.

Are the Lemmy servers struggling to scale because of the federation process / protocols?

Maybe I underestimate how much compute goes into hosting user generated content? Users generate very little text, but uploading pictures takes more space. Users are generating millions of bytes of content and it’s overloading computers that can handle billions of bytes with ease, what happened? Am I missing something here?

Or maybe the code is just inefficient?

Which brings me to the title’s question: Does Lemmy benefit from using Rust? None of the problems I can imagine are related to code execution speed.

If the federation process and protocols are inefficient, then everything is being built on sand. Popular protocols are hard to change. How often does the HTTP protocol change? Never. The language used for the code doesn’t matter in this case.

If the code is just inefficient, well, inefficient Rust is probably slower than efficient Python or JavaScript. Could the complexity of Rust have pushed the devs towards a simpler but less efficient solution that ends up being slower than garbage collected languages? I’m sure this has happened before, but I don’t know anything about the Lemmy code.

Or, again, maybe I’m just underestimating the amount of compute required to support 1500 users sharing a little bit of text and a few images?

  • AggressivelyPassive@feddit.de
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    1
    ·
    2 years ago

    I’m pretty sure the fediverse needs a new kind of node at some point. If we assume, that almost every larger instance is connected to almost every other larger instance directly, then there’s a ton of duplicated and very small messages.

    There needs to be some kind of hub in-between to aggregate and route this avalanche. Especially if, like you wrote, every upvote is a message, the overhead (I/O, unmarshalling, etc) is huge.

    • chris@l.roofo.cc
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      2
      ·
      edit-2
      2 years ago

      You mean like centralizing the fediverse? Who hosts the hub? Who maintains it? In which country? Who pays for it?

      • AggressivelyPassive@feddit.de
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 years ago

        Not a single hub, multiple ones.

        Anyone can host a hub, federated instances can negotiate the intersection of hubs they both trust and then send traffic that way. That could mean, a single comment might be sent to, say, five hubs and each hub then forwards to 50 instances or so.

        Since the hubs are rather simple, they can scale very easily and via cryptographic ratchets, all instances can make sure, they received the correct messages.

        • sznio@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 years ago

          Hmm. Does the federation protocol only send information directly between servers, by that I mean that when something happens on A, does it send it to all other federated servers by itself?

          If you could just proxy messages through other servers it would be an improvement. Essentially every instance would also be a hub. If you’re an instance A, connected to B and C, when B send you something you pass it onto C, instead of having C communicate with B directly.

          In order to prevent spam you’d need whitelisting for the instances which you will act as a proxy for, and messages will have to be signed. Also, some protocol to discover the topology surrounding your server would be neat for optimizing delivery.

      • topbroken@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        2 years ago

        O(n*n) isn’t really scalable, so you either

        a - have a small number of nodes total

        b - have a small number of hubs with a larger number of leaf nodes.

        Either way, there’s going to be some nodes that become more influential than others.

    • topbroken@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      This is kinda how Usenet worked (well, still does). Rather than n*n federated connections, smaller providers tend to federate with central hubs that form backbones.

      I think it makes sense for the fediverse as well.