Skip Navigation
Lemmy Server Performance

Lemmy Server Performance

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

Members
437
Posts
4
Active Today
4
Created
2 yr. ago
  • Lemmy Server Performance @lemmy.ml
    RoundSparrow @ .ee @lemm.ee

    Redis, Memcached, dragonfly for Lemmy and 2010 presentation on scaling....

    Lemmy is incredibly unique in it's stance of not using Redis, Memcached, dragonfly... something. And all the CPU cores and RAM for what this week is reported as 57K active users across over 1200 Instance servers.

    Why no Redis, Memcached, dragonfly? These are staples of API for scaling.

    Anyway, Reddit too started with PostgreSQL and was open source.

    MONDAY, MAY 17, 2010

    http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html

    "and growing Reddit to 7.5 million users per month"

    Lesson 5: Memcache
    The essence of this lesson is: memcache everything.

    They store everything in memcache: 1. Database data 2. Session data 3. Rendered pages 4. Memoizing (remember previously calculated results) internal functions 5. Rate-limiting user actions, crawlers 6. Storing pre-computing listings/pages 7. Global locking.

    They store more data now in Memcachedb than Postgres. It’s like memcache but stores to disk. Very fast. All queries are gener

  • Lemmy Server Performance @lemmy.ml
    DreadTowel @lemmy.world

    De-facto memory leak

    It looks like the lack of persistent storage for the federated activity queue is leading to instances running out of memory in a matter of hours. See my comment for more details.

    Furthermore, this leads to data loss, since there is no other consistency mechanism. I think it might be a high priority issue, taking into account the current momentum behind growth of Lemmy...

  • Lemmy Server Performance @lemmy.ml
    phiresky @lemmy.world

    High CPU usage? How to profile Lemmy server process CPU usage

    Here's how you can profile the lemmy server cpu usage with perf (a standard linux tool):

    Run this to record 10 s of CPU samples: perf record -p $(pidof lemmy_server) --call-graph=lbr -a -- sleep 10

    Then run this to show the output perf report.

    Screenshot example:

    Post your result here. There might be something obvious causing high usage.

  • Lemmy Server Performance @lemmy.ml
    vapeloki @lemmy.ml

    Lemmy Benchmarking - Working on tooling

    Heyho,

    as a PostgreSQL guy, i'm currently working an tooling environment to simulate load on a lemmy instance and measure the database usage.

    The tooling is written in Go (just because it is easy to write parallel load generators with it) and i'm using tools like PoWA to have a deep look at what happens on the database.

    Currently, i have some troubles with lemmy itself, that make it hard to realy stress the database. For example the worker pool is far to small to bring the database near any real performance bootlenecks. Also, Ratelimiting per IP is an issue.

    I though about ignoring the reverse proxy in front of the lemmy API and just spoof Forwarded-For headers to work around it.

    Any ideas are welcome. Anyone willing to help is welcome.

    Goals if this should be:

    • Get a good feeling for the current database usage of lemmy
    • Optimize Statements and DB Layout
    • Find possible improvements by caching

    As your loved core devs for lemmy have large Issue Tracker backlog, some folks that