In a recent post, I discussed that I implemented MooseFS across several cloud based servers.
My goal was to achieve data redundancy and resiliency in my cloud lab. If you’re interested in more of the basics, check out that initial article.
Since my original deployment, I’ve been able to do a lot of tuning and playing with various settings.
I wanted to discuss some of the major things that I’ve learned about the solution that might help others. You can think of this article as sort of a “best practices” for cloud MooseFS deployments.
Let’s remember that in this case, I’m using MooseFS well against the company’s “intended” use case. We’ve violated a number of their best practices to get here.
This article is more of my own best practices guidelines, learned from working with the platform.
MooseFS Is Poor On Security
First and foremost, MooseFS is not what I’d describe as a secure-by-default application.
In general, a cloud instance of MooseFS should absolutely be behind software (or hardware) firewalls with no direct public access from the internet. (e.g. no public IP’s)
If you’re hell bent on bad ideas, at least implement a software firewall like UFW or IP Tables to ensure the system is only accessible from desired locations!
There are several reasons any instance of MooseFS should be highly secured:
- The web front end doesn’t have any password protections
- The default configurations don’t use password authentication and allows for mounting from any IP address (though they can be configured more securely, of course)
- The basic security functions all rely on configuration files with your passwords in plain text instead of rugged hash based systems
- The actual traffic between servers is not encrypted in any way and there doesn’t appear a way to do so
In my case, all of my virtual server instances exist in virtual LAN interfaces, protected by a software firewall.
I can access the resources via site to site VPN’s and all of my MooseFS servers can also reach each other over site to site VPN’s.
Just forget about doing any kind of port forwarding or traditional firewall pokes! Forget about it!
If your network has a lot of different users, you may want to give some thought about how to secure it from users on your local network.
MooseFS is just not built for public exposure, so make sure everything is behind a firewall.
Choose Your MooseFS Transport Wisely
If your MooseFS data will transit the public internet, be mindful of securing that traffic!
You could theoretically argue that data stored and transmitted is inherently secure. This is because data is broken up into “chunks” and only pieces of the data are stored or transmitted, likely to different destinations.
That’s a fair argument. But, let’s not confuse obscurity with security!
There is little that beats proper, authenticated security such as TLS. Which, MooseFS does not have. When you wrap obscurity in security and you’re golden!
The traditional approach to securing traffic is VPN. But, not all VPN’s are created equally.
For my transport, I opted to go with Wireguard.
This VPN protocol is highly performant and is known to be able to transmit large volumes of traffic over the public internet. The overhead of Wireguard is generally minimal and even “potato grade” hardware is decent at encapsulating and decapsulating the packets.
With VPN solutions like IPSEC, SSL VPN and OpenVPN, there’s a significant amount of overhead. It’s also much more demanding of hardware, requiring things like AES and high end CPU performance.
You could definitely do this in properly configured IPSEC or SSL VPN’s, no doubt. It’s just you might not be able to achieve maximum throughput between chunk servers.
Optimize Your Congestion Protocols
If you have any kind of latency in the network, which is common when using wide area networks, this one will be important.
Since my MooseFS instance is in the cloud, I have inherent latency involved.
Without going into the deep technical details, latency has a major impact on throughput. This limitation is called Bandwidth Delay Product and the basic gist is that higher latency means lower throughput.
High latency used to be a death sentence for large throughput applications back in the day.
But, in 2025 we have “modern” congestion protocols that deal much better with high latency.
BBR is one such protocol. Developed out of Google and put into Google’s production in 2015, it’s an entire re-thinking of how to transmit data over both lossy and highly latent connections.
I implemented BBR on all my MooseFS machines. This allowed for much better throughput in my chunk transfers, easily doubling the overall throughput that I could achieve.
The default congestion protocol for most machines at the time of this writing is called Cubic. It’s good, works really well. But, it falls down in high throughput, high latency situations.
You may also want to adjust the congestion protocol used by clients. Since they will be reaching out over this WAN to acquire data, an efficient congestion protocol will ensure higher performance.
Implement Mesh VPN For MooseFS
MooseFS is highly network aware. It works in both traditional layer 2 and layer 3 types of network implementations.
But, the important thing is that each server can directly talk to all other servers. It is a mesh model, not a hub and spoke (or star topology) model.
This means that your cloud network should be meshed as well.
Since this mesh model is at the basis of MooseFS, there’s a corresponding level of performance as you add more chunk servers. More servers equals more potential connections.
So, if you’re operating from different sites and/or servers, each of these servers should have connectivity “directly” to one another.
This is often called “site to site” VPN or “site to multi-site” VPN.
If your VPN uses a star topology, your central hub is going to become a bottleneck. When you tunnel traffic back to your hub, that traffic will be doubled from ingress to egress.
It is far more efficient to let the traffic go its most direct path to each of the servers.
Remember, your VPN is going to obscure the “actual” path data has to take, but reducing the hops involved will simply be more performant.
Allocate Appropriate Virtual Resources
As I discussed in the intro article, MooseFS recommends against virtualizing their solution.
I understand why that’s what they recommend, but I also think this is a silly approach in 2025.
Especially for cloud resources, which highly benefit from virtualization from a value proposition standpoint. (Examples include same-box firewall, local DNS, better hardware control and robust monitoring capabilities.)
One of my observations with MooseFS is that it is not extremely resource intensive (e.g. CPU utilization, memory), but it can still bring your system to its knees.
For some general virtual hardware recommendations at a hobbyist type level?
- Master server should have at least 1-2 vCPU. Memory will scale based on data stored, but at least 1024 to 4096 megabytes of RAM is recommended. Memory usage will increase with more stored data.
- Chunk servers should not be allocated more than 1 vCPU. Though they aren’t CPU intensive, the chunk servers can put extreme load on your system. Even 3% CPU usage can place a load of 5 or more on your system. Memory needs are quite low, typically 512 to 1024 megabytes will be sufficient.
- Metalogger servers are the least “needy” of the bunch. 1 vCPU and 1024 megabytes of RAM is fine here. They barely get used, once an hour the metadata is transferred to them.
- Memory requirements increase based on the number of chunks in the system. It is lessened by the number of chunk servers in the system. You may need to increase settings based on specific hardware and storage levels.
As discussed above, the load values that MooseFS can place on a system is quite extreme.
Load is different from CPU utilization in that it represents the “true” CPU processing needs of a system. It basically represents the number of simultaneous CPU calculations needed.
A load of 2 means two CPU’s are needed. A load of 20 means 20 CPU’s are theoretically needed.
However, MooseFS operates just fine in an IO starved environment from my experiments.
Since overall CPU usage is quite low, the system can easily swap loads between actual CPU’s with relative efficiency and minimal impact to the system.
Were you building an actual high performance MooseFS cluster, the calculus about starving IO operations would be different. In that case, ensuring full resource provisioning is likely more critical.
But, what this does mean is that one should be cautious about implementing MooseFS in shared resource environments, such as VPS or virtual private servers. Your provider may kick you out if you rip it too hard!
Making The Network Hurt
One of the things that took me by surprise is how conservative a lot of the default values are.
After seeing the CPU usage vs. load behavior discussed above, it does make sense that MooseFS would opt for more conservative default values.
But, when it comes to my private cloud lab, I want to make my hardware hurt! So, let’s melt some CPU’s!
The default values might work fine in a LAN with low latency. But, in the cloud, there’s a lot of delays between chunks getting sent, received, stored and informing the master server of all the status changes.
My goal was to be able to utilize my contracted gigabit network to its maximum extent possible. The defaults were giving me between 25 to 150mbps on average.
I played around with most of the different settings in the MooseFS master server. (i.e. mfsmaster.cfg)
I found two settings that were primarily tied to actual network and overall server performance. These settings are:
CHUNKS_WRITE_REP_LIMIT
CHUNKS_READ_REP_LIMIT
These need to be tuned to any given network. They define how much data is received and sent by the various servers. Both the configuration file and MAN files describe the settings and relationship of those settings.
What I did to tune the system was doubled all of the values for each of these two settings, then restarted the master server.
This increased performance, but not as high as I wanted it. So, I doubled again. And so forth. Until the system could saturate a 1 gigabit network connection. I ended up with:
CHUNKS_WRITE_REP_LIMIT = 16,8,8,32,32
CHUNKS_READ_REP_LIMIT = 80,40,16,40,80
After tuning, the system can nearly balance and send data as fast as I can write to it. And that’s over a highly latent WAN!
But, fair warning, highly tuned values like this are going to make your system hurt! I saw loads of nearly 80 on a comparatively weak 8 core, 16 thread machine! That is a LOT, perhaps the most I’ve seen!
Also, be aware, before aggressive tuning like this, you should know your provider’s “fair use” policies. Unlimited is never unlimited. If your provider is sensitive to saturating your link, you should think twice about aggressively tuning your system like this.
I may actually back off on these aggressive settings once the bulk of my data is in the array.
Reason being, these aggressive settings also appear to impact the write performance slightly. Since each drive has two receive your data and aggressively send it back out, my “spinning disks” are really working harder than they should.
But, to be fair, MooseFS will also stabilize to almost no traffic between servers when it’s not rebalancing or moving data around.
Be Mindful Of Storage Classes
MooseFS has a concept of storage classes. These are used to determine where data is written and also ultimately stored.
There is a lot of flexibility in the design, allowing precise control over exactly where the chunks are stored.
However, it’s important to be mindful of where data will be written if you want to achieve decent write performance to the MooseFS cluster.
There are options for a thing called “topology” contained in MooseFS. However, at the time of this writing, I have determined that my version is currently bugged out and unable to use this properly.
That said, you can still control where data is initially written with the storage classes. (This is the “-C” bit or creation part of defining the storage class.)
Using these, you can ensure the data is initially ingested on drives locally close to you. Then, from there, the system can replicate and store the data in a different place later.
From my perspective, I care more about the time for an initial write than I do the performance of chunk replication later in the storage process. One is me waiting for something to happen, the latter is robots doing robot things.
I did note that the more I pushed the system to higher levels of chunk replication and performance, the worse the overall initial write performance became.
This is due to the fact that your drives have multiple, conflicting mandates. One is to ingest your data quickly, the other is to move data to other drives, also quickly.
So, there’s a balance that needs to be found for each system’s configuration.
Make Sure Your Network Is Up To The Task
When it comes to high performance networking, the details often matter. Which is why I talked about obscure things like congestion protocols previously.
I found it curious that MooseFS “official” best practices recommend things like LACP (multiple, redundant network uplinks) as this has zero bearing on actual network performance in almost all “properly designed” use cases.
Based on my experience, though, the quality of the network does matter.
In a wide area network (and internet based) MooseFS cluster, the quality of your provider and connectivity between datacenters will absolutely matter.
If you have poor peering between two endpoints, and intermediary carriers are redlining their capacity, there will be little you can actually do to increase performance.
Another example, if your server is on a shared 1 gigabit port, the actual bandwidth you’ll get will be a fraction of that.
The better way of saying this is that the quality of your provider(s) and their overall peering with the “rest of the internet” is going to be important for something like MooseFS.
Fortunately, modern day datacenters have advanced exponentially in their connectivity over the last few years.
With 10gbps, 40gbps and 100 gigabit per second links being the “norm,” finding poorly connected data centers is definitely harder these days.
Less important are things like specific geography, especially if you use a congestion protocol like BBR. Even intercontinental storage clusters are possible (I know because I did it!), so long as the providers provide you quality throughput and peering.
Data storage both is and isn’t latency sensitive, it really depends on how you intend to use it.
For basic things like the storage of media, higher latencies are fine. But, if you want to operate virtual machines from that storage, then something like my project here is out of the question.
DNS Is The Way For Server Discovery
In the guide files, MooseFS recommends using each host’s /etc/hosts file to manage DNS.
This is a good idea on your master server, as the system only uses this file to populate its own internal server entries. It doesn’t even look at DNS, at all.
However, for your mfsmaster entry, you’ll have a much better time if you implement this at the DNS server level.
Doing so means you can spin up a MooseFS client (or chunk server, metalogger) with minimal configuration and integration requirements. With DNS, they’ll just be inherently aware of the master server from each network.
In fact, there’s really no need to configure the host files of the chunk servers, metalogger servers or even the clients. If you have good DNS, that is.
I also briefly played around with multiple MooseFS clusters in the same network.
This can be done in the same network, but you do have to take care to associate each server with the appropriate master server. Mounts are also done to the appropriate master server.
A multi-cluster situation gets especially more tricky and reinforces the need for good DNS management.
Minimize Your Changes To Config Files
There are tons of various options in the configuration files. Some of them sound quite enticing to squeak more and more performance.
However, in playing with a lot of these settings, I found very few that actually made a meaningful difference in regard to performance.
In my experience, most of these performance related settings are better served at reducing the system impact as opposed to increasing it.
For example, increasing things like maximum workers, load thresholds and other resources appeared to do very little to changing how well (or not well) the system operates.
The settings discussed above (about chunk replication) were the real magic to getting more performance and throughput out of the system.
But, if you’re in a situation where you want to reduce the impact of MooseFS, you might want to explore these options further.
Also, if you’re interested in securing your system further, this is definitely something you’ll want to spend some time digging into.
Integrating With Different Systems
One of the things I hit once I started bringing MooseFS into actual production is that it’s not always optimal for direct client access.
The MooseFS client uses what is called FUSE to mount the cluster.
Since most of my systems that would be interacting with this data are virtualized into Proxmox and are using LXC’s, this means my containers must be enabled for FUSE.
I encountered an inherent incompatibility between Proxmox, LXC containers and FUSE. Basically, LXC containers with FUSE mounts cannot be backed up with Proxmox Backup Server (PBS) using optimal settings.
Long story short, PBS uses a concept called “FS-FREEZE” and “FS-THAW” to hold a steady state of an LXC while it’s backing up. This just simply doesn’t work when FUSE based disks are mounted.
There is no great workaround, other than using “stop” mode which means you induce down time.
There are some hacks out there, but they’re less than optimal and hardly tried and true.
So, I found it better to interact with MooseFS using an NFS to MFS gateway. Also a SMB to MFS gateway for cluster access to Windows machines.
In practice, this is a single machine (a full blown VM in my case) that has both MooseFS and also SMB or NFS. The mounts are sourced on the MFS cluster and shared out via the specified protocol.
What this allows you to do is to work with the MooseFS cluster via NFS or SMB as opposed to requiring MooseFS to be mounted directly on the VM or LXC.
This also solves the issue that MooseFS doesn’t provide a Windows based client for the community edition of MooseFS. No problem, I can work with good old fashioned Samba!
There is a performance hit when you do this. And, it’s not small.
An alternative is to simply to use VM’s and not LXC’s, of course. I’m not going to do that! I love my LXC’s and will defend them forever!
And if I have to have an SMB gateway for Windows, it’s a tiny jump to also build an NFS gateway.
That’s Pretty Much It For MooseFS
I think at this point, I’ve taken the proof of concept all the way to a final product.
I’ve been able to achieve the performance levels that are desired for the system and really rip some hardware to shreds!
Hopefully you’ve found some insights and have gathered some overall helpful information.
One of the reasons I wrote this series is that I found very little “human grade” information on MooseFS. Ultimately, practice is better than MAN pages.
There are very few folks talking about specific details about a MooseFS implementation, so I wanted to change that!
In closing, if you have any comments, feel free to slap them down below.
But, remember, I am not MooseFS technical support!

Hey! Very nice write-up on both MooseFS posts!
I didn’t consider touching the CHUNKS_WRITE_REP_LIMIT (and READ one). I’ll read further this week if it may help my setup, I’m running a 10gbps 3-node mesh all-flash cluster, but I couldn’t saturate it with the default settings 🙂
I do agree the documentation is a mess. For example, it isn’t documented in the MAN pages what is exactly a chunkserver in ‘grace mode’. I’ll be contributing my grain of salt here
I found this in the 1.7 (1.7!!) changelog of MooseFS on github
* MooseFS 1.7.0 (2013-07-05)
– (…)
– (master+cs) simple chunkserver overload detection
– (master) ‘grace’ state for chunkservers – after overload detection given chunkserver is in ‘grace’ state for some time – new chunks are not created on that server in this state
Thank you! It was a fun deep dive into a tech. I’m still using it today, been pretty solid and easy to support. Oh, and sorry for the delay, I only very rarely get comments on this site, it’s kind of a hobby thing I do.
These settings basically alter how many chunks are written and replicated simultaneously within the cluster. Even though compute resource utilization goes through the roof as you increase them, actual throughput increases are significant. This is why it’s great to virtualize as you can isolate and control those significant compute loads and still get pure data transfer.
Your project sounds super interesting. I’m primarily using rusty spinners, but I do have a few NvME based chunkservers too. (Which, I haven’t really bothered to optimize.) If I had any advice, it’s to deep dive on each feature, one at a time, to see how it impacts performance. Also, it’s a different goal to prioritize data ingestion than it is to prioritize replication.
I probably don’t spend enough time discussing storage classes in this post. But, I found them very useful for controlling where data writes and many other things. Additionally, I made some comments in this post about bugs, which have since been fixed. Topology works just fine if you declare the networks and other settings correctly. I’d agree the documentation could be better, you really have to thread multiple sources of MAN files, their webpages/old docs and their forums. But, it’s still a fun hacking project if you’re into storage, networks and doing something off the beaten path.
Have fun!