Update Performance & Scaling

Matthias Klein 2025-08-03 20:25:36 +00:00
parent cdc82390e8
commit 9a9f8ae63f
2 changed files with 323 additions and 1 deletions

323
Performance-%26-Scaling.md Normal file

@ -0,0 +1,323 @@
# Performance & Scaling
Complete guide to optimizing GTS-HolMirDas for different server configurations and performance requirements.
## 🚀 RSS Feed Optimization (v1.1.0+)
GTS-HolMirDas supports URL parameters to dramatically increase content discovery without additional API calls.
### RSS Feed Limits
Most Mastodon-compatible instances support the `?limit=X` parameter:
```bash
# Default behavior (20 posts per feed)
https://mastodon.social/tags/homelab.rss
# Increased limits (up to 100 posts per feed)
https://mastodon.social/tags/homelab.rss?limit=50
https://fosstodon.org/tags/docker.rss?limit=100
```
**Supported limits:** 20 (default), 50, 75, 100 (instance-dependent)
## 📊 Performance Impact Tables
### Configuration Comparison
| Configuration | Posts/Run | API Calls | Processing Time | Memory Impact |
|---------------|-----------|-----------|-----------------|---------------|
| **Conservative** | ~100 posts | 30+ feeds | 2-5 minutes | +50MB |
| **Standard (limit=20)** | ~200 posts | 30+ feeds | 3-8 minutes | +100MB |
| **Optimized (limit=50)** | ~500 posts | 30+ feeds | 5-12 minutes | +200MB |
| **Aggressive (limit=75)** | ~750 posts | 30+ feeds | 8-15 minutes | +300MB |
| **Maximum (limit=100)** | ~1000 posts | 30+ feeds | 10-20 minutes | +400MB |
### Real Production Data
Based on actual deployments with 102 RSS feeds:
```
📊 Conservative Setup (limit=20-50):
⏱️ Runtime: 4:14 | 📄 Posts: 245 | ⚡ 58 posts/minute
🌐 New instances: +15 | 💾 Memory: ~350MB total
📊 Balanced Setup (limit=50-75):
⏱️ Runtime: 6:32 | 📄 Posts: 487 | ⚡ 74 posts/minute
🌐 New instances: +28 | 💾 Memory: ~450MB total
📊 Aggressive Setup (limit=75-100):
⏱️ Runtime: 8:42 | 📄 Posts: 1045 | ⚡ 120 posts/minute
🌐 New instances: +45 | 💾 Memory: ~650MB total
```
## ⚙️ Configuration Tuning
### Environment Variables
```bash
# Processing Configuration
MAX_POSTS_PER_RUN=75 # Increase for higher limits
DELAY_BETWEEN_REQUESTS=1 # Balance speed vs. server load
RSS_URLS_FILE=/app/rss_feeds.txt
# Recommended combinations by server capacity:
# Small VPS (1GB RAM):
MAX_POSTS_PER_RUN=40
DELAY_BETWEEN_REQUESTS=2
# Medium Server (2-4GB RAM):
MAX_POSTS_PER_RUN=75
DELAY_BETWEEN_REQUESTS=1
# Powerful Server (4GB+ RAM):
MAX_POSTS_PER_RUN=100
DELAY_BETWEEN_REQUESTS=1
```
### RSS Feed Strategy
#### Progressive Scaling Approach
**Phase 1: Testing (Week 1)**
```bash
# Start with mixed limits to test performance
https://mastodon.social/tags/homelab.rss?limit=30
https://fosstodon.org/tags/selfhosting.rss?limit=40
https://chaos.social/tags/docker.rss?limit=50
```
**Phase 2: Optimization (Week 2-3)**
```bash
# Increase gradually based on server capacity
https://mastodon.social/tags/homelab.rss?limit=50
https://fosstodon.org/tags/selfhosting.rss?limit=75
https://chaos.social/tags/docker.rss?limit=100
```
**Phase 3: Production (Week 4+)**
```bash
# Full optimization based on monitoring results
https://mastodon.social/tags/homelab.rss?limit=100
https://fosstodon.org/tags/selfhosting.rss?limit=100
https://chaos.social/tags/docker.rss?limit=100
```
#### Instance Quality Assessment
**High-Quality Instances (recommended for aggressive limits):**
```bash
# Tech-focused instances (good signal-to-noise ratio)
https://fosstodon.org/tags/homelab.rss?limit=100
https://infosec.exchange/tags/security.rss?limit=100
https://social.tchncs.de/tags/linux.rss?limit=75
# Specialized communities
https://chaos.social/tags/ccc.rss?limit=50
https://pixelfed.social/tags/photography.rss?limit=50
```
**General Instances (moderate limits recommended):**
```bash
# Large general instances (more noise, use moderate limits)
https://mastodon.social/tags/technology.rss?limit=50
https://mstdn.social/tags/programming.rss?limit=40
```
## 📈 Monitoring & Optimization
### Performance Metrics
The statistics output shows real-time performance indicators:
```
📊 GTS-HolMirDas Run Statistics:
⏱️ Runtime: 0:08:42 # Target: <15 minutes
📄 Total posts processed: 487 # Scales with limits
🌐 Current known instances: 3150 # Cumulative growth
New instances discovered: +45 # Per-run discovery
📡 RSS feeds processed: 102 # Your feed count
⚡ Posts per minute: 56.0 # Processing efficiency
```
### Key Performance Indicators
**Runtime Optimization:**
- **Target:** <15 minutes per run
- **Good:** 5-10 minutes
- **Excellent:** <5 minutes
**Discovery Efficiency:**
- **New instances per run:** 20-50+ (higher with more aggressive limits)
- **Posts per minute:** 30-100+ (depends on server and network speed)
- **Federation growth:** 100-200+ new instances per week
**Resource Utilization:**
- **Memory growth:** Linear with post count (~0.5MB per 100 posts)
- **Storage growth:** ~50-100MB per month (processed URLs tracking)
- **Network usage:** ~1-5MB per run (RSS fetching + API calls)
### Optimization Guidelines
#### Memory Management
**Monitor GoToSocial Memory Usage:**
```bash
# Check memory usage during runs
docker stats gotosocial
docker stats gts-holmirdas
# Memory impact per configuration:
# Conservative: +50-100MB during processing
# Balanced: +100-200MB during processing
# Aggressive: +200-400MB during processing
```
**Memory Optimization Tips:**
- Each 100 additional posts ≈ ~2-5MB additional RAM usage
- Peak memory usage occurs during duplicate detection
- Memory returns to baseline after run completion
- Recommended: 1GB+ total RAM for aggressive configurations
#### Processing Time Optimization
**Scales linearly with:**
- `MAX_POSTS_PER_RUN × number_of_feeds`
- Network latency to RSS sources
- GoToSocial API response times
**Optimization strategies:**
```bash
# If processing takes too long:
MAX_POSTS_PER_RUN=50 # Reduce from 75/100
DELAY_BETWEEN_REQUESTS=2 # Increase from 1
# If network timeouts occur:
DELAY_BETWEEN_REQUESTS=3 # More conservative timing
# Reduce RSS feed count temporarily
# If duplicate detection is slow:
# Clean processed URLs periodically (monthly):
docker-compose exec gts-holmirdas rm -f /app/data/processed_urls.json
```
#### Federation Growth Optimization
**Maximize Instance Discovery:**
- Higher `?limit=` parameters = more diverse instance discovery
- Expect 20-50+ new instances per optimized run
- Specialized hashtags often yield better quality content
- Mix of instance types (tech, general, niche) provides diversity
**Balance Discovery vs. Storage:**
- More instances = larger GoToSocial database
- Monitor database growth: ~10GB per year for active instances
- Consider storage capacity when planning aggressive scaling
## 🛠️ Troubleshooting High-Volume Setups
### Common Scaling Issues
#### Issue: Processing Takes Too Long
```bash
# Solution 1: Reduce volume
MAX_POSTS_PER_RUN=50 # Reduce from 75/100
DELAY_BETWEEN_REQUESTS=2 # Increase from 1
# Solution 2: Optimize feeds
# Remove low-quality or duplicate feeds
# Focus on high-signal instances
```
#### Issue: GoToSocial Uses Too Much Memory
```bash
# Solution 1: Reduce processing volume
# Lower ?limit= parameters to 50 instead of 100
# Reduce RSS feed count temporarily
# Solution 2: Increase run frequency instead of volume
# Run every 30 minutes with limit=25 instead of hourly with limit=75
```
#### Issue: Duplicate Detection Slow
```bash
# Solution: Storage cleanup (monthly maintenance)
docker-compose exec gts-holmirdas rm -f /app/data/processed_urls.json
# Note: This forces fresh state tracking
# Posts will be reprocessed once, then normal duplicate detection resumes
```
#### Issue: Network Timeouts
```bash
# Solution: More conservative timing
DELAY_BETWEEN_REQUESTS=3 # Increase from 1-2
MAX_POSTS_PER_RUN=40 # Reduce load
# Check network connectivity:
curl -I https://mastodon.social/tags/test.rss
```
## 🎯 Best Practices by Server Size
### Small VPS (1GB RAM, 1 CPU)
```bash
# Configuration
MAX_POSTS_PER_RUN=25
DELAY_BETWEEN_REQUESTS=2
# RSS Strategy
# 10-20 feeds with limit=30-50
# Focus on quality over quantity
# Monitor memory usage closely
```
### Medium Server (2-4GB RAM, 2+ CPU)
```bash
# Configuration
MAX_POSTS_PER_RUN=50
DELAY_BETWEEN_REQUESTS=1
# RSS Strategy
# 30-50 feeds with limit=50-75
# Good balance of discovery and performance
# Recommended for most deployments
```
### Powerful Server (4GB+ RAM, 4+ CPU)
```bash
# Configuration
MAX_POSTS_PER_RUN=100
DELAY_BETWEEN_REQUESTS=1
# RSS Strategy
# 50-100+ feeds with limit=75-100
# Maximum discovery and federation growth
# Monitor storage growth long-term
```
## 📋 Performance Checklist
### Pre-Scaling Checklist
- [ ] Monitor baseline resource usage for 1 week
- [ ] Verify GoToSocial has adequate RAM (1GB+ recommended)
- [ ] Test with small feed set before scaling up
- [ ] Set up monitoring/alerting for resource usage
- [ ] Plan storage capacity for database growth
### Scaling Process
- [ ] Increase limits gradually (20→50→75→100)
- [ ] Monitor each change for 2-3 days
- [ ] Adjust `MAX_POSTS_PER_RUN` based on processing time
- [ ] Balance discovery rate with server capacity
- [ ] Document optimal configuration for your setup
### Post-Scaling Monitoring
- [ ] Weekly resource usage review
- [ ] Monthly processed URLs cleanup
- [ ] Quarterly RSS feed quality assessment
- [ ] Database growth monitoring
- [ ] Performance metrics tracking
By following these guidelines, you can optimize GTS-HolMirDas for your specific server configuration and achieve maximum federation efficiency!

@ -1 +0,0 @@
Welcome to the Wiki.