monorepo/native/desktop/maplefile/internal/service/tokenmanager
2025-12-02 14:33:08 -05:00
..
config.go Initial commit: Open sourcing all of the Maple Open Technologies code. 2025-12-02 14:33:08 -05:00
manager.go Initial commit: Open sourcing all of the Maple Open Technologies code. 2025-12-02 14:33:08 -05:00
provider.go Initial commit: Open sourcing all of the Maple Open Technologies code. 2025-12-02 14:33:08 -05:00
README.md Initial commit: Open sourcing all of the Maple Open Technologies code. 2025-12-02 14:33:08 -05:00

Token Manager Service

Table of Contents

  1. Overview
  2. Why Do We Need This?
  3. How It Works
  4. Architecture
  5. Configuration
  6. Lifecycle Management
  7. Error Handling
  8. Testing
  9. Troubleshooting
  10. Examples

Overview

The Token Manager is a background service that automatically refreshes authentication tokens before they expire. This ensures users stay logged in without interruption and don't experience failed API requests due to expired tokens.

Key Benefits:

  • Seamless user experience (no sudden logouts)
  • No failed API requests due to expired tokens
  • Automatic cleanup on app shutdown
  • Graceful handling of refresh failures

Why Do We Need This?

The Problem

When you log into MapleFile, the backend gives you two tokens:

  1. Access Token - Used for API requests (expires quickly, e.g., 1 hour)
  2. Refresh Token - Used to get new access tokens (lasts longer, e.g., 30 days)

Without Token Manager:

User logs in → Gets tokens (expires in 1 hour)
User works for 61 minutes
User tries to upload file → ❌ 401 Unauthorized!
User gets logged out → 😞 Lost work, has to login again

With Token Manager:

User logs in → Gets tokens (expires in 1 hour)
Token Manager checks every 30 seconds
At 59 minutes → Token Manager refreshes tokens automatically
User works for hours → ✅ Everything just works!

The Solution

The Token Manager runs in the background and:

  1. Checks token expiration every 30 seconds
  2. Refreshes tokens when < 1 minute remains
  3. Handles failures gracefully (3 strikes = logout)
  4. Shuts down cleanly when app closes

How It Works

High-Level Flow

┌─────────────────────────────────────────────────────────────┐
│                     Application Lifecycle                    │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
        ┌──────────────────────────────────────┐
        │  App Starts / User Logs In           │
        └──────────────────────────────────────┘
                              │
                              ▼
        ┌──────────────────────────────────────┐
        │  Token Manager Starts                │
        │  (background goroutine)              │
        └──────────────────────────────────────┘
                              │
                              ▼
        ┌──────────────────────────────────────┐
        │  Every 30 seconds:                   │
        │  1. Check session                    │
        │  2. Calculate time until expiry      │
        │  3. Refresh if < 1 minute            │
        └──────────────────────────────────────┘
                              │
                    ┌─────────┴─────────┐
                    ▼                   ▼
        ┌───────────────────┐  ┌──────────────────┐
        │  Refresh Success  │  │  Refresh Failed  │
        │  (reset counter)  │  │  (increment)     │
        └───────────────────┘  └──────────────────┘
                                         │
                                         ▼
                              ┌──────────────────┐
                              │  3 failures?     │
                              └──────────────────┘
                                         │
                                    Yes  │  No
                              ┌──────────┴──────┐
                              ▼                 ▼
                    ┌─────────────────┐  ┌──────────┐
                    │  Force Logout   │  │ Continue │
                    └─────────────────┘  └──────────┘
                              │
                              ▼
        ┌──────────────────────────────────────┐
        │  App Shuts Down / User Logs Out      │
        └──────────────────────────────────────┘
                              │
                              ▼
        ┌──────────────────────────────────────┐
        │  Token Manager Stops Gracefully      │
        │  (goroutine cleanup)                 │
        └──────────────────────────────────────┘

Detailed Process

1. Starting the Token Manager

When a user logs in OR when the app restarts with a valid session:

// In CompleteLogin or Startup
tokenManager.Start()

This creates a background goroutine that runs continuously.

2. Background Refresh Loop

The goroutine runs this logic every 30 seconds:

1. Get current session from LevelDB
2. Check if session exists and is valid
3. Calculate: timeUntilExpiry = session.ExpiresAt - time.Now()
4. If timeUntilExpiry < 1 minute:
   a. Call API to refresh tokens
   b. API returns new access + refresh tokens
   c. Tokens automatically saved to session
5. If refresh fails:
   a. Increment failure counter
   b. If counter >= 3: Force logout
6. If refresh succeeds:
   a. Reset failure counter to 0

3. Stopping the Token Manager

When user logs out OR app shuts down:

// Create a timeout context (max 3 seconds)
ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
defer cancel()

// Stop gracefully
tokenManager.Stop(ctx)

This signals the goroutine to stop and waits for confirmation.


Architecture

Component Structure

internal/service/tokenmanager/
├── config.go          # Configuration settings
├── manager.go         # Main token manager logic
├── provider.go        # Wire dependency injection
└── README.md          # This file

Key Components

1. Manager Struct

type Manager struct {
    // Dependencies
    config      Config                    // Settings (intervals, thresholds)
    client      *client.Client            // API client for token refresh
    authService *auth.Service             // Auth service for logout
    getSession  *session.GetByIdUseCase   // Get current session
    logger      *zap.Logger               // Structured logging

    // Lifecycle management
    ctx       context.Context    // Manager's context
    cancel    context.CancelFunc // Cancel function
    stopCh    chan struct{}      // Signal to stop
    stoppedCh chan struct{}      // Confirmation of stopped
    running   atomic.Bool        // Is manager running?

    // Refresh state
    mu                  sync.Mutex  // Protects failure counter
    consecutiveFailures int         // Track failures
}

2. Config Struct

type Config struct {
    RefreshBeforeExpiry    time.Duration  // How early to refresh (default: 1 min)
    CheckInterval          time.Duration  // How often to check (default: 30 sec)
    MaxConsecutiveFailures int            // Failures before logout (default: 3)
}

Goroutine Management

Why Use Goroutines?

A goroutine is Go's way of running code in the background (like a separate thread). We need this because:

  • Main app needs to respond to UI events
  • Token checking can happen in the background
  • No blocking of user actions

The Double-Channel Pattern

We use two channels for clean shutdown:

stopCh    chan struct{}  // We close this to signal "please stop"
stoppedCh chan struct{}  // Goroutine closes this to say "I stopped"

Why two channels?

// Without confirmation:
close(stopCh)  // Signal stop
// Goroutine might still be running! ⚠️
// App shuts down → goroutine orphaned → potential crash

// With confirmation:
close(stopCh)             // Signal stop
<-stoppedCh               // Wait for confirmation
// Now we KNOW goroutine is done ✅

Thread Safety

Problem: Multiple parts of the app might access the token manager at once.

Solution: Use synchronization primitives:

  1. atomic.Bool for running flag

    // Atomic operations are thread-safe (no mutex needed)
    if !tm.running.CompareAndSwap(false, true) {
        return // Already running, don't start again
    }
    
  2. sync.Mutex for failure counter

    // Lock before accessing shared data
    tm.mu.Lock()
    defer tm.mu.Unlock()
    tm.consecutiveFailures++
    

Configuration

Default Settings

Config{
    RefreshBeforeExpiry:    1 * time.Minute,   // Refresh with 1 min remaining
    CheckInterval:          30 * time.Second,  // Check every 30 seconds
    MaxConsecutiveFailures: 3,                 // 3 failures = logout
}

Why These Values?

Setting Value Reasoning
RefreshBeforeExpiry 1 minute Conservative buffer. Even if one check fails, we have time for next attempt
CheckInterval 30 seconds Frequent enough to catch the 1-minute window, not too aggressive on resources
MaxConsecutiveFailures 3 failures Balances between transient network issues and genuine auth problems

Customizing Configuration

To change settings, modify provider.go:

func ProvideManager(...) *Manager {
    config := Config{
        RefreshBeforeExpiry:    2 * time.Minute,  // More conservative
        CheckInterval:          1 * time.Minute,  // Less frequent checks
        MaxConsecutiveFailures: 5,                // More tolerant
    }
    return New(config, client, authService, getSession, logger)
}

Lifecycle Management

1. Starting the Token Manager

Called from:

  • Application.Startup() - If valid session exists from previous run
  • Application.CompleteLogin() - After successful login

What happens:

func (m *Manager) Start() {
    // 1. Check if already running (thread-safe)
    if !m.running.CompareAndSwap(false, true) {
        return  // Already running, do nothing
    }

    // 2. Create context for goroutine
    m.ctx, m.cancel = context.WithCancel(context.Background())

    // 3. Create channels for communication
    m.stopCh = make(chan struct{})
    m.stoppedCh = make(chan struct{})

    // 4. Reset failure counter
    m.consecutiveFailures = 0

    // 5. Launch background goroutine
    go m.refreshLoop()
}

Why it's safe to call multiple times:

The CompareAndSwap operation ensures only ONE goroutine starts, even if Start() is called many times.

2. Running the Refresh Loop

The goroutine does this forever (until stopped):

func (m *Manager) refreshLoop() {
    // Ensure we always mark as stopped when exiting
    defer close(m.stoppedCh)
    defer m.running.Store(false)

    // Create ticker (fires every 30 seconds)
    ticker := time.NewTicker(m.config.CheckInterval)
    defer ticker.Stop()

    // Do initial check immediately
    m.checkAndRefresh()

    // Loop forever
    for {
        select {
        case <-m.stopCh:
            // Stop signal received
            return

        case <-m.ctx.Done():
            // Context cancelled
            return

        case <-ticker.C:
            // 30 seconds elapsed, check again
            m.checkAndRefresh()
        }
    }
}

The select statement explained:

Think of select like a switch statement for channels. It waits for one of these events:

  • stopCh closed → Time to stop
  • ctx.Done() → Forced cancellation
  • ticker.C → 30 seconds passed, do work

3. Stopping the Token Manager

Called from:

  • Application.Shutdown() - App closing
  • Application.Logout() - User logging out

What happens:

func (m *Manager) Stop(ctx context.Context) error {
    // 1. Check if running
    if !m.running.Load() {
        return nil  // Not running, nothing to do
    }

    // 2. Signal stop (close the channel)
    close(m.stopCh)

    // 3. Wait for confirmation OR timeout
    select {
    case <-m.stoppedCh:
        // Goroutine confirmed it stopped
        return nil

    case <-ctx.Done():
        // Timeout! Force cancel
        m.cancel()

        // Give it 100ms more
        select {
        case <-m.stoppedCh:
            return nil
        case <-time.After(100 * time.Millisecond):
            return ctx.Err()  // Failed to stop cleanly
        }
    }
}

Why the timeout?

If the goroutine is stuck (e.g., in a long API call), we can't wait forever. The app needs to shut down!


Error Handling

1. Refresh Failures

Types of failures:

Failure Type Cause Handling
Network Error No internet connection Increment counter, retry next check
401 Unauthorized Refresh token expired Increment counter, likely force logout
500 Server Error Backend issue Increment counter, retry next check
Timeout Slow network Increment counter, retry next check

Failure tracking:

func (m *Manager) checkAndRefresh() error {
    m.mu.Lock()
    defer m.mu.Unlock()

    // ... check if refresh needed ...

    // Attempt refresh
    if err := m.client.RefreshToken(ctx); err != nil {
        m.consecutiveFailures++

        if m.consecutiveFailures >= m.config.MaxConsecutiveFailures {
            // Too many failures! Force logout
            return m.forceLogout()
        }

        return err
    }

    // Success! Reset counter
    m.consecutiveFailures = 0
    return nil
}

2. Force Logout

When it happens:

  • 3 consecutive refresh failures
  • Session expired on startup

What it does:

func (m *Manager) forceLogout() error {
    m.logger.Warn("Forcing logout due to token refresh issues")

    // Use background context (not manager's context which might be cancelled)
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    // Clear session from LevelDB
    if err := m.authService.Logout(ctx); err != nil {
        m.logger.Error("Failed to force logout", zap.Error(err))
        return err
    }

    // User will see login screen on next UI interaction
    return nil
}

User experience:

When force logout happens, the user will see the login screen the next time they interact with the app. Their work is NOT lost (local files remain), they just need to log in again.

3. Session Not Found

Scenario: User manually deleted session file, or session expired.

Handling:

// Get current session
sess, err := m.getSession.Execute()
if err != nil || sess == nil {
    // No session = user not logged in
    // This is normal, not an error
    return nil  // Do nothing
}

Testing

Manual Testing

Test 1: Normal Refresh

  1. Log in to the app
  2. Watch logs for token manager start
  3. Wait ~30 seconds
  4. Check logs for "Token refresh not needed yet"
  5. Verify time_until_expiry is decreasing

Expected logs:

INFO  Token manager starting
INFO  Token refresh loop started
DEBUG Token refresh not needed yet {"time_until_expiry": "59m30s"}
... wait 30 seconds ...
DEBUG Token refresh not needed yet {"time_until_expiry": "59m0s"}

Test 2: Automatic Refresh

  1. Log in and get tokens with short expiry (if possible)
  2. Wait until < 1 minute remaining
  3. Watch logs for automatic refresh

Expected logs:

INFO  Token refresh needed {"time_until_expiry": "45s"}
INFO  Token refreshed successfully
DEBUG Token refresh not needed yet {"time_until_expiry": "59m30s"}

Test 3: Graceful Shutdown

  1. Log in (token manager running)
  2. Close the app (Cmd+Q on Mac, Alt+F4 on Windows)
  3. Check logs for clean shutdown

Expected logs:

INFO  MapleFile desktop application shutting down
INFO  Token manager stopping...
INFO  Token refresh loop received stop signal
INFO  Token refresh loop exited
INFO  Token manager stopped gracefully

Test 4: Logout

  1. Log in (token manager running)
  2. Click logout button
  3. Verify token manager stops

Expected logs:

INFO  Token manager stopping...
INFO  Token manager stopped gracefully
INFO  User logged out successfully

Test 5: Session Resume on Restart

  1. Log in
  2. Close app
  3. Restart app
  4. Check logs for session resume

Expected logs:

INFO  MapleFile desktop application started
INFO  Resuming valid session from previous run
INFO  Session restored to API client
INFO  Token manager starting
INFO  Token manager started for resumed session

Unit Testing (TODO)

// Example test structure (to be implemented)

func TestTokenManager_Start(t *testing.T) {
    // Test that Start() can be called multiple times safely
    // Test that goroutine actually starts
}

func TestTokenManager_Stop(t *testing.T) {
    // Test graceful shutdown
    // Test timeout handling
}

func TestTokenManager_RefreshLogic(t *testing.T) {
    // Test refresh when < 1 minute
    // Test no refresh when > 1 minute
}

func TestTokenManager_FailureHandling(t *testing.T) {
    // Test failure counter increment
    // Test force logout after 3 failures
    // Test counter reset on success
}

Troubleshooting

Problem: Token manager not starting

Symptoms:

  • No "Token manager starting" log
  • App works but might get logged out after token expires

Possible causes:

  1. No session on startup

    Check logs for: "No session found on startup"
    Solution: This is normal if user hasn't logged in yet
    
  2. Session expired

    Check logs for: "Session expired on startup"
    Solution: User needs to log in again
    
  3. Token manager already running

    Check logs for: "Token manager already running"
    Solution: This is expected behavior (prevents duplicate goroutines)
    

Problem: "Token manager stop timeout"

Symptoms:

  • App takes long time to close
  • Warning in logs: "Token manager stop timeout, forcing cancellation"

Possible causes:

  1. Refresh in progress during shutdown

    Goroutine might be in the middle of API call
    Solution: Wait for current API call to timeout (max 30s)
    
  2. Network issue

    API call hanging due to network problems
    Solution: Force cancellation (already handled automatically)
    

Problem: Getting logged out unexpectedly

Symptoms:

  • User sees login screen randomly
  • Logs show "Forcing logout due to token refresh issues"

Possible causes:

  1. Network connectivity issues

    Check logs for repeated: "Token refresh failed"
    Solution: Check internet connection, backend availability
    
  2. Backend API down

    All refresh attempts failing
    Solution: Check backend service status
    
  3. Refresh token expired

    Backend returns 401 on refresh
    Solution: User needs to log in again (this is expected)
    

Problem: High CPU/memory usage

Symptoms:

  • App using lots of resources
  • Multiple token managers running

Diagnosis:

# Check goroutines
curl http://localhost:34115/debug/pprof/goroutine?debug=1

# Look for multiple "refreshLoop" goroutines

Possible causes:

  1. Token manager not stopping on logout

    Check logs for missing: "Token manager stopped gracefully"
    Solution: Bug in stop logic (report issue)
    
  2. Multiple Start() calls

    Should not happen (atomic bool prevents this)
    Solution: Report issue with reproduction steps
    

Examples

Example 1: Adding Custom Logging

Want to know exactly when refresh happens?

// In tokenmanager/manager.go, modify checkAndRefresh():

func (m *Manager) checkAndRefresh() error {
    // ... existing code ...

    // Before refresh
    m.logger.Info("REFRESH STARTING",
        zap.Time("now", time.Now()),
        zap.Time("token_expires_at", sess.ExpiresAt))

    if err := m.client.RefreshToken(ctx); err != nil {
        // Log failure details
        m.logger.Error("REFRESH FAILED",
            zap.Error(err),
            zap.String("error_type", fmt.Sprintf("%T", err)))
        return err
    }

    // After refresh
    m.logger.Info("REFRESH COMPLETED",
        zap.Time("completion_time", time.Now()))

    return nil
}

Example 2: Custom Failure Callback

Want to notify UI when logout happens?

// Add callback to Manager struct:

type Manager struct {
    // ... existing fields ...
    onForceLogout func(reason string)  // NEW
}

// In checkAndRefresh():
if m.consecutiveFailures >= m.config.MaxConsecutiveFailures {
    reason := fmt.Sprintf("%d consecutive refresh failures", m.consecutiveFailures)

    if m.onForceLogout != nil {
        m.onForceLogout(reason)  // Notify callback
    }

    return m.forceLogout()
}

// In Application, set callback:
func (a *Application) Startup(ctx context.Context) {
    // ... existing code ...

    // Set callback to emit Wails event
    a.tokenManager.onForceLogout = func(reason string) {
        runtime.EventsEmit(a.ctx, "auth:logged-out", reason)
    }
}

Example 3: Metrics Collection

Want to track refresh statistics?

type RefreshMetrics struct {
    TotalRefreshes     int64
    SuccessfulRefreshes int64
    FailedRefreshes    int64
    LastRefreshTime    time.Time
}

// Add to Manager:
type Manager struct {
    // ... existing fields ...
    metrics RefreshMetrics
    metricsMu sync.Mutex
}

// In checkAndRefresh():
if err := m.client.RefreshToken(ctx); err != nil {
    m.metricsMu.Lock()
    m.metrics.TotalRefreshes++
    m.metrics.FailedRefreshes++
    m.metricsMu.Unlock()
    return err
}

m.metricsMu.Lock()
m.metrics.TotalRefreshes++
m.metrics.SuccessfulRefreshes++
m.metrics.LastRefreshTime = time.Now()
m.metricsMu.Unlock()

// Export metrics via Wails:
func (a *Application) GetRefreshMetrics() map[string]interface{} {
    return map[string]interface{}{
        "total": a.tokenManager.metrics.TotalRefreshes,
        "successful": a.tokenManager.metrics.SuccessfulRefreshes,
        "failed": a.tokenManager.metrics.FailedRefreshes,
    }
}

Summary for Junior Developers

Key Concepts to Remember

  1. Goroutines are background threads

    • They run concurrently with your main app
    • Need careful management (start/stop)
  2. Channels are for communication

    • close(stopCh) = "Please stop"
    • <-stoppedCh = "I confirm I stopped"
  3. Mutexes prevent race conditions

    • Lock before accessing shared data
    • Always defer unlock
  4. Atomic operations are thread-safe

    • Use for simple flags
    • No mutex needed
  5. Context carries deadlines

    • Respect timeouts
    • Use for cancellation

What NOT to Do

Don't call Start() in a loop

// Bad!
for {
    tokenManager.Start()  // Creates goroutine leak!
}

Don't forget to Stop()

// Bad!
func Logout() {
    authService.Logout()  // Token manager still running!
}

Don't block on Stop() without timeout

// Bad!
tokenManager.Stop(context.Background())  // Could hang forever!

// Good!
ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
defer cancel()
tokenManager.Stop(ctx)

Learning Resources

Getting Help

If you're stuck:

  1. Check the logs (they're very detailed)
  2. Look at the troubleshooting section above
  3. Ask senior developers for code review
  4. File an issue with reproduction steps

Changelog

v1.0.0 (2025-11-21)

  • Initial implementation
  • Background refresh every 30 seconds
  • Refresh when < 1 minute before expiry
  • Graceful shutdown with timeout handling
  • Automatic logout after 3 consecutive failures
  • Session resume on app restart