store.py

Storage Cluster Processor Documentation

The StorageClusterProcessor formats action analysis data into structured clusters specifically for LLM inference requests. It organizes context and relationships for single-request processing.

Process Overview

  1. Formats action analysis results into context clusters

  2. Identifies relationships between contexts

  3. Creates structured output for LLM inference

Class Documentation

StorageClusterProcessor

StorageClusterProcessor(min_connection_strength: float = 0.4)

Formats processed data for LLM inference requests.

Parameters

  • min_connection_strength: Minimum threshold for pattern connections

Core Methods

identify_core_tokens

def identify_core_tokens(self, patterns: Dict[str, Dict]) -> List[str]

Extracts core tokens from pattern data based on score threshold.

create_cluster

def create_cluster(self, pattern_results: Dict[str, Any], cluster_type: str) -> Dict[str, Any]

Creates a context cluster containing:

  • Unique identifier

  • Context type

  • Core tokens

  • Context strength

  • Pattern count

establish_cluster_connections

def establish_cluster_connections(self, clusters: List[Dict[str, Any]], 
                               pattern_distribution: Dict[str, Dict]) -> List[Dict[str, Any]

Identifies relationships between contexts for inference.

process_storage

def process_storage(self, action_results: Dict[str, Any]) -> Dict[str, Any]

Main method that:

  1. Creates context clusters

  2. Builds pattern distribution

  3. Identifies context relationships

  4. Returns formatted structure for LLM request

Data Structures

Input Format

action_results = {
    'action_analysis': {
        'action': {
            'items': [...],
            'metadata': {...}
        },
        'non_action': {
            'items': [...],
            'metadata': {...}
        }
    },
    'summary': {
        'batch_id': str
    }
}

Output Format

{
    'request_id': str,
    'contexts': [
        {
            'cluster_id': str,
            'type': str,
            'core_tokens': List[str],
            'context_strength': float,
            'pattern_count': int
        }
    ],
    'relationships': [
        {
            'connection_id': str,
            'contexts': List[str],
            'shared_patterns': List[str],
            'strength': float
        }
    ],
    'metadata': {
        'context_count': int,
        'relationship_count': int,
        'processing_time': float,
        'timestamp': str,
        'source_batch': str
    }
}

Usage

# Initialize processor
processor = StorageClusterProcessor(min_connection_strength=0.4)

# Format data for LLM request
inference_request = processor.process_storage(action_results)

Implementation Notes

  1. Request Structure:

    • Organized by context type

    • Maintains pattern relationships

    • Includes relevant metadata

  2. Pattern Processing:

    • Aggregates patterns by context

    • Identifies core tokens

    • Creates context distribution

  3. Relationship Identification:

    • Based on shared patterns

    • Strength threshold filtering

    • Context relationship mapping

Last updated