TrainingDataManager.swift
TrainingDataManager Documentation
Overview
TrainingDataManager.swift
provides a centralized system for managing, storing, and retrieving training data used in the machine learning pipeline. It handles data persistence and offers various methods for data manipulation.
Core Components
TrainingData Structure
Properties
id
UUID
Unique identifier for the training data
text
String
The content to be processed
category
String?
Optional category label
timestamp
Date
When the data was created
metadata
[String: Any]?
Optional additional information (not persisted)
isProcessed
Bool
Processing status flag
Primary Features
Data Management Methods
These methods handle the core CRUD operations:
add
: Adds a single training data entryaddBatch
: Adds multiple training data entriesremove
: Removes an entry by IDupdate
: Updates an existing entry
Data Retrieval Methods
Provides filtered access to the data:
getAllTrainingData
: Returns all stored entriesgetUnprocessedData
: Returns only unprocessed entriesgetDataByCategory
: Returns entries matching a specific category
Data Persistence
Handles saving and loading data:
save
: Persists current data to JSON fileload
: Loads data from saved JSON filegetStorageURL
: Manages storage location in app documents directory
Codable Implementation
The TrainingData
structure implements Codable
with custom encoding/decoding:
Excludes
metadata
from persistence (since[String: Any]
isn't Codable)Handles encoding/decoding of all other properties
Uses custom
CodingKeys
for property mapping
Example Usage
Best Practices
Data Management
Always use unique IDs for entries
Check processing status before operations
Handle metadata separately from persistence
Error Handling
Implement proper error handling for save/load operations
Validate data before adding/updating
Performance
Use batch operations for multiple entries
Consider memory usage with large datasets
Related Components
MLTrainingService: Uses this manager for training data
ModelPersistenceManager: Works alongside for model storage
TextEmbeddingsService: Processes the training data
Last updated